Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FC-0040] feat: update search index when object tags change #647

Conversation

yusuf-musleh
Copy link
Member

@yusuf-musleh yusuf-musleh commented Mar 24, 2024

Description

This PR implements updating search index with object tags whenever they change, based on the new CONTENT_OBJECT_TAGS_CHANGED event.

Supporting Information

Testing Instructions

  • Run your local devstack on this branch
  • Clone the openedx-events repo on this branch [FC-0040] feat: Add CONTENT_OBJECT_TAGS_CHANGED signal openedx/openedx-events#327 and make sure it is accessible from your devstack, and install it locally
    • pip uninstall openedx-events
    • pip install -e path/to/openedx-events/
  • Make sure you have meilisearch setup locally, follow the setup instructions here https://github.com/open-craft/tutor-contrib-meilisearch
  • Make sure you also have some sample taxonomies/tags, you can add some from here: https://github.com/open-craft/taxonomy-sample-data
  • Run tutor dev run cms bash and ./manage.py cms reindex_studio
  • View the resulting index at http://meilisearch.local.edly.io:7700/ (see tutor-contrib-meilisearch README for how to get the API key to log in)
  • Check and confirm that is an index with documents in it. If you already have some tags applied to blocks, they will appear there
  • Go back to studio, and add/update tags in a block
  • Go back to meilisearch dashboard, and confirm that the Tags field in the index document for the block has been updated to reflect that changes you made.
    • Note: I recommend having a easily identifiable name for the block/unit that you added tags to so you can easily search for it
    • NOTE: Meilisearch seems to cache queries along with their results in the frontend, so if you simply search it might show you stale data (network tab shows no requests), especially if you're searching for the same query. Make sure you refresh the Meilisearch dashboard (http://meilisearch.local.edly.io:7700/) and then perform the search.
  • Repeat and confirm the above for both units and blocks

Prinvate-ref: FAL-3691

@yusuf-musleh yusuf-musleh force-pushed the yusuf-musleh/update-search-index-when-object-tags-change branch from 89d451c to b2dc9a4 Compare March 24, 2024 05:35
@yusuf-musleh yusuf-musleh force-pushed the yusuf-musleh/update-search-index-when-object-tags-change branch 4 times, most recently from 0d444e4 to 21b8139 Compare March 25, 2024 11:06
@yusuf-musleh yusuf-musleh changed the title [WIP] feat: update search index when object tags change (temp) [WIP] feat: update search index when object tags change Mar 25, 2024
Comment on lines 85 to 141
upsert_xblock_index_doc.delay(
content_object_tags.object_id,
recursive=False,
update_metadata=True,
update_tags=True,
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to set update_metadata=True here, because without it, it was creating new documents when tags are updated, rather than updating the existing one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new documents' issue is unrelated to this parameter.

We changed the way the ID hash was generated upstream, so the first time you tag an object (after the update), a new document is generated. Could you check if this was the case?

If so, using update_metadata=False should work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, adding Tags to a LibraryComponent if passing here (and falling).

A suggestion is to refactor and create a new update function for tagging only that can handle both types of blocks calling searchable_doc_for_library_block or searchable_doc_for_course_block (and eliminate the update_metadata and update_tags parameters from the task). But it's up to you!

Copy link
Member Author

@yusuf-musleh yusuf-musleh Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check if this was the case?
If so, using update_metadata=False should work.

Yes -- that seemed to fix it and using update_metadata=False now works! I updated it.

I also added handling tagging for library blocks, I just opted for checking the type of the object_id for the content object and choosing the correct method based on that.

@@ -162,7 +162,7 @@ def _tags_for_content_object(object_id: UsageKey | LearningContextKey) -> dict:
# tags for each component separately.
all_tags = tagging_api.get_object_tags(object_id).all()
if not all_tags:
return {}
return {Fields.tags: {}}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was needed to clear out tags in the index when unselecting all tags for the block, it would remain the last value otherwise because the Fields.tags value would not be included.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth mentioning that in the code as a comment :)

@yusuf-musleh yusuf-musleh force-pushed the yusuf-musleh/update-search-index-when-object-tags-change branch 6 times, most recently from 61d72bb to 4a85a61 Compare March 26, 2024 09:48
@yusuf-musleh yusuf-musleh changed the title [WIP] feat: update search index when object tags change [FC-0040] feat: update search index when object tags change Mar 26, 2024
@yusuf-musleh yusuf-musleh marked this pull request as ready for review March 26, 2024 10:27
@rpenido rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch from f1a94f3 to a2e9d2a Compare March 26, 2024 18:45
Copy link
Member

@rpenido rpenido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @yusuf-musleh! Good work here!
I left some suggestions and questions in the review.

@rpenido rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch from a2e9d2a to 4a55c89 Compare March 26, 2024 20:57
object_id = kwargs.get('object_id')
CONTENT_OBJECT_TAGS_CHANGED.send_event(
time=now(),
content_object=ContentObjectData(object_id=object_id)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I wonder, what's an object_id? What does it look like?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The object_id is the id of the course or library block as a string that is being tagged. For example:

block-v1:SampleTaxonomyOrg2+STC1+2023_1+type@vertical+block@f8de78f0897049ce997777a3a31b6ea0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add that explanation to the event data docstring? Here that would address all my comments. Thank you!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing! Updated.

Copy link

@mariajgrimaldi mariajgrimaldi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments regarding openedx/openedx-events#327. Also, is this PR going to be open against upstream?

@yusuf-musleh yusuf-musleh force-pushed the yusuf-musleh/update-search-index-when-object-tags-change branch 2 times, most recently from 8f39fc6 to cdde7d7 Compare March 28, 2024 04:31
@yusuf-musleh
Copy link
Member Author

Thanks @rpenido and @mariajgrimaldi for your reviews! I've addressed your comments.

Also, is this PR going to be open against upstream?

Yes, this PR should be opened against upstream master once the base of this PR gets merged.

@rpenido rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch 2 times, most recently from b7aef89 to 45b6bf9 Compare March 28, 2024 19:34
kdmccormick and others added 9 commits April 18, 2024 15:35
…stones-16b82db

feat: Upgrade Python dependency edx-milestones
fix: SSO self-serve tool invalid entityId parsing

Commit generated by workflow `openedx/edx-platform/.github/workflows/upgrade-one-python-dependency.yml@refs/heads/master`
…thon_3.11

Make Tests Python 3.11 Compatible
…dx-enterprise-68b3753

feat: Upgrade Python dependency edx-enterprise
…lash

feat!: Remove the django-splash app.
feat: re-expose deprecated fragments as passthrough

Commit generated by workflow `openedx/edx-platform/.github/workflows/upgrade-one-python-dependency.yml@refs/heads/master`

Co-authored-by: connorhaugh <connorhaugh@users.noreply.github.com>
Co-authored-by: connorhaugh <49422820+connorhaugh@users.noreply.github.com>
Update to a Python 3.11 compatible version

Commit generated by workflow `openedx/edx-platform/.github/workflows/upgrade-one-python-dependency.yml@refs/heads/master`
Update to a Python 3.11 compatible version

Commit generated by workflow `openedx/edx-platform/.github/workflows/upgrade-one-python-dependency.yml@refs/heads/master`
@bradenmacdonald
Copy link
Member

@yusuf-musleh I found a small but significant bug, so I pushed a fix to your PR. And (unrelated) there are some conflicts now. Could you please rebase or merge master into this, then ping me to merge it?

@@ -116,11 +129,19 @@ def test_reindex_meilisearch_disabled(self, mock_meilisearch):
@override_settings(MEILISEARCH_ENABLED=True)
def test_reindex_meilisearch(self, mock_meilisearch):

# Add tags field to doc, since reindex calls with `include_tags=True` internally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring is out of date - we don't have include_tags=True anymore

@yusuf-musleh yusuf-musleh force-pushed the yusuf-musleh/update-search-index-when-object-tags-change branch from d474406 to 8fc6c45 Compare April 21, 2024 12:13
@yusuf-musleh
Copy link
Member Author

I found a small but significant bug, so I pushed a fix to your PR. And (unrelated) there are some conflicts now. Could you please rebase or merge master into this, then ping me to merge it?

@bradenmacdonald Got it, I've addressed your comments, rebased/fixed conflicts and created a new PR off master, thats good to go (openedx#34559). So I'm closing this PR in favor of that one.

@bradenmacdonald bradenmacdonald deleted the yusuf-musleh/update-search-index-when-object-tags-change branch April 22, 2024 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
10 participants