Skip to content

Conversation

@abeglova
Copy link
Contributor

@abeglova abeglova commented Jul 10, 2024

What are the relevant tickets?

closes https://github.com/mitodl/hq/issues/4798

Description (What does it do?)

If a recreate_index jobs is currently started before a different recreate_index job is finished, the create_backing_index function creates a new index and removes the reindexing tag from the first reindexing index that was created. The first reindexing index is then never deleted and left orphaned.

This PR adds a check for an existing reindexing tag before starting a new reindexing job. In the default state, the job fails if there is an existing reindexing tag. In case something went wrong on a previous run and a current reindexing tag shouldn't exist, there is a --remove_existing_reindexing_tags option to force removing the existing reindexing tag

In addition, this pr deletes any orphaned indexes every time the recreate_index job runs

How can this be tested?

From the command line run

from learning_resources_search.indexing_api import create_backing_index
create_backing_index("program") 
create_backing_index("program") 

This creates an index with the reindexing tag and another orphaned index for programs

Run
docker-compose run web ./manage.py recreate_index --program
You should see an error saying "Reindexing in progress. Reindexing indexes already exist"

Run
docker-compose run web ./manage.py recreate_index --program --remove_existing_reindexing_tags

and the reindexing should finish

From the command line run

from learning_resources_search.indexing_api import get_conn
conn = get_conn()
conn.indices.get_alias(index="*")

Check that there is only one index with "program" in the name

Additionally run docker-compose run web ./manage.py recreate_index --all and verify that everything looks normal

@abeglova abeglova marked this pull request as ready for review July 10, 2024 19:13
@abeglova abeglova changed the title Ab/reindexing changes reindexing changes Jul 10, 2024
"""
try:
if not remove_existing_reindexing_tags:
existing_reindexing_indexes = api.get_existing_reindexing_indexes(indexes)
Copy link
Contributor Author

@abeglova abeglova Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to check both here and in the management command in case start_recreate_index is stuck behind other jobs and another indexing job is started while this task is in the queue. We could just have the check here but it's nicer for the user to know about the existing reindexing indexes right away if it already exists when the management command runs


@app.task(autoretry_for=(RetryError,), retry_backoff=True, rate_limit="600/m")
@app.task(
acks_late=True, autoretry_for=(RetryError,), retry_backoff=True, rate_limit="600/m"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added acks_late=True here just in case, although I don't think this job failing is how orphaned indexes are generally created

@abeglova abeglova added Needs Review An open Pull Request that is ready for review and removed Work in Progress labels Jul 10, 2024
@mbertrand mbertrand self-assigned this Jul 11, 2024
Copy link
Member

@mbertrand mbertrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works great, 👍 , just some minor nitpicks.

return error

api.delete_orphaned_indexes(
indexes, delete_reindexing_tags=remove_existing_reindexing_tags
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are reindex orphans to delete, remove_existing_reindexing_tags has to be True to get to this point. If there aren't any, it seems like delete_reindexing_tags could still be True and it would just step through that conditional block in delete_orphaned_indexes without doing anything? So maybe this argument is not strictly necessary. But it works fine as is anyway, and does avoid running the code in that conditional block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the conditional is not strictly necessary .


finish_recreate_index_dict = {}

delete_orphaned_indexes_mock.assert_called_once()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe check if arguments are correct? :

delete_orphaned_indexes_mock.assert_called_once_with(indexes, False)

finish_recreate_index_dict = {"program": "backing"}

if remove_existing_reindexing_tags:
delete_orphaned_indexes_mock.assert_called_once()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_orphaned_indexes_mock.assert_called_once_with(indexes, True)

@abeglova abeglova force-pushed the ab/reindexing_changes branch from c96f862 to 85538d2 Compare July 11, 2024 16:07
@abeglova abeglova force-pushed the ab/reindexing_changes branch from 85538d2 to 0c9159e Compare July 11, 2024 17:06
@abeglova abeglova merged commit cb21d9a into main Jul 11, 2024
This was referenced Jul 11, 2024
@rhysyngsun rhysyngsun deleted the ab/reindexing_changes branch February 7, 2025 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs Review An open Pull Request that is ready for review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants