Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add atomic/zero-downtime index rebuilding #358

Merged
merged 20 commits into from Sep 6, 2022

Conversation

oehrlein
Copy link
Contributor

@oehrlein oehrlein commented Jul 9, 2021

These changes allow for atomic/zero-downtime index rebuilding using aliases.

There are two new management commands: --use-alias and --use-alias-keep-index, both which only work with the --rebuild command. If --use-alias-keep-index is used with --use-alias, the indices previously aliased will not be deleted. If --use-alias-keep-index isn't used, the old indices will be deleted.

Regarding naming, the defined index name in documents.py becomes the alias name, and indices take the alias name plus a suffix (hyphen + current datetime). I took example from this.

2 big caveats:

  1. This only works in the simplest use case of just rebuilding and replacing indices - it won't work in the event of continuous updates to an index, as outlined in this article in the section "Wait, I know a case when this will not work…"
  2. As mentioned in this example (same example as above), in order to properly deserialize the data in searches, you will need to add this function to your documents in documents.py:
@classmethod
def _matches(cls, hit):
    # override _matches to match indices in a pattern instead of just ALIAS
    # hit is the raw dict as returned by elasticsearch
    return fnmatch(hit["_index"], ALIAS + "-*")

For this to work, remember to add from fnmatch import fnmatch to your imports in documents.py. Also, replace ALIAS with your alias name, or just make that argument one string, e.g. 'my-alias-*'.

As noted in the example:

BlogPost._matches() class method is required for this code to work since
otherwise BlogPost will not be used to deserialize the documents as those
will have index set to the concrete index whereas the class refers to the
alias.

I didn't get around to writing tests or adding to the documentation. With all that being said, please let me know if you have questions and/or suggestions!

EDIT: Please note that caveat 2 has been taken care of in ce05715, so it is no longer an issue.

@oehrlein
Copy link
Contributor Author

oehrlein commented Jul 9, 2021

This resolves #314

@safwanrahman
Copy link
Collaborator

Thanks for the Pull request, I will review it and add my concern into it. @saadmk11 Can you take a look as well?

@saadmk11
Copy link
Contributor

Looks like there is a similar PR #284 that does this as a part of it but has been stale for a long time

Copy link
Contributor

@saadmk11 saadmk11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pull request 💯 , added some small suggestions.

@safwanrahman I have not tested this locally yet will do it when I get some time.

@oehrlein
Copy link
Contributor Author

Looks like there is a similar PR #284 that does this as a part of it but has been stale for a long time

I tried out the changes in that PR, and unless I misunderstood what was documented, it didn't do exactly what I was looking for. Regardless, yea, it's been stale for a while.

oehrlein and others added 5 commits July 28, 2021 21:26
Co-authored-by: Maksudul Haque <saad.mk112@gmail.com>
Co-authored-by: Maksudul Haque <saad.mk112@gmail.com>
Co-authored-by: Maksudul Haque <saad.mk112@gmail.com>
Co-authored-by: Maksudul Haque <saad.mk112@gmail.com>
Co-authored-by: Maksudul Haque <saad.mk112@gmail.com>
@safwanrahman
Copy link
Collaborator

This seems like a good approach. Though I was thinking to different way to achieve it, but it will achieve the purpose with minimal changes. I will test it locally before mege. I have added a little fix over.

@antunesleo
Copy link

antunesleo commented Aug 3, 2021

First of all, thanks for tackling this one @oehrlein

About the continuous updates to an index issue, I think we can mitigate it running combined your solution with populate command, this way:

First we run search_index --rebuild --atomic
secondly we run a search_index populate in order to index documents that were inserted during the migration

One caveat is that items inserted during the migration will be inconsistent for a brief moment, the items will not be available for search until we run the populate command.
I believe this is acceptable for search engines like ES that are not strongly consistent and the most important thing: we are not loosing any document during the migration.

Another caveat is that we will have to write documents 2 times in this can be an performance bottleneck for bigger clusters.

@oehrlein
Copy link
Contributor Author

oehrlein commented Aug 4, 2021

@safwanrahman Sounds good, I appreciate it!

@oehrlein
Copy link
Contributor Author

oehrlein commented Aug 4, 2021

@antunesleo No problem!

I think for now (and considering it doesn't require any changes to the code), the combination of those two commands works well enough.

Of course, it would be good to eventually try to have it similar to the article I linked in my original comment. Although, it's admittedly not a need of mine at the moment, so I won't be able to get around to it for the time being.

@codecov-commenter
Copy link

codecov-commenter commented Sep 16, 2021

Codecov Report

Merging #358 (ce05715) into master (6470dd7) will decrease coverage by 8.95%.
The diff coverage is 42.10%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #358      +/-   ##
==========================================
- Coverage   67.85%   58.90%   -8.96%     
==========================================
  Files          12       12              
  Lines         756      910     +154     
==========================================
+ Hits          513      536      +23     
- Misses        243      374     +131     
Impacted Files Coverage Δ
...sticsearch_dsl/management/commands/search_index.py 28.70% <37.50%> (-13.50%) ⬇️
django_elasticsearch_dsl/documents.py 77.41% <100.00%> (+0.75%) ⬆️
django_elasticsearch_dsl/fields.py 61.45% <100.00%> (-0.20%) ⬇️
django_elasticsearch_dsl/registries.py 74.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 79218c8...ce05715. Read the comment docs.

@oehrlein
Copy link
Contributor Author

@safwanrahman @saadmk11 I made the changes for all your guys' suggestions, so it should all be good to merge!

@safwanrahman
Copy link
Collaborator

Thanks a lot @oehrlein for your patience and patch that is very good design. @saadmk11 Can you test it with any application and check if it is working perfectly? I am being too busy with personal things so maybe it would be better if you can test it and let me know. I will merge it after you test it.

@oehrlein
Copy link
Contributor Author

@safwanrahman No problem, and thanks!

@saadmk11 Let me know if anything needs to be changed!

@saadmk11
Copy link
Contributor

@saadmk11 Let me know if anything needs to be changed!

Hi @oehrlein, Thanks for updating the PR with the requested changes. 💯

I tried to test these changes locally (by running python manage.py search_index --rebuild --use-alias) on an application that already had an index created on the Elasticsearch cluster (using the rebuild command of the current version of django-elasticsearch-dsl).

I am getting this error when I run the command:

Traceback (most recent call last):
  File "manage.py", line 21, in <module>
    main()
  File "manage.py", line 17, in main
    execute_from_command_line(sys.argv)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 262, in handle
    self._rebuild(models, options)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 242, in _rebuild
    options
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/django_elasticsearch_dsl/management/commands/search_index.py", line 179, in _update_alias
    es_conn.indices.update_aliases({"actions": alias_actions})
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/client/utils.py", line 301, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/client/indices.py", line 612, in update_aliases
    "POST", "/_aliases", params=params, headers=headers, body=body
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/transport.py", line 458, in perform_request
    raise e
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/transport.py", line 426, in perform_request
    timeout=timeout,
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/connection/http_urllib3.py", line 287, in perform_request
    self._raise_error(response.status, raw_data)
  File "/home/saad/.virtualenvs/test_project/lib/python3.6/site-packages/elasticsearch/connection/base.py", line 338, in _raise_error
    status_code, error_message, additional_info
elasticsearch.exceptions.RequestError: RequestError(400, 'invalid_alias_name_exception', 'Invalid alias name [some_index_name], an index exists with the same name as the alias')

(I think this is happening because we are trying to create an alias with the same name of the index. As the index name does not have any timestamp/uniqueness on the current implementation it is conflicting with the alias name)

But If I delete the index on ElasticSearch and then run the rebuild command it works as expected because the newly created index name has timestamp/uniqueness according to the PR implementation.

Please let me know If I need to add/update anything to test the changes of this PR. :)

@oehrlein
Copy link
Contributor Author

@saadmk11 No problem!

Thanks for catching this error! I ran into it before, but forgot about it before my original commits.

I made adjustments to fix the issue. Now, when we want an alias to have the name of an index that already exists, it deletes the existing index as well. Likewise, I made changes so that when we want an index to have the name of an alias that already exists, it deletes the indices pointed at the alias to free up the name.

I also made adjustments to account for deleting indices that were created with aliases (after running --rebuild with --use-alias, run --delete with --use-alias to delete those indices).

In addition, I added additional stdout messages in case of doing something alias-specific, but only having regular indices, or vice-versa. For example, trying to delete regular indices after creating indices with an alias (after running --rebuild with --use-alias, run --delete without --use-alias) provides a message explaining the situation and how to resolve it.

Take a look at the changes when you get a chance, and let me know if you have any suggestions and/or think there should be changes!

@oehrlein
Copy link
Contributor Author

oehrlein commented Oct 6, 2021

@saadmk11 Thanks for the kind words and the help + suggestions!

I took your suggestion regarding the _matches function and added it in ce05715. While I understand that having multiple, similarly-named indices will cause matching issues, I think that's less likely to happen than someone trying to use aliases and forgetting to override the _matches function on their own.

There's just 2 suggestions that I'm looking for your response for - everything else should be good! :)

@saadmk11
Copy link
Contributor

I took your suggestion regarding the _matches function and added it in ce05715. While I understand that having multiple, similarly-named indices will cause matching issues, I think that's less likely to happen than someone trying to use aliases and forgetting to override the _matches function on their own.

Thanks for adding this. I would like to know if @safwanrahman has any thoughts on this.

@oehrlein
Copy link
Contributor Author

@saadmk11 No problem!

@safwanrahman Let us know!

@safwanrahman
Copy link
Collaborator

I need to take a deeper look into the PR again. i was on a long vacation recently, so it may take some days to catch things up. I am sorry about that. I will reply the queries once I read the PR.

@oehrlein
Copy link
Contributor Author

@safwanrahman Sounds good, and no worries! Hopefully the vacation was good :)

@oehrlein
Copy link
Contributor Author

@safwanrahman Any luck at being able to take a look at this?

@oehrlein
Copy link
Contributor Author

@safwanrahman Just wanted to follow up again - any luck at being able to take a look at this?

@violuke
Copy link

violuke commented Mar 2, 2022

We're really like to see this too @safwanrahman if you are able to review. Thank you for your time

@rsommerard
Copy link

Hi, any news about the merge of this really useful feature?

@Root-kjh
Copy link

Root-kjh commented Apr 4, 2022

i found that this source code does not erase new indexes when an error occurs in process of creating alias

@phpaulin
Copy link

Would be great to see this PR merged!!

@oehrlein any chance you are considering merging this?

@oehrlein
Copy link
Contributor Author

@phpaulin I definitely would like to, but it's beyond me unfortunately.

@safwanrahman @saadmk11 Any chance you guys will have time soon to look at this again and merge it?

@adamKenneweg
Copy link

please add it @safwanrahman , it's very useful for production deployments

@phpaulin
Copy link

Yes, it would be greatly helpful for our team also @safwanrahman .

@safwanrahman safwanrahman merged commit d33a2da into django-es:master Sep 6, 2022
@safwanrahman
Copy link
Collaborator

@oehrlein Thanks a ton for the PR. I have been quite busy for some months so could not verify to merge it. Finally, I got some time to check the code and it seems to be good from my end.

I have merged it now! 💯 Thanks again for your patience.

@violuke
Copy link

violuke commented Sep 7, 2022

Incredible, thanks so much @safwanrahman and @oehrlein

@oehrlein
Copy link
Contributor Author

oehrlein commented Sep 7, 2022

@safwanrahman No problem, and thanks for taking a look at it and merging it!

@violuke No problem!

@zillasaur
Copy link

@safwanrahman will the release be bumped to include this soon? i'd love to use this functionality!

@zillasaur
Copy link

@oehrlein thank you for doing this, amazing work! is it possible to update the PR description with the correct management command names and also have the proper documentation for this new feature? (pending on the new version)

@fizikurniawan
Copy link

@oehrlein Thank you for the PR. I have a lot of data to index, what will happen if there is new data during the indexing process? Will that data be stored in the new index (when use --use-alias) or not indexed? Because average time to populate takes about 20 min.

@oehrlein
Copy link
Contributor Author

oehrlein commented May 6, 2023

@zillasaur Firstly, my apologies for the long delay to get back to you.

I updated the PR description with the correct management commands. I also added documentation for the management commands and submitted a PR (#446) for it.

@oehrlein
Copy link
Contributor Author

oehrlein commented May 6, 2023

@fizikurniawan Firstly, my apologies for the long delay to get back to you.

To answer your question, I'm not sure. Ultimately, that's more of a question of how populating an Elasticsearch index works, considering populating an index with or without aliases works the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet