New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize reindex #1233
Optimize reindex #1233
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
docs/administrative-tasks.md
Outdated
|
||
## Reindexing data | ||
|
||
Sometimes, you need to reindex data (in case model breaking changes, defect of workers...). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/in case/in case of/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/defect of worker/worker defect/
docs/administrative-tasks.md
Outdated
## Reindexing data | ||
|
||
Sometimes, you need to reindex data (in case model breaking changes, defect of workers...). | ||
You can use the `udata search index command` to do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/udata search index command
/udata search index
command/
docs/administrative-tasks.md
Outdated
Sometimes, you need to reindex data (in case model breaking changes, defect of workers...). | ||
You can use the `udata search index command` to do so. | ||
|
||
This command both support full reindex without arguments or partial with model names as arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
supports both full reindex without arguments and partial
docs/administrative-tasks.md
Outdated
udata search index reuses organizations | ||
``` | ||
|
||
By default the command does delete the previous index in case of success or the new unfinished index in case of error but you can ask to keep indexes with the `-k/--keep` parameter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/does delete/deletes/
docs/administrative-tasks.md
Outdated
udata search index -f | ||
``` | ||
|
||
It's possible to do a partial reindex by providing models (support both singular or plural) as arguments: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both singular and plural are supported
udata/search/commands.py
Outdated
|
||
|
||
def iter_for_index(docs, index_name): | ||
'''Iter over ES documents ensuring a given index''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Iter/Iterate/
}) | ||
|
||
|
||
def enable_refresh(index_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put refresh_interval
as a parameter with a default value of 1s
? In case this needs to be changed/configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 👍
udata/search/commands.py
Outdated
|
||
def enable_refresh(index_name): | ||
''' | ||
Enable refresh after indexing and force merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enable refresh and force merge. Used after indexing. ?
udata/search/commands.py
Outdated
if force or prompt_bool(('Index {0} will be deleted, are you sure ?' | ||
.format(index_name))): | ||
if IS_INTERACTIVE and not force: | ||
msg = 'Index {0} will be deleted, are you sure ?' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ ?/?/
This PR brings a huge performance improvement on reindexing as well as improved errors handling and a more flexible and unique
index
command.A single command
Both commands (
search init
andsearch reindex
) have been merged into a singlesearch index
command.This command is more flexible (models as variable arguments, plural or singular, delete obselete index by default...) and more powerful (allows to reindex multiple models in a single pass).
Previous commands still exists but will show a deprecation warning when used.
According to deprecation policy removal should be in udata 1.4 (An issue should be created).
Full reindexing
Simply execute the command without arguments
Partial reindex
Models are now simple arguments instead of the
-t
option from the previoussearch reindex
command.This means that to only reindex reuses and organizations you can do it in a single pass with:
instead of two passes before:
The command also accept plural forms as it is a common error:
Performances
Performance improvements comes from following Tune for indexing speed guide and the bulk indexing section from the Update settings documentation.
Consequences:
Before
After
Error handling
This PR improves error handling on indexation. No more ugly stacktrace, only the errors details (which where not displayed before by the way).
Now, both commands also properly handle kill signals and keyboard interrupt.
In case of error, the unfinished index is now properly deleted and so avoid having a lots of unfinished indexes (consuming ES memory).
For debugging purpose, you can keep the unfinished index with the
-k/--keep
parameter.Documentation
At last, the
search index
command is now documented in the "Administrative tasks" section.