Change tootctl search deploy
algorithm
#14300
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First, performance killer fixes:
StatusesIndex
definition was missing eager load ofpreloadable_poll
, n+1 queries issueStatusesIndex
delete_if
was callingsearchable_by
without crutches, causing lots of n+1 queries **The actual command change is a divergence from Chewy-provided tasks, to do things our own way.
AccountsIndex
, toStatusesIndex
, toTagsIndex
, which means that the comparatively smallTagsIndex
was not ready to go until the giganticStatusesIndex
has finished importing. Now we manually define the order: accounts, tags, statuses.* We still de-index some stuff, as long as it's in the database. The
delete_if
check is respected, additionally, we have a special optimization for the statuses index, de-indexing statuses when theirsearchable_by
field is empty. What we don't try to de-index is records that have been removed from the database completely. Having those linger in ElasticSearch is junk, but does not truly impact search results since any ElasticSearch results have to be hydrated from the database. However, their presence in ElasticSearch presumes that the live updates through Sidekiq have failed somehow, which is not a normal situation.** Removing the
delete_if
means that live updates through Sidekiq will always index, and never de-index statuses that are not searchable by anyone. This has no search results implications but is suboptimal from an index size perspective. Unfortunately, from my reading of Chewy code, it would be difficult if not impossible to monkey-patch it to pass crutches when callingdelete_if
.