New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize indexing for the autogenerated ID append-only case #20211
Conversation
If elasticsearch controls the ID values as well as the documents version we can optimize the code that adds / appends the documents to the index. Essentially we an skip the version lookup for all documents unless the same document is delivered more than once. On the lucene level we can simply call IndexWriter#addDocument instead of #updateDocument but on the Engine level we need to ensure that we deoptimize the case once we see the same documetn more than once. This is done as follows: 1. Mark every request with a timestamp. This is done once on the first node that receives a request and is fixed for this request. This can be even the machine local time (see why later). The important part is that retry requests will have the same value as the original one. 2. In the engine we make sure we keep the highest seen time stamp of "retry" requests. This is updated while the retry request has its doc id lock. Call this `highestDeOptimzeAddDocumentTimestamp` 3. When the engine runs an "optimized" request comes, it compares it's timestamp with the current `highestDeOptimzeAddDocumentTimestamp` (but doesn't update it). If the the request timestamp is higher it is safe to execute it as optimized (no retry request with the same timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one and update the `highestDeOptimzeAddDocumentTimestamp` unless it's been updated already to a higher value Closes elastic#19813
if (highestDeOptimzeAddDocumentTimestamp.get() >= index.timestamp()) { | ||
break; | ||
} | ||
} while(highestDeOptimzeAddDocumentTimestamp.compareAndSet(deOptimizeTimestamp, index.timestamp()) == false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to recapture deOptimizeTimestamp in the loop no?
…ons are not loaded
I tested performance on indexing geonames (8.6 M docs) on a fast 8 core box on a fast SSD with plenty of RAM (64 GB, vs 4 GB for JVM heap):
~20% speedup, nice :) |
return isAutogeneratedID; | ||
} | ||
|
||
public boolean isCanHaveDuplicates() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public boolean canHaveDuplicates()
?
* add assertions that canHaveDuplicates is only true if isAutoGeneratedID is true * add more documentation
} | ||
if (allowIdGeneration && id == null) { | ||
assert autoGeneratedID == false; | ||
autoGeneratedID = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we explicitly set can have duplicates to false here (and default it true)? or at least assert it is false? (I prefer the first)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont' think we should do this to be honest I added assertions along the way to ensure it's only true if autoGeneratedID
is true too also it's documented. We can switch so something like:
enum IDSource {
External, AutoGenerated, RetryAutoGenerated
}
``
to have the different states clear, WDYT?
@bleskes thx so much for the in-depth reviews, this would have been impossible without that! I will wait for CI to run and push tomorrow morning |
I re-tested performance gain on latest PR, still ~20% speedup! Nice:
|
@mikemccand w00t. I presume this was without replicas? |
@bleskes yes: single node with defaults, except 4 GB heap. |
- use auto-generated ids for indexing elastic#20211 - use rounded dates in queries elastic#20115
To ensure we don't add documents more than once even if it's mostly paranoia except of one case where we relocated a shards away and back to the same node while an initial request is in flight but has not yet finished AND is retried. Yet, this is a possible case and for that reason we ensure we pass on the maxUnsafeAutoIdTimestamp on when we prepare for translog recovery. Relates to #20211
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case. This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates. This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/ Relates #20211
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case. This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates. This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/ Relates #20211
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case. This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates. This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/ Relates #20211
If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.
On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.
This is done as follows:
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.
This is updated while the retry request has its doc id lock. Call this
maxUnsafeAutoIdTimestamp
current
maxUnsafeAutoIdTimestamp
(but doesn't update it). If the the requesttimestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the
maxUnsafeAutoIdTimestamp
unless it's been updated already to a higher valueRelates to #19813