Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize indexing for the autogenerated ID append-only case #20211

Merged
merged 23 commits into from Sep 1, 2016

Conversation

s1monw
Copy link
Contributor

@s1monw s1monw commented Aug 29, 2016

If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.

On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.

This is done as follows:

  1. Mark every request with a timestamp. This is done once on the first node that
    receives a request and is fixed for this request. This can be even the
    machine local time (see why later). The important part is that retry
    requests will have the same value as the original one.
  2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
    This is updated while the retry request has its doc id lock. Call this maxUnsafeAutoIdTimestamp
  3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
    current maxUnsafeAutoIdTimestamp (but doesn't update it). If the the request
    timestamp is higher it is safe to execute it as optimized (no retry request with the same
    timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
    and update the maxUnsafeAutoIdTimestamp unless it's been updated already to a higher value

Relates to #19813

If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.

On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same documetn more than once.

This is done as follows:

1. Mark every request with a timestamp. This is done once on the first node that
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.

2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
This is updated while the retry request has its doc id lock. Call this `highestDeOptimzeAddDocumentTimestamp`

3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
current `highestDeOptimzeAddDocumentTimestamp` (but doesn't update it). If the the request
timestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the `highestDeOptimzeAddDocumentTimestamp` unless it's been updated already to a higher value

Closes elastic#19813
if (highestDeOptimzeAddDocumentTimestamp.get() >= index.timestamp()) {
break;
}
} while(highestDeOptimzeAddDocumentTimestamp.compareAndSet(deOptimizeTimestamp, index.timestamp()) == false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to recapture deOptimizeTimestamp in the loop no?

@mikemccand
Copy link
Contributor

I tested performance on indexing geonames (8.6 M docs) on a fast 8 core box on a fast SSD with plenty of RAM (64 GB, vs 4 GB for JVM heap):

before:
 Total docs/sec: 44526.0
 Total docs/sec: 45460.2

after:
 Total docs/sec: 53650.4
 Total docs/sec: 53639.6

~20% speedup, nice :)

@s1monw s1monw removed the WIP label Aug 29, 2016
return isAutogeneratedID;
}

public boolean isCanHaveDuplicates() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public boolean canHaveDuplicates() ?

* add assertions that canHaveDuplicates is only true if isAutoGeneratedID is true
* add more documentation
}
if (allowIdGeneration && id == null) {
assert autoGeneratedID == false;
autoGeneratedID = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we explicitly set can have duplicates to false here (and default it true)? or at least assert it is false? (I prefer the first)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont' think we should do this to be honest I added assertions along the way to ensure it's only true if autoGeneratedID is true too also it's documented. We can switch so something like:

enum IDSource {
  External, AutoGenerated, RetryAutoGenerated
}
`` 
to have the different states clear, WDYT?

@s1monw
Copy link
Contributor Author

s1monw commented Aug 31, 2016

@bleskes thx so much for the in-depth reviews, this would have been impossible without that! I will wait for CI to run and push tomorrow morning

@mikemccand
Copy link
Contributor

I re-tested performance gain on latest PR, still ~20% speedup! Nice:

before:

  Total docs/sec: 44299.1
  Total docs/sec: 43192.2  

after:

  Total docs/sec: 53321.2
  Total docs/sec: 52980.7

@bleskes
Copy link
Contributor

bleskes commented Aug 31, 2016

@mikemccand w00t. I presume this was without replicas?

@mikemccand
Copy link
Contributor

@bleskes yes: single node with defaults, except 4 GB heap.

@s1monw s1monw merged commit a0becd2 into elastic:master Sep 1, 2016
jpountz added a commit to jpountz/elasticsearch that referenced this pull request Sep 2, 2016
 - use auto-generated ids for indexing elastic#20211
 - use rounded dates in queries elastic#20115
s1monw added a commit that referenced this pull request Sep 2, 2016
To ensure we don't add documents more than once even if it's mostly paranoia
except of one case where we relocated a shards away and back to the same node
while an initial request is in flight but has not yet finished AND is retried.

Yet, this is a possible case and for that reason we ensure we pass on the
maxUnsafeAutoIdTimestamp on when we prepare for translog recovery.

Relates to #20211
bleskes added a commit that referenced this pull request Oct 31, 2016
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case.

This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.

This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/

Relates #20211
bleskes added a commit that referenced this pull request Nov 1, 2016
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case.

This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.

This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/

Relates #20211
bleskes added a commit that referenced this pull request Nov 1, 2016
Replication request may arrive at a replica before the replica's node has processed a required mapping update. In these cases the TransportReplicationAction will retry the request once a new cluster state arrives. Sadly that retry logic failed to call `ReplicationRequest#onRetry`, causing duplicates in the append only use case.

This commit fixes this and also the test which missed the check. I also added an assertion which would have helped finding the source of the duplicates.

This was discovered by https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=opensuse/174/

Relates #20211
@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Engine :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants