Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow _update and upsert to read from the transaction log #29264

Merged
merged 5 commits into from Mar 28, 2018

Conversation

@s1monw
Copy link
Contributor

s1monw commented Mar 27, 2018

We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.

This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.

Closes #26802

We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.

This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

elasticmachine commented Mar 27, 2018

Copy link
Contributor

jpountz left a comment

Looks good to me overall. I left some minor questions.

}
if (operation.routing() != null && visitor.needsField(FAKE_ROUTING_FIELD) == StoredFieldVisitor.Status.YES) {
visitor.stringField(FAKE_ROUTING_FIELD, operation.routing().getBytes(StandardCharsets.UTF_8));
}

This comment has been minimized.

Copy link
@jpountz

jpountz Mar 27, 2018

Contributor

I know some visitors won't like that _id is not seen, so maybe assert that visitor.needsField("_id") is false?

new VersionsAndSeqNoResolver.DocIdAndVersion(0, ((Translog.Index) operation).version(), reader, 0));
}
} catch (IOException e) {
throw new UncheckedIOException(e);

This comment has been minimized.

Copy link
@jpountz

jpountz Mar 27, 2018

Contributor

we seem to be using EngineException as a wrapper in other places?

}
};
}
return null;

This comment has been minimized.

Copy link
@jpountz

jpountz Mar 27, 2018

Contributor

should we throw an exception?

Copy link
Contributor

jpountz left a comment

LGTM

Copy link
Contributor

ywelsch left a comment

I've left some minor questions and comments. Thanks @s1monw

/**
* Returns the translog location for this version value or null. This is optional and might not be tracked all the time.
*/
public Translog.Location getLocation() {

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

can you add @Nullable?

TranslogReader translogReader = readers.get(i);
if (translogReader.generation == location.generation) {
reader = translogReader;
onClose = acquireTranslogGenFromDeletionPolicy(current.generation);

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

closing looks to be an expensive call, because upon closing, the write lock is acquired twice (once in trimUnreferencedReaders, and once in closeFilesIfNoPendingRetentionLocks). I wonder if we need to optimize those two methods now.

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 28, 2018

Author Contributor

I will look into it

}
if (current.generation == location.generation) {
// fsync here to ensure all buffers are written to disk
current.syncUpTo(location.translogLocation + location.size);

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

why require this?

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 28, 2018

Author Contributor

Because the buffer that we have in memory need to be written to disk if we point to it from the record. I will add a comment

if (versionValue.getLocation() != null) {
try {
Translog.Operation operation = translog.readOperation(versionValue.getLocation());
if (operation != null) {

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

when do we expect this to be null?

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 28, 2018

Author Contributor

Given the translog generation is not available anymore I think it’s unlikely but I see a chance. I can add a comment in the code

new VersionsAndSeqNoResolver.DocIdAndVersion(0, ((Translog.Index) operation).version(), reader, 0));
}
} catch (IOException e) {
throw new EngineException(shardId, "failed to read operation from translog", e);

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

do we want to fail the engine (as we do when indexing)?

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 28, 2018

Author Contributor

So this is the read side of things I am not sure any imo exceptions are fatal?!

s1monw added 2 commits Mar 28, 2018
@s1monw

This comment has been minimized.

Copy link
Contributor Author

s1monw commented Mar 28, 2018

@ywelsch I pushed changes. thanks

Copy link
Contributor

ywelsch left a comment

LGTM

new VersionsAndSeqNoResolver.DocIdAndVersion(0, ((Translog.Index) operation).version(), reader, 0));
}
} catch (IOException e) {
maybeFailEngine("realtime_get", e); // lets check if the translog has failed with a tragic event

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

In other places, we have wrapped the maybeFailEngine call as follows:

try {
    maybeFailEngine("index", e);
} catch (Exception inner) {
    e.addSuppressed(inner);
}
throw ...;

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 28, 2018

Author Contributor

that's bogus I think. There is no exception throw from maybeFaileEngine only errors which should not be handled. I think we are good here?

This comment has been minimized.

Copy link
@s1monw

s1monw Mar 28, 2018

Author Contributor

I will clean this up in a followup

This comment has been minimized.

Copy link
@ywelsch

ywelsch Mar 28, 2018

Contributor

right, makes sense

Copy link
Member

bleskes left a comment

LGTM. Thanks for not letting this go.

@s1monw s1monw merged commit 13e19e7 into elastic:master Mar 28, 2018
2 checks passed
2 checks passed
CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details
@s1monw s1monw mentioned this pull request Mar 28, 2018
s1monw added a commit that referenced this pull request Mar 28, 2018
We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.

This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.

Closes #26802
@jonaf

This comment has been minimized.

Copy link

jonaf commented Jun 11, 2018

@s1monw Is there any chance this fix will ever get back ported to the 5.x release branch? I'm coming from Elasticsearch 1.7, and it's significantly more work to upgrade to ES 6.x than ES 5.x due to several breaking changes, notably scripting (we currently make heavy use of Groovy). I'm hoping to upgrade to 5.x as a stepping-stone to 6.x, but my use-case involves heavy, rapid indexing, and this bug slows down indexing in 5.x badly.

@bleskes

This comment has been minimized.

Copy link
Member

bleskes commented Jun 11, 2018

@jonaf this is a major change and 5.6 is in it's last maintenance mode (until 7.0 is released). It only receive small crucial bug fixes or security fixes. I'm afraid this one can't go there. I understand you want to upgrade from 1.7 (it's impressive how long it worked for you) but you'd have to invest in moving to 6.x, if you can't find a way to work around this (like batching updates to make sure the extra refresh in 5.x has less costs).

@jonaf

This comment has been minimized.

Copy link

jonaf commented Jun 11, 2018

Thanks for your reply, @bleskes . We're already batching (to the maximum extent that it improves indexing speed), so the impact is too dramatic for us to ignore. We'll have to go all the way to 6.x, then.

@luyuncheng

This comment has been minimized.

Copy link

luyuncheng commented Aug 29, 2019

Hi, I checked the code shows that _GET call always read from a reader for consistency, How about If we use _GET api with preference realtime=false which actually cause refresh action, or preference version the same as document version in translog can get from translog? @s1monw

ywelsch added a commit that referenced this pull request Nov 9, 2019
The realtime GET API currently has erratic performance in case where a document is accessed
that has just been indexed but not refreshed yet, as the implementation will currently force an
internal refresh in that case. Refreshing can be an expensive operation, and also will block the
thread that executes the GET operation, blocking other GETs to be processed. In case of
frequent access of recently indexed documents, this can lead to a refresh storm and terrible
GET performance.

While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted
to read from the translog in case of realtime GET API or update API, this was removed in 5.0
(#20102) to avoid inconsistencies between values that were returned from the translog and
those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and
upsert to read from the translog again as it was easier to guarantee consistency for these, and
also brought back more predictable performance characteristics of this API. Calls to the realtime
GET API, however, would still always do a refresh if necessary to return consistent results. This
means that users that were calling realtime GET APIs to coordinate updates on client side
(realtime GET + CAS for conditional index of updated doc) would still see very erratic
performance.

This PR (together with #48707) resolves the inconsistencies between reading from translog and
index. In particular it fixes the inconsistencies that happen when requesting stored fields, which
were not available when reading from translog. In case where stored fields are requested, this
PR will reparse the _source from the translog and derive the stored fields to be returned. With
this, it changes the realtime GET API to allow reading from the translog again, avoid refresh
storms and blocking the GET threadpool, and provide overall much better and predictable
performance for this API.
ywelsch added a commit that referenced this pull request Nov 9, 2019
The realtime GET API currently has erratic performance in case where a document is accessed
that has just been indexed but not refreshed yet, as the implementation will currently force an
internal refresh in that case. Refreshing can be an expensive operation, and also will block the
thread that executes the GET operation, blocking other GETs to be processed. In case of
frequent access of recently indexed documents, this can lead to a refresh storm and terrible
GET performance.

While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted
to read from the translog in case of realtime GET API or update API, this was removed in 5.0
(#20102) to avoid inconsistencies between values that were returned from the translog and
those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and
upsert to read from the translog again as it was easier to guarantee consistency for these, and
also brought back more predictable performance characteristics of this API. Calls to the realtime
GET API, however, would still always do a refresh if necessary to return consistent results. This
means that users that were calling realtime GET APIs to coordinate updates on client side
(realtime GET + CAS for conditional index of updated doc) would still see very erratic
performance.

This PR (together with #48707) resolves the inconsistencies between reading from translog and
index. In particular it fixes the inconsistencies that happen when requesting stored fields, which
were not available when reading from translog. In case where stored fields are requested, this
PR will reparse the _source from the translog and derive the stored fields to be returned. With
this, it changes the realtime GET API to allow reading from the translog again, avoid refresh
storms and blocking the GET threadpool, and provide overall much better and predictable
performance for this API.
debadair added a commit to debadair/elasticsearch that referenced this pull request Nov 13, 2019
The realtime GET API currently has erratic performance in case where a document is accessed
that has just been indexed but not refreshed yet, as the implementation will currently force an
internal refresh in that case. Refreshing can be an expensive operation, and also will block the
thread that executes the GET operation, blocking other GETs to be processed. In case of
frequent access of recently indexed documents, this can lead to a refresh storm and terrible
GET performance.

While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted
to read from the translog in case of realtime GET API or update API, this was removed in 5.0
(elastic#20102) to avoid inconsistencies between values that were returned from the translog and
those returned by the index. This was partially reverted in 6.3 (elastic#29264) to allow _update and
upsert to read from the translog again as it was easier to guarantee consistency for these, and
also brought back more predictable performance characteristics of this API. Calls to the realtime
GET API, however, would still always do a refresh if necessary to return consistent results. This
means that users that were calling realtime GET APIs to coordinate updates on client side
(realtime GET + CAS for conditional index of updated doc) would still see very erratic
performance.

This PR (together with elastic#48707) resolves the inconsistencies between reading from translog and
index. In particular it fixes the inconsistencies that happen when requesting stored fields, which
were not available when reading from translog. In case where stored fields are requested, this
PR will reparse the _source from the translog and derive the stored fields to be returned. With
this, it changes the realtime GET API to allow reading from the translog again, avoid refresh
storms and blocking the GET threadpool, and provide overall much better and predictable
performance for this API.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.