Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add upgrade_only_ancient_segments option to upgrade API #10540

Closed
wants to merge 11 commits into from

Conversation

mikemccand
Copy link
Contributor

PR is based on 1.x.

The new option defaults to false, because it's still important to upgrade old-but-same-Lucene-major-version to get improvements.

If it's set to true, only segments whose major version is older than current major version will be upgraded.

I also fixed the REST upgrade GET API to separately return bytes of ancient segments.

Does anyone know why UpgradeTest doesn't allow testing upgrading ancient segments...? ElasticsearchBackwardsCompatibilityIntegrationTest.backwardsCompatibilityPath gets angry because e.g. 0.90.X is < Version.V_1_6_0.minimumCompatibilityVersion() ... but I thought users are allowed to do such an upgrade?

Closes #10213

@mikemccand mikemccand self-assigned this Apr 10, 2015
The `upgrade` API accepts the following request parameters:

[horizontal]
`upgrade_only_ancient_segments`:: If true, only very old segments (from a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can upgrade be omitted here since its redundant with the upgrade api name?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also can we make a private boolean method shouldUpgrade for the hairy boolean logic. Github review tool just broke so i cant put this comment where it belongs.

runUpgrade(httpClient(), indexToUpgrade, "only_ancient_segments", "true");

// TODO: why do we never test upgrading ancient segments here...?
// E.g. 0.90.x (Lucene 3.x) --> 1.x (Lucene 4.x) ElasticsearchBackwardsCompatIntegrationTest.backwardsCompatibilityPath checks that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, 0.90.x was 4.x right? The backcompat tests can only run on the same major version because of wire format changes between 1.x and 2.0. I think the only way to test this is with the static bwc tests. There used to be an "UpgradeReallyOldIndexTest" or something like that. Perhaps we should have that to test the ancient vs non ancient behavior, and leave the static bwc tests just doing a normal upgrade?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm, 0.90.x was 4.x right?

Oh you're right!

OK I'll add a static index based on UpgradeReallyOldIndexTest on 1.x ..

@s1monw
Copy link
Contributor

s1monw commented Apr 14, 2015

I don't have any complains with this though if @rjernst is happy this LGTM

@s1monw s1monw added the blocker label Apr 14, 2015
@mikemccand
Copy link
Contributor Author

OK I pushed more changes. I think this is ready.

I added a new test case with a static back compat index containing a mix of ancient (Lucene 3.x) and merely old (Lucene 4.x) segments, and confirmed if I upgrade with only_ancient then the "merely old" segments remain.

However, I saw this spooky-yet-apparently-harmless exception on upgrade:

 [2015-04-14 16:34:54,964][TRACE][index.gateway.local      ] [node_t0] [index-0.20.6-and-0.90.6][0] ignoring truncation exception, the translog is either empty or half-written
  1> org.elasticsearch.index.translog.TruncatedTranslogException: translog header truncated
  1> at org.elasticsearch.index.translog.ChecksummedTranslogStream.openInput(ChecksummedTranslogStream.java:120)
  1> at org.elasticsearch.index.gateway.local.LocalIndexShardGateway.recover(LocalIndexShardGateway.java:246)
  1> at org.elasticsearch.index.gateway.IndexShardGatewayService$1.run(IndexShardGatewayService.java:112)
  1> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  1> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  1> at java.lang.Thread.run(Thread.java:745)
  1> Caused by: java.io.EOFException
  1> at org.apache.lucene.store.InputStreamDataInput.readByte(InputStreamDataInput.java:37)
  1> at org.apache.lucene.store.DataInput.readInt(DataInput.java:98)
  1> at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:134)
  1> at org.elasticsearch.index.translog.ChecksummedTranslogStream.openInput(ChecksummedTranslogStream.java:116)
  1> ... 5 more

It's caused by a 0-byte translog file that 0.20.6 left behind even after I flushed the index. It doesn't cause the test to fail (ES really does just ignore it) but it still seems bad... though maybe it's just a known issue with 0.20.6? The 0 byte translog that 0.20.6 wrote somehow survived after upgrade to 0.90.6, adding new docs and flushing.

if 'node' in vars():

if node is not None:
# This only happens if we've hit an exception:
logging.info('Shutting down node with pid %d', node.pid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe move this to a simple helper like shutdown_node(node) since it is exactly the same as above?

@rjernst
Copy link
Member

rjernst commented Apr 15, 2015

LGTM

@mikemccand
Copy link
Contributor Author

maybe move this to a simple helper like shutdown_node(node) since it is exactly the same as above?

Good idea, I just pushed that.

mikemccand added a commit that referenced this pull request Apr 16, 2015
… an old Lucene version are upgraded

This option defaults to false, because it is also important to upgrade
the "merely old" segments since many Lucene improvements happen within
minor releases.

But you can pass true to do the minimal work necessary to upgrade to
the next major Elasticsearch release.

The HTTP GET upgrade request now also breaks out how many bytes of
ancient segments need upgrading.

Closes #10213

Closes #10540
@mikemccand mikemccand removed the review label Apr 16, 2015
@mikemccand
Copy link
Contributor Author

When I merged to master I dropped the static mixed ancient and "merely old" segments test because we cannot test that until we have a 2.0 release.

@clintongormley clintongormley added :Upgrade API and removed :Core/Infra/Core Core issues without another label labels May 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants