Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unsupported `postings_format` / `doc_values_format` #7604

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
4 participants
@mikemccand
Copy link
Contributor

commented Sep 4, 2014

Today ES allows you to pick e.g. "pulsing", but this is very dangerous because that format, and all other postings/doc values formats from the Lucene codecs module, has no backwards compatibility support in Lucene. So on upgrade you can easily hit strange exceptions that make your index unusable / look like index corruption.

So I removed lucene-codecs JAR entirely from ES, which e.g. removes direct, simple text, memory PF, Lucene's BloomFilteringPF, and disk/memory DVF.

I haven't verified, but I think users can still put the Lucene codecs JAR onto ES's CLASSPATH (e.g. in with a plugin) and then use these formats in their own apps (at their own risk). I think this extra step is better than the ease today with which users can select these formats that Lucene doesn't support.

Today ES allows you to pick e.g. "pulsing", but this is very dangerous because that format, and all other postings/doc values formats from the Lucene codecs module, has no backwards compatibility support in Lucene. So on upgrade you can easily hit strange exceptions that make your index unusable / look like index corruption.

So I removed lucene-codecs JAR entirely from ES, which e.g. removes direct, simple text, memory PF, Lucene's BloomFilteringPF, and disk/memory DVF.

I haven't verified, but I think users can still put the Lucene codecs JAR onto ES's CLASSPATH (e.g. in with a plugin) and then use these formats in their own apps (at their own risk). I think this extra step is better than the ease today with which users can select these formats that Lucene doesn't support.

See #7566 and #7238

@mikemccand mikemccand added the review label Sep 4, 2014

@jpountz jpountz added the breaking label Sep 4, 2014

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Sep 4, 2014

+1

@jpountz jpountz removed the review label Sep 4, 2014

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2014

so how does this work with the reading part if somebody used one of the postings formats before? We removed the codecs JAR entirely will we be able to read old indices created with Lucene < 4.8? I think that can be very tricky though - I am not sure if all the old postings formats and DV formats are in core?

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2014

The codecs jar only contained experimental codecs for which lucene doesn't maintain backward compatibility, so I think it's fine? Users who happened to use one of these non-default codecs would need to either reindex or switch their indices to the defaut codec and trigger a merge before upgrading. All the old postings/dv formats are in core today (but will move to a module in 4.11 that we can add a dependency to when we upgrade to lucene 4.11 https://issues.apache.org/jira/browse/LUCENE-5858).

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2014

Right, users who use the default (back compat supported) codecs will be fine: those are in core (moving to separate JAR in 5.0). But non-back-compat codecs (e.g. bloom_pulsing, pulsing) won't be recognized anymore, which I think is OK? (Better than the "false corruption" we saw on #7238 ).

either reindex or switch their indices to the defaut codec and trigger a merge before upgrading.

Hmm do we allow changing the postings_format / doc_values_format in the mapping for a field after it's created? Or is that "write once"?

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2014

Hmm do we allow changing the postings_format / doc_values_format in the mapping for a field after it's created? Or is that "write once"?

you can change it via the update mapping API

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2014

Hmm do we allow changing the postings_format / doc_values_format in the mapping for a field after it's created? Or is that "write once"?

It can be updated, see AbstractFieldMapper.merge.

@mikemccand

This comment has been minimized.

Copy link
Contributor Author

commented Sep 5, 2014

you can change it via the update mapping API

It can be updated, see AbstractFieldMapper.merge.

OK that's great, so there is a migration path.

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2014

++ just double checking...

@s1monw

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2014

LGTM

@mikemccand mikemccand closed this in 130fdef Sep 8, 2014

mikemccand added a commit that referenced this pull request Sep 8, 2014

Core: remove built-in support for Lucene's experimental codecs
Lucene's experimental codecs (from the codecs module) do not provide
backwards compatibility and are free to change from release to
release.  When they do change, they typically cannot in general read
older indices and the resulting exceptions look like index corruption.
So, we are removing built-in support for them to prevent applications
from choosing one and then seeing strange exceptions on upgrade.

Closes #7566

Closes #7604

@clintongormley clintongormley changed the title Don't allow selecting unsupported postings_format / doc_values_format Mapping: Remove unsupported postings_format / doc_values_format Sep 8, 2014

@clintongormley clintongormley changed the title Mapping: Remove unsupported postings_format / doc_values_format Remove unsupported `postings_format` / `doc_values_format` Jun 6, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.