Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_field_names should not get docvalues #10892

Closed
rmuir opened this issue Apr 30, 2015 · 1 comment · Fixed by #10893
Closed

_field_names should not get docvalues #10892

rmuir opened this issue Apr 30, 2015 · 1 comment · Fixed by #10893
Assignees

Comments

@rmuir
Copy link
Contributor

rmuir commented Apr 30, 2015

When i try to reproduce @peterskim12 benchmark here (https://github.com/peterskim12/elk-index-size-tests), I see larger space usage for master than expected.

I first looked at fieldinfos and see _field_names with SORTED_SET, this should be pretty wasteful and probably is the smoking gun:

field indexed docvalues
_uid DOCS NONE
_source NONE NONE
_type DOCS NONE
_version NONE NUMERIC
@Version DOCS SORTED_SET
@timestamp DOCS SORTED_NUMERIC
host DOCS_AND_FREQS_AND_POSITIONS NONE
host.raw DOCS SORTED_SET
clientip DOCS_AND_FREQS_AND_POSITIONS NONE
clientip.raw DOCS SORTED_SET
ident DOCS_AND_FREQS_AND_POSITIONS NONE
ident.raw DOCS SORTED_SET
auth DOCS_AND_FREQS_AND_POSITIONS NONE
auth.raw DOCS SORTED_SET
timestamp DOCS_AND_FREQS_AND_POSITIONS NONE
timestamp.raw DOCS SORTED_SET
verb DOCS_AND_FREQS_AND_POSITIONS NONE
verb.raw DOCS SORTED_SET
request DOCS_AND_FREQS_AND_POSITIONS NONE
request.raw DOCS SORTED_SET
httpversion DOCS_AND_FREQS_AND_POSITIONS NONE
httpversion.raw DOCS SORTED_SET
response DOCS SORTED_NUMERIC
bytes DOCS SORTED_NUMERIC
referrer DOCS_AND_FREQS_AND_POSITIONS NONE
referrer.raw DOCS SORTED_SET
agent DOCS_AND_FREQS_AND_POSITIONS NONE
agent.raw DOCS SORTED_SET
_all DOCS_AND_FREQS_AND_POSITIONS NONE
_field_names DOCS SORTED_SET
@rmuir
Copy link
Contributor Author

rmuir commented Apr 30, 2015

I peeked at the docvalues file and saw ~ 8MB ordinals alone for _field_names, so it accounts for the space increase that surprised me (75MB -> 83MB total index size).

rjernst added a commit to rjernst/elasticsearch that referenced this issue Apr 30, 2015
When doc values were turned on a by default, most meta fields
had it explicitly disabled.  However, _field_names was missed.
This change forces doc values to be off always for _field_names
and removes the unnecessary support when creating index fields.

closes elastic#10892
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants