Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Norms disabling on existing fields #4813

Closed
jpountz opened this issue Jan 20, 2014 · 6 comments
Closed

Norms disabling on existing fields #4813

jpountz opened this issue Jan 20, 2014 · 6 comments
Assignees
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types v1.2.0 v2.0.0-beta1

Comments

@jpountz
Copy link
Contributor

jpountz commented Jan 20, 2014

We should allow for disabling norms on existing fields via the update mappings API. Implementation-wise, we would only have to set omitNorms to false in the FieldType and Lucene would automatically ignore norms on the next fields that would be added to the index and remove data from the index upon merges.

However, the reverse operation cannot be supported, so disabling norms would be a destructive operation.

@nik9000
Copy link
Member

nik9000 commented Jan 21, 2014

How would scoring work if some of the documents have norms and some do not? Would it just ignore them on all documents because they are false on the FieldType?

@jpountz
Copy link
Contributor Author

jpountz commented Jan 21, 2014

By default, here is what Lucene would do: Let's assume that we already have 2 segments A and B that have norms. We are now writing segment C and the first n documents have been added with norms enabled while the last maxDoc-n documents have been added with norms disabled because of a mapping update.

On the next refresh, segment C will be written, and no document of segment C will have norms (even though the first documents were added with norms -- disabling norms is a destructuive operation). If you run a query on this index, norms will be taken into account on A and B, and norms will be assumed to be all equal to 1 on C since it doesn't have norms.

Then, as background merges happen, A and B are going to be merged with segments that don't have norms and the resulting segment won't have norms either (even for documents that come from A or B).

This is it for the default behavior. Alternatively, if we want to, something we could do as well would be to wrap the IndexReader to make all segments pretend that they don't have norms as soon as norms get disabled via a mapping update.

@nik9000
Copy link
Member

nik9000 commented Jan 21, 2014

Then, as background merges happen, A and B are going to be merged with segments that don't have norms and the resulting segment won't have norms either (even for documents that come from A or B).

Would that make the results from segment C score more highly then segment A and B until the merge unless you did something like wrap the IndexReader? If you aren't using index time boosts a value of 1 represents a single term, right?

@jpountz
Copy link
Contributor Author

jpountz commented Jan 21, 2014

Yes, very likely, segment C would score higher.

I have to admit I hadn't thought too much about scoring, I mostly thought about users who would realize they were using a particular field solely for matching (without scoring), sorting or aggregations and would like to stop paying the price for norms. So maybe it makes sense to do some wrapping to avoid surprises with scores.

@nik9000
Copy link
Member

nik9000 commented Jan 21, 2014

You could get away with just documenting it very well I suppose.

@jpountz
Copy link
Contributor Author

jpountz commented Jan 21, 2014

Right. I need to think more about the consequences of each option... :-)

@s1monw s1monw added v1.2.0 and removed v1.1.0 labels Mar 12, 2014
@jpountz jpountz self-assigned this Mar 24, 2014
@jpountz jpountz added the v2.0.0 label Mar 24, 2014
jpountz added a commit to jpountz/elasticsearch that referenced this issue Mar 24, 2014
jpountz added a commit that referenced this issue Mar 25, 2014
@clintongormley clintongormley added >enhancement :Search/Mapping Index mappings, including merging and defining field types labels Jun 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Mapping Index mappings, including merging and defining field types v1.2.0 v2.0.0-beta1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants