New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Norms disabling on existing fields #4813
Comments
How would scoring work if some of the documents have norms and some do not? Would it just ignore them on all documents because they are false on the |
By default, here is what Lucene would do: Let's assume that we already have 2 segments A and B that have norms. We are now writing segment C and the first On the next refresh, segment C will be written, and no document of segment C will have norms (even though the first documents were added with norms -- disabling norms is a destructuive operation). If you run a query on this index, norms will be taken into account on A and B, and norms will be assumed to be all equal to 1 on C since it doesn't have norms. Then, as background merges happen, A and B are going to be merged with segments that don't have norms and the resulting segment won't have norms either (even for documents that come from A or B). This is it for the default behavior. Alternatively, if we want to, something we could do as well would be to wrap the IndexReader to make all segments pretend that they don't have norms as soon as norms get disabled via a mapping update. |
Would that make the results from segment C score more highly then segment A and B until the merge unless you did something like wrap the IndexReader? If you aren't using index time boosts a value of 1 represents a single term, right? |
Yes, very likely, segment C would score higher. I have to admit I hadn't thought too much about scoring, I mostly thought about users who would realize they were using a particular field solely for matching (without scoring), sorting or aggregations and would like to stop paying the price for norms. So maybe it makes sense to do some wrapping to avoid surprises with scores. |
You could get away with just documenting it very well I suppose. |
Right. I need to think more about the consequences of each option... :-) |
We should allow for disabling norms on existing fields via the update mappings API. Implementation-wise, we would only have to set
omitNorms
to false in theFieldType
and Lucene would automatically ignore norms on the next fields that would be added to the index and remove data from the index upon merges.However, the reverse operation cannot be supported, so disabling norms would be a destructive operation.
The text was updated successfully, but these errors were encountered: