-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't create mapping entry for dynamic templates #6619
Comments
I spent some time looking at the mapping code but I don't understand it enough to make progress here... can someone who knows ObjectMapper.java give some pointers? I tried commenting out the putMapper(mapper) and context.setMappingsModified() in the end of parseDynamicValue, but this makes many tests angry... |
This will be a rather big change, since we also need to change in each place that looks up a mapping (for search and such). I think that concrete mappings, even with dynamic templates, is very valuable, for example, Kibana can then auto suggest existing fields and such. I think that there is a lot of improvements that we can add to ES even when it concretely creates mappings. One is this: #6648, the other is potentially to move from update on write data structures (that have a better concurrency story) to update in place concurrent data structures above a certain threshold. Based on my tests, I think we can get to a very good perf while still maintaing the concrete mappings case. The cluster state is the place that will suffer, or when someone has 1 million fields for example. But I think that this is simply abusing the system and things will break in other places (in terms of resources used, ...), not just mappings. |
update
|
#6707 has been pushed as well, I think we are at a good state performance wise, so closing this for now, we can reopen a new issue if this is still a problem |
#6843 another one that helps a lot with memory usage here |
Today, when a new field name shows up in a document matching a dynamic template, we record that field name, its type information, etc.
But if you add many, many fields this way, the mappings become very large and serializing them into the cluster state becomes very costly.
I think we may be able to get away with not making a mapping entry and just re-matching that same field the next time it comes? Or maybe making mapping entries only up until a limit..
The text was updated successfully, but these errors were encountered: