Don't create mapping entry for dynamic templates #6619

mikemccand · 2014-06-25T20:49:47Z

Today, when a new field name shows up in a document matching a dynamic template, we record that field name, its type information, etc.

But if you add many, many fields this way, the mappings become very large and serializing them into the cluster state becomes very costly.

I think we may be able to get away with not making a mapping entry and just re-matching that same field the next time it comes? Or maybe making mapping entries only up until a limit..

mikemccand · 2014-06-28T09:11:18Z

I spent some time looking at the mapping code but I don't understand it enough to make progress here... can someone who knows ObjectMapper.java give some pointers?

I tried commenting out the putMapper(mapper) and context.setMappingsModified() in the end of parseDynamicValue, but this makes many tests angry...

kimchy · 2014-06-30T10:49:06Z

This will be a rather big change, since we also need to change in each place that looks up a mapping (for search and such). I think that concrete mappings, even with dynamic templates, is very valuable, for example, Kibana can then auto suggest existing fields and such.

I think that there is a lot of improvements that we can add to ES even when it concretely creates mappings. One is this: #6648, the other is potentially to move from update on write data structures (that have a better concurrency story) to update in place concurrent data structures above a certain threshold. Based on my tests, I think we can get to a very good perf while still maintaing the concrete mappings case.

The cluster state is the place that will suffer, or when someone has 1 million fields for example. But I think that this is simply abusing the system and things will break in other places (in terms of resources used, ...), not just mappings.

kimchy · 2014-07-03T19:06:16Z

update

Mappings: Update mapping on master in async manner #6648 is in, and will be on 1.3.
More resource efficient analysis wrapping usage #6714 is in, will be on 1.3 (memory improvement in analysis for many mappings)
Improve performance for many new fields introduction in mapping #6707 the last one, that brings the perf to acceptable levels, but is a bit trickier in terms of code change, we are still discussing...

kimchy · 2014-07-05T15:44:18Z

#6707 has been pushed as well, I think we are at a good state performance wise, so closing this for now, we can reopen a new issue if this is still a problem

kimchy · 2014-07-21T13:24:47Z

#6843 another one that helps a lot with memory usage here

mikemccand added enhancement labels Jun 25, 2014

areek assigned rjernst and unassigned rjernst Jun 29, 2014

kimchy removed v1.3.0 labels Jul 5, 2014

kimchy closed this as completed Jul 5, 2014

jpountz mentioned this issue Jan 12, 2015

first cut at ephemeral fields #9189

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't create mapping entry for dynamic templates #6619

Don't create mapping entry for dynamic templates #6619

mikemccand commented Jun 25, 2014

mikemccand commented Jun 28, 2014

kimchy commented Jun 30, 2014

kimchy commented Jul 3, 2014

kimchy commented Jul 5, 2014

kimchy commented Jul 21, 2014

Don't create mapping entry for dynamic templates #6619

Don't create mapping entry for dynamic templates #6619

Comments

mikemccand commented Jun 25, 2014

mikemccand commented Jun 28, 2014

kimchy commented Jun 30, 2014

kimchy commented Jul 3, 2014

kimchy commented Jul 5, 2014

kimchy commented Jul 21, 2014