Remove possibility for conflicting field definitions and ambiguous field resolution #8870

clintongormley · 2014-12-10T12:35:34Z

Fields with the same name in different types in the same index should have the same mapping. Previously, this has been advised as "best practice", but relying on advice has proved insufficient (see #4081 for the many issues that have resulted from allowing conflicting field definitions). Instead we need to enforce this in the API.

This issue (which replaces #4081) is a meta issue listing all of the changes that need to be made:

Alternatives:

Fields with different types can be renamed to distinguish their purpose, eg login_name vs login_date.
Different types can be separated into different indices.

The text was updated successfully, but these errors were encountered:

OlegYch · 2014-12-10T14:25:47Z

what about fields with different path (but same name) and different mapping in one type?

clintongormley · 2014-12-10T14:28:41Z

@OlegYch Since we will no longer support short names, only the full path is used to identify a field, so it is only the path that matters.

OlegYch · 2014-12-10T14:47:16Z

we are using 'mapping types' precisely to store different kinds of documents in the same index (so that we can use parent/child queries)
so far prefixing field path with _type worked fine for us
this change would mean that we would lose ability to use parent/child queries if there happens to be a conflict, as they would have to be in different indexes
the conflicts will probably be rare, and we could probably change documents schema if they arise, but a better way to prevent or diagnose the conflicts than an exception on put would be nice
i'm also wondering if there is no way to somehow append field type to its name internally (like one would do to resolve conflicts manually)?

clintongormley · 2014-12-10T15:12:07Z

the conflicts will probably be rare, and we could probably change documents schema if they arise, but a better way to prevent or diagnose the conflicts than an exception on put would be nice
i'm also wondering if there is no way to somehow append field type to its name internally (like one would do to resolve conflicts manually)?

As you said, conflicts are rare (for most people). Normally a field with the same name refers to the same type of data. The alternative (eg prefixing the field name with the type) would create much more sparsity in the index, and would impact every cross-type query (as multiple fields would need to be queried). Right now, we have opted for making the common case correct and efficient. However, we have left the APIs as they are in case, sometime in the future, we manage to figure out a cleverer way of handling conflicting field definitions.

rore · 2014-12-29T13:40:01Z

For the record, I want to raise here again our objection to the way this change is planned.

We have several use cases in which we have an index with custom types that have custom fields. Types are not pre-defined and fields are not pre-defined. There's high potential of fields with the same name under different types, including fields with the same name that have different field type (and it happens). We have a lot of this kind of data.

This modeling was aligned with the way Types where presented by Elasticsearch (and still are - in our last meetup Types where referred to by Boaz as equivalent to "tables").

So with this change we will be enforced to either hard-code the type as a prefix for all fields ourselves, or encapsulate all documents with a root "type" node. It can be done but is ugly, requires patchy handling and also reindexing all data.

A better option that will allow such use cases is having a setting on the index level to configure field type isolation. So if we know that we need field type isolation (and we don't have cross-type searches and are willing to pay the overhead), we can set it to "type" level, and internally fields will be prefixed by the type.

clintongormley · 2014-12-29T13:57:21Z

Hi @rore

There's high potential of fields with the same name under different types, including fields with the same name that have different field type (and it happens). We have a lot of this kind of data.

If you have this, then your data is essentially broken today, and you can end up with incorrect results, exceptions, or even corrupt indices.

The first thing that we're trying to do is to make everything safe and predictable. We are leaving the mapping APIs as they are so that, in the future, we may be able to revisit this decision and provide more alternatives.

jhansen-tt · 2015-06-25T22:43:50Z

+1. This has been a major pain ever since I started using ES, and still happens in 1.6. Deleting the type alone doesn't fix the problem -- I have to completely delete the entire index and re-index everything, and then sometimes the problem goes away. I do set up a strict mapping before I index anything. It really feels to be a timing thing between the shards.

jordansissel · 2015-06-25T22:53:03Z

it seems that default data model for logstash is broken then.

Maybe? This only impacts users who have the same field name mapped to different data types under different document types in the same index. I don't know how many users this affects. Given Logstash has had this behavior (by default a daily index) for many years and given the anecdote that I can't recall much reports of this problem from Logstash users in those years, I'm not sure how big a problem this will be for Logstash users.

I will confess that prior to this ticket, my assumptions were that type mappings were fully independent even if they shared field names (w/ different mappings) - I know accept this as incorrect, but I don't know great an impact this has had against Logstash users at this time.

If you enforce that all the same names need to have the same type you are effectively enforcing the same schema for all different logs from different sources.

These are two different things. One constraint "Fields in the same index but on different document types must have the same mapping" is not the same as saying "the same schema must exist for all documents in the same index regardless of type - the conflict is only when two fields occupy the same name but different mappings in one index.

More research is needed for the Logstash side of things, but it's possible we may want to change the default index to include the 'type' field from Logstash (it'll be a backwards-compatibility-breaking change, if we do this). Hopefully the script from #10214 will help users figure out how this will impact them before upgrading, and we can address further from there.

jhansen-tt · 2015-06-25T23:08:21Z

After reading this:

https://www.elastic.co/guide/en/elasticsearch/reference/1.3/mapping.html

I think this should be taken out of the docs:
In practice though, this restriction is almost never an issue.

It looks like this is probably my issue, but I think the documentation should be updated to say that using different mapping characteristics on fields with the same name across multiple types is not supported, because some searches fall apart completely, such as sorting.

monowai · 2015-07-02T22:00:11Z

Indexes can have the same field name with a different type. Doesn't this change move the query problem out of the index/type level and up in to the index? It seems to me that I had a query spanning indexes I'd still have the same fieldname+datatype conflict problem.

Is there any merit in resolving this as part of the query DSL? If you have conflicting field+datatypes then could the query allow the caller to specify which field+datatype they wanted ignoring docs that don't match the criteria.

rjernst · 2015-07-03T00:06:54Z

@monowai The important thing about #8871 was making field types consistent within an index. You are correct that across indexes, the problem can still exist. However, whether this is an error case depends on the query. If the query is not parseable in one of the indexes, an exception would be raised. This was already the case, but now it should be consistently raised, while before mixed field types within an index could have masked the problem (depending on the order the document types were loaded within mappings).

jpountz mentioned this issue Dec 10, 2014

Mappings: disallow exotic options on meta fields #8143

Closed

This was referenced Dec 10, 2014

Deprecate extracting _routing and _id from document fields #6730

Closed

Remove the _boost field #8875

Closed

jpountz mentioned this issue Dec 10, 2014

Mappings: Ensure that reindexing is always possible #8142

Closed

clintongormley mentioned this issue Dec 10, 2014

Deprecate the _size field #8876

Closed

clintongormley added :Search Foundations/Mapping Index mappings, including merging and defining field types >breaking v2.0.0-beta1 labels Dec 10, 2014

clintongormley mentioned this issue Dec 10, 2014

Remove the ability to delete mappings #8877

Closed

jpountz mentioned this issue Dec 10, 2014

Mappings: Deprecate index_name and path #6677

Closed

clintongormley mentioned this issue Dec 10, 2014

Field resolution should be unambiguous #4081

Closed

clintongormley added the Meta label Dec 17, 2014

clintongormley mentioned this issue Dec 29, 2014

geo_distance - can't filter by inner (object) type UncheckedExecutionException - java.lang.NumberFormatException #5255

Closed

This was referenced Dec 30, 2014

Problems with field resolution in significant terms aggregations #5687

Closed

Prepend the type name to the index_name automatically #5851

Closed

Wildcard expansion on field names should not try to match on name #6494

Closed

clintongormley mentioned this issue Jun 6, 2015

Indexing: index-time sorting #6720

Closed

clintongormley mentioned this issue Jun 14, 2015

Number fields #11650

Closed

monowai mentioned this issue Jun 17, 2015

Display fields by index/type elastic/kibana#4252

Closed

modmac mentioned this issue Jul 28, 2015

Problem with sorting on a filtered query elastic/elasticsearch-php#264

Closed

clintongormley mentioned this issue Aug 6, 2015

ArrayIndexOutOfBoundsException when using 2-level terms aggregation #12685

Closed

clintongormley added v2.0.0 and removed v2.0.0-beta1 v2.0.0 labels Aug 13, 2015

clintongormley closed this as completed Aug 16, 2015

clintongormley mentioned this issue Oct 16, 2015

no buckets returned by terms aggregation on raw field #14161

Closed

This was referenced Nov 21, 2015

sorting does not work when querying an alias #7343

Closed

search by type in URL finds a document of a different type #7635

Closed

This was referenced Dec 2, 2015

copy_to of mapper attachments metadata field isn't working #14946

Closed

Average seems to be wrong #15169

Closed

clintongormley mentioned this issue Dec 11, 2015

XContentBuilder throws NumberFormatException for Date field #15375

Closed

This was referenced Jan 10, 2016

Not getting All data in case of multiple types. Getting data for one specific type. #15533

Closed

Allowing dots in field names #15951

Closed

clintongormley mentioned this issue Jan 29, 2016

omit_norms: true PUT _mapping exception #16298

Closed

This was referenced Feb 29, 2016

Aggregations returns wrong results #16849

Closed

ALLOCATION_FAILED Field [user] is defined as a field in mapping [syslog] #16695

Closed

clintongormley mentioned this issue Apr 6, 2016

Invalid shift value in prefixCoded bytes #17559

Closed

rjernst mentioned this issue Jun 30, 2016

Nested query against documents with multiple nested fields with same second-level field names only finds hits for the first-defined nested field #19193

Closed

rjernst mentioned this issue Mar 20, 2017

Remove support for types? #15613

Closed

songdongsheng mentioned this issue Jun 7, 2017

different object can not have same field name with different types Erudika/para#11

Closed

javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove possibility for conflicting field definitions and ambiguous field resolution #8870

Remove possibility for conflicting field definitions and ambiguous field resolution #8870

clintongormley commented Dec 10, 2014

OlegYch commented Dec 10, 2014

clintongormley commented Dec 10, 2014

OlegYch commented Dec 10, 2014

clintongormley commented Dec 10, 2014

rore commented Dec 29, 2014

clintongormley commented Dec 29, 2014

jhansen-tt commented Jun 25, 2015

jordansissel commented Jun 25, 2015

jhansen-tt commented Jun 25, 2015

monowai commented Jul 2, 2015

rjernst commented Jul 3, 2015

Remove possibility for conflicting field definitions and ambiguous field resolution #8870

Remove possibility for conflicting field definitions and ambiguous field resolution #8870

Comments

clintongormley commented Dec 10, 2014

Alternatives:

OlegYch commented Dec 10, 2014

clintongormley commented Dec 10, 2014

OlegYch commented Dec 10, 2014

clintongormley commented Dec 10, 2014

rore commented Dec 29, 2014

clintongormley commented Dec 29, 2014

jhansen-tt commented Jun 25, 2015

jordansissel commented Jun 25, 2015

jhansen-tt commented Jun 25, 2015

monowai commented Jul 2, 2015

rjernst commented Jul 3, 2015