Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: change the way unsupported data types fields are handled #50823

Merged
merged 9 commits into from
Jan 20, 2020

Conversation

astefan
Copy link
Contributor

@astefan astefan commented Jan 9, 2020

By default, new field types in SQL are treated as UNSUPPORTED, but because "flattened" adds a _keyed sub-field to the flattened field, the fields discovery algorithm in SQL (looking at the output of field_caps API) is broken by the _keyed suffix.

This PR fixes this situation for the "flattened" data type, but also for other future unsupported data types, by marking an entire hierarchy of fields and sub-fields under an unsupported data type field, as unsupported.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/SQL)

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if there isn't a generic approach that we can take to treat flattened as unsupported without being aware of it.
That is take into account a multi field starting with _ and potentially looking at the type and notice that we don't recognize it - essentially extending the type discovery to multi fields with special "internal" keys.

@@ -39,6 +39,7 @@ check.dependsOn internalClusterTest

dependencies {
compileOnly project(path: xpackModule('core'), configuration: 'default')
compileOnly project(path: xpackModule('mapper-flattened'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this creates a runtime issue as plugins that are not marked as extensible (such as Painless) cannot have their classes imported into a different project.
In other words the FlatObjectFieldMapper.CONTENT_TYPE won't be available which would result in a CNFE.
As such as I would simply declare the String constant internally and use it verbatim instead of linking to it.

@@ -493,6 +495,12 @@ public void resolveAsSeparateMappings(String indexWildcard, String javaRegex, bo
if (typeEntry.getKey().startsWith("_") && typeCap.getType().startsWith("_")) {
continue;
}

// skip the "flattened" type of field's "_keyed" subfield
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we cannot make this a bit generic by looking for field names that have leaves starting with _ and if so, look at their type - if its not recognized, mark the field as unsupported.

@astefan astefan requested a review from costin January 13, 2020 16:42
@astefan
Copy link
Contributor Author

astefan commented Jan 13, 2020

@elasticmachine run elasticsearch-ci/bwc

@astefan
Copy link
Contributor Author

astefan commented Jan 13, 2020

@elasticmachine run elasticsearch-ci/default-distro

@astefan
Copy link
Contributor Author

astefan commented Jan 13, 2020

@elasticmachine update branch

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -495,6 +501,7 @@ public void resolveAsSeparateMappings(String indexWildcard, String javaRegex, bo
if (typeEntry.getKey().startsWith("_") && typeCap.getType().startsWith("_")) {
continue;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: extra new line

@polyfractal polyfractal added v7.7.0 and removed v7.6.0 labels Jan 15, 2020
@astefan astefan requested a review from bpintea January 15, 2020 16:01
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - added just a few comments regarding styling.

String inherited = unsupportedParent.getInherited();
String type = unsupportedParent.getOriginalType();

if (inherited == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole block can be further simplified by

esField = new UnsupportedEsField(esField.getName(), type, inherited != null ? inherited : unsupportedParent.getName(), esField.getProperties());

for (Entry<String, EsField> entry : properties.entrySet()) {
EsField field = entry.getValue();
UnsupportedEsField u;
if (field instanceof UnsupportedEsField) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the if needed - the result will be the same regardless of whether field is UnsupportedEsField or not.

@@ -217,8 +217,14 @@ private static Attribute handleSpecialFields(UnresolvedAttribute u, Attribute na
// unsupported types
else if (DataTypes.isUnsupported(fa.dataType())) {
UnsupportedEsField unsupportedField = (UnsupportedEsField) fa.field();
named = u.withUnresolvedMessage(
"Cannot use field [" + fa.name() + "] type [" + unsupportedField.getOriginalType() + "] as is unsupported");
if (unsupportedField.hasInherited()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the message is the same in the branch, simply do the if on the prefix

String inheritedMessage = "";
if (unsupportedField.hasInherited()) {
   inheritedMessage = format("in hierarchy (field [{}"], unsupportedField.getInherited()));
}
named = u.withUnresolvedMessage(... + inheritedMessage);

Further more replace the string concatenation in the message with MessageFormat.format to trim it a bit.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm unclear on why propagate unsupported is needed...

@@ -91,7 +91,8 @@ private static void walkMapping(String name, Object value, Map<String, EsField>
break;
case UNSUPPORTED:
String type = content.get("type").toString();
field = new UnsupportedEsField(name, type);
field = new UnsupportedEsField(name, type, null, properties);
propagateUnsupportedType(name, type, properties);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is propagated needed? The parents should always be discovered before children which means an unsupported parent will discovered before its children, which in turn, will already know their parent is unsupported and thus become unsupported as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@costin If I remember well, the method pair startWalking-walkMapping does the discovery the other way around: start with a mapping looking like parent=[child=[another_child=[type:keyword]], type:text], type:foobar]], get the parent, look at properties like "it's nested", or "it's object", or "has sub-fields" here, if yes, dig into those and so on and so forth.

Then after discovering all the children, create the parent itself here. So, by the time the algorithm finds out that a root field is unsupported, the children already have some types (their own).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is used only in the case of some tests that use a .json file with mappings. I made this change because I wanted to also test the error message (here) that gets printed out from the Analyzer.

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@astefan astefan added the v7.6.1 label Jan 20, 2020
@astefan astefan merged commit 7adb286 into elastic:master Jan 20, 2020
@astefan astefan changed the title SQL: Treat flattened data type fields as UNSUPPORTED SQL: change the way unsupported data types fields are handled Jan 20, 2020
astefan added a commit to astefan/elasticsearch that referenced this pull request Jan 20, 2020
…c#50823)

The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286)
astefan added a commit that referenced this pull request Jan 20, 2020
#51220)

The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286)
astefan added a commit that referenced this pull request Jan 20, 2020
#51220)

The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286)
(cherry picked from commit df36169)
astefan added a commit to astefan/elasticsearch that referenced this pull request Jan 21, 2020
…c#50823) (elastic#51220)

The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286)
astefan added a commit that referenced this pull request Jan 21, 2020
The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286)
@astefan astefan deleted the ignore_flattened_fields branch January 21, 2020 14:55
@astefan astefan removed the v7.3.3 label Jan 21, 2020
astefan added a commit that referenced this pull request Jan 21, 2020
The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286)
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
…c#50823)

The hierarchy of fields/sub-fields under a field that is of an unsupported data type will be marked as unsupported as well. Until this change, the behavior was to set the unsupported data type field's hierarchy as empty.

Example, considering the following hierarchy of fields/subfields  a -> b -> c -> d, if b would be of type "foo", then b, c and d will be marked as unsupported.
@polyfractal polyfractal added v7.6.0 and removed v7.6.1 labels Feb 7, 2020
@ywelsch
Copy link
Contributor

ywelsch commented Feb 27, 2020

The backport PR seems to have been merged. I'm therefore removing the backport pending label here. Please shout if this is incorrect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants