Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parsing for InternalScriptedMetric aggregation #24738

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -412,7 +412,8 @@ public enum ValueType {
OBJECT_OR_STRING(START_OBJECT, VALUE_STRING),
OBJECT_ARRAY_BOOLEAN_OR_STRING(START_OBJECT, START_ARRAY, VALUE_BOOLEAN, VALUE_STRING),
OBJECT_ARRAY_OR_STRING(START_OBJECT, START_ARRAY, VALUE_STRING),
VALUE(VALUE_BOOLEAN, VALUE_NULL, VALUE_EMBEDDED_OBJECT, VALUE_NUMBER, VALUE_STRING);
VALUE(VALUE_BOOLEAN, VALUE_NULL, VALUE_EMBEDDED_OBJECT, VALUE_NUMBER, VALUE_STRING),
VALUE_OBJECT_ARRAY(VALUE_BOOLEAN, VALUE_NULL, VALUE_EMBEDDED_OBJECT, VALUE_NUMBER, VALUE_STRING, START_OBJECT, START_ARRAY);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that we add this to ObjectParser we should add tests for it as part of the object parser tests too. Maybe even commit it upstream from a separate PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this is necessary, those enums are merely combinations of token types and the parsing test added in this PR would fail if this wasn't working (indeed it did before I added this new constant includinf ARRAY_START and OBJECT_START). We also don't use any of the enums other than STRING and FLOAT in ObjectParserTests currently.
As for adding this as a separate PR upstream, that PR would only add an enum thats not used? I think we can add it together with merging the feature branch.


private final EnumSet<XContentParser.Token> tokens;

Expand Down
Expand Up @@ -124,7 +124,7 @@ public Object getProperty(List<String> path) {

@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
return builder.field("value", aggregation());
return builder.field(CommonFields.VALUE.getPreferredName(), aggregation());
}

@Override
Expand Down
@@ -0,0 +1,83 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

package org.elasticsearch.search.aggregations.metrics.scripted;

import org.elasticsearch.common.xcontent.ObjectParser;
import org.elasticsearch.common.xcontent.ObjectParser.ValueType;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.common.xcontent.XContentParser.Token;
import org.elasticsearch.common.xcontent.XContentParserUtils;
import org.elasticsearch.search.aggregations.ParsedAggregation;

import java.io.IOException;
import java.util.Collections;
import java.util.List;

public class ParsedScriptedMetric extends ParsedAggregation implements ScriptedMetric {
private List<Object> aggregation;

@Override
public String getType() {
return ScriptedMetricAggregationBuilder.NAME;
}

@Override
public Object aggregation() {
if (aggregation.size() != 1) {
throw new IllegalStateException("aggregation was not reduced");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this error up-to-date?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, at least in InternalScriptedMetric. I think in the parsed version we can ommit it.

Copy link
Member

@javanna javanna May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea that was my point, it doesn't apply to the parsed version of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left an assert for tests and a comment though, otherwise this might look like a mistake later on.

}
return aggregation.get(0);
}

@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
return builder.field(CommonFields.VALUE.getPreferredName(), aggregation());
}

private static final ObjectParser<ParsedScriptedMetric, Void> PARSER = new ObjectParser<>(ParsedScriptedMetric.class.getSimpleName(), true,
ParsedScriptedMetric::new);

static {
declareAggregationFields(PARSER);
PARSER.declareField((agg, value) -> agg.aggregation = Collections.singletonList(value),
ParsedScriptedMetric::parseValue, CommonFields.VALUE, ValueType.VALUE_OBJECT_ARRAY);
}

private static Object parseValue(XContentParser parser) throws IOException {
Token token = parser.currentToken();
if (token == XContentParser.Token.VALUE_NULL) {
return null;
} else if (token.isValue()) {
return XContentParserUtils.parseStoredFieldsValue(parser);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it feels weird to use this method from here, given its name, as it was really meant to parse stored fields values based on assumptions made around the possible types we have there. I wouldn't want changes made there to be reflected here in the future. Shall we copy what it does over here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thats what I had before. I agree that a little copy/paste is better than to introduce unneeded dependencies in this case. Glad to change back.

} else if (token == XContentParser.Token.START_OBJECT) {
return parser.map();
} else if (token == XContentParser.Token.START_ARRAY) {
return parser.list();
}
return null;
}

public static ParsedScriptedMetric fromXContent(XContentParser parser, final String name) {
ParsedScriptedMetric aggregation = PARSER.apply(parser, null);
aggregation.setName(name);
return aggregation;
}
}
Expand Up @@ -58,6 +58,7 @@
import org.elasticsearch.search.aggregations.metrics.percentiles.hdr.InternalHDRPercentilesTests;
import org.elasticsearch.search.aggregations.metrics.percentiles.tdigest.InternalTDigestPercentilesRanksTests;
import org.elasticsearch.search.aggregations.metrics.percentiles.tdigest.InternalTDigestPercentilesTests;
import org.elasticsearch.search.aggregations.metrics.scripted.InternalScriptedMetricTests;
import org.elasticsearch.search.aggregations.metrics.sum.InternalSumTests;
import org.elasticsearch.search.aggregations.metrics.valuecount.InternalValueCountTests;
import org.elasticsearch.search.aggregations.pipeline.InternalSimpleValueTests;
Expand Down Expand Up @@ -130,6 +131,7 @@ private static List<InternalAggregationTestCase> getAggsTests() {
aggsTests.add(new InternalAdjacencyMatrixTests());
aggsTests.add(new SignificantLongTermsTests());
aggsTests.add(new SignificantStringTermsTests());
aggsTests.add(new InternalScriptedMetricTests());
return Collections.unmodifiableList(aggsTests);
}

Expand Down
Expand Up @@ -19,7 +19,10 @@

package org.elasticsearch.search.aggregations.metrics.scripted;

import com.google.common.base.Supplier;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this what you meant to use?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, good catch. Seems to do the same thing as java.util.function.Supplier apparently though.


import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.common.geo.GeoPoint;
import org.elasticsearch.common.io.stream.Writeable.Reader;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.env.Environment;
Expand All @@ -30,20 +33,49 @@
import org.elasticsearch.script.ScriptService;
import org.elasticsearch.script.ScriptSettings;
import org.elasticsearch.script.ScriptType;
import org.elasticsearch.search.aggregations.ParsedAggregation;
import org.elasticsearch.search.aggregations.pipeline.PipelineAggregator;
import org.elasticsearch.test.InternalAggregationTestCase;
import org.junit.Before;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

public class InternalScriptedMetricTests extends InternalAggregationTestCase<InternalScriptedMetric> {

private static final String REDUCE_SCRIPT_NAME = "reduceScript";
// randomized only once so that any random test instance has the same value
private boolean hasReduceScript = randomBoolean();
private boolean hasReduceScript;
private Supplier<Object>[] valueTypes;
private final Supplier<Object>[] leafValueSuppliers = new Supplier[] { () -> randomInt(), () -> randomLong(), () -> randomDouble(),
() -> randomFloat(), () -> randomBoolean(), () -> randomAlphaOfLength(5), () -> new GeoPoint(randomDouble(), randomDouble()),
() -> null };
private final Supplier<Object>[] nestedValueSuppliers = new Supplier[] { () -> new HashMap<String, Object>(),
() -> new ArrayList<>() };



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remove some of these empty lines?

@Override
@Before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need @before here, the parent method already has it.

public void setUp() throws Exception {
super.setUp();
hasReduceScript = randomBoolean();
// we want the same value types (also for nested lists, maps) for all random aggregations
int levels = randomIntBetween(1, 3);
valueTypes = new Supplier[levels];
for (int i = 0; i < levels; i++) {
if (i < levels - 1) {
valueTypes[i] = randomFrom(nestedValueSuppliers);
} else {
// the last one needs to be a leaf value, not map or list
valueTypes[i] = randomFrom(leafValueSuppliers);
}
}
}

@Override
protected InternalScriptedMetric createTestInstance(String name, List<PipelineAggregator> pipelineAggregators,
Expand All @@ -56,7 +88,26 @@ protected InternalScriptedMetric createTestInstance(String name, List<PipelineAg
if (hasReduceScript) {
reduceScript = new Script(ScriptType.INLINE, MockScriptEngine.NAME, REDUCE_SCRIPT_NAME, params);
}
return new InternalScriptedMetric(name, randomAlphaOfLength(5), reduceScript, pipelineAggregators, metaData);
Object randomValue = randomValue(valueTypes, 0);
return new InternalScriptedMetric(name, randomValue, reduceScript, pipelineAggregators, metaData);
}

@SuppressWarnings("unchecked")
private static Object randomValue(Supplier<Object>[] valueTypes, int level) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's worth having this in RandomObjects instead. But given that it is only used here I don't have a strong opinion on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is also very specialized given that before dance that creates the suppliers, maybe wise to leave it here for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Supplier<Object> sup = valueTypes[level];
Object value = sup.get();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can save a line and go Object value = valueTypes[level].get(); directly

if (value instanceof Map) {
int elements = randomIntBetween(1,5);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after comma

for (int i = 0; i < elements; i++) {
((Map<String, Object>) value).put(randomAlphaOfLength(5), randomValue(valueTypes, level + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you cast once before the for loop?

}
} else if (value instanceof List) {
int elements = randomIntBetween(1,5);
for (int i = 0; i < elements; i++) {
((List<Object>) value).add(randomValue(valueTypes, level + 1));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, cast before then use in loop?

}
}
return value;
}

/**
Expand Down Expand Up @@ -105,4 +156,52 @@ protected Reader<InternalScriptedMetric> instanceReader() {
return InternalScriptedMetric::new;
}

@Override
protected void assertFromXContent(InternalScriptedMetric aggregation, ParsedAggregation parsedAggregation) {
assertTrue(parsedAggregation instanceof ParsedScriptedMetric);
ParsedScriptedMetric parsed = (ParsedScriptedMetric) parsedAggregation;

assertValues(aggregation.aggregation(), parsed.aggregation());
}

private void assertValues(Object expected, Object actual) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be static?

if (expected instanceof Long) {
// longs that fit into the integer range are parsed back as integer
if (actual instanceof Integer) {
assertEquals(((Long) expected).intValue(), actual);
} else {
assertEquals(expected, actual);
}
} else if (expected instanceof Float) {
// based on the xContent type, floats are sometimes parsed back as doubles
if (actual instanceof Double) {
assertEquals(expected, ((Double) actual).floatValue());
} else {
assertEquals(expected, actual);
}
} else if (expected instanceof GeoPoint) {
assertTrue(actual instanceof Map);
GeoPoint point = (GeoPoint) expected;
Map<String, Object> pointMap = (Map<String, Object>) actual;
assertEquals(point.getLat(), pointMap.get("lat"));
assertEquals(point.getLon(), pointMap.get("lon"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here we have a difference between transport client and our rest client. Would it make sense to add an extra post-parsing treatment for maps that in case of a map of 2 keys lat and lon converts it back to a GeoPoint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, this won't be the only case but one of many. I don't think we should be forced to treat each of those as a special case in the rest client, the documentation says we don't support them anyway. Maybe @colings86 has an opinion about this though...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should draw a line here. Users need to understand that we are going through REST and some things have to change as the protocol is different. I am not sure that patching this to make things work like with the transport client will work, there will be edge cases that we don't handle anyways. After all it should be easy for people to migrate, and a few users may be hit by this right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, really all we can do is provide a best effort for compatibility between the two and there will inevitably be differences. I think this is a case where the extra effort to make the two the same is not worth it

Copy link
Member

@tlrx tlrx May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me, I didn't have a strong feeling about it.

} else if (expected instanceof Map) {
Map<String, Object> expectedMap = (Map<String, Object>) expected;
Map<String, Object> actualMap = (Map<String, Object>) actual;
assertEquals(expectedMap.size(), actualMap.size());
for (String key : expectedMap.keySet()) {
assertValues(expectedMap.get(key), actualMap.get(key));
}
} else if (expected instanceof List) {
List<Object> expectedList = (List<Object>) expected;
List<Object> actualList = (List<Object>) actual;
assertEquals(expectedList.size(), actualList.size());
Iterator<Object> actualIterator = actualList.iterator();
for (Object element : expectedList) {
assertValues(element, actualIterator.next());
}
} else {
assertEquals(expected, actual);
}
}
}
Expand Up @@ -96,6 +96,8 @@
import org.elasticsearch.search.aggregations.metrics.percentiles.tdigest.InternalTDigestPercentiles;
import org.elasticsearch.search.aggregations.metrics.percentiles.tdigest.ParsedTDigestPercentileRanks;
import org.elasticsearch.search.aggregations.metrics.percentiles.tdigest.ParsedTDigestPercentiles;
import org.elasticsearch.search.aggregations.metrics.scripted.ParsedScriptedMetric;
import org.elasticsearch.search.aggregations.metrics.scripted.ScriptedMetricAggregationBuilder;
import org.elasticsearch.search.aggregations.metrics.stats.ParsedStats;
import org.elasticsearch.search.aggregations.metrics.stats.StatsAggregationBuilder;
import org.elasticsearch.search.aggregations.metrics.stats.extended.ExtendedStatsAggregationBuilder;
Expand Down Expand Up @@ -182,6 +184,7 @@ public abstract class InternalAggregationTestCase<T extends InternalAggregation>
map.put(AdjacencyMatrixAggregationBuilder.NAME, (p, c) -> ParsedAdjacencyMatrix.fromXContent(p, (String) c));
map.put(SignificantLongTerms.NAME, (p, c) -> ParsedSignificantLongTerms.fromXContent(p, (String) c));
map.put(SignificantStringTerms.NAME, (p, c) -> ParsedSignificantStringTerms.fromXContent(p, (String) c));
map.put(ScriptedMetricAggregationBuilder.NAME, (p, c) -> ParsedScriptedMetric.fromXContent(p, (String) c));

namedXContents = map.entrySet().stream()
.map(entry -> new NamedXContentRegistry.Entry(Aggregation.class, new ParseField(entry.getKey()), entry.getValue()))
Expand Down