New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support metrics counter types in ESQL #107877
Conversation
f1e3827
to
87dcd03
Compare
84cd4ee
to
68efa9a
Compare
68efa9a
to
f1981e8
Compare
Hi @dnhatn, I've created a changelog YAML for you. |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in general, but there are edge cases that might need some refining:
stats min(tx)
->argument of [min(tx)] must be [datetime or numeric except unsigned_long], found value [min(tx)] type [counter_long]
. It's good we have a verification exception, but its message is not entirely true- tests are needed in VerifierTests for error messages when using these types in various places in queries
- it seems I cannot aggregate on these types, but I can group by them (ie
min(time) by counter
). Is this expected? count(counter)
seems to also work...
x-pack/plugin/src/yamlRestTest/resources/rest-api-spec/test/esql/40_tsdb.yml
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some questions. Probably good to go but I'll recheck after you answer.
.../src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/ToDouble.java
Outdated
Show resolved
Hide resolved
...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/LocalExecutionPlanner.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/type/EsqlDataTypes.java
Outdated
Show resolved
Hide resolved
@@ -212,7 +219,8 @@ public static boolean isRepresentable(DataType t) { | |||
&& t != FLOAT | |||
&& t != SCALED_FLOAT | |||
&& t != SOURCE | |||
&& t != HALF_FLOAT; | |||
&& t != HALF_FLOAT | |||
&& isCounterType(t) == false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think counters are representable though. They can be loaded into blocks now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disabled this mainly to avoid the map in AbstractFunctionTestCase #107877 (comment). Would you be okay if I enable this in a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Delaying is fine.
...ck/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/action/EsqlQueryResponseTests.java
Outdated
Show resolved
Hide resolved
DataTypes.NULL | ||
), | ||
"boolean or counter_double or datetime or numeric or string" | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've grown to hate this Map
. I made it, but I dislike it....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've done the right thing to it, but I'm sorry you had to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's another version of errorsForCasesWithoutExamples
which takes another parameter that outputs a matcher for hte expected type. You could use that rather than changing the map.
x-pack/plugin/src/yamlRestTest/resources/rest-api-spec/test/esql/40_tsdb.yml
Show resolved
Hide resolved
Good question. Some usages, such as count(counter) or to_string(counter), seem to make sense. Should we support them all, or still require casting in these cases? |
I would be cautious and start with requiring casting everywhere, there may be edge cases that we couldn't think of atm and it may be difficult to remove the support once it's merged. If we get demand for this, we'll analyze these requests and adapt at that time. |
Thanks @astefan. We are on the same page. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks Nhat. Outside some superficial nits, the main comment I have is around CSV tests with counters in various places such as:
- filtering (WHERE counter > 10)
- expressions both named and anonymous + scalar functions (EVAL X = counter % 3 > 2 | WHERE counter_long > counter_double)
- using it in agg functions MAX(counter), MIN(counter % 2)
Thanks!
...c/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/NumericAggregate.java
Outdated
Show resolved
Hide resolved
.../src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/convert/ToDouble.java
Outdated
Show resolved
Hide resolved
@@ -344,7 +344,8 @@ private PhysicalOperation planTopN(TopNExec topNExec, LocalExecutionPlannerConte | |||
case "version" -> TopNEncoder.VERSION; | |||
case "boolean", "null", "byte", "short", "integer", "long", "double", "float", "half_float", "datetime", "date_period", | |||
"time_duration", "object", "nested", "scaled_float", "unsigned_long", "_doc" -> TopNEncoder.DEFAULT_SORTABLE; | |||
case "geo_point", "cartesian_point", "geo_shape", "cartesian_shape" -> TopNEncoder.DEFAULT_UNSORTABLE; | |||
case "geo_point", "cartesian_point", "geo_shape", "cartesian_shape", "counter_long", "counter_integer", "counter_double" -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate from this PR but it makes sense to centralize these strings in one class to avoid bugs due to typos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from Costin's comments (especially the additional tests), it's LGTM.
I think all of those should fail unless you wrap them in |
Thank you all for your reviews. I'll have to delay the CSV tests because I'm going to build a CSV dataset for the tsdb suite. Also, the CSV test infra doesn't support error checking yet. The YAML tests should have good coverage. |
We should widen the numeric root types before converting them into counter types. Relates #107877
This commit adds support for numeric metrics counter fields in ES|QL. These counter types, including
counter_long
,counter_integer
, andcounter_double
, are different from their parent types. Users will have limited interaction with these counter types, restricted to:to_long(a_long_counter)
)These restrictions are intentional to prevent misuse. If users want to use them as numeric values, explicit casting to their root types is required.