Add variable field support in aggregations#18
Conversation
Support $(var:field:<type>) syntax alongside existing $(var:list). No behavioral changes yet — field_type is parsed but unused. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Field_var carries a Tjson.var and a simple_type parsed from the
field type annotation. typeof_value returns Simple typ directly,
bypassing the mapping lookup.
Also adds parse_field_type helper to convert user-facing type
strings ("string", "int", "float", "int64") to simple_type.
Note: the value type definition was moved after pp_simple_type
because [@@deriving show] on Field_var needs the printer in scope.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Parse $(var:field:<type>) in aggregation field position - Register field variables as string input parameters - Skip numeric/date field validation for dynamic fields Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests $(group_by:field:string) syntax: verifies string input parameter and string bucket key output type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add field variable section under Variables with usage example and supported type annotations.
Use descriptive names like _typ, _name, _metric, _keys etc. instead of bare _ in pattern matches added by this PR.
| let parse_field_type = function | ||
| | "string" -> String | ||
| | "int" -> Int | ||
| | "float" -> Double | ||
| | "int64" -> Int64 | ||
| | s -> fail "unsupported field type annotation %S, expected: string, int, float, int64" s |
There was a problem hiding this comment.
why not simple_of_es_type above
| match String.nsplit s ":" with | ||
| | [name] -> name, false, None | ||
| | [name; "list"] -> name, true, None | ||
| | [name; "field"; typ] -> name, false, Some typ |
There was a problem hiding this comment.
i think better syntax is just name:type? maybe it will be useful in more contexts than dynamic field and overall there is no semantic importance to it being field, it is currently a field type only by the virtue of place where it is used
| | Bool | Json as t -> eprintfn "W: field %S expected to be numeric, but has type %s" f (show_simple_type t) | ||
| end | ||
| | Field_num (Script _) -> () | ||
| | Field_num (Field_var (_v, _typ)) -> () (* cannot validate dynamic field type *) |
There was a problem hiding this comment.
but we know the type and can validate it same as Field f above?
| let field_var_cstrs = match agg with | ||
| | Dynamic _dv -> [] | ||
| | Static agg -> | ||
| let vs = match agg with | ||
| | Simple_metric (_metric, v) -> [v] | ||
| | Value_count v | Cardinality v | ||
| | Histogram v | Range v -> [v] | ||
| | Range_keyed (v, _keys) -> [v] | ||
| | Terms { term; _ } | Significant_terms { term; _ } | Significant_text { term; _ } -> [term] | ||
| | Date_histogram { on; _ } | Date_range { on; _ } -> [on] | ||
| | Multi_terms { terms; _ } -> terms | ||
| | Weighted_avg { value; weight } -> [value.value; weight.value] | ||
| | Filter _q -> [] | ||
| | Filters _fs -> [] | ||
| | Filters_dynamic _dv -> [] | ||
| | Top_hits _th -> [] | ||
| | Nested _path -> [] | ||
| | Reverse_nested _rpath -> [] | ||
| | Bucket_sort _bs -> [] | ||
| | Cumulative_sum _cp -> [] | ||
| in | ||
| List.concat_map field_var_constraints vs | ||
| in |
There was a problem hiding this comment.
this can be done in the above match where all the constraints are computed already
Type annotations now use ES type names (keyword, long, double) instead of custom names (string, int, float). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The :field: segment was redundant — the type annotation is useful regardless of context, and the usage site already determines semantics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The annotated type is known at codegen time, so we can warn when a non-numeric type is used in a numeric context — same as Field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Eliminates redundant second match on aggregation type. Each arm now handles its own field_var_constraints inline, matching the existing pattern for on_int_var and Field_num/Field_date constraints. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
rr0gi
left a comment
There was a problem hiding this comment.
please make fixes and merge
| let dummy_expunge = Dict ["dummy_expunge", Simple Json] | ||
|
|
||
| let field_var_constraints = function | ||
| | Field_var (v, _typ) -> [On_var (v, Eq_type String)] |
There was a problem hiding this comment.
i think this warrants a comment that String here is for the variable value itself, not for the aggregation result, to avoid confusion
|
|
||
| The type annotation determines the output bucket key type. The variable itself becomes a `string` input parameter — the caller passes the ES field name at runtime (e.g., `"keyword_en"`). | ||
|
|
||
| Supported type annotations: any ES type accepted by `simple_of_es_type` (`keyword`, `text`, `long`, `double`, `float`, `boolean`, `date`, `int64`, `ip`, `murmur3`). |
There was a problem hiding this comment.
this is external documentation, they don't know what is simple_of_es_type.
| Supported type annotations: any ES type accepted by `simple_of_es_type` (`keyword`, `text`, `long`, `double`, `float`, `boolean`, `date`, `int64`, `ip`, `murmur3`). | |
| Type can be any ES type - `keyword`, `text`, `long`, `double`, `float`, `boolean`, `date`, `int64`, `ip`, `murmur3`. |
| let debug_dump = false | ||
|
|
||
| type var = { optional : bool; list : bool; name : string } [@@deriving show] | ||
| type var = { optional : bool; list : bool; name : string; field_type : string option } [@@deriving show] |
There was a problem hiding this comment.
| type var = { optional : bool; list : bool; name : string; field_type : string option } [@@deriving show] | |
| type var = { optional : bool; list : bool; name : string; type_ : string option } [@@deriving show] |
but at this point will be cleaner to get rid of list bool and rename type_ to hint and introduce Common.var parsed from generic Tjson.var, not in this PR
| let opt_suffix = if optional then "?" else "" in | ||
| let name_s = name ^ opt_suffix in | ||
| match field_type with | ||
| | Some ft -> sprintf "$(%s:%s)" name_s ft |
There was a problem hiding this comment.
natural qn is what if it is both list and has type, guess can assert for now to be clear
String is the type of the variable value (field name passed at runtime), not the aggregation result type. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
External users don't know what simple_of_es_type is. List supported types directly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The field is no longer field-specific after the syntax simplification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The parser branches already prevent this, but the explicit check makes the invariant visible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests $(bucket_field:long) in a histogram aggregation, exercising the Field_num constraint validation path and float bucket key output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Author: Vitalii Shapoval <vitalii_shapoval@ukr.net> Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Addresses #3
$(var:field:<type>)syntax for parameterizing the"field"value in aggregations (e.g.,$(group_by:field:string))stringinput parameter (the ES field name passed at runtime)string,int,float,int64) determines the output bucket key typeChanges
$(var:field:<type>)alongside existing$(var:list)Field_varconstructor tovaluetype,parse_field_typehelper, andtypeof_valuehandlingField_varthrough aggregation analysis and constraint resolutionField_num (Field_var _)in constraint validationTest Plan
field_var_aggwith terms aggregation using$(group_by:field:string)