Support square bracket array syntax in flexible ingest pipelines #133790

jbaiera · 2025-08-29T08:03:17Z

Adds support for array indexing in the new flexible ingest document access pattern.

The classic way to index into an array field via ingest node is to use an integer field name. This is done in a context sensitive manner. If the field path finds itself at an array when handling a field name, it will try to parse it as an integer and use it to index into the array. If the path finds itself at an object when handling an integer field name, it will treat the integer as a regular field name:

get `a.b.c.1`	Resolves to...
`{a: {b: {c: [0, 1, 2] } } }`	`1`
`{a: {b: {c: {0: foo, 1: bar, 2: baz] } } }`	`bar`

Confusingly, if you try to write a value to an array index, and that array does not exist, it will create a path of objects and use the index as the field name for the value:

set `a.b.c.1` to `foo`	Results in...
`{a: {b: {c: [0, 1, 2] } } }`	`{a: {b: {c: [0, foo, 2] } } }`
`{}`	`{a: {b: {c: {1: foo} } } }`

Since the flexible access pattern is a new way to denote field access in ingest documents, we will add support for new array indexing syntax which is closer to how most programming languages surface the concept. This will allow for explicit control over whether a field path element is meant to be a field name or an array index, and will hopefully cut down on set operations that inconsistently create fields when array accesses were intended.

The new syntax follows some simple rules:

Arrays are indexed by using square brackets to denote the position to use.

a[0]
a.b.c[0]
a[1].b[5].c[0]

Square brackets may be repeated to index higher dimensionality arrays.

a[0][2]
a[2].b[0][1].c[0][2][0]

Using numbers after a dot will always be treated as a field name.

a.0 ==> field a then field 0

When retrieving a field, square brackets interrupt the ability to chain dotted fields together. This is because dotted field names can only be accessed on map data. Array indices can only be applied to array data.

Field Path	Explanation of traversal
`a[0]`	field `a` then array index `0`
`a[0].b`	field `a` then array index `0` then field `b`
`a.b.c[2].d.e.f[1].g`	field `a.b.c` then array index `2` then field `d.e.f` then array index `1` then field `g`

Setting a value on a document with a field path that includes array indices will require the arrays to exist on the document.

Document	Field Path	Result
`{a: [ 0, 1, 2 ] }`	`set a[1] = 5`	`{a: [ 0, 5, 2 ] }`
`{a: [ {b:foo} ]}`	`set a[0].b = bar`	`{a:[ {b:bar} ]}`
`{a: [] }`	`set a[0] = bar`	Exception `index [0] out of bounds for array with size [0]`
`{a: {} }`	`set a[0] = bar`	Exception `could not resolve array index [0] against field type [Map]`
`{}`	`set a[0] = bar`	Exception `could not resolve field [a]`

Append operations continue to work as expected

Document	Field Path	Result
`{a: [ 0, 1, 2 ] }`	`a append 5`	`{a: [ 0, 1, 2, 5 ] }`
`{a: [ {b:foo} ]}`	`a append {c = bar}`	`{a:[ {b:foo}, {c:bar} ]}`
`{a: [] }`	`a append bar`	`{a: [bar] }`
`{a: {} }`	`a append bar`	`{a: [{}, bar] }`
`{}`	`a append bar`	`{a: [bar] }`
`{a: [0, 1] }`	`a[0] append bar`	`{a: [ [0, bar], 1] }`

elasticsearchmachine · 2025-08-29T08:03:46Z

Pinging @elastic/es-data-management (Team:Data Management)

masseyke

IngestDocumentTests::testRemoveFieldIgnoreMissing fails when it takes the "case 1" path (so about a third of the time), but otherwise it all looks good to me.

masseyke

LGTM

jbaiera · 2025-09-12T06:30:21Z

Converting this to draft. We are going to put this on hold until we can determine how we'd like to expose this new syntax to the scripting field API

jbaiera added 6 commits August 29, 2025 03:24

Update FieldPath to parse array syntax

273d48a

Only parse the paths in the new way when flexible is enabled

0a40db9

Add retrieval logic for array fields to flexible access pattern

0026043

Make access pattern optional

34234a5

Update test cases for new array syntax

23e51ab

Add rest test cases

aa23790

jbaiera added >non-issue :Data Management/Data streams Data streams and their lifecycles v9.2.0 labels Aug 29, 2025

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Aug 29, 2025

[CI] Auto commit changes from spotless

6c8be1d

jbaiera requested a review from masseyke August 29, 2025 08:13

masseyke requested changes Sep 3, 2025

View reviewed changes

jbaiera added 2 commits September 4, 2025 00:47

Fix broken test case

9ed8278

Merge branch 'main' into streams-ingest-pipeline-field-array-syntax

4a8d959

jbaiera requested a review from masseyke September 4, 2025 04:48

masseyke approved these changes Sep 4, 2025

View reviewed changes

Merge branch 'main' into streams-ingest-pipeline-field-array-syntax

1eead39

jbaiera mentioned this pull request Sep 11, 2025

Reserve square bracket syntax for ingest document flexible field access pattern #134172

Merged

jbaiera marked this pull request as draft September 12, 2025 06:29

elasticsearchmachine added v9.3.0 and removed v9.2.0 labels Oct 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support square bracket array syntax in flexible ingest pipelines #133790

Support square bracket array syntax in flexible ingest pipelines #133790

Uh oh!

jbaiera commented Aug 29, 2025

Uh oh!

elasticsearchmachine commented Aug 29, 2025

Uh oh!

masseyke left a comment

Uh oh!

masseyke left a comment

Uh oh!

jbaiera commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support square bracket array syntax in flexible ingest pipelines #133790

Are you sure you want to change the base?

Support square bracket array syntax in flexible ingest pipelines #133790

Uh oh!

Conversation

jbaiera commented Aug 29, 2025

The new syntax follows some simple rules:

Uh oh!

elasticsearchmachine commented Aug 29, 2025

Uh oh!

masseyke left a comment

Choose a reason for hiding this comment

Uh oh!

masseyke left a comment

Choose a reason for hiding this comment

Uh oh!

jbaiera commented Sep 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants