Skip to content

Conversation

@jbaiera
Copy link
Member

@jbaiera jbaiera commented Aug 29, 2025

Adds support for array indexing in the new flexible ingest document access pattern.

The classic way to index into an array field via ingest node is to use an integer field name. This is done in a context sensitive manner. If the field path finds itself at an array when handling a field name, it will try to parse it as an integer and use it to index into the array. If the path finds itself at an object when handling an integer field name, it will treat the integer as a regular field name:

get a.b.c.1 Resolves to...
{a: {b: {c: [0, 1, 2] } } } 1
{a: {b: {c: {0: foo, 1: bar, 2: baz] } } } bar

Confusingly, if you try to write a value to an array index, and that array does not exist, it will create a path of objects and use the index as the field name for the value:

set a.b.c.1 to foo Results in...
{a: {b: {c: [0, 1, 2] } } } {a: {b: {c: [0, foo, 2] } } }
{} {a: {b: {c: {1: foo} } } }

Since the flexible access pattern is a new way to denote field access in ingest documents, we will add support for new array indexing syntax which is closer to how most programming languages surface the concept. This will allow for explicit control over whether a field path element is meant to be a field name or an array index, and will hopefully cut down on set operations that inconsistently create fields when array accesses were intended.

The new syntax follows some simple rules:

Arrays are indexed by using square brackets to denote the position to use.

a[0]
a.b.c[0]
a[1].b[5].c[0]

Square brackets may be repeated to index higher dimensionality arrays.

a[0][2]
a[2].b[0][1].c[0][2][0]

Using numbers after a dot will always be treated as a field name.

a.0 ==> field a then field 0

When retrieving a field, square brackets interrupt the ability to chain dotted fields together. This is because dotted field names can only be accessed on map data. Array indices can only be applied to array data.

Field Path Explanation of traversal
a[0] field a then array index 0
a[0].b field a then array index 0 then field b
a.b.c[2].d.e.f[1].g field a.b.c then array index 2 then field d.e.f then array index 1 then field g

Setting a value on a document with a field path that includes array indices will require the arrays to exist on the document.

Document Field Path Result
{a: [ 0, 1, 2 ] } set a[1] = 5 {a: [ 0, 5, 2 ] }
{a: [ {b:foo} ]} set a[0].b = bar {a:[ {b:bar} ]}
{a: [] } set a[0] = bar Exception index [0] out of bounds for array with size [0]
{a: {} } set a[0] = bar Exception could not resolve array index [0] against field type [Map]
{} set a[0] = bar Exception could not resolve field [a]

Append operations continue to work as expected

Document Field Path Result
{a: [ 0, 1, 2 ] } a append 5 {a: [ 0, 1, 2, 5 ] }
{a: [ {b:foo} ]} a append {c = bar} {a:[ {b:foo}, {c:bar} ]}
{a: [] } a append bar {a: [bar] }
{a: {} } a append bar {a: [{}, bar] }
{} a append bar {a: [bar] }
{a: [0, 1] } a[0] append bar {a: [ [0, bar], 1] }

@jbaiera jbaiera added >non-issue :Data Management/Data streams Data streams and their lifecycles v9.2.0 labels Aug 29, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Aug 29, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@jbaiera jbaiera requested a review from masseyke August 29, 2025 08:13
Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IngestDocumentTests::testRemoveFieldIgnoreMissing fails when it takes the "case 1" path (so about a third of the time), but otherwise it all looks good to me.

@jbaiera jbaiera requested a review from masseyke September 4, 2025 04:48
Copy link
Member

@masseyke masseyke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jbaiera
Copy link
Member Author

jbaiera commented Sep 12, 2025

Converting this to draft. We are going to put this on hold until we can determine how we'd like to expose this new syntax to the scripting field API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Data streams Data streams and their lifecycles >non-issue Team:Data Management Meta label for data/management team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants