Skip to content

Composite Aggregation with multiple sources breaks query structure #8704

@ianschmaltz

Description

@ianschmaltz

Elastic.Clients.Elasticsearch version: 9.1.7

Elasticsearch version: 9.1.2

.NET runtime version: .NET Framework v4.8

Operating system version: Windows 11


Description of the problem including expected versus actual behavior:

When using the Fluent API to define a composite aggregation with more than one field, the client generates invalid JSON.

Instead of serializing each entry in the sources array as a separate object, the Fluent API combines them into a single object, which is not supported by Elasticsearch and causes the query to be rejected.


Steps to reproduce:

  1. Use the Fluent API to define a composite aggregation with 2 .Add() calls in .Sources(...). Example (Fluent Syntax Used):
var searchResponse = client.Client.SearchAsync<DBODespesa>(s => s
    .Index(nomeIndice)
    .Size(0)
    .Query(finalQuery)
    // Here
    .Aggregations(aggregations => aggregations
        .Add("group_by", aggregation => aggregation
            .Composite(composite => composite
                            .Size(65536)
                            .Sources(sources => sources
                                .Add("descricao", src => src
                                    .Terms(terms => terms.Field(campoDescricao))
                                )                                                
                                .Add("codigo", src2 => src2
                                    .Terms(terms => terms.Field(campoChave))
                                )
                            )
                        )
            .Aggregations(aggregations2 => aggregations2
                .Add("soma_empenhado", aggregation1 => aggregation1
                    .Sum(sum => sum
                        .Field(x => x.ValorEmpenho)
                    )
                )
                .Add("soma_liquidado", aggregation1 => aggregation1
                    .Sum(sum => sum
                        .Field(x => x.ValorLiquidado)
                    )
                )
                .Add("soma_pago", aggregation1 => aggregation1
                    .Sum(sum => sum
                        .Field(x => x.ValorPago)
                    )
                )
                .Add("soma_rap", aggregation1 => aggregation1
                    .Sum(sum => sum
                        .Field(x => x.ValorRap)
                    )
                )
                .Add("primeiro_registro", aggregation1 => aggregation1
                    .TopHits(top_hits => top_hits
                        .Size(1)
                    )
                )
            )
        )
    )
).Result;
  1. Run the query using .SearchAsync(...)
  2. The query fails due to malformed JSON in the sources array

Expected behavior

"sources": [
  { "descricao": { "terms": { "field": "unidadeGestora.keyword" } } },
  { "codigo": { "terms": { "field": "codigoUnidadeGestora" } } }
]

Actual behavior

"sources": [
  {
    "descricao": { "terms": { "field": "unidadeGestora.keyword" } },
    "codigo": { "terms": { "field": "codigoUnidadeGestora" } }
  }
]

This breaks composite aggregations with more than one field and prevents pagination/sorting on the server side.


Provide ConnectionSettings (if relevant): We're using the official Elastic.Clients.Elasticsearch client with default ElasticsearchClientSettings, strongly typed models, and fluent API approach.


Provide DebugInformation (if relevant):
Query returns error:
ElasticsearchClientException: Request failed to execute. The server returned 400 - Bad Request. Invalid composite aggregation: sources must be an array of single-key objects.


Extra Context

  • There’s no public workaround using the Fluent API.
  • This issue blocks real-world use of composite aggregations with multi-field grouping.
  • This was confirmed by Elastic Support as potentially being a client-side serialization bug.

Support case ID (if helpful): 01980458


Requested Help:

  • Confirm whether this is a bug
  • Provide a Fluent API example that works with multiple sources
  • Suggest workaround if bug is confirmed

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions