-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Description
When an aggregation is created solely to support a HAVING, WHERE, or ORDER BY clause (i.e., the aggregation is not in the SELECT list), the aggregation result leaks into the output row set as an extra column.
Reproduction
ORDER BY case
SELECT profile.city AS city,
AVG(age) AS avg_age
FROM dql_users
GROUP BY profile.city
ORDER BY COUNT(*) DESCActual result (3 columns instead of 2):
| city | avg_age | count_all |
|---|---|---|
| paris | 27.5 | 2.0 |
| lyon | 40.0 | 1.0 |
| marseille | 50.0 | 1.0 |
HAVING case
SELECT profile.city AS city,
AVG(age) AS avg_age
FROM dql_users
GROUP BY profile.city
HAVING COUNT(*) >= 1Expected: 2 columns (city, avg_age). count_all should not appear.
WHERE case (same pattern)
Any aggregation created via Criteria.extractAggregationFields for WHERE filtering that is not in SELECT will also leak.
Root Cause
SingleSearch.aggregates collects aggregations from SELECT, HAVING, WHERE, and ORDER BY (since issues #52 and #53). All collected aggregations are passed to the bridge layer and included in the ES query. The result parser (ElasticConversion.parseAggregations) includes all aggregation values in the output rows without distinguishing between SELECT aggregations and auxiliary ones.
Suggested Fix
The output column filtering (in SearchApi.extractOutputFieldNames or normalizeRow) should exclude aggregation columns that are not present in the SELECT clause.
Possible approaches:
- Mark
Fieldinstances with their origin (SELECT vs ORDER BY vs HAVING/WHERE) and filter at output time - Use
Select.fieldsWithComputedAliases(notaggregates) to determine output columns - Post-filter rows to remove columns not in the SELECT field list
Related
- Issue ORDER BY on aggregation alias not applied to ES terms aggregation #52 — ORDER BY on aggregation alias (introduced ORDER BY aggregation extraction)
- Issue Aggregations referenced only in HAVING or WHERE are not created #53 — Aggregations from HAVING/WHERE only (introduced HAVING/WHERE aggregation extraction)
- Issue Filter out Elasticsearch metadata columns from result sets by default #45 — Filter ES metadata columns (similar output filtering concern)