Skip to content

HAVING COUNT(*) fails when COUNT(*) has no alias #50

@fupelaqu

Description

@fupelaqu

Description

When a GROUP BY query uses COUNT(*) without an alias and references it in a HAVING clause, Elasticsearch rejects the query with:

[action_request_validation_exception] Validation Failed: 1: No aggregation found for path [*]

The bucket_selector aggregation generated for HAVING uses * as its buckets_path, which is not a valid aggregation name in Elasticsearch.

Reproduction

SELECT profile.city AS city,
       COUNT(*),
       AVG(age) AS avg_age
FROM dql_users
GROUP BY profile.city
HAVING COUNT(*) >= 1

Without alias, COUNT(*) is translated to an aggregation named *, and the HAVING clause generates:

"having_filter": {
  "bucket_selector": {
    "buckets_path": { "*": "*" },
    "script": { "source": "params.* >= 1" }
  }
}

Elasticsearch cannot resolve * as an aggregation path.

Workaround

Add an alias to COUNT(*):

SELECT profile.city AS city,
       COUNT(*) AS cnt,
       AVG(age) AS avg_age
FROM dql_users
GROUP BY profile.city
HAVING COUNT(*) >= 1

Expected Behaviour

COUNT(*) without alias should generate a valid aggregation name (e.g., count_*) and the HAVING clause should reference that name correctly.

Code Path Analysis

Aggregation naming

  1. Identifier.metricName (sql/…/package.scala:924-941): when COUNT(*) has no alias, aliasOrName returns "*", then metricName maps it to "count_all"
  2. ElasticAggregation.aggName (bridge/…/ElasticAggregation.scala:131-145): without alias, builds "count__index" via s"${aggType}_${sourceField}" (sourceField is "_index" for COUNT(*))

HAVING → bucket_selector

  1. Having.script (sql/…/query/Having.scala:36-44): generates painless script via MetricSelectorScript.metricSelector(criteria)
  2. Criteria.extractAllMetricsPath (sql/…/query/Where.scala:66-74): recursively extracts metric names from HAVING expressions — returns identifier.allMetricsPath
  3. extractMetricsPathForBucket (bridge/…/ElasticAggregation.scala:737-753): looks up metric name via agg.aggName == metricName || agg.field == metricName

Root cause

The metric name extracted from the HAVING expression ("*" from the identifier) does not match the actual ES aggregation name ("count__index"). When an alias is provided (e.g., AS cnt), both the aggregation name and the HAVING reference use "cnt", so the lookup succeeds.

Context

Related to issue #41 (computed alias for unnamed columns). The auto-alias count_* is already generated for output column naming, but it is not used as the aggregation name in the Elasticsearch query.

Key Files

File Purpose
sql/…/package.scala (lines 906, 924-941) Identifier.aliasOrName, metricName
bridge/…/ElasticAggregation.scala (lines 131-145) aggName computation
bridge/…/ElasticAggregation.scala (lines 513-531) bucket_selector generation for HAVING
bridge/…/ElasticAggregation.scala (lines 737-753) extractMetricsPathForBucket lookup
sql/…/query/Having.scala (lines 36-44) Having.script painless generation
sql/…/query/Where.scala (lines 66-74, 294-299) extractAllMetricsPath from criteria
sql/…/transform/TransformBucketSelectorConfig.scala bucket_selector JSON structure

SBT Modules

Module sbt project Purpose
core core ElasticConversion, response parsing
sql sql SQL parser, Having, Identifier, transform layer
bridge (shared) softclient4es-sql-bridge ElasticAggregation (shared code)
es8/bridge softclient4es8-sql-bridge ES8-specific bridge
testkit softclient4es-core-testkit Shared test specs (ReplGatewayIntegrationSpec)
es8/java es8java ES8 Java client + tests (JavaClient8ReplGatewayIntegrationSpec)

Quick test command

sbt "es8java/testOnly *JavaClient8ReplGateway*"

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions