Use this issue to track InQL RFC 016, which already exists at docs/rfcs/016_core_aggregate_functions.md.
Area
- Specification (RFCs)
- Package & tests
- Documentation
Summary
RFC 016 defines InQL's core aggregate function set: count, sum, avg, min, and max, including argument forms, input type rules, null behavior, empty-input behavior, result type rules, aliases, and required diagnostics.
Motivation
These aggregates form the minimum portable aggregate surface for dataframe and query-block work. They need explicit semantics before broader aggregate families and modifiers can be made stable.
Proposal sketch
The RFC specifies the core aggregate vocabulary, its typing and null/empty-input behavior, and the diagnostics required when aggregate calls are used incorrectly. It keeps aggregate measures distinct from scalar expressions and aligns aggregate helper behavior with the registry model.
Open design questions to resolve before Planned:
- What exact integer type should
count return across local and backend execution?
- Should
avg over integers return a floating type, decimal type, or backend-derived numeric type?
Alternatives considered
The RFC rejects defining all Spark aggregates at once, making count permanently argumentless, and returning zero from sum on empty input.
Impact / compatibility
This RFC affects aggregate helpers, grouped query checking, Prism/Substrait aggregate lowering, docs for null/empty-input behavior, and compatibility aliases such as mean where semantics match.
Implementation notes (optional)
Handle after RFC 014 so each aggregate lands as a registry-backed function entry with explicit lifecycle, type, null, and Substrait mapping metadata.
Checklist
Use this issue to track InQL RFC 016, which already exists at
docs/rfcs/016_core_aggregate_functions.md.Area
Summary
RFC 016 defines InQL's core aggregate function set:
count,sum,avg,min, andmax, including argument forms, input type rules, null behavior, empty-input behavior, result type rules, aliases, and required diagnostics.Motivation
These aggregates form the minimum portable aggregate surface for dataframe and query-block work. They need explicit semantics before broader aggregate families and modifiers can be made stable.
Proposal sketch
The RFC specifies the core aggregate vocabulary, its typing and null/empty-input behavior, and the diagnostics required when aggregate calls are used incorrectly. It keeps aggregate measures distinct from scalar expressions and aligns aggregate helper behavior with the registry model.
Open design questions to resolve before Planned:
countreturn across local and backend execution?avgover integers return a floating type, decimal type, or backend-derived numeric type?Alternatives considered
The RFC rejects defining all Spark aggregates at once, making
countpermanently argumentless, and returning zero fromsumon empty input.Impact / compatibility
This RFC affects aggregate helpers, grouped query checking, Prism/Substrait aggregate lowering, docs for null/empty-input behavior, and compatibility aliases such as
meanwhere semantics match.Implementation notes (optional)
Handle after RFC 014 so each aggregate lands as a registry-backed function entry with explicit lifecycle, type, null, and Substrait mapping metadata.
Checklist