Skip to content

test(workflow-operator): add unit test coverage for AggregationOperation#4908

Merged
aglinxinyuan merged 5 commits into
apache:mainfrom
aglinxinyuan:test-aggregation-operation-spec
May 4, 2026
Merged

test(workflow-operator): add unit test coverage for AggregationOperation#4908
aglinxinyuan merged 5 commits into
apache:mainfrom
aglinxinyuan:test-aggregation-operation-spec

Conversation

@aglinxinyuan
Copy link
Copy Markdown
Contributor

@aglinxinyuan aglinxinyuan commented May 4, 2026

What changes were proposed in this PR?

Adds AggregationOperationSpec covering the public surface of AggregationOperation (common/workflow-operator/src/main/scala/org/apache/texera/amber/operator/aggregate/AggregationOperation.scala), which previously had no test coverage.

The new spec pins:

  • getAggregationAttribute per kind: SUM/MIN/MAX preserve the input type, COUNT → INTEGER, AVERAGE → DOUBLE, CONCAT → STRING; null aggFunction throws RuntimeException.
  • getAggFunc validation: SUM/MIN/MAX throw UnsupportedOperationException for non-numeric attribute types; null aggFunction throws.
  • Per-kind DistributedAggregation semantics (init / iterate / merge / finalAgg) for SUM, COUNT, AVERAGE, CONCAT — including the null-handling differences between count-all (attribute = null) and count-non-null, and the rule that AVERAGE of zero values projects to null.
  • getFinal: COUNT is rewritten to SUM over the partial column; non-COUNT aggregations keep their kind but rebind both attribute and resultAttribute to the partial column.
  • AveragePartialObj: case-class field exposure and value equality.

No production code changed; this is test-only.

Any related issues, documentation, discussions?

Closes #4907

How was this PR tested?

Added 19 new unit tests in AggregationOperationSpec. Verified locally:

sbt 'WorkflowOperator/Test/testOnly org.apache.texera.amber.operator.aggregate.AggregationOperationSpec'
# → Tests: succeeded 19, failed 0

sbt 'WorkflowOperator/Test/scalafmtCheck'
# → clean

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code

Pin the contract of `AggregationOperation`'s public API: result-attribute
typing per aggregation kind, error paths for unsupported types and null
aggFunction, the per-kind `DistributedAggregation` semantics (init,
iterate, merge, finalAgg) for SUM/COUNT/AVERAGE/CONCAT including the
null-handling differences between count-all and count-non-null and the
"average of zero values is null" rule, and the `getFinal` rewrite that
turns COUNT into SUM over the partial column while leaving other
aggregations' aggFunction intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46.90%. Comparing base (94ab9c7) to head (623b23d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4908      +/-   ##
============================================
+ Coverage     42.10%   46.90%   +4.79%     
- Complexity     2174     2179       +5     
============================================
  Files           980      844     -136     
  Lines         36298    27319    -8979     
  Branches       3783     2536    -1247     
============================================
- Hits          15283    12813    -2470     
+ Misses        20096    13726    -6370     
+ Partials        919      780     -139     
Flag Coverage Δ
access-control-service 39.53% <ø> (ø)
amber 43.03% <ø> (+0.05%) ⬆️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 0.00% <ø> (ø)
file-service 33.24% <ø> (ø)
workflow-compiling-service 47.72% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds focused unit test coverage for AggregationOperation (workflow-operator) to pin key parts of its public contract and distributed aggregation semantics, addressing issue #4907.

Changes:

  • Introduces AggregationOperationSpec with tests for result-column typing by aggregation kind (SUM/COUNT/AVERAGE/MIN/MAX/CONCAT) and null-aggFunction error handling.
  • Adds tests for getAggFunc validation and semantics for SUM/COUNT/AVERAGE/CONCAT, including init/iterate/merge/finalAgg behaviors.
  • Adds tests for getFinal rewrite behavior (COUNT → SUM over partial/result column).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ping coverage

Address review feedback on apache#4908:
- (Yicong-Huang / Copilot) The original spec broadly duplicated
  AggregateOpSpec's getAggregationAttribute, per-kind iterate, and
  getFinal coverage. Removed the duplicate cases and added a header
  comment documenting what AggregateOpSpec already covers so future
  contributors don't re-add them.
- (Copilot) Misleading CONCAT description that said "skip null" while
  asserting null contributes a delimiter — removed (the iterate behavior
  is already covered in AggregateOpSpec).
- (Yicong-Huang) Add partial+final pipeline coverage. New end-to-end
  tests run a real worker COUNT / SUM, emit partial output tuples, then
  re-aggregate them via getFinal to confirm the two-stage result matches
  a single-pass aggregation.

What this spec uniquely covers (vs AggregateOpSpec):
- getAggFunc validation errors (non-numeric on SUM/MIN/MAX, null aggFunction).
- CONCAT merge (AggregateOpSpec exercises CONCAT iterate, not merge directly).
- Two-stage worker→final pipeline for COUNT and SUM.
- AveragePartialObj field/equality contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…tes and test names

Address Copilot feedback on apache#4908:
- Coverage-notes header: drop the inaccurate "merge" claim about
  AggregateOpSpec (it exercises iterate/finalAgg but never calls
  merge directly).
- Coverage-notes header: replace "AveragePartialObj value-class
  exposure" — AveragePartialObj is a plain case class, not a value
  class (does not extend AnyVal). Reword to "case-class field
  exposure / value equality".
- Test names: rename "non-numeric types on SUM/MIN/MAX" to
  "unsupported attribute types on SUM/MIN/MAX". SUM/MIN/MAX accept
  TIMESTAMP alongside the numerics, so "non-numeric" was inaccurate.
  Added inline comments listing the actual accepted types.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@aglinxinyuan aglinxinyuan enabled auto-merge (squash) May 4, 2026 08:23
@aglinxinyuan aglinxinyuan merged commit 9960041 into apache:main May 4, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add unit test coverage for AggregationOperation

4 participants