Skip to content

test(amber): add unit test coverage for DataFrame equality and inMemSize#4784

Merged
Yicong-Huang merged 3 commits into
apache:mainfrom
aglinxinyuan:xinyuan-test-data-frame-spec
May 3, 2026
Merged

test(amber): add unit test coverage for DataFrame equality and inMemSize#4784
Yicong-Huang merged 3 commits into
apache:mainfrom
aglinxinyuan:xinyuan-test-data-frame-spec

Conversation

@aglinxinyuan
Copy link
Copy Markdown
Contributor

@aglinxinyuan aglinxinyuan commented May 3, 2026

What changes were proposed in this PR?

Add DataPayloadSpec covering the custom equals and inMemSize of DataFrame:

  • inMemSize: zero for an empty frame; sum of contained tuple sizes otherwise
  • equals is reflexive on a single empty-frame instance (df == df)
  • equals returns true for two distinct empty-frame instances
  • equals rejects non-DataFrame values and null
  • equals rejects frames whose lengths differ
  • equals treats element-wise equal frames as equal
  • equals respects element order
  • equals rejects frames whose elements differ

Any related issues, documentation, discussions?

Closes #4783

How was this PR tested?

sbt "WorkflowExecutionService/testOnly org.apache.texera.amber.engine.common.ambermessage.DataPayloadSpec" — 9/9 tests pass.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Claude Opus 4.7)

Add DataPayloadSpec covering DataFrame.inMemSize aggregation across
contained tuples and the custom equals contract: empty equality,
non-DataFrame rejection, length mismatch, element-wise equality, order
sensitivity, and element-mismatch rejection.

Closes apache#4783

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 3, 2026 01:49
@github-actions github-actions Bot added the engine label May 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds ScalaTest unit coverage for DataFrame’s custom equals and inMemSize behavior in Amber’s messaging payloads, addressing the gap reported in #4783.

Changes:

  • Introduces DataPayloadSpec covering DataFrame.inMemSize for empty vs non-empty frames.
  • Adds DataFrame.equals tests for type mismatch/null, length mismatch, element-wise equality, order sensitivity, and differing elements.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 42.35%. Comparing base (945849c) to head (d2e6248).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4784      +/-   ##
============================================
- Coverage     42.84%   42.35%   -0.49%     
  Complexity     2027     2027              
============================================
  Files           957      957              
  Lines         34077    34094      +17     
  Branches       3753     3753              
============================================
- Hits          14600    14442     -158     
- Misses        18693    18870     +177     
+ Partials        784      782       -2     
Flag Coverage Δ
amber 40.74% <ø> (+0.01%) ⬆️
python 84.15% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aglinxinyuan aglinxinyuan requested a review from Yicong-Huang May 3, 2026 01:59
- Use the schema's Attribute (`schema.getAttribute("v")`) when adding
  fields in the `tuple` helper so it stays consistent with the schema
  under test, rather than constructing a fresh Attribute.
- Split the empty-frame equality test into a true reflexivity case
  (`df == df`) and a separate case for two distinct empty frames, so
  the wording matches what is asserted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Yicong-Huang Yicong-Huang enabled auto-merge (squash) May 3, 2026 03:15
@Yicong-Huang Yicong-Huang merged commit 8cf5b9a into apache:main May 3, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add unit test coverage for DataFrame equality and inMemSize

4 participants