Skip to content

Conversation

@zifeif2
Copy link
Contributor

@zifeif2 zifeif2 commented Jan 16, 2026

What changes were proposed in this pull request?

Add Repartition Integration Test for Aggregate, Dedup, FMGWS and SessionWindow Operators, as well as a query that contains multiple operators

The tests verifies that

  • state data is correct after rewrite
  • resumed query after repartition loads the correct state data
  • verify repartition batch has the correct metadata

The dimensions that this test covers

  • increase/decrease partition
  • enable/disable changelog checkpoint
  • enable/disable checkpoint id
  • state version for both Aggregate and FMGWS operators

Why are the changes needed?

We need to create integration test to ensure data correctness for repartitioning on a complete set of operators

Does this PR introduce any user-facing change?

No

How was this patch tested?

See added tests

Was this patch authored or co-authored using generative AI tooling?

Yes

@github-actions
Copy link

JIRA Issue Information

=== Task SPARK-54365 ===
Summary: Test repartition for Agg, Dedup, session window, FMGWS
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@zifeif2 zifeif2 force-pushed the repartition-test branch 3 times, most recently from b5cd624 to fe13a54 Compare January 16, 2026 05:32
@zifeif2 zifeif2 marked this pull request as ready for review January 16, 2026 17:36
Copy link
Contributor

@micheal-o micheal-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Nice structure. Just some minor comments. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants