Skip to content

[SPARK-57117][BUILD] Upgrade zstd-jni to 1.5.7-9#56163

Closed
dongjoon-hyun wants to merge 4 commits into
apache:masterfrom
dongjoon-hyun:SPARK-57117
Closed

[SPARK-57117][BUILD] Upgrade zstd-jni to 1.5.7-9#56163
dongjoon-hyun wants to merge 4 commits into
apache:masterfrom
dongjoon-hyun:SPARK-57117

Conversation

@dongjoon-hyun
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun commented May 28, 2026

What changes were proposed in this pull request?

This PR aims to upgrade zstd-jni to 1.5.7-9 for Apache Spark 5.

Why are the changes needed?

To bring the latest bug fixes and improvements from upstream zstd-jni v1.5.7-9.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7 (1M context)

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Could you review this PR, @LuciferYang ?

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

All tests passed.

Copy link
Copy Markdown
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we update the benchmark results?

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Thank you, @LuciferYang . Let me regenerate them.

Compression 10000 times at level 1 with buffer pool 593 594 1 0.0 59304.9 1.1X
Compression 10000 times at level 2 with buffer pool 626 627 1 0.0 62575.0 1.0X
Compression 10000 times at level 3 with buffer pool 730 730 0 0.0 72959.7 0.9X
Compression 10000 times at level 1 without buffer pool 1707 1708 2 0.0 170666.5 1.0X
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why are these jdk21 results much worse than before?

Copy link
Copy Markdown
Member Author

@dongjoon-hyun dongjoon-hyun May 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @peter-toth . Since the underlying machine is changed randomly in GitHub Actions (AMD EPYC 7763 64-Core Processor vs Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz), we usually check Relative columns first. If there is no significant change relatively, it's okay. Technically, it's not completely isolated environment in the performance wise.

@dongjoon-hyun
Copy link
Copy Markdown
Member Author

Merged to master for Apache Spark 5.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-57117 branch May 28, 2026 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants