Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline CHUNKCOPY and CHUNKUNROLL #1669

Merged
merged 2 commits into from
Feb 22, 2024

Conversation

ccawley2011
Copy link
Contributor

This slightly decreases the shared library size on x86_64 when both SSE2 and SSSE3 are enabled.

Copy link

codecov bot commented Feb 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (7745c28) 83.28% compared to head (b2aaa39) 83.01%.
Report is 13 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1669      +/-   ##
===========================================
- Coverage    83.28%   83.01%   -0.28%     
===========================================
  Files          135      135              
  Lines        10924    10365     -559     
  Branches      2815     2815              
===========================================
- Hits          9098     8604     -494     
+ Misses        1125     1063      -62     
+ Partials       701      698       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@mtl1979 mtl1979 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nmoinvaz nmoinvaz added Architecture Architecture specific optimization labels Feb 20, 2024
@Dead2
Copy link
Member

Dead2 commented Feb 21, 2024

X86-64 i9-9900K

Develop ca0e463 Feb 21

 Tool: minigzip   Levels: 1-9
 Runs: 70         Trim worst: 40

 Level   Comp   Comptime min/avg/max/stddev  Decomptime min/avg/max/stddev  Compressed size
 1     54.166%      0.150/0.154/0.156/0.001        0.034/0.036/0.038/0.001        8,523,732
 2     43.871%      0.244/0.249/0.250/0.001        0.032/0.036/0.038/0.002        6,903,609
 3     42.387%      0.295/0.297/0.299/0.001        0.031/0.034/0.036/0.001        6,670,099
 4     41.647%      0.317/0.324/0.326/0.002        0.030/0.034/0.035/0.001        6,553,723
 5     41.216%      0.344/0.352/0.354/0.002        0.032/0.034/0.035/0.001        6,485,938
 6     41.037%      0.390/0.397/0.400/0.003        0.030/0.033/0.035/0.001        6,457,776
 7     40.778%      0.493/0.498/0.503/0.003        0.030/0.034/0.035/0.001        6,416,919
 8     40.704%      0.605/0.613/0.618/0.004        0.029/0.033/0.034/0.001        6,405,244
 9     40.409%      0.912/0.919/0.925/0.004        0.029/0.033/0.035/0.002        6,358,951

 avg1  42.913%                        0.423                          0.034
 tot                                114.095                          9.195       60,775,991

   text    data     bss     dec     hex filename
 138190    1352       8  139550   2211e libz-ng.so.2

PR #1669

 Level   Comp   Comptime min/avg/max/stddev  Decomptime min/avg/max/stddev  Compressed size
 1     54.166%      0.148/0.154/0.155/0.002        0.031/0.036/0.038/0.002        8,523,732
 2     43.871%      0.244/0.249/0.250/0.001        0.032/0.036/0.038/0.001        6,903,609
 3     42.387%      0.292/0.296/0.298/0.001        0.030/0.034/0.036/0.002        6,670,099
 4     41.647%      0.318/0.324/0.326/0.002        0.030/0.034/0.035/0.001        6,553,723
 5     41.216%      0.344/0.350/0.353/0.002        0.032/0.034/0.036/0.001        6,485,938
 6     41.037%      0.392/0.397/0.400/0.002        0.031/0.033/0.035/0.001        6,457,776
 7     40.778%      0.492/0.498/0.502/0.003        0.031/0.033/0.035/0.001        6,416,919
 8     40.704%      0.606/0.612/0.615/0.003        0.030/0.033/0.035/0.001        6,405,244
 9     40.409%      0.909/0.916/0.921/0.003        0.029/0.033/0.035/0.001        6,358,951

 avg1  42.913%                        0.422                          0.034
 tot                                113.849                          9.179       60,775,991

   text    data     bss     dec     hex filename
 137766    1352       8  139126   21f76 libz-ng.so.2

Slightly smaller library size, as advertised.
No discernible speed difference.

@nmoinvaz
Copy link
Member

nmoinvaz commented Feb 21, 2024

I think this commit should be split into two. One for inlining, and the other for turning off mixed sse2/ssse3. The second commit should contain: This slightly decreases the shared library size on x86_64 when both SSE2 and SSSE3 are enabled. in the commit message.

@ccawley2011
Copy link
Contributor Author

I think this commit should be split into two. One for inlining, and the other for turning off mixed sse2/ssse3. The second commit should contain: This slightly decreases the shared library size on x86_64 when both SSE2 and SSSE3 are enabled. in the commit message.

OK, done.

@Dead2
Copy link
Member

Dead2 commented Feb 22, 2024

I think perhaps these commits should be reordered. If the functions are static inline, then the ssse3 one won't find the sse2 functions as they won't exist any more, and cause a compiler error when compiling the first commit (bad for bisecting)
Also, I suspect that it is probably the static inline part that actually reduces library size.

This slightly decreases the shared library size on x86_64 when both SSE2 and SSSE3 are enabled.
@ccawley2011
Copy link
Contributor Author

I think perhaps these commits should be reordered. If the functions are static inline, then the ssse3 one won't find the sse2 functions as they won't exist any more, and cause a compiler error when compiling the first commit (bad for bisecting) Also, I suspect that it is probably the static inline part that actually reduces library size.

Good point, fixed.

@Dead2 Dead2 merged commit 7cca3e6 into zlib-ng:develop Feb 22, 2024
139 of 140 checks passed
@ccawley2011 ccawley2011 deleted the chunkset-ssse3-inline branch February 22, 2024 22:01
@Dead2 Dead2 mentioned this pull request May 30, 2024
@Dead2 Dead2 mentioned this pull request Jun 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Architecture Architecture specific optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants