Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed the decoding loop to detect more invalid cases of corruption sooner #3677

Merged
merged 5 commits into from Jul 5, 2023

Conversation

Cyan4973
Copy link
Contributor

@Cyan4973 Cyan4973 commented Jun 16, 2023

The main objective of this PR is to detect an additional invalid case of corruption without reliance on the checksum.
For the record, many cases of corruption are possible, of several of them are undetectable, except by the final checksum.
Even for those which are theoretically detectable, such detection must remain practical, i.e. not cost a lot of performance nor increase complexity too much.

This is one of them. Prior attempts to add this one corruption case to the list of early-detected ones were unsuccessful, as they lead to more complex code on top of slower decompression speed. Upon discussion with @ip7z, I decided to have another look at the topic.

The newly proposed change fixes the issue, and imho makes the code better (i.e. more readable) for maintenance. To achieve this, I had to modify the main decoding loop, impacting the scope and interface of decodeSequence function. All decoding loops were impacted, though the changes are more pronounced for the splitLitBuffer variant .

In term of performance, the outcome is mixed.
As expected, modifying the hottest loop in the code is bound to impact performance measurably, even if the generated assembly is modified in a minor way. We also know that this code is incredibly sensitive to Instruction Alignments side-effects, which are essentially random, so we expect fairly large swings in either direction.

To verify this, the decompression speed of this patch was benched on a i7-9700k workstation with several different compilers and versions. Here is the detailed outcome, comparing this commit (left) with dev branch (right) :

compile with gcc-7                                                                │compile with gcc-7
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  722.6 MB/s/s│ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  746.2 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  885.4 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  946.8 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  379.6 MB/s, 1169.1 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  370.7 MB/s, 1227.4 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  285.8 MB/s, 1036.2 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  279.1 MB/s, 1101.8 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  212.6 MB/s, 1010.5 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  209.6 MB/s, 1071.1 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  151.9 MB/s,  818.0 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  150.3 MB/s,  872.7 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  109.6 MB/s,  984.4 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  109.5 MB/s, 1055.1 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   86.0 MB/s,  784.6 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   85.9 MB/s,  850.8 MB/s
compile with gcc-8                                                                │compile with gcc-8
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  742.5 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  655.9 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  886.1 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  917.9 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  374.5 MB/s, 1124.9 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  379.7 MB/s, 1197.5 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  278.4 MB/s,  982.2 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  283.9 MB/s, 1058.7 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  211.5 MB/s,  982.4 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  206.0 MB/s, 1037.2 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  150.1 MB/s,  799.9 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  147.2 MB/s,  842.3 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  103.2 MB/s,  954.5 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),   98.7 MB/s, 1009.6 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   82.2 MB/s,  761.1 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   79.0 MB/s,  806.1 MB/s
compile with gcc-9                                                                │compile with gcc-9
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  731.4 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  693.0 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  917.2 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  878.8 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  375.4 MB/s, 1176.1 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  377.6 MB/s, 1206.9 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  277.3 MB/s, 1021.5 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  279.4 MB/s, 1073.8 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  207.7 MB/s, 1017.5 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  212.0 MB/s, 1034.7 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  149.3 MB/s,  809.0 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  151.7 MB/s,  836.5 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  115.8 MB/s, 1003.9 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  115.1 MB/s,  999.9 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   91.1 MB/s,  791.1 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   90.6 MB/s,  795.1 MB/s
compile with gcc-10                                                               │compile with gcc-10
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  754.4 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  730.1 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  939.2 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s, 1010.1 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  379.9 MB/s, 1204.2 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  373.3 MB/s, 1311.9 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  280.1 MB/s, 1073.0 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  278.9 MB/s, 1164.2 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  213.2 MB/s, 1059.2 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  217.5 MB/s, 1146.1 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  152.1 MB/s,  878.8 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  155.7 MB/s,  933.5 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  114.3 MB/s, 1039.8 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  114.7 MB/s, 1125.0 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   90.0 MB/s,  847.2 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   90.2 MB/s,  904.5 MB/s
compile with gcc-11                                                               │compile with gcc-11
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  717.4 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  724.5 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  916.3 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  936.8 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  375.7 MB/s, 1181.5 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  375.9 MB/s, 1186.7 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  282.8 MB/s, 1034.4 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  280.2 MB/s, 1036.7 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  210.8 MB/s, 1039.3 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  207.0 MB/s, 1037.7 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  151.1 MB/s,  852.7 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  148.1 MB/s,  842.6 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  112.9 MB/s, 1011.4 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  113.6 MB/s, 1026.5 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   88.1 MB/s,  813.6 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   89.2 MB/s,  821.1 MB/s
compile with clang-6.0                                                            │compile with clang-6.0
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  693.0 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  756.6 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  897.2 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  932.8 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  386.3 MB/s  1164.7 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  387.4 MB/s  1222.4 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  285.5 MB/s, 1031.7 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  287.7 MB/s, 1091.3 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  212.4 MB/s, 1016.3 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  207.3 MB/s, 1058.9 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  153.0 MB/s,  828.7 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  150.4 MB/s,  863.9 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  115.8 MB/s,  981.2 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  119.6 MB/s, 1044.5 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   90.5 MB/s,  783.4 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   93.7 MB/s,  842.5 MB/s
 compile with clang-7                                                              │compile with clang-7
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  763.9 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  739.3 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  919.2 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  932.8 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  369.8 MB/s, 1187.8 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  377.3 MB/s, 1222.3 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  272.3 MB/s, 1057.4 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  279.8 MB/s, 1092.8 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  215.1 MB/s, 1034.8 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  214.3 MB/s, 1067.4 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  154.0 MB/s,  843.7 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  152.9 MB/s,  874.0 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  118.0 MB/s, 1002.5 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  116.4 MB/s, 1031.5 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   91.7 MB/s,  800.6 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   90.9 MB/s,  827.3 MB/s
compile with clang-8                                                              │compile with clang-8
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  683.5 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  754.2 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  962.9 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  927.6 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  393.3 MB/s  1253.5 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  393.7 MB/s  1217.8 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  286.4 MB/s, 1104.6 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  291.3 MB/s, 1086.0 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  207.4 MB/s, 1085.2 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  205.8 MB/s, 1054.2 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  149.9 MB/s,  874.1 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  148.4 MB/s,  856.8 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  118.4 MB/s, 1066.3 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  118.3 MB/s, 1033.3 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   90.8 MB/s,  848.4 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   91.3 MB/s,  829.8 MB/s
compile with clang-9                                                              │compile with clang-9
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  751.9 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  784.2 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  927.2 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  932.0 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  373.3 MB/s, 1209.4 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  370.7 MB/s, 1245.0 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  275.8 MB/s, 1063.2 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  280.4 MB/s, 1108.9 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  204.8 MB/s, 1055.4 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  204.6 MB/s, 1069.6 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  147.3 MB/s,  854.0 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  147.2 MB/s,  863.5 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  117.4 MB/s, 1035.9 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  116.9 MB/s, 1043.7 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   92.6 MB/s,  828.0 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   92.0 MB/s,  830.1 MB/s
compile with clang-10                                                             │compile with clang-10
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  747.2 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  759.1 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  935.8 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  941.3 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  387.5 MB/s  1211.0 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  387.7 MB/s  1240.6 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  286.6 MB/s, 1064.7 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  288.4 MB/s, 1098.9 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  211.1 MB/s, 1068.7 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  204.6 MB/s, 1071.4 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  153.1 MB/s,  870.8 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  149.3 MB/s,  868.8 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  120.1 MB/s, 1039.7 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  120.0 MB/s, 1056.2 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   93.5 MB/s,  830.2 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   93.3 MB/s,  846.2 MB/s
compile with clang-11                                                             │compile with clang-11
 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  752.5 MB/s  │ 3#enwik9.L22.zst    :1000000000 -> 215031773 (x4.650),   0.00 MB/s,  726.7 MB/s
 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  959.2 MB/s  │ 3#lesia.tar.L19.zst : 211957760 ->  52990423 (x4.000),   0.00 MB/s,  944.2 MB/s
 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  377.4 MB/s, 1261.9 MB/s  │ 1#silesia.tar       : 211957760 ->  73422067 (x2.887),  372.3 MB/s, 1242.9 MB/s
 1#enwik8            : 100000000 ->  40667563 (x2.459),  278.4 MB/s, 1109.6 MB/s  │ 1#enwik8            : 100000000 ->  40667563 (x2.459),  279.0 MB/s, 1102.7 MB/s
 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  206.2 MB/s, 1096.7 MB/s  │ 3#silesia.tar       : 211957760 ->  66523984 (x3.186),  210.1 MB/s, 1081.9 MB/s
 3#enwik8            : 100000000 ->  35461800 (x2.820),  149.8 MB/s,  883.3 MB/s  │ 3#enwik8            : 100000000 ->  35461800 (x2.820),  152.4 MB/s,  882.3 MB/s
 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  119.6 MB/s, 1079.4 MB/s  │ 5#silesia.tar       : 211957760 ->  63040521 (x3.362),  119.2 MB/s, 1058.2 MB/s
 5#enwik8            : 100000000 ->  33702880 (x2.967),   93.2 MB/s,  859.7 MB/s  │ 5#enwik8            : 100000000 ->  33702880 (x2.967),   92.5 MB/s,  848.4 MB/s

As expected, performance changes were essentially random, depending on compiler version. One could say they are rather more favorable and more stable for gcc, and rather defavorable for clang, mostly due to 2 bad versions. But this is just because the starting point of these comparisons (dev branch) was also randomly impacted by instruction alignments, and was a bit more detrimental to gcc baseline, and more advantageous to clang.

So far, no surprise, nothing conclusive. It's just a pity that such a setup doesn't allow us to detect small changes (~1% range) with confidence due to the much larger random impact of instruction alignment.

To complete the picture, I'm adding tests for the M1 Pro platform. As the cpu architecture is radically different, I was hoping that issues such as random instruction alignments impact would not be present there.

I was too optimistic.
The effect of this change is pretty positive when compiling with default system compiler (Apple clang version 14.0.3 (clang-1403.0.22.14.1)) :

./zstd -b1e6i5 ~/dev/bench/silesia.tar            │
 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  610.8 MB/s, 1530.8 MB/s │ 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  609.9 MB/s, 1480.5 MB/s
 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  461.2 MB/s, 1419.2 MB/s │ 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  468.2 MB/s, 1361.7 MB/s
 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  383.3 MB/s  1413.0 MB/s │ 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  383.0 MB/s  1340.1 MB/s
 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  351.6 MB/s, 1412.6 MB/s │ 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  350.6 MB/s, 1334.0 MB/s
 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  183.1 MB/s, 1409.6 MB/s │ 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  182.8 MB/s, 1329.0 MB/s
 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  128.5 MB/s, 1502.9 MB/s │ 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  128.3 MB/s, 1419.3 MB/s

This is a non-negligible +4-5% decompression speed performance across the board, not bad !

Unfortunately, the trend reverses when using gcc, provided by brew :

compile with gcc-11                                                              │compile with gcc-11
 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  609.2 MB/s, 1538.9 MB/s │ 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  602.1 MB/s, 1723.6 MB/s
 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  470.1 MB/s, 1439.7 MB/s │ 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  467.1 MB/s, 1676.9 MB/s
 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  378.4 MB/s, 1431.7 MB/s │ 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  374.9 MB/s, 1699.0 MB/s
 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  347.4 MB/s, 1431.6 MB/s │ 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  344.6 MB/s, 1720.1 MB/s
 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  187.6 MB/s, 1427.0 MB/s │ 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  187.7 MB/s, 1720.1 MB/s
 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  133.8 MB/s, 1522.5 MB/s │ 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  133.9 MB/s, 1823.1 MB/s

compile with gcc-12                                                              │compile with gcc-12
 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  613.7 MB/s, 1552.4 MB/s │ 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  613.2 MB/s, 1710.3 MB/s
 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  472.3 MB/s, 1454.3 MB/s │ 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  472.2 MB/s, 1664.2 MB/s
 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  377.9 MB/s, 1445.8 MB/s │ 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  377.5 MB/s, 1680.0 MB/s
 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  347.7 MB/s, 1448.7 MB/s │ 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  346.8 MB/s, 1700.7 MB/s
 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  184.5 MB/s, 1444.7 MB/s │ 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  184.3 MB/s, 1696.3 MB/s
 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  129.4 MB/s, 1541.2 MB/s │ 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  129.5 MB/s, 1798.0 MB/s

compile with gcc-13                                                              │compile with gcc-13
 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  617.4 MB/s, 1550.3 MB/s │ 1#silesia.tar       : 211972608 ->  73422432 (x2.887),  617.0 MB/s, 1713.0 MB/s
 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  477.0 MB/s, 1454.3 MB/s │ 2#silesia.tar       : 211972608 ->  69499071 (x3.050),  476.1 MB/s, 1663.7 MB/s
 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  379.6 MB/s, 1446.0 MB/s │ 3#silesia.tar       : 211972608 ->  66523575 (x3.186),  379.0 MB/s, 1682.8 MB/s
 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  349.6 MB/s, 1449.9 MB/s │ 4#silesia.tar       : 211972608 ->  65324711 (x3.245),  348.6 MB/s, 1700.3 MB/s
 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  184.4 MB/s, 1445.7 MB/s │ 5#silesia.tar       : 211972608 ->  63045668 (x3.362),  183.7 MB/s, 1697.9 MB/s
 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  129.3 MB/s, 1539.8 MB/s │ 6#silesia.tar       : 211972608 ->  61547760 (x3.444),  129.2 MB/s, 1800.4 MB/s

Now we are talking a pretty severe 10-12% decompression speed drop compared to dev !
This is a pretty large drop.

Yet, there are a few considerations.
To begin with, the performance of gcc on dev branch is exceptionally stellar. We are talking about a ~+20% performance advantage over clang ! This is impressive.
Even after the change, were clang gains +4-5% while gcc loses -10-12%, gcc is still in the lead, though by a reduced margin of +2-3%.
This makes me wonder where does the exceptional performance of gcc on dev branch comes from.

This could be due to one or a combination of effects. Come to mind :

  1. M1 Pro is actually not immune to random instruction alignment side-effects, or other cpu architecture side-effects that are effectively uncontrollable from C. gcc was simply "lucky" when generating the dev binary, and no longer after the change.
  2. gcc might have performance heuristics that happen to work well with previous decoding loop, but do no longer get triggered properly after the change introduced by this patch.

Given that all 3 gcc versions tested have the same behavior, the explanation 2) feels a bit more likely. In which case, it would be interesting to understand why, and find a mitigation which allows gcc to shine again. But this is difficult to investigate. There is no equivalent to perf counter on macos. I presume understanding the performance profile of the generated binary implies better proficiency with Xcode tooling, and Xcode might be tied to clang.

Now, to be fair, on macos M1 Pro, I would also expect clang to be a more common compiler that gcc, making clang results a bit more important for this platform.

To summarize, if we ignore performance results or consider them non-conclusive, I am in favor of this PR, because :

  1. It achieves the objective (add an additional corruption case that can be detected without checksum)
  2. It makes the decoding loop code (slightly) easier to read and maintain

I believe that both of these properties are desirable.

@Cyan4973 Cyan4973 changed the title Changed the decoding loop to detect more cases of invalid corruption sooner Changed the decoding loop to detect more invalid cases of corruption sooner Jun 16, 2023
@Cyan4973 Cyan4973 self-assigned this Jun 16, 2023
make a mock initialization to please the tool
@Cyan4973 Cyan4973 marked this pull request as ready for review June 17, 2023 00:04
Copy link
Contributor

@terrelln terrelln left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks correct to me. My server is off right now, but I will benchmark this when I get home today, just to double check perf on my machine.

@terrelln
Copy link
Contributor

I've measured on my devserver and measure a 2.5% regression with our version of clang, and a 4% regression with our version of gcc. I still have to measure on my home server.

@Cyan4973
Copy link
Contributor Author

I've measured on my devserver and measure a 2.5% regression with our version of clang, and a 4% regression with our version of gcc. I still have to measure on my home server.

Not sure if you have access to multiple versions of these compilers,
this would be useful to distinguish random instruction alignment impacts
from a genuine performance regression.

@terrelln
Copy link
Contributor

Not sure if you have access to multiple versions of these compilers,
this would be useful to distinguish random instruction alignment impacts
from a genuine performance regression.

Unfortunately not.

I'm okay landing this, I like the move of reload to decode sequence, it is logical and seems like it should be "good". When we get closer to a release, we could re-measure, and if we still notice a regression, we could attempt to recoup some of the loss.

@Cyan4973 Cyan4973 merged commit 118200f into dev Jul 5, 2023
91 checks passed
@Cyan4973 Cyan4973 deleted the detectOverflow branch September 12, 2023 20:45
anakinxc pushed a commit to secretflow/spu that referenced this pull request Mar 27, 2024
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [com_github_facebook_zstd](https://togithub.com/facebook/zstd) |
http_archive | patch | `v1.5.5` -> `v1.5.6` |

---

### Release Notes

<details>
<summary>facebook/zstd (com_github_facebook_zstd)</summary>

### [`v1.5.6`](https://togithub.com/facebook/zstd/releases/tag/v1.5.6):
Zstandard v1.5.6 - Chrome Edition

[Compare
Source](https://togithub.com/facebook/zstd/compare/v1.5.5-kernel...v1.5.6)

This release highlights the deployment of Google [Chrome
123](https://developer.chrome.com/blog/new-in-chrome-123), introducing
`zstd-encoding` for Web traffic, introduced as a preferable option for
compression of dynamic contents. With limited web server support for
`zstd-encoding` due to its novelty, we are launching an updated
Zstandard version to facilitate broader adoption.

##### New stable parameter `ZSTD_c_targetCBlockSize`

Using `zstd` compression for large documents over the Internet, data is
segmented into smaller blocks of up to 128 KB, for incremental updates.
This is crucial for applications like Chrome that process parts of
documents as they arrive. However, on slow or congested networks, there
can be some brief unresponsiveness in the middle of a block
transmission, delaying update. To mitigate such scenarios, `libzstd`
introduces the new parameter `ZSTD_c_targetCBlockSize`, enabling the
division of blocks into even smaller segments to enhance initial byte
delivery speed. Activating this feature incurs a cost, both runtime
(equivalent to -2% speed at level 8) and a slight compression efficiency
decrease (<0.1%), but offers some interesting latency reduction, notably
beneficial in areas with less powerful network infrastructure.

##### Granular binary size selection

`libzstd` provides build customization, including options to compile
only the compression or decompression modules, minimizing binary size.
Enhanced in `v1.5.6`
([source](https://togithub.com/facebook/zstd/tree/dev/lib#modular-build)),
it now allows for even finer control by enabling selective inclusion or
exclusion of specific components within these modules. This advancement
aids applications needing precise binary size management.

##### Miscellaneous Enhancements

This release includes various minor enhancements and bug fixes to
enhance user experience. Key updates include an expanded list of
recognized compressed file suffixes for the `--exclude-compressed` flag,
improving efficiency by skipping presumed incompressible content.
Furthermore, compatibility has been broadened to include additional
chipsets (`sparc64`, `ARM64EC`, `risc-v`) and operating systems (`QNX`,
`AIX`, `Solaris`, `HP-UX`).

#### Change Log

api: Promote `ZSTD_c_targetCBlockSize` to Stable API by
[@&#8203;felixhandte](https://togithub.com/felixhandte)
api: new experimental `ZSTD_d_maxBlockSize` parameter, to reduce
streaming decompression memory, by
[@&#8203;terrelln](https://togithub.com/terrelln)
perf: improve performance of param `ZSTD_c_targetCBlockSize`, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
perf: improved compression of arrays of integers at high compression, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
lib: reduce binary size with selective built-time exclusion, by
[@&#8203;felixhandte](https://togithub.com/felixhandte)
lib: improved huffman speed on small data and linux kernel, by
[@&#8203;terrelln](https://togithub.com/terrelln)
lib: accept dictionaries with partial literal tables, by
[@&#8203;terrelln](https://togithub.com/terrelln)
lib: fix CCtx size estimation with external sequence producer, by
[@&#8203;embg](https://togithub.com/embg)
lib: fix corner case decoder behaviors, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) and
[@&#8203;aimuz](https://togithub.com/aimuz)
lib: fix zdict prototype mismatch in static_only mode, by
[@&#8203;ldv-alt](https://togithub.com/ldv-alt)
lib: fix several bugs in magicless-format decoding, by
[@&#8203;embg](https://togithub.com/embg)
cli: add common compressed file types to `--exclude-compressed` by
[@&#8203;daniellerozenblit](https://togithub.com/daniellerozenblit)
(requested by [@&#8203;dcog989](https://togithub.com/dcog989))
cli: fix mixing `-c` and `-o` commands with `--rm`, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
cli: fix erroneous exclusion of hidden files with `--output-dir-mirror`
by [@&#8203;felixhandte](https://togithub.com/felixhandte)
cli: improved time accuracy on BSD, by
[@&#8203;felixhandte](https://togithub.com/felixhandte)
cli: better errors on argument parsing, by
[@&#8203;KapJI](https://togithub.com/KapJI)
tests: better compatibility with older versions of `grep`, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
tests: lorem ipsum generator as default content generator, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
build: cmake improvements by
[@&#8203;terrelln](https://togithub.com/terrelln),
[@&#8203;sighingnow](https://togithub.com/sighingnow),
[@&#8203;gjasny](https://togithub.com/gjasny),
[@&#8203;JohanMabille](https://togithub.com/JohanMabille),
[@&#8203;Saverio976](https://togithub.com/Saverio976),
[@&#8203;gruenich](https://togithub.com/gruenich),
[@&#8203;teo-tsirpanis](https://togithub.com/teo-tsirpanis)
build: bazel support, by
[@&#8203;jondo2010](https://togithub.com/jondo2010)
build: fix cross-compiling for AArch64 with lld by
[@&#8203;jcelerier](https://togithub.com/jcelerier)
build: fix Apple platform compatibility, by
[@&#8203;nidhijaju](https://togithub.com/nidhijaju)
build: fix Visual 2012 and lower compatibility, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
build: improve win32 support, by
[@&#8203;DimitriPapadopoulos](https://togithub.com/DimitriPapadopoulos)
build: better C90 compliance for zlibWrapper, by
[@&#8203;emaste](https://togithub.com/emaste)
port: make: fat binaries on macos, by
[@&#8203;mredig](https://togithub.com/mredig)
port: ARM64EC compatibility for Windows, by
[@&#8203;dunhor](https://togithub.com/dunhor)
port: QNX support by
[@&#8203;klausholstjacobsen](https://togithub.com/klausholstjacobsen)
port: MSYS2 and Cygwin makefile installation and test support, by
[@&#8203;QBos07](https://togithub.com/QBos07)
port: risc-v support validation in CI, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
port: sparc64 support validation in CI, by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973)
port: AIX compatibility, by
[@&#8203;likema](https://togithub.com/likema)
port: HP-UX compatibility, by
[@&#8203;likema](https://togithub.com/likema)
doc: Improved specification accuracy, by
[@&#8203;elasota](https://togithub.com/elasota)
bug: Fix and deprecate ZSTD_generateSequences
([#&#8203;3981](https://togithub.com/facebook/zstd/issues/3981)), by
[@&#8203;terrelln](https://togithub.com/terrelln)

#### Full change list (auto-generated)

- Add win32 to windows-artifacts.yml by
[@&#8203;Kim-SSi](https://togithub.com/Kim-SSi) in
[facebook/zstd#3600
- Fix mmap-dict help output by
[@&#8203;daniellerozenblit](https://togithub.com/daniellerozenblit) in
[facebook/zstd#3601
- \[oss-fuzz] Fix simple_round_trip fuzzer with overlapping
decompression by [@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3612
- Reduce streaming decompression memory by (128KB - blockSizeMax) by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3616
- removed travis & appveyor scripts by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3621
- Add ZSTD_d_maxBlockSize parameter by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3617
- \[doc] add decoder errata paragraph by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3620
- add makefile entry to build fat binary on macos by
[@&#8203;mredig](https://togithub.com/mredig) in
[facebook/zstd#3614
- Disable unused variable warning in msan configurations by
[@&#8203;danlark1](https://togithub.com/danlark1) in
[facebook/zstd#3624

[facebook/zstd#3634
- Allow Build-Time Exclusion of Individual Compression Strategies by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3623
- Get zstd working with ARM64EC on Windows by
[@&#8203;dunhor](https://togithub.com/dunhor) in
[facebook/zstd#3636
- minor : update streaming_compression example by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3631
- Fix UBSAN issue (zero addition to NULL) by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3658
- Add options in Makefile to cmake by
[@&#8203;sighingnow](https://togithub.com/sighingnow) in
[facebook/zstd#3657
- fix a minor inefficiency in compress_superblock by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3668
- Fixed a bug in the educational decoder by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3659
- changed LLU suffix into ULL for Visual 2012 and lower by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3664
- fixed decoder behavior when nbSeqs==0 is encoded using 2 bytes by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3669
- detect extraneous bytes in the Sequences section by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3674
- Bitstream produces only zeroes after an overflow event by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3676
- Update FreeBSD CI images to latest supported releases by
[@&#8203;emaste](https://togithub.com/emaste) in
[facebook/zstd#3684
- Clean up a false error message in the LDM debug log by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3686
- Hide ASM symbols on Apple platforms by
[@&#8203;nidhijaju](https://togithub.com/nidhijaju) in
[facebook/zstd#3688
- Changed the decoding loop to detect more invalid cases of corruption
sooner by [@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3677
- Fix Intel Xcode builds with assembly by
[@&#8203;gjasny](https://togithub.com/gjasny) in
[facebook/zstd#3665
- Save one byte on the frame epilogue by
[@&#8203;Coder-256](https://togithub.com/Coder-256) in
[facebook/zstd#3700
- Update fileio.c: fix build failure with enabled LTO by
[@&#8203;LocutusOfBorg](https://togithub.com/LocutusOfBorg) in
[facebook/zstd#3695
- fileio_asyncio: handle malloc fails in AIO_ReadPool_create by
[@&#8203;void0red](https://togithub.com/void0red) in
[facebook/zstd#3704
- Fix typographical error in README.md by
[@&#8203;nikohoffren](https://togithub.com/nikohoffren) in
[facebook/zstd#3701
- Fixed typo by
[@&#8203;alexsifivetw](https://togithub.com/alexsifivetw) in
[facebook/zstd#3712
- Improve dual license wording in README by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3718
- Unpoison Workspace Memory Before Custom-Free by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3725
- added ZSTD_decompressDCtx() benchmark option to fullbench by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3726
- No longer reject dictionaries with literals maxSymbolValue < 255 by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3731
- fix: ZSTD_BUILD_DECOMPRESSION message by
[@&#8203;0o001](https://togithub.com/0o001) in
[facebook/zstd#3728
- Updated Makefiles for full MSYS2 and Cygwin installation and testing …
by [@&#8203;QBos07](https://togithub.com/QBos07) in
[facebook/zstd#3720
- Work around nullptr-with-nonzero-offset warning by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3738
- Fix & refactor Huffman repeat tables for dictionaries by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3737
- zdictlib: fix prototype mismatch by
[@&#8203;ldv-alt](https://togithub.com/ldv-alt) in
[facebook/zstd#3733
- Fixed zstd cmake shared build on windows by
[@&#8203;JohanMabille](https://togithub.com/JohanMabille) in
[facebook/zstd#3739
- Added qnx in the posix test section of platform.h by
[@&#8203;klausholstjacobsen](https://togithub.com/klausholstjacobsen) in
[facebook/zstd#3745
- added some documentation on ZSTD_estimate\*Size() variants by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3755
- Improve macro guards for ZSTD_assertValidSequence by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3770
- Stop suppressing pointer-overflow UBSAN errors by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3776
- fix x32 tests on Github CI by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3777
- Fix new typos found by codespell by
[@&#8203;DimitriPapadopoulos](https://togithub.com/DimitriPapadopoulos)
in
[facebook/zstd#3771
- Do not test WIN32, instead test \_WIN32 by
[@&#8203;DimitriPapadopoulos](https://togithub.com/DimitriPapadopoulos)
in
[facebook/zstd#3772
- Fix a very small formatting typo in the lib/README.md file by
[@&#8203;dloidolt](https://togithub.com/dloidolt) in
[facebook/zstd#3763
- Fix pzstd Makefile to allow setting `DESTDIR` and `BINDIR` separately
by [@&#8203;paulmenzel](https://togithub.com/paulmenzel) in
[facebook/zstd#3752
- Remove FlexArray pattern from ZSTDMT by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3786
- solving flexArray issue
[#&#8203;3785](https://togithub.com/facebook/zstd/issues/3785) in fse by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3789
- Add doc on how to use it with cmake FetchContent by
[@&#8203;Saverio976](https://togithub.com/Saverio976) in
[facebook/zstd#3795
- Correct FSE probability bit consumption in specification by
[@&#8203;elasota](https://togithub.com/elasota) in
[facebook/zstd#3806
- Add Bazel module instructions to README.md by
[@&#8203;jondo2010](https://togithub.com/jondo2010) in
[facebook/zstd#3812
- Clarify that a stream containing too many Huffman weights is invalid
by [@&#8203;elasota](https://togithub.com/elasota) in
[facebook/zstd#3813
- \[cmake] Require CMake version 3.5 or newer by
[@&#8203;gruenich](https://togithub.com/gruenich) in
[facebook/zstd#3807
- Three fixes for the Linux kernel by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3822
- \[huf] Improve fast huffman decoding speed in linux kernel by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3826
- \[huf] Improve fast C & ASM performance on small data by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3827
- update xxhash library to v0.8.2 by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3820
- Modernize macros to use `do { } while (0)` by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3831
- Clarify that the presence of weight value 1 is required, and a lone
implied 1 weight is invalid by
[@&#8203;elasota](https://togithub.com/elasota) in
[facebook/zstd#3814
- Move offload API params into ZSTD_CCtx_params by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3839
- Update FreeBSD CI: drop 12.4 (nearly EOL) by
[@&#8203;emaste](https://togithub.com/emaste) in
[facebook/zstd#3845
- Make offload API compatible with static CCtx by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3854
- zlibWrapper: convert to C89 / ANSI C by
[@&#8203;emaste](https://togithub.com/emaste) in
[facebook/zstd#3846
- Fix a nullptr dereference in ZSTD_createCDict_advanced2() by
[@&#8203;michoecho](https://togithub.com/michoecho) in
[facebook/zstd#3847
- Cirrus-CI: Add FreeBSD 14 by
[@&#8203;emaste](https://togithub.com/emaste) in
[facebook/zstd#3855
- CI: meson: use builtin handling for MSVC by
[@&#8203;eli-schwartz](https://togithub.com/eli-schwartz) in
[facebook/zstd#3858
- cli: better errors on argument parsing by
[@&#8203;KapJI](https://togithub.com/KapJI) in
[facebook/zstd#3850
- Clarify that probability tables must not contain non-zero
probabilities for invalid values by
[@&#8203;elasota](https://togithub.com/elasota) in
[facebook/zstd#3817
- \[x-compile] Fix cross-compiling for AArch64 with lld by
[@&#8203;jcelerier](https://togithub.com/jcelerier) in
[facebook/zstd#3760
- playTests.sh does no longer needs grep -E by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3865
- minor: playTests.sh more compatible with older versions of grep by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3877
- disable Intel CET Compatibility tests by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3884
- improve cmake test by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3883
- add sparc64 compilation test by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3886
- add a lorem ipsum generator by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3890
- Update Dependency in Intel CET Test; Re-Enable Test by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3893
- Improve compression of Arrays of Integers (High compression mode) by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3895
- \[Zstd] Less verbose log for patch mode. by
[@&#8203;sandreenko](https://togithub.com/sandreenko) in
[facebook/zstd#3899
- fix
[`5921623`](https://togithub.com/facebook/zstd/commit/5921623844651008)
by [@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3900
- Fix fuzz issue
[`5131069`](https://togithub.com/facebook/zstd/commit/5131069967892480)
by [@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3902
- Advertise Availability of Security Vulnerability Notifications by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3909
- updated setup-msys2 to v2.22.0 by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3914
- Lorem Ipsum generator update by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3913
- Reduce scope of variables by
[@&#8203;gruenich](https://togithub.com/gruenich) in
[facebook/zstd#3903
- Improve speed of ZSTD_c_targetCBlockSize by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3915
- More regular block sizes with `targetCBlockSize` by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3917
- removed sprintf usage from zstdcli.c by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3916
- Export a `zstd::libzstd` CMake target if only static or dynamic
linkage is specified. by
[@&#8203;teo-tsirpanis](https://togithub.com/teo-tsirpanis) in
[facebook/zstd#3811
- fix version of actions/checkout by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3926
- minor Makefile refactoring by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3753
- lib/decompress: check for reserved bit corruption in zstd by
[@&#8203;aimuz](https://togithub.com/aimuz) in
[facebook/zstd#3840
- Fix state table formatting by
[@&#8203;elasota](https://togithub.com/elasota) in
[facebook/zstd#3816
- Specify offset 0 as invalid and specify required fixup behavior by
[@&#8203;elasota](https://togithub.com/elasota) in
[facebook/zstd#3824
- update -V documentation by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3928
- fix LLU->ULL by [@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3929
- Fix building xxhash on AIX 5.1 by
[@&#8203;likema](https://togithub.com/likema) in
[facebook/zstd#3860
- Fix building on HP-UX 11.11 PA-RISC by
[@&#8203;likema](https://togithub.com/likema) in
[facebook/zstd#3862
- Fix AsyncIO reading seed queueing by
[@&#8203;yoniko](https://togithub.com/yoniko) in
[facebook/zstd#3940
- Use ZSTD_LEGACY_SUPPORT=5 in "make test" by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3943
- Pin sanitizer CI jobs to ubuntu-20.04 by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3945
- chore: fix some typos by
[@&#8203;acceptacross](https://togithub.com/acceptacross) in
[facebook/zstd#3949
- new method to deal with offset==0 erroneous edge case by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3937
- add tests inspired from
[#&#8203;2927](https://togithub.com/facebook/zstd/issues/2927) by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3948
- cmake refactor: move HP-UX specific logic into its own function by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3946
- Fix [#&#8203;3719](https://togithub.com/facebook/zstd/issues/3719) :
mixing -c, -o and --rm by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3942
- minor: fix incorrect debug level by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3936
- add RISC-V emulation tests to Github CI by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3934
- prevent XXH64 from being autovectorized by XXH512 by default by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3933
- Stop Hardcoding the POSIX Version on BSDs by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3952
- Convert the CircleCI workflow to a GitHub Actions workflow by
[@&#8203;jk0](https://togithub.com/jk0) in
[facebook/zstd#3901
- Add common compressed file types to --exclude-compressed by
[@&#8203;daniellerozenblit](https://togithub.com/daniellerozenblit) in
[facebook/zstd#3951
- Export ZSTD_LEGACY_SUPPORT in tests/Makefile by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3955
- Exercise ZSTD_findDecompressedSize() in the simple decompression
fuzzer by [@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3959
- Update `ZSTD_RowFindBestMatch` comment by
[@&#8203;yoniko](https://togithub.com/yoniko) in
[facebook/zstd#3947
- Add the zeroSeq sample by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3954
- \[cpu] Backport fix for rbx clobbering on Windows with Clang by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3957
- Do not truncate file name in verbose mode by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3956
- updated documentation by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3958
- \[asm]\[aarch64] Mark that BTI and PAC are supported by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3961
- Use `utimensat()` on FreeBSD by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3960
- reduce the amount of #include in cover.h by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3962
- Remove Erroneous Exclusion of Hidden Files and Folders in
`--output-dir-mirror` by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3963
- Promote `ZSTD_c_targetCBlockSize` Parameter to Stable API by
[@&#8203;felixhandte](https://togithub.com/felixhandte) in
[facebook/zstd#3964
- \[cmake] Always create libzstd target by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3965
- Remove incorrect docs regarding ZSTD_findFrameCompressedSize() by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3967
- add line number to debug traces by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3966
- bump version number by
[@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3969
- Export zstd's public headers via BUILD_INTERFACE by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3968
- Fix bug with streaming decompression of magicless format by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3971
- pzstd: use c++14 without conditions by
[@&#8203;kanavin](https://togithub.com/kanavin) in
[facebook/zstd#3682
- Fix bugs in simple decompression fuzzer by
[@&#8203;yoniko](https://togithub.com/yoniko) in
[facebook/zstd#3978
- Fuzzing and bugfixes for magicless-format decoding by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3976
- Fix & fuzz ZSTD_generateSequences by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3981
- Fail on errors when building fuzzers by
[@&#8203;yoniko](https://togithub.com/yoniko) in
[facebook/zstd#3979
- \[cmake] Emit warnings for contradictory build settings by
[@&#8203;terrelln](https://togithub.com/terrelln) in
[facebook/zstd#3975
- Document the process for adding a new fuzzer by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3982
- Fix -Werror=pointer-arith in fuzzers by
[@&#8203;embg](https://togithub.com/embg) in
[facebook/zstd#3983
- Doc update by [@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3977
- v1.5.6 by [@&#8203;Cyan4973](https://togithub.com/Cyan4973) in
[facebook/zstd#3984

#### New Contributors

- [@&#8203;Kim-SSi](https://togithub.com/Kim-SSi) made their first
contribution in
[facebook/zstd#3600
- [@&#8203;mredig](https://togithub.com/mredig) made their first
contribution in
[facebook/zstd#3614
- [@&#8203;dunhor](https://togithub.com/dunhor) made their first
contribution in
[facebook/zstd#3636
- [@&#8203;sighingnow](https://togithub.com/sighingnow) made their first
contribution in
[facebook/zstd#3657
- [@&#8203;nidhijaju](https://togithub.com/nidhijaju) made their first
contribution in
[facebook/zstd#3688
- [@&#8203;gjasny](https://togithub.com/gjasny) made their first
contribution in
[facebook/zstd#3665
- [@&#8203;Coder-256](https://togithub.com/Coder-256) made their first
contribution in
[facebook/zstd#3700
- [@&#8203;LocutusOfBorg](https://togithub.com/LocutusOfBorg) made their
first contribution in
[facebook/zstd#3695
- [@&#8203;void0red](https://togithub.com/void0red) made their first
contribution in
[facebook/zstd#3704
- [@&#8203;nikohoffren](https://togithub.com/nikohoffren) made their
first contribution in
[facebook/zstd#3701
- [@&#8203;alexsifivetw](https://togithub.com/alexsifivetw) made their
first contribution in
[facebook/zstd#3712
- [@&#8203;0o001](https://togithub.com/0o001) made their first
contribution in
[facebook/zstd#3728
- [@&#8203;QBos07](https://togithub.com/QBos07) made their first
contribution in
[facebook/zstd#3720
- [@&#8203;JohanMabille](https://togithub.com/JohanMabille) made their
first contribution in
[facebook/zstd#3739
- [@&#8203;klausholstjacobsen](https://togithub.com/klausholstjacobsen)
made their first contribution in
[facebook/zstd#3745
- [@&#8203;Saverio976](https://togithub.com/Saverio976) made their first
contribution in
[facebook/zstd#3795
- [@&#8203;elasota](https://togithub.com/elasota) made their first
contribution in
[facebook/zstd#3806
- [@&#8203;jondo2010](https://togithub.com/jondo2010) made their first
contribution in
[facebook/zstd#3812
- [@&#8203;gruenich](https://togithub.com/gruenich) made their first
contribution in
[facebook/zstd#3807
- [@&#8203;michoecho](https://togithub.com/michoecho) made their first
contribution in
[facebook/zstd#3847
- [@&#8203;KapJI](https://togithub.com/KapJI) made their first
contribution in
[facebook/zstd#3850
- [@&#8203;jcelerier](https://togithub.com/jcelerier) made their first
contribution in
[facebook/zstd#3760
- [@&#8203;sandreenko](https://togithub.com/sandreenko) made their first
contribution in
[facebook/zstd#3899
- [@&#8203;teo-tsirpanis](https://togithub.com/teo-tsirpanis) made their
first contribution in
[facebook/zstd#3811
- [@&#8203;aimuz](https://togithub.com/aimuz) made their first
contribution in
[facebook/zstd#3840
- [@&#8203;acceptacross](https://togithub.com/acceptacross) made their
first contribution in
[facebook/zstd#3949
- [@&#8203;jk0](https://togithub.com/jk0) made their first contribution
in
[facebook/zstd#3901

**Full Changelog**:
facebook/zstd@v1.5.5...v1.5.6

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/secretflow/spu).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4yNjkuMiIsInVwZGF0ZWRJblZlciI6IjM3LjI2OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants