Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi, the following patchset replaces some computation by lookup tables and improve the performance a bit more. It's using ghc primitive directly but the portability line is GHC, so hopefully this is not a problem. if it's a problem, it might be possible to use bytestring overloadStrings and unsafeIndexing to almost the same performance increase (but consistently slower on some other benchmarks of mine.)
benchmarks below are means/lb/ub before and after optimization on my machine.
benchmarking encode/8
mean: 100.5066 ns, lb 100.4542 ns, ub 100.6346 ns, ci 0.950
mean: 89.37258 ns, lb 89.34455 ns, ub 89.42334 ns, ci 0.950
benchmarking encode/32
mean: 292.5857 ns, lb 292.5121 ns, ub 292.7101 ns, ci 0.950
mean: 254.6374 ns, lb 249.8842 ns, ub 261.5605 ns, ci 0.950
benchmarking encode/128
mean: 1.038159 us, lb 1.037985 us, ub 1.038537 us, ci 0.950
mean: 850.7781 ns, lb 850.6118 ns, ub 851.1164 ns, ci 0.950
benchmarking encode/1024
mean: 7.768683 us, lb 7.766970 us, ub 7.772629 us, ci 0.950
mean: 6.838571 us, lb 6.693303 us, ub 7.040833 us, ci 0.950
benchmarking encode/65536
mean: 511.2344 us, lb 510.7413 us, ub 511.9317 us, ci 0.950
mean: 426.9267 us, lb 426.2879 us, ub 427.5551 us, ci 0.950
benchmarking decode/8
mean: 491.4554 ns, lb 491.3104 ns, ub 491.7289 ns, ci 0.950
mean: 420.7300 ns, lb 419.1382 ns, ub 427.0422 ns, ci 0.950
benchmarking decode/32
mean: 1.146035 us, lb 1.145754 us, ub 1.146530 us, ci 0.950
mean: 815.8817 ns, lb 815.1795 ns, ub 817.5643 ns, ci 0.950
benchmarking decode/128
mean: 3.738534 us, lb 3.737629 us, ub 3.740522 us, ci 0.950
mean: 2.379186 us, lb 2.376243 us, ub 2.382636 us, ci 0.950
benchmarking decode/1024
mean: 29.88323 us, lb 29.87664 us, ub 29.89754 us, ci 0.950
mean: 16.65282 us, lb 16.65004 us, ub 16.65840 us, ci 0.950
benchmarking decode/65536
mean: 1.915516 ms, lb 1.914163 ms, ub 1.919997 ms, ci 0.950
mean: 1.056637 ms, lb 1.055106 ms, ub 1.057994 ms, ci 0.950