-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve encoding/decoding speed #1500
Conversation
... by not going through a `Term` intermediate This gives a ~28% performance in decoding improvement, which means that cache looks are not faster. Here are the new decoding benchmarks before and after this change: Before: ``` benchmarked Issue #108/Binary time 266.5 μs (265.7 μs .. 267.4 μs) 1.000 R² (1.000 R² .. 1.000 R²) mean 266.3 μs (265.6 μs .. 267.1 μs) std dev 2.418 μs (1.891 μs .. 3.436 μs) benchmarking Kubernetes/Binary ... took 36.94 s, total 56 iterations benchmarked Kubernetes/Binary time 641.3 ms (623.0 ms .. 655.4 ms) 0.999 R² (0.997 R² .. 1.000 R²) mean 679.7 ms (665.5 ms .. 702.6 ms) std dev 29.48 ms (14.15 ms .. 39.05 ms) ``` After: ``` benchmarked Issue #108/Binary time 282.2 μs (279.6 μs .. 284.7 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 281.9 μs (280.7 μs .. 287.7 μs) std dev 7.089 μs (2.550 μs .. 15.44 μs) variance introduced by outliers: 11% (moderately inflated) benchmarking Kubernetes/Binary ... took 27.57 s, total 56 iterations benchmarked Kubernetes/Binary time 499.1 ms (488.1 ms .. 506.6 ms) 0.999 R² (0.998 R² .. 1.000 R²) mean 498.9 ms (494.4 ms .. 503.9 ms) std dev 8.539 ms (6.236 ms .. 12.56 ms) ``` There's a slight performance regression for the decoding microbenchmark, but in practice my testing on real examples matches performance improvements seen in the larger benchmark based on an example cache product from `dhall-kubernetes`. Note that is a breaking change because: * There is no longer a `FromTerm` nor `ToTerm` class. Now we use the `Serialise` class and `{encode,decode}Expression` now work on `ByteString`s instead of `Term`s * I further narrowed the types of several encoding/decoding utilites to expect a `Void` for the first type parameter of `Expr` * This is a regression with respect to stripping 55799 CBOR tags, mainly because properly handling the tags at every possible point in the syntax tree would considerably complicate the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The speedup is great, but it seems to me that especially the decoding got more low-level and a bit trickier. Would you mind checking that the acceptance test suite gives us good test coverage there? Otherwise I can do that too!
There's also a build failure in dhall-lsp-server
:
C:\projects\dhall-haskell\dhall-lsp-server\src\Dhall\LSP\Backend\Dhall.hs:168:30: error:
* Couldn't match type `Src' with `Void'
Expected type: Expr Void Void
Actual type: Expr Src Void
* In the first argument of `Dhall.hashExpressionToCode', namely
`alphaNormal'
In the expression: Dhall.hashExpressionToCode alphaNormal
In an equation for `hashNormalToCode':
hashNormalToCode (Normal expr)
= Dhall.hashExpressionToCode alphaNormal
where
alphaNormal = Dhall.alphaNormalize expr
|
168 | Dhall.hashExpressionToCode alphaNormal
| ^^^^^^^^^^^
Command exited with code 1
return (BoolLit b) | ||
|
||
TypeString -> do | ||
s <- Decoding.decodeString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a type annotation that shows that this is a Text
? decodeString
is pretty confusing! :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of decodeString
s that would need to be changed if we did that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Never mind then!
@sjakobi: I did a code coverage check and the main things that are not exercised by the test suite are:
Only one of these concerns me: I believe the code is not correctly handling |
Thanks for checking! :) 👍 I can take care of adding test cases for the currently uncovered code to the test suite. |
... as suggested by @sjakobi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
return (BoolLit b) | ||
|
||
TypeString -> do | ||
s <- Decoding.decodeString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Never mind then!
@Gabriel439 had discovered in dhall-lang/dhall-haskell#1500 that we were missing some test coverage here.
@Gabriel439 had discovered in dhall-lang/dhall-haskell#1500 that we were missing some test coverage here.
... by not going through a
Term
intermediateThis gives a ~28% performance improvement for decoding, which means that
cache lookups are now faster.
Here are the new decoding benchmarks before and after this change:
Before:
After:
There's a slight performance regression for the decoding microbenchmark, but
in practice my testing on real examples matches performance improvements seen
in the larger benchmark based on an example cache product from
dhall-kubernetes
.Note that is a breaking change because:
There is no longer a
FromTerm
norToTerm
class. Now we use theSerialise
class and{encode,decode}Expression
now work onByteString
sinstead of
Term
sI further narrowed the types of several encoding/decoding utilites to expect a
Void
for the first type parameter ofExpr
This is a regression with respect to stripping 55799 CBOR tags, mainly
because properly handling the tags at every possible point in the syntax tree
would considerably complicate the code