Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize binary decoding of builtins #1810

Merged
merged 2 commits into from
May 25, 2020
Merged

Conversation

sjakobi
Copy link
Collaborator

@sjakobi sjakobi commented May 24, 2020

  • Use decodeUtf8ByteArray to avoid UTF16-encoding the scrutinee.

  • Optimize the pattern matching by grouping the patterns by length.

    GHC currently doesn't produce static length information for string
    literals. Consequently the pattern matching worked somewhat like this:

    s <- decodeString
    
    let len_s = length s
    
    if len_s == length "Natural/build" && sameBytes s "Natural/build"
        then return NaturalBuild
        else if len_s == length "Natural/fold" && sameBytes s "Natural/fold"
                 ...
    

    Decoding Sort, the most extreme case, would involve a total of 32
    conditional jumps as a consequence of length comparisons alone.

    Judging by the Core, we can get that number down to 8 by grouping
    the patterns by length: One to check the length of the decoded string,
    and (unfortunately) still one each for the 7 candidate literals of
    length 4.

    The number of string content comparisons should be unchanged.

The result of these optimizations is that the time to decode the cache for cpkg
is reduced by 7-9%. Decoding time for the Prelude goes down by 13-16%.

This also changes the builtin encoding to use encodeUtf8ByteArray in order
to avoid UTF16-encoding and decoding the builtins strings. I didn't check
the performance implications though.

Context: #1804.

* Use decodeUtf8ByteArray to avoid UTF16-encoding the scrutinee.

* Optimize the pattern matching by grouping the patterns by length.

  GHC currently doesn't produce static length information for string
  literals. Consequently the pattern matching worked somewhat like this:

      s <- decodeString

      let len_s = length s

      if len_s == length "Natural/build" && sameBytes s "Natural/build"
          then return NaturalBuild
          else if len_s == length "Natural/fold" && sameBytes s "Natural/fold"
                   ...

  Decoding `Sort`, the most extreme case, would involve a total of 32
  conditional jumps as a consequence of length comparisons alone.

  Judging by the Core, we can get that number down to 8 by grouping
  the patterns by length: One to check the length of the decoded string,
  and (unfortunately) still one each for the 7 candidate literals of
  length 4.

  The number of string content comparisons should be unchanged.

The result of these optimizations is that the time to decode the cache for cpkg
is reduced by 7-9%. Decoding time for the Prelude goes down by 13-16%.

This also changes the builtin encoding to use encodeUtf8ByteArray in order
to avoid UTF16-encoding and decoding the builtins strings. I didn't check
the performance implications though.

Context: #1804.
@sjakobi
Copy link
Collaborator Author

sjakobi commented May 24, 2020

Judging by the Core, we can get that number down to 8 by grouping
the patterns by length: One to check the length of the decoded string,
and (unfortunately) still one each for the 7 candidate literals of
length 4.

These completely unnecessary length checks on the literals come from ShortByteString's Eq instance. To avoid them, we could try using the compareByteArrays# primop instead. primitive has a nice backward-compatible wrapper for it, but it's unfortunately not currently exported: haskell/primitive#131

Copy link
Collaborator

@Gabriella439 Gabriella439 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! 🙂

@mergify mergify bot merged commit 93313dc into master May 25, 2020
@mergify mergify bot deleted the sjakobi/builtins-binary branch May 25, 2020 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants