Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve UTF-8 parser performance #8

Closed
wants to merge 1 commit into from
Closed

Conversation

jwilm
Copy link
Collaborator

@jwilm jwilm commented Jul 11, 2017

A benchmark for UTF-8 parsing performance was added. Using this
benchmark, several strategies for parsing were tested versus the
original lookup table implementation:

  • Pure match

    name pure_lookup ns/iter pure_match ns/iter diff ns/iter diff % speedup
    tests::parse_bench_utf8_demo 68,984 52,731 -16,253 -23.56% x 1.31
    tests::std_string_parse_utf8 42,472 42,144 -328 -0.77% x 1.01

  • Match with packed lookup in Ground state

    name pure_lookup ns/iter match_and_lookup ns/iter diff ns/iter diff % speedup
    tests::parse_bench_utf8_demo 68,984 68,922 -62 -0.09% x 1.00
    tests::std_string_parse_utf8 42,472 36,788 -5,684 -13.38% x 1.15

  • Match with unpacked lookup in Ground state

    name pure_lookup ns/iter match_and_lookup_unpacked ns/iter diff ns/iter diff % speedup
    tests::parse_bench_utf8_demo 68,984 63,727 -5,257 -7.62% x 1.08
    tests::std_string_parse_utf8 42,472 42,787 315 0.74% x 0.99

Of these implementations, the pure match peformed best in
microbenchmarks, and that is the implementation retained in this commit.

A benchmark for UTF-8 parsing performance was added. Using this
benchmark, several strategies for parsing were tested versus the
original lookup table implementation:

* Pure match

    name                          pure_lookup ns/iter  pure_match ns/iter  diff ns/iter   diff %  speedup
    tests::parse_bench_utf8_demo  68,984               52,731                   -16,253  -23.56%   x 1.31
    tests::std_string_parse_utf8  42,472               42,144                      -328   -0.77%   x 1.01

* Match with packed lookup in Ground state

    name                          pure_lookup ns/iter  match_and_lookup ns/iter  diff ns/iter   diff %  speedup
    tests::parse_bench_utf8_demo  68,984               68,922                             -62   -0.09%   x 1.00
    tests::std_string_parse_utf8  42,472               36,788                          -5,684  -13.38%   x 1.15

* Match with unpacked lookup in Ground state

    name                          pure_lookup ns/iter  match_and_lookup_unpacked ns/iter  diff ns/iter  diff %  speedup
    tests::parse_bench_utf8_demo  68,984               63,727                                   -5,257  -7.62%   x 1.08
    tests::std_string_parse_utf8  42,472               42,787                                      315   0.74%   x 0.99

Of these implementations, the pure match peformed best in
microbenchmarks, and that is the implementation retained in this commit.
@jwilm jwilm mentioned this pull request Jul 11, 2017
@chrisduerr chrisduerr self-assigned this Nov 23, 2019
@chrisduerr chrisduerr self-requested a review December 8, 2019 15:14
@chrisduerr
Copy link
Member

The #35 PR should implement this and some extra.

@chrisduerr chrisduerr closed this Dec 8, 2019
@chrisduerr chrisduerr deleted the improve-utf8-parse branch December 8, 2019 15:15
@chrisduerr chrisduerr removed their request for review December 8, 2019 15:15
@chrisduerr chrisduerr removed their assignment Dec 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants