Skip to content

Optimize convert_UTF8_to_JSON for mostly ASCII strings #629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Oct 19, 2024

Conversation

casperisfine
Copy link

If we assume that even UTF-8 strings are mostly ASCII, we can implement a fast path for the ASCII parts.

Before:

== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json     5.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     49.403 (± 2.0%) i/s   (20.24 ms/i) -    250.000 in   5.062647s
                  oj    100.120 (± 2.0%) i/s    (9.99 ms/i) -    504.000 in   5.035349s
           rapidjson     26.404 (± 0.0%) i/s   (37.87 ms/i) -    132.000 in   5.001025s

Comparison:
                json:       49.4 i/s
                  oj:      100.1 i/s - 2.03x  faster
           rapidjson:       26.4 i/s - 1.87x  slower

After:

== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    10.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     95.686 (± 2.1%) i/s   (10.45 ms/i) -    480.000 in   5.018575s
                  oj     96.875 (± 2.1%) i/s   (10.32 ms/i) -    486.000 in   5.019097s
           rapidjson     26.260 (± 3.8%) i/s   (38.08 ms/i) -    132.000 in   5.033151s

Comparison:
                json:       95.7 i/s
                  oj:       96.9 i/s - same-ish: difference falls within error
           rapidjson:       26.3 i/s - 3.64x  slower

If we assume that even UTF-8 strings are mostly ASCII, we can implement a
fast path for the ASCII parts.

Before:

```
== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json     5.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     49.403 (± 2.0%) i/s   (20.24 ms/i) -    250.000 in   5.062647s
                  oj    100.120 (± 2.0%) i/s    (9.99 ms/i) -    504.000 in   5.035349s
           rapidjson     26.404 (± 0.0%) i/s   (37.87 ms/i) -    132.000 in   5.001025s

Comparison:
                json:       49.4 i/s
                  oj:      100.1 i/s - 2.03x  faster
           rapidjson:       26.4 i/s - 1.87x  slower
```

After:

```
== Encoding mixed utf8 (20012001 bytes)
ruby 3.4.0dev (2024-10-18T15:12:54Z master d1b5c10957) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    10.000 i/100ms
                  oj     9.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     95.686 (± 2.1%) i/s   (10.45 ms/i) -    480.000 in   5.018575s
                  oj     96.875 (± 2.1%) i/s   (10.32 ms/i) -    486.000 in   5.019097s
           rapidjson     26.260 (± 3.8%) i/s   (38.08 ms/i) -    132.000 in   5.033151s

Comparison:
                json:       95.7 i/s
                  oj:       96.9 i/s - same-ish: difference falls within error
           rapidjson:       26.3 i/s - 3.64x  slower
```
scratch[2] = hexdig[ch >> 12];
scratch[3] = hexdig[(ch >> 8) & 0xf];
scratch[4] = hexdig[(ch >> 4) & 0xf];
scratch[5] = hexdig[ch & 0xf];
fbuffer_append(out_buffer, scratch, 6);
} else if ((ch & 0xE0) == 0xC0) { /* leading 3 bits are 0b110 */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes me wonder if we could actually store the character length in the escape table, and directly jump. 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@eregon eregon Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to care is this doesn't guarantee there are 1-5 bytes left in the String if the String is broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never enter this path if the string is broken:

convert_UTF8_to_JSON(buffer, obj, state->script_safe);

We rely on the string coderange to guarantee us we don't have to deal with broken UTF-8.

The assumption is that most strings have their coderange known. If we wanted to avoid scanning the coderange when it is unknown we could have an alternative function that does protect against broken strings, but not sure if it's worth it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right a broken String is just an exception on JSON.dump (unlike on JSON.load), nice.

Since we're looking up the table anyway, we might as well store the
UTF-8 char length in it. For single byte characters that don't need
escaping we store `0`.

This helps on strings with lots of multi-byte characters:

Before:

```
== Encoding mostly utf8 (20004001 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     6.000 i/100ms
                  oj    10.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     67.978 (± 1.5%) i/s   (14.71 ms/i) -    342.000 in   5.033062s
                  oj    100.876 (± 2.0%) i/s    (9.91 ms/i) -    510.000 in   5.058080s
           rapidjson     26.389 (± 7.6%) i/s   (37.89 ms/i) -    132.000 in   5.027681s

Comparison:
                json:       68.0 i/s
                  oj:      100.9 i/s - 1.48x  faster
           rapidjson:       26.4 i/s - 2.58x  slower
```

After:

```
== Encoding mostly utf8 (20004001 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json     7.000 i/100ms
                  oj    10.000 i/100ms
           rapidjson     2.000 i/100ms
Calculating -------------------------------------
                json     75.187 (± 2.7%) i/s   (13.30 ms/i) -    378.000 in   5.030111s
                  oj     95.196 (± 2.1%) i/s   (10.50 ms/i) -    480.000 in   5.043565s
           rapidjson     25.969 (± 3.9%) i/s   (38.51 ms/i) -    130.000 in   5.011471s

Comparison:
                json:       75.2 i/s
                  oj:       95.2 i/s - 1.27x  faster
           rapidjson:       26.0 i/s - 2.90x  slower
```
Profiling revealed that we were spending lots of time growing the buffer.
Buffer operations is definitely something we want to optimize, but for
this specific benchmark what we're interested in is UTF-8 scanning performance.

Each iteration of the two scaning benchmark were producing 20MB of JSON,
now they only produce 5MB.

Now:

```
== Encoding mostly utf8 (5001001 bytes)
ruby 3.4.0dev (2024-10-18T19:01:45Z master 7be9a333ca) +YJIT +PRISM [arm64-darwin23]
Warming up --------------------------------------
                json    35.000 i/100ms
                  oj    36.000 i/100ms
           rapidjson    10.000 i/100ms
Calculating -------------------------------------
                json    359.161 (± 1.4%) i/s    (2.78 ms/i) -      1.820k in   5.068542s
                  oj    359.699 (± 0.6%) i/s    (2.78 ms/i) -      1.800k in   5.004291s
           rapidjson     99.687 (± 2.0%) i/s   (10.03 ms/i) -    500.000 in   5.017321s

Comparison:
                json:      359.2 i/s
                  oj:      359.7 i/s - same-ish: difference falls within error
           rapidjson:       99.7 i/s - 3.60x  slower
```
@byroot byroot merged commit cfd569a into ruby:master Oct 19, 2024
73 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants