Skip to content

Conversation

casperisfine
Copy link

Rather than to copy into a buffer to unescape and then copy that buffer into the final string, we can directly copy into the final string.

The downside is that if the string contains a lot of escaping, we end up returning a string that's larger than strictly necessary, but it's probably fine.

Before:

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    58.000 i/100ms
           oj strict    74.000 i/100ms
          Oj::Parser    76.000 i/100ms
           rapidjson    52.000 i/100ms
Calculating -------------------------------------
                json    556.659 (± 2.9%) i/s    (1.80 ms/i) -      2.800k in   5.034719s
                  oj    604.077 (± 3.8%) i/s    (1.66 ms/i) -      3.016k in   5.001546s
           oj strict    706.942 (± 3.5%) i/s    (1.41 ms/i) -      3.552k in   5.030954s
          Oj::Parser    752.917 (± 3.2%) i/s    (1.33 ms/i) -      3.800k in   5.052707s
           rapidjson    546.470 (± 3.5%) i/s    (1.83 ms/i) -      2.756k in   5.049855s

Comparison:
                json:      556.7 i/s
          Oj::Parser:      752.9 i/s - 1.35x  faster
           oj strict:      706.9 i/s - 1.27x  faster
                  oj:      604.1 i/s - 1.09x  faster
           rapidjson:      546.5 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    29.000 i/100ms
                  oj    32.000 i/100ms
           oj strict    38.000 i/100ms
          Oj::Parser    42.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    317.858 (± 3.1%) i/s    (3.15 ms/i) -      1.595k in   5.023245s
                  oj    348.168 (± 2.6%) i/s    (2.87 ms/i) -      1.760k in   5.058431s
           oj strict    394.599 (± 2.8%) i/s    (2.53 ms/i) -      1.976k in   5.012073s
          Oj::Parser    403.771 (± 3.0%) i/s    (2.48 ms/i) -      2.058k in   5.101578s
           rapidjson    383.441 (± 3.7%) i/s    (2.61 ms/i) -      1.938k in   5.061355s

Comparison:
                json:      317.9 i/s
          Oj::Parser:      403.8 i/s - 1.27x  faster
           oj strict:      394.6 i/s - 1.24x  faster
           rapidjson:      383.4 i/s - 1.21x  faster
                  oj:      348.2 i/s - 1.10x  faster

After:

== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    62.000 i/100ms
           oj strict    73.000 i/100ms
          Oj::Parser    76.000 i/100ms
           rapidjson    54.000 i/100ms
Calculating -------------------------------------
                json    561.009 (± 7.5%) i/s    (1.78 ms/i) -      2.800k in   5.039548s
                  oj    601.124 (± 4.3%) i/s    (1.66 ms/i) -      3.038k in   5.064686s
           oj strict    707.455 (± 3.4%) i/s    (1.41 ms/i) -      3.577k in   5.062540s
          Oj::Parser    751.799 (± 3.1%) i/s    (1.33 ms/i) -      3.800k in   5.059509s
           rapidjson    535.641 (± 3.2%) i/s    (1.87 ms/i) -      2.700k in   5.045816s

Comparison:
                json:      561.0 i/s
          Oj::Parser:      751.8 i/s - 1.34x  faster
           oj strict:      707.5 i/s - 1.26x  faster
                  oj:      601.1 i/s - same-ish: difference falls within error
           rapidjson:      535.6 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    30.000 i/100ms
                  oj    32.000 i/100ms
           oj strict    36.000 i/100ms
          Oj::Parser    42.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    313.248 (± 7.3%) i/s    (3.19 ms/i) -      1.560k in   5.014118s
                  oj    341.977 (± 4.1%) i/s    (2.92 ms/i) -      1.728k in   5.063332s
           oj strict    387.062 (± 6.2%) i/s    (2.58 ms/i) -      1.944k in   5.045961s
          Oj::Parser    400.423 (± 4.0%) i/s    (2.50 ms/i) -      2.016k in   5.044513s
           rapidjson    379.046 (± 6.1%) i/s    (2.64 ms/i) -      1.911k in   5.064461s

Comparison:
                json:      313.2 i/s
          Oj::Parser:      400.4 i/s - 1.28x  faster
           oj strict:      387.1 i/s - 1.24x  faster
           rapidjson:      379.0 i/s - 1.21x  faster
                  oj:      342.0 i/s - same-ish: difference falls within error

Rather than to copy into a buffer to unescape and then copy that
buffer into the final string, we can directly copy into the final
string.

The downside is that if the string contains a lot of escaping, we
end up returning a string that's larger than strictly necessary, but
it's probably fine.

Before:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    58.000 i/100ms
           oj strict    74.000 i/100ms
          Oj::Parser    76.000 i/100ms
           rapidjson    52.000 i/100ms
Calculating -------------------------------------
                json    556.659 (± 2.9%) i/s    (1.80 ms/i) -      2.800k in   5.034719s
                  oj    604.077 (± 3.8%) i/s    (1.66 ms/i) -      3.016k in   5.001546s
           oj strict    706.942 (± 3.5%) i/s    (1.41 ms/i) -      3.552k in   5.030954s
          Oj::Parser    752.917 (± 3.2%) i/s    (1.33 ms/i) -      3.800k in   5.052707s
           rapidjson    546.470 (± 3.5%) i/s    (1.83 ms/i) -      2.756k in   5.049855s

Comparison:
                json:      556.7 i/s
          Oj::Parser:      752.9 i/s - 1.35x  faster
           oj strict:      706.9 i/s - 1.27x  faster
                  oj:      604.1 i/s - 1.09x  faster
           rapidjson:      546.5 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    29.000 i/100ms
                  oj    32.000 i/100ms
           oj strict    38.000 i/100ms
          Oj::Parser    42.000 i/100ms
           rapidjson    38.000 i/100ms
Calculating -------------------------------------
                json    317.858 (± 3.1%) i/s    (3.15 ms/i) -      1.595k in   5.023245s
                  oj    348.168 (± 2.6%) i/s    (2.87 ms/i) -      1.760k in   5.058431s
           oj strict    394.599 (± 2.8%) i/s    (2.53 ms/i) -      1.976k in   5.012073s
          Oj::Parser    403.771 (± 3.0%) i/s    (2.48 ms/i) -      2.058k in   5.101578s
           rapidjson    383.441 (± 3.7%) i/s    (2.61 ms/i) -      1.938k in   5.061355s

Comparison:
                json:      317.9 i/s
          Oj::Parser:      403.8 i/s - 1.27x  faster
           oj strict:      394.6 i/s - 1.24x  faster
           rapidjson:      383.4 i/s - 1.21x  faster
                  oj:      348.2 i/s - 1.10x  faster
```

After:

```
== Parsing twitter.json (567916 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    56.000 i/100ms
                  oj    62.000 i/100ms
           oj strict    73.000 i/100ms
          Oj::Parser    76.000 i/100ms
           rapidjson    54.000 i/100ms
Calculating -------------------------------------
                json    561.009 (± 7.5%) i/s    (1.78 ms/i) -      2.800k in   5.039548s
                  oj    601.124 (± 4.3%) i/s    (1.66 ms/i) -      3.038k in   5.064686s
           oj strict    707.455 (± 3.4%) i/s    (1.41 ms/i) -      3.577k in   5.062540s
          Oj::Parser    751.799 (± 3.1%) i/s    (1.33 ms/i) -      3.800k in   5.059509s
           rapidjson    535.641 (± 3.2%) i/s    (1.87 ms/i) -      2.700k in   5.045816s

Comparison:
                json:      561.0 i/s
          Oj::Parser:      751.8 i/s - 1.34x  faster
           oj strict:      707.5 i/s - 1.26x  faster
                  oj:      601.1 i/s - same-ish: difference falls within error
           rapidjson:      535.6 i/s - same-ish: difference falls within error

== Parsing citm_catalog.json (1727030 bytes)
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                json    30.000 i/100ms
                  oj    32.000 i/100ms
           oj strict    36.000 i/100ms
          Oj::Parser    42.000 i/100ms
           rapidjson    39.000 i/100ms
Calculating -------------------------------------
                json    313.248 (± 7.3%) i/s    (3.19 ms/i) -      1.560k in   5.014118s
                  oj    341.977 (± 4.1%) i/s    (2.92 ms/i) -      1.728k in   5.063332s
           oj strict    387.062 (± 6.2%) i/s    (2.58 ms/i) -      1.944k in   5.045961s
          Oj::Parser    400.423 (± 4.0%) i/s    (2.50 ms/i) -      2.016k in   5.044513s
           rapidjson    379.046 (± 6.1%) i/s    (2.64 ms/i) -      1.911k in   5.064461s

Comparison:
                json:      313.2 i/s
          Oj::Parser:      400.4 i/s - 1.28x  faster
           oj strict:      387.1 i/s - 1.24x  faster
           rapidjson:      379.0 i/s - 1.21x  faster
                  oj:      342.0 i/s - same-ish: difference falls within error
```
@byroot byroot merged commit dc19b2f into ruby:master Oct 31, 2024
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants