Skip to content

Allocate the FBuffer struct on the stack #657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 29, 2024

Conversation

casperisfine
Copy link

Ref: #655

The actual buffer is still on the heap, but this saves a pair of malloc/free.

This helps a lot on micro-benchmarks

Before:

ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   531.598k i/100ms
          JSON reuse   417.666k i/100ms
Calculating -------------------------------------
                  Oj      5.735M (± 1.3%) i/s  (174.35 ns/i) -     28.706M in   5.005900s
          JSON reuse      4.604M (± 1.4%) i/s  (217.18 ns/i) -     23.389M in   5.080779s

Comparison:
                  Oj:  5735475.6 i/s
          JSON reuse:  4604380.3 i/s - 1.25x  slower

After:

ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower

Bench:

require 'benchmark/ips'
require 'oj'
require 'json'

json_encoder = JSON::State.new(JSON.dump_default_options)
test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]]

Oj.default_options = Oj.default_options.merge(mode: :compat)

Benchmark.ips do |x|
  x.config(time: 5, warmup: 2)

  x.report("Oj") do
    Oj.dump(test_data)
  end

  x.report("JSON reuse") do
    json_encoder.generate(test_data)
  end

  x.compare!(order: :baseline)
end

Ref: ruby#655

The actual buffer is still on the heap, but this saves a pair
of malloc/free.

This helps a lot on micro-benchmarks

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   531.598k i/100ms
          JSON reuse   417.666k i/100ms
Calculating -------------------------------------
                  Oj      5.735M (± 1.3%) i/s  (174.35 ns/i) -     28.706M in   5.005900s
          JSON reuse      4.604M (± 1.4%) i/s  (217.18 ns/i) -     23.389M in   5.080779s

Comparison:
                  Oj:  5735475.6 i/s
          JSON reuse:  4604380.3 i/s - 1.25x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

Bench:

```ruby
require 'benchmark/ips'
require 'oj'
require 'json'

json_encoder = JSON::State.new(JSON.dump_default_options)
test_data = [1, "string", { a: 1, b: 2 }, [3, 4, 5]]

Oj.default_options = Oj.default_options.merge(mode: :compat)

Benchmark.ips do |x|
  x.config(time: 5, warmup: 2)

  x.report("Oj") do
    Oj.dump(test_data)
  end

  x.report("JSON reuse") do
    json_encoder.generate(test_data)
  end

  x.compare!(order: :baseline)
end
```
@byroot byroot merged commit 48e22da into ruby:master Oct 29, 2024
35 checks passed
@casperisfine casperisfine deleted the fbuffer-stack branch October 29, 2024 09:35
casperisfine pushed a commit to casperisfine/json that referenced this pull request Oct 29, 2024
Ref: ruby#655
Followup: ruby#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```
byroot added a commit to casperisfine/json that referenced this pull request Oct 29, 2024
Ref: ruby#655
Followup: ruby#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```
casperisfine pushed a commit to casperisfine/json that referenced this pull request Oct 29, 2024
Ref: ruby#655
Followup: ruby#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```
casperisfine pushed a commit to casperisfine/json that referenced this pull request Oct 29, 2024
Ref: ruby#655
Followup: ruby#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```
casperisfine pushed a commit to casperisfine/json that referenced this pull request Oct 30, 2024
Ref: ruby#655
Followup: ruby#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```
hsbt pushed a commit to hsbt/ruby that referenced this pull request Nov 1, 2024
Ref: ruby/json#655
Followup: ruby/json#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision ruby/json@be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision ruby/json@be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```

ruby/json@fe607f4806
hsbt pushed a commit to ruby/ruby that referenced this pull request Nov 1, 2024
Ref: ruby/json#655
Followup: ruby/json#657

Assuming the generator might be used for fairly small documents
we can start with a reasonable buffer size of the stack, and if
we outgrow it, we can spill on the heap.

In a way this is optimizing for micro-benchmarks, but there are
valid use case for fiarly small JSON document in actual real world
scenarios, so trashing the GC less in such case make sense.

Before:

```
ruby 3.3.4 (2024-07-09 revision ruby/json@be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   518.700k i/100ms
          JSON reuse   483.370k i/100ms
Calculating -------------------------------------
                  Oj      5.722M (± 1.8%) i/s  (174.76 ns/i) -     29.047M in   5.077823s
          JSON reuse      5.278M (± 1.5%) i/s  (189.46 ns/i) -     26.585M in   5.038172s

Comparison:
                  Oj:  5722283.8 i/s
          JSON reuse:  5278061.7 i/s - 1.08x  slower
```

After:

```
ruby 3.3.4 (2024-07-09 revision ruby/json@be1089c8ec) +YJIT [arm64-darwin23]
Warming up --------------------------------------
                  Oj   517.837k i/100ms
          JSON reuse   548.871k i/100ms
Calculating -------------------------------------
                  Oj      5.693M (± 1.6%) i/s  (175.65 ns/i) -     28.481M in   5.004056s
          JSON reuse      5.855M (± 1.2%) i/s  (170.80 ns/i) -     29.639M in   5.063004s

Comparison:
                  Oj:  5692985.6 i/s
          JSON reuse:  5854857.9 i/s - 1.03x  faster
```

ruby/json@fe607f4806
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants