Skip to content

Use memcpy() of standard C library#742

Merged
ohler55 merged 1 commit intoohler55:developfrom
Watson1978:memcpy
Jan 15, 2022
Merged

Use memcpy() of standard C library#742
ohler55 merged 1 commit intoohler55:developfrom
Watson1978:memcpy

Conversation

@Watson1978
Copy link
Collaborator

Ruby has its own implementation of ruby_nonempty_memcpy(), which is used when memcpy() is called.

static inline void *
ruby_nonempty_memcpy(void *dest, const void *src, size_t n)
{
    if (n) {
        return memcpy(dest, src, n);
    }
    else {
        return dest;
    }
}
RBIMPL_SYMBOL_EXPORT_END()
#undef memcpy
#define memcpy ruby_nonempty_memcpy

https://github.com/ruby/ruby/blob/master/include/ruby/internal/memory.h

It has an unnecessary if statement and it has some overhead.

Similar: #735

before after result
Oj.dump (macOS) 7.839k 8.000k 1.021x
Oj.dump (Linux) 9.140k 10.043k 1.099x

Environment

  • macOS
    • macOS 12.1
    • Apple M1 Max
    • Apple clang version 13.0.0 (clang-1300.0.29.30)
    • Ruby 3.1.0
  • Linux
    • Zorin OS 16
    • AMD Ryzen 7 5700G
    • gcc version 11.1.0
    • Ruby 3.1.0

macOS

Before

Warming up --------------------------------------
             Oj.dump   783.000  i/100ms
Calculating -------------------------------------
             Oj.dump      7.839k (± 0.9%) i/s -    118.233k in  15.084888s

After

Warming up --------------------------------------
             Oj.dump   803.000  i/100ms
Calculating -------------------------------------
             Oj.dump      8.000k (± 0.8%) i/s -    120.450k in  15.057950s

Linux

Before

Warming up --------------------------------------
             Oj.dump   887.000  i/100ms
Calculating -------------------------------------
             Oj.dump      9.140k (± 0.9%) i/s -    137.485k in  15.043569s

After

Warming up --------------------------------------
             Oj.dump   972.000  i/100ms
Calculating -------------------------------------
             Oj.dump     10.043k (± 0.9%) i/s -    150.660k in  15.002236s

Test code

require 'benchmark/ips'
require 'oj'

data = (0..10000).to_a

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data, mode: :compat) }
end

Ruby has its own implementation of ruby_nonempty_memcpy(), which is used when memcpy() is called.

```c
static inline void *
ruby_nonempty_memcpy(void *dest, const void *src, size_t n)
{
    if (n) {
        return memcpy(dest, src, n);
    }
    else {
        return dest;
    }
}
RBIMPL_SYMBOL_EXPORT_END()
#undef memcpy
#define memcpy ruby_nonempty_memcpy
```
https://github.com/ruby/ruby/blob/master/include/ruby/internal/memory.h

It has an unnecessary `if` statement and it has some overhead.

Similar: ohler55#735

−               | before | after   | result
--               | --     | --      | --
Oj.dump (macOS)  | 7.839k |  8.000k | 1.021x
Oj.dump (Linux)  | 9.140k | 10.043k | 1.099x

### Environment
- macOS
  - macOS 12.1
  - Apple M1 Max
  - Apple clang version 13.0.0 (clang-1300.0.29.30)
  - Ruby 3.1.0
- Linux
  - Zorin OS 16
  - AMD Ryzen 7 5700G
  - gcc version 11.1.0
  - Ruby 3.1.0

### macOS
#### Before
```
Warming up --------------------------------------
             Oj.dump   783.000  i/100ms
Calculating -------------------------------------
             Oj.dump      7.839k (± 0.9%) i/s -    118.233k in  15.084888s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   803.000  i/100ms
Calculating -------------------------------------
             Oj.dump      8.000k (± 0.8%) i/s -    120.450k in  15.057950s
```

### Linux
#### Before
```
Warming up --------------------------------------
             Oj.dump   887.000  i/100ms
Calculating -------------------------------------
             Oj.dump      9.140k (± 0.9%) i/s -    137.485k in  15.043569s
```

#### After
```
Warming up --------------------------------------
             Oj.dump   972.000  i/100ms
Calculating -------------------------------------
             Oj.dump     10.043k (± 0.9%) i/s -    150.660k in  15.002236s
```

### Test code
```ruby
require 'benchmark/ips'
require 'oj'

data = (0..10000).to_a

Benchmark.ips do |x|
  x.time = 15

  x.report('Oj.dump') { Oj.dump(data, mode: :compat) }
end
```
@Watson1978
Copy link
Collaborator Author

Here is stacktrace with before changing.

スクリーンショット 2022-01-16 0 18 06

@ohler55 ohler55 merged commit 7ed0b3b into ohler55:develop Jan 15, 2022
@Watson1978 Watson1978 deleted the memcpy branch January 15, 2022 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants