Replace other `iter().position()` instances with `memchr` #302

nagisa · 2021-05-08T22:10:02Z

This might provide similar improvements to those seen in #301, depending on how hot the relevant code is.. This makes memchr a non-optional dependency, due to its use in read_string method of pod::Bytes. If this is a problem it is still possible to make it an optional dependency used by coff and macho code.

philipc

This is great, thanks. If the dependency is ever a problem we can make it optional with a naive implementation when disabled.

Let's merge this instead of #301. Can you delete the unneeded fixme?

src/read/archive.rs

philipc · 2021-05-10T09:09:43Z

I experimented with using reads instead of mmap (https://github.com/philipc/object/tree/pr302) and it seems promising:

mmap:
          5,324.10 msec task-clock                #    0.997 CPUs utilized          
    18,797,166,250      cycles                    #    3.531 GHz                    
    28,849,521,693      instructions              #    1.53  insn per cycle         

read:
          2,655.91 msec task-clock                #    0.983 CPUs utilized          
     9,361,529,424      cycles                    #    3.525 GHz                    
     7,898,562,868      instructions              #    0.84  insn per cycle

Note that I changed the loop to cover the file open too (just the inner loop is slower, but that's expected to be better for mmap since it's already in memory after the first loop). It's still not ideal because it does a hash lookup and allocation for every read, but fixing that is harder and something I'll think about. Also not sure how well it translates to real workloads.

This makes parsing of the archive headers significantly faster. The `ar` example adjusted to parse the same archive 1 million times, when run with the rlib of the `object` crate itself produces the following metrics: 788.19 msec task-clock:u # 0.998 CPUs utilized 2,502,967,113 cycles:u # 3.176 GHz 7,780,571,392 instructions:u # 3.11 insn per cycle In contrast to the following for the old code: 1,061.09 msec task-clock:u # 0.998 CPUs utilized 3,374,141,510 cycles:u # 3.180 GHz 12,012,570,139 instructions:u # 3.56 insn per cycle This results in a reduction of about 1B cycles, or 25% reduction in wall clock time. Originally `perf` would show a heavy hotspot (in the area of 50% of the total runtime) in `parse_sysv_extended_name`.

Here instead of figuring out the extents of the integer ahead of time we check for the spaces while we compute the number itself. This further reduces the runtime of the beforementioned case (see previous commit) to: 580.57 msec task-clock:u # 0.997 CPUs utilized 1,843,957,595 cycles:u # 3.176 GHz 5,901,570,558 instructions:u # 3.20 insn per cycle `perf report` still shows that the most of the time is spent parsing sysv archive names (which makes sense – its pretty much all the program does after all!).

This makes `memchr` a non-optional dependency, due to its use in `read_string` method of `pod::Bytes`.

nagisa · 2021-05-10T12:55:51Z

Done.

Replace other `iter().position()` instances with `memchr`

philipc requested changes May 10, 2021

View reviewed changes

src/read/archive.rs Outdated Show resolved Hide resolved

nagisa added 3 commits May 10, 2021 15:55

Replace other .position() patterns with memchr

de7ecdd

This makes `memchr` a non-optional dependency, due to its use in `read_string` method of `pod::Bytes`.

nagisa force-pushed the nagisa/memchr-everywhere branch from b3d84de to de7ecdd Compare May 10, 2021 12:55

philipc approved these changes May 10, 2021

View reviewed changes

philipc merged commit 8ee62e1 into gimli-rs:master May 10, 2021

nagisa mentioned this pull request May 11, 2021

Use the object crate for metadata reading rust-lang/rust#83640

Merged

mcbegamerxx954 pushed a commit to mcbegamerxx954/object that referenced this pull request Jun 15, 2024

Merge pull request gimli-rs#302 from nagisa/nagisa/memchr-everywhere

3cacb7e

Replace other `iter().position()` instances with `memchr`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace other `iter().position()` instances with `memchr` #302

Replace other `iter().position()` instances with `memchr` #302

nagisa commented May 8, 2021

philipc left a comment

philipc commented May 10, 2021

nagisa commented May 10, 2021

Replace other iter().position() instances with memchr #302

Replace other iter().position() instances with memchr #302

Conversation

nagisa commented May 8, 2021

philipc left a comment

Choose a reason for hiding this comment

philipc commented May 10, 2021

nagisa commented May 10, 2021

Replace other `iter().position()` instances with `memchr` #302

Replace other `iter().position()` instances with `memchr` #302