Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace other iter().position() instances with memchr #302

Merged
merged 3 commits into from
May 10, 2021

Commits on May 10, 2021

  1. Use memchr::memchr{,2} for archive parsing

    This makes parsing of the archive headers significantly faster. The `ar`
    example adjusted to parse the same archive 1 million times, when run
    with the rlib of the `object` crate itself produces the following
    metrics:
    
        788.19 msec     task-clock:u              #    0.998 CPUs utilized
     2,502,967,113      cycles:u                  #    3.176 GHz
     7,780,571,392      instructions:u            #    3.11  insn per cycle
    
    In contrast to the following for the old code:
    
      1,061.09 msec     task-clock:u              #    0.998 CPUs utilized
     3,374,141,510      cycles:u                  #    3.180 GHz
    12,012,570,139      instructions:u            #    3.56  insn per cycle
    
    This results in a reduction of about 1B cycles, or 25% reduction in wall
    clock time.
    
    Originally `perf` would show a heavy hotspot (in the area of 50% of the
    total runtime) in `parse_sysv_extended_name`.
    nagisa committed May 10, 2021
    Configuration menu
    Copy the full SHA
    7353fff View commit details
    Browse the repository at this point in the history
  2. Further reduce the runtime of parse_u64_digits

    Here instead of figuring out the extents of the integer ahead of time we
    check for the spaces while we compute the number itself. This further
    reduces the runtime of the beforementioned case (see previous commit) to:
    
        580.57 msec     task-clock:u              #    0.997 CPUs utilized
     1,843,957,595      cycles:u                  #    3.176 GHz
     5,901,570,558      instructions:u            #    3.20  insn per cycle
    
    `perf report` still shows that the most of the time is spent parsing
    sysv archive names (which makes sense – its pretty much all the program
    does after all!).
    nagisa committed May 10, 2021
    Configuration menu
    Copy the full SHA
    119d995 View commit details
    Browse the repository at this point in the history
  3. Replace other .position() patterns with memchr

    This makes `memchr` a non-optional dependency, due to its use in
    `read_string` method of `pod::Bytes`.
    nagisa committed May 10, 2021
    Configuration menu
    Copy the full SHA
    de7ecdd View commit details
    Browse the repository at this point in the history