Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perf] path::parent_path() is slow #88

Closed
tstack opened this issue Jan 20, 2021 · 6 comments
Closed

[perf] path::parent_path() is slow #88

tstack opened this issue Jan 20, 2021 · 6 comments
Assignees
Labels
available on master Fix is done on master branch, issue closed on next release enhancement New feature or request
Milestone

Comments

@tstack
Copy link

tstack commented Jan 20, 2021

Describe the bug
The path::parent_path() method is unreasonably slow compared to something like POSIX's dirname(3).

To Reproduce
Using this file:

#include <libgen.h>
#include <chrono>
#include "ghc/filesystem.hpp"

int main(int argc, char *argv[])
{
  {
    std::string path_str = "/a/really/long/path/to/test/performance/of/parent_path";

    {
      ghc::filesystem::path path(path_str);
      auto start = std::chrono::system_clock::now();

      for (int lpc = 0; lpc < 1000000; lpc++) {
        auto par = path.parent_path();
      }
      auto stop = std::chrono::system_clock::now();

      printf("ghc duration: %d\n", (int32_t) (stop - start).count());
    }
    {
      auto start = std::chrono::system_clock::now();

      for (int lpc = 0; lpc < 1000000; lpc++) {
        auto par = (char *) malloc(path_str.length() + 1);

        strcpy(par, path_str.c_str());
        dirname(par);
        free(par);
      }
      auto stop = std::chrono::system_clock::now();

      printf("dirname(3) duration: %d\n", (int32_t) (stop - start).count());
    }
  }
}

Compiled with:

$ g++ --std=c++14 -O3 path_perf.cc

On my 2013 iMac with a 3.5Ghz Core i7, I get the following results:

ghc duration: 2337478
dirname(3) duration: 94162

Expected behavior
The performance of parent_path() should be a lot better.

Additional context
I'm using this library in https://github.com/tstack/lnav and parent_path() turned out to be a major source of slowness when a lot of files were open.

(This library is great, thanks for making it!)

@trapexit
Copy link

To be fair... have you done a performance comparison with some standard versions?

https://github.com/gulrak/filesystem/blob/master/include/ghc/filesystem.hpp#L2703

Might also be worth replacing the implementation with dirname and see how that compares.

@tstack
Copy link
Author

tstack commented Jan 20, 2021

have you done a performance comparison with some standard versions?

Using the following compiler:

Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.3 (clang-1103.0.32.62)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

And updating the test file to use std::filesystem, the results are:

fs duration: 133115
dirname(3) duration: 97202

Which is a reasonable result.

@gulrak gulrak self-assigned this Jan 20, 2021
@gulrak
Copy link
Owner

gulrak commented Jan 20, 2021

I'll look into it, thanks for reporting.

@gulrak gulrak added the enhancement New feature or request label Jan 20, 2021
@gulrak
Copy link
Owner

gulrak commented Jan 20, 2021

Okay, I could reproduce your measurements. Thanks for delivering the nice test!

On my MacBook Pro i9, 2.4GHz I got:

ghc duration: 2261615
dirname(3) duration: 93431

I reworked the implementation and now get:

ghc duration: 177320
dirname(3) duration: 95402

It's not as fast as the libc++ std::filesystem::path::parent_path() that gets here:

fs duration: 105919
dirname(3) duration: 94327

But they use string_views internally and as this implementation needs to compile from C++11 on, and I don't want do differentiate too much of the code, I think I am quite happy with the improvement.

gulrak added a commit that referenced this issue Jan 20, 2021
@tstack
Copy link
Author

tstack commented Jan 21, 2021

That looks like a good improvement to me, thanks for taking a look.

@gulrak gulrak added the available on master Fix is done on master branch, issue closed on next release label Jan 21, 2021
@gulrak gulrak added this to the v1.4.2 milestone Jan 21, 2021
@gulrak gulrak modified the milestones: v1.4.2, v1.5.0 Jan 30, 2021
@gulrak
Copy link
Owner

gulrak commented Feb 7, 2021

This is now part of release v1.5.0

@gulrak gulrak closed this as completed Feb 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
available on master Fix is done on master branch, issue closed on next release enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants