Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark custom getline implemention #3101

Closed
springmeyer opened this issue Oct 1, 2015 · 8 comments
Closed

Benchmark custom getline implemention #3101

springmeyer opened this issue Oct 1, 2015 · 8 comments
Milestone

Comments

@springmeyer
Copy link
Member

@artemp added a new getline impl in be437eb that is able to cope with embedded newlines in csv lines. To feel comfortable going forward with this we need to benchmark compared to std::getline to ensure performance is not terrible. Our getline will do slightly more work so as long as it is not > 10% slower we should be okay.

@springmeyer springmeyer added this to the Mapnik v3.0.6 milestone Oct 1, 2015
@springmeyer
Copy link
Member Author

Slower, but not by much:

$ ./benchmark/out/test_getline --iterations 10000000 --line "one          
two"
std::getline:                              t:0 i:10000000     285 milliseconds
csv_utils::getline_csv:                    t:0 i:10000000     320 milliseconds

@springmeyer
Copy link
Member Author

Sometimes faster when multithreaded:

$ ./benchmark/run
Inheriting from mapnik-settings.env
std::getline:                              t:0 i:10000000     293 milliseconds
csv_utils::getline_csv:                    t:0 i:10000000     338 milliseconds
std::getline:                              t:20 i:500000      488 milliseconds
csv_utils::getline_csv:                    t:20 i:500000      371 milliseconds

@springmeyer
Copy link
Member Author

Ugh: profiling shows that most time is being taken in locale locking due to the stream.widen called per loop (

std::getline(s,csv_line,s.widen(newline));
). So we're not actually isolating getline yet...

@springmeyer
Copy link
Member Author

Before:

std::getline:                              t:20 i:500000      488 milliseconds
csv_utils::getline_csv:                    t:20 i:500000      371 milliseconds

After 336170c:

$ ./benchmark/out/test_getline --iterations 500000 --threads 20
std::getline:                              t:20 i:500000       32 milliseconds
csv_utils::getline_csv:                    t:20 i:500000       61 milliseconds

@springmeyer
Copy link
Member Author

I'm comfortable with perf for now. I've modified the tests so that they pass with the new getline implementation. If getline shows up in profiling as a bottleneck in the future then we can optimize it perhaps by avoiding the sentry (which is currently the bottleneck) and using C-style code that does not use std::ios: http://stackoverflow.com/a/8123198

@springmeyer
Copy link
Member Author

for the record, sentry and clear appear to be the bottlenecks in our getline impl currently:

@springmeyer
Copy link
Member Author

hmm, github image uploads are now working right now...

@artemp
Copy link
Member

artemp commented Oct 2, 2015

@springmeyer - thanks for the bench 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants