Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize small Getters #124

Open
winterland1989 opened this issue Sep 21, 2016 · 3 comments
Open

optimize small Getters #124

winterland1989 opened this issue Sep 21, 2016 · 3 comments

Comments

@winterland1989
Copy link

winterland1989 commented Sep 21, 2016

After some profiling, i came to the conclusion that current running logic is too complex, in order to support bytesRead, we have two Decoder type, the overhead of runGetIncremental is noticeable when the getter is small(around 100ns compare to cereal on my machine).

This gives binary a disadvantage in various benchmarks, since lots of benchmark just focus on a single combinator, the overhead of running it render this kind of benchmarks meaningless.

There're cases we do need running small getters, but i haven't really need bytesRead yet. So let's find a way to improve bytesRead, or remove it ?

@kolmodin
Copy link
Owner

Advancing the buffer pointer was moved to the outer loop since that gave a large speedup. This was done a few years ago when the library switched to CPS.
Since then other libraries have tried different approaches - IIRC attoparsec updates the pointer in the innermost loop and still has great performance.
It's definately worth looking into. All in all binary is due for an overhaul to try out some "new" ideas and change of API.

@winterland1989
Copy link
Author

Yes, attoparsec encode Pos into the CPS Parser type. My intuition is that, by doing the same trick we can eliminate the extra Decoder type, and speed up runGetIncremental. I'll try to verify this when i got time.

@winterland1989
Copy link
Author

winterland1989 commented Sep 21, 2016

Partial resolved in #125 , now the overhead is low enough to not be a practical problem, but it still screws some single combinator benchmarks, for example the one in scanner package(which is quite meaningless anyway).

I think maybe we should document trade-offs of current approach somewhere, so that people won't mistakenly think binary is slow. Actually it's the fastest one in my benchmark.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants