-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ideas for better bytestring builder #194
Comments
@fumieval I've seen your work and I've been meaning to reach out to you about it; it does look promising but I'd like to better understand its trade-offs; is it really strictly superior in all cases, or can you point out pathological cases where the current builder would be significantly (which may not necessarily be blockers) at a disadvantage? Is it possible to switch to your implementation while being 100% API compatible with the old impl? (i.e. same type-sigs; no observable semantic differences ) |
bytestring's Builder tends to be slow when working with small primitives (e.g. Word8). From my small benchmark:
AFAIK both |
Hrm; about the performance penalty of |
Because |
@fumieval so are you saying that the difference/penalty is mostly noticeable for small-ish serializations (i.e. in the <= 32KiB range)? |
Yes, currently it has a big (constant?) overhead. If the whole process takes more than 100μs, I think mason gets faster than others (as shown in https://github.com/fumieval/mason#performance) in most cases. |
@fumieval what if the caller could provide a size-hint of the expected output buffer size (i.e. an integer argument)? would it then be possible to mitigate the constant-overhead and get on par with the current builders performance? |
I think the problem really is the overhead of thread creation, not about the buffer size. fast-builder implements a rather complicated trick to avoid this issue: http://hackage.haskell.org/package/fast-builder-0.1.2.0/docs/src/Data.ByteString.FastBuilder.Internal.html#toLazyByteStringWith |
I wasn't able to replace Builder with Mason. Looks like fast builder doesn't provide options to wrap lazy string into builder. |
A new approach based on delimited continuations seems promising: ghc-proposals/ghc-proposals#313 (comment) |
@fumieval haven't looked at a lot of your work (which is rather broad btw, thank you), but have you considered something along the lines of builder or small-bytearray-builder. These could be re-worked to use Ptr rather easily, and I'm interested to see how they stack up. Additionally, @andrewthad has done a lot of thinking about builders and may have some interesting input here. |
In bytebuild, I use GC-managed |
@chessai small-bytearray-builder's design looks very interesting, but both builder and small-bytearray-builder produce fully strict byte arrays only, while we need streaming behaviour for toLazyByteString and hPutBuilder. @andrewthad I didn't know about bytes-builder-shootout, that looks useful! If mason works as intended (primitives are well fused and inlined), producing strict ByteString should be faster than fast-builder. I'll try adding mason to bytes-builder-shootout |
I added mason to bytes-builder-shootout. the latest release of mason wasn't very good in this benchmark:
I investigated the code and it turns out that the lack of worker-wrapper transformation (which isn't necessary in fast-builder and small-bytearray-builder) is slowing it down. Flattening the internal structure drastically improved the speed.
|
@fumieval is there anything actionable here? |
Unfortunately, mason's performance regressed a lot since GHC 9.2, and I didn't manage to figure out why. I decided to put the project on hold until delimited continuation primops arrives. |
@fumieval Delimited Continuations has merged https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7942 Any new progress? |
As https://github.com/haskell-perf/strict-bytestring-builders pointed out, current Data.ByteString.Builder tends to be slow especially when primitives are small.
Some alternatives have good benchmark results (cf https://www.reddit.com/r/haskell/comments/e6yg76/mason_faster_bytestring_builder/ ,https://github.com/fumieval/mason#performance, http://hackage.haskell.org/package/fast-builder); we may be able to take some tricks from those.
It also makes sense to expand the API so that
hPutBuilder
I think it's important that the internal API are exposed. At the moment it's difficult to send the contents of Builder over a socket due to
BuildSignal
being opaque.The text was updated successfully, but these errors were encountered: