-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Data.ByteString.Lazy.Char8.lines less strict #562
Conversation
@Bodigrim Something went wrong early in downing GHC 8.2 for Ubuntu, and also cancelled all the other CI jobs. Perhaps a transient glitch, but I have no permissions to kick the CI it seems. Please make it go. A review would also be great. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I retried CI without success. I don't have time to investigate that further right now.
Although it was (partially) documented, the old strictness behavior is surprising and I expect any potential performance benefit is negligible outside of the degenerate small-chunks case, so I'm happy to replace it. (Have you benchmarked to verify that any performance impact is negligible?)
c79262c
to
827df30
Compare
We don't have existing benchmarks for lines, but looking at the code path taken when the initial chunk holds multiple lines I see no opportunity for performance degradation. If anything the new code should be faster, because no The in-chunk loop boils down to: lines (Chunk c0 cs0) = let l :| ls = lines1 c0 cs0 in l : ls
where
lines1 c cs
| len > 1 = c1 <| lines1 (B.unsafeDrop 1 t0) cs
...
where
(h0, t0) = B.break (== 0x0a) c
len = B.length t0
c1 = if B.null h0 then Empty else Chunk h0 Empty Which is essentially the same as the old: loop0 c cs =
case B.elemIndex (c2w '\n') c of
...
Just n | n /= 0 -> Chunk (B.unsafeTake n c) Empty
: loop0 (B.unsafeDrop (n+1) c) cs
| otherwise -> Empty
: loop0 (B.unsafeTail c) cs Strict ByteStrings optimise |
e5b2716
to
564ee0a
Compare
@vdukhovni please rebase atop of #563 |
Done. |
Rebasing on #563 should resolve the CI issues... |
7a10c98
to
e191a74
Compare
Rebased on |
@clyring I think is looking pretty clean now. Once there are no residual issues, and you mark the PR approved, I'll squash and wait for any final reviews. |
I looked at the generated "core" output from GHC 9.2 (with -O), and indeed there are no unwanted thunks. In the expected case that a chunk has an internal newline, all the The interface file has wrappers to unbox the arguments and call into the main loop:
While the loop itself becomes:
FWIW, the unboxed (n+1) is computed twice, it is easy to patch the code to avoid that too, but I think the micro-optimisation is not worth the price of readability: --- a/Data/ByteString/Lazy/Char8.hs
+++ b/Data/ByteString/Lazy/Char8.hs
@@ -882,7 +882,8 @@ lines (Chunk c0 cs0) = unNE $! go c0 cs0
go :: S.ByteString -> ByteString -> NonEmpty ByteString
go c cs = case B.elemIndex (c2w '\n') c of
Just n
- | n + 1 < B.length c -> consNE c' $ go (B.unsafeDrop (n+1) c) cs
+ | n1 <- n + 1
+ , n1 < B.length c -> consNE c' $ go (B.unsafeDrop n1 c) cs
-- 'c' was a multi-line chunk
| otherwise -> c' :| lines cs
-- 'c' was a single-line chunk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good now. Just two aesthetic quibbles:
- You may wish to remove
Data.ByteString.break
from the commit message. - The comment for
consNE
is now out-of-date.
The maintainers will generally squash when merging, so doing so yourself is not required.
I did also confirm with a toy benchmark (nf L8.lines (L.fromStrict $ stimes 32 loremIpsum)
) that this patch does not regress performance in the typical many-more-lines-than-chunks situation. (...and 827df30 was about 60% slower, almost entirely because (<|)
is terrible.)
The current implementation of `lines` in Data.ByteString.Lazy.Char8 is too strict. When a "line" spans multiple chunks it traverses all the chunks to the first line boundary before constructing the list head. For example, `lines <$> getContents` reading a large file with no line breaks does not make the first chunk of the (only) line available until the entire file is read into memory.
Thanks for pitching in. I've updated the comments as suggested and squashed. I hope this is it. Also, is it worth bothering with the hypothetical patch below (from #562 (comment))? Does your benchmark show any difference? --- a/Data/ByteString/Lazy/Char8.hs
+++ b/Data/ByteString/Lazy/Char8.hs
@@ -882,7 +882,8 @@ lines (Chunk c0 cs0) = unNE $! go c0 cs0
go :: S.ByteString -> ByteString -> NonEmpty ByteString
go c cs = case B.elemIndex (c2w '\n') c of
Just n
- | n + 1 < B.length c -> consNE c' $ go (B.unsafeDrop (n+1) c) cs
+ | n1 <- n + 1
+ , n1 < B.length c -> consNE c' $ go (B.unsafeDrop n1 c) cs
-- 'c' was a multi-line chunk
| otherwise -> c' :| lines cs
-- 'c' was a single-line chunk |
prop_lines_empty_invariant = | ||
True === case LC.lines (LC.pack "\nfoo\n") of | ||
Empty : _ -> True | ||
_ -> False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also have a randomized test that uses invariant
or checkInvariant
to check the output of lines
? I was expecting this test to exist already, but I couldn't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could that be a separate PR, or would you care to propose the test? It'd take me some scarce cycles to page in (or acquire) the knowhow of how to construct randomised tests for this, I don't use the facilities in question sufficiently often in real life...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this can be done separately. To generate the test input, I'd simply use some arbitrary
LazyByteString
's, interleaved with additional newlines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have opened #564 to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers!
Are we waiting for @Bodigrim to review, or can this now be merged? By the way, when squashing I did inadvertently merge that |
Thanks @vdukhovni! |
The current implementation of `lines` in Data.ByteString.Lazy.Char8 is too strict. When a "line" spans multiple chunks it traverses all the chunks to the first line boundary before constructing the list head. For example, `lines <$> getContents` reading a large file with no line breaks does not make the first chunk of the (only) line available until the entire file is read into memory. Co-authored-by: Viktor Dukhovni <ietf-dane@dukhovni.org> (cherry picked from commit eb352a9)
The current implementation of
lines
in Data.ByteString.Lazy.Char8 is too strict. When a "line" spans multiple chunks it traverses all the chunks to the first line boundary before constructing the list head.For example,
lines <$> getContents
reading a large file with no line breaks does not make the first chunk of the (only) line available until the entire file is read into memory.Now that
Data.ByteString.break
is optimised for the(== c)
case, we can get efficient code for the common many lines per-chunk use-case, without being needlessly strict. Tests added to make sure that the first chunk is available prompty without looking further.