Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate findIndexOrEnd by exporting it from Data.ByteString.Internal #337

Merged
merged 2 commits into from
Jan 11, 2021

Conversation

noughtmare
Copy link
Contributor

Closes #334

I am open to other suggestions for the section name "Internal indexing".

Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to other suggestions for the section name "Internal indexing".

Just "Indexing" would also do, IMHO.

Does this PR improve any benchmark results? Feel free to add benchmarks if the current suite doesn't cover the issue from #334.

@noughtmare
Copy link
Contributor Author

noughtmare commented Dec 19, 2020

Does this PR improve any benchmark results? Feel free to add benchmarks if the current suite doesn't cover the issue from #334.

There are no existing benchmarks that test the affected functions as far as I'm aware. I could add a new one in this PR, but then we would only see the new "score". The affected functions are the functions in D.B.Lazy that use findIndexOrEnd which are: takeWhile, dropWhile, break, group and groupBy.

I think group is an odd one, because all the others have a custom predicate argument. I think group could actually be improved by using a function that uses memchr internally, but that is probably something for another issue.

So, I am now thinking of adding extra benchmarks in BenchAll.hs for the takeWhile, dropWhile, break and groupBy functions.

The BenchAll.hs file is a bit strange, at the top it says: "Benchmark all 'Builder' functions.", but I think other people have added benchmarks to this file that are not related to the Builder functions, so it might be okay to put these benchmarks there.

Also, I don't know what the best input would be for these benchmarks. For dropWhile, takeWhile and break I could just use a lazy ByteString that is a few chunks large where each chunk is filled with a single byte repeated 4k times so that the condition is never met. But groupBy might need some other kind of input.

@Bodigrim
Copy link
Contributor

BenchAll.hs is kinda kitchen sink these days, please add benchmarks for dropWhile and takeWhile there.

@Bodigrim
Copy link
Contributor

Bodigrim commented Jan 9, 2021

@noughtmare sorry to nudge, but I'd like to get this merged before the next release.

@noughtmare
Copy link
Contributor Author

noughtmare commented Jan 11, 2021

Sorry for the long wait. I'm thinking of reordering the history so that the benchmarks are added before the changes. That way, you can more easily benchmark before and after the changes. I'm also having trouble with naming things.

@noughtmare
Copy link
Contributor Author

noughtmare commented Jan 11, 2021

I've added some benchmarks. The results on my machine:

Benchmark Old New
takeWhile 46.15 μs 35.34 μs
dropWhile 49.15 μs 36.65 μs
break 45.95 μs 50.95 μs
group zeroes 6.874 μs 6.283 μs
group zero-one 247.0 μs 248.7 μs
groupBy (>=) 191.1 μs 187.1 μs
groupBy (>) 522.4 μs 508.4 μs

No spectacular changes, but almost everything improves a little. I think group zero-one is not all that much different, but I'm confused by the regression of break.

@Bodigrim
Copy link
Contributor

On my machine benchmarks are:

findIndexOrEnd/takeWhile                 mean 36.32 μs  ( +- 1.579 μs  )
findIndexOrEnd/dropWhile                 mean 39.87 μs  ( +- 1.647 μs  )
findIndexOrEnd/break                     mean 39.18 μs  ( +- 1.237 μs  )
findIndexOrEnd/group zeroes              mean 4.920 μs  ( +- 138.9 ns  )
findIndexOrEnd/group zero-one            mean 232.8 μs  ( +- 19.99 μs  )
findIndexOrEnd/groupBy (>=)              mean 151.0 μs  ( +- 7.790 μs  )
findIndexOrEnd/groupBy (>)               mean 440.4 μs  ( +- 19.53 μs  )

vs.

findIndexOrEnd/takeWhile                 mean 29.68 μs  ( +- 728.2 ns  )
findIndexOrEnd/dropWhile                 mean 32.34 μs  ( +- 624.3 ns  )
findIndexOrEnd/break                     mean 32.33 μs  ( +- 488.3 ns  )
findIndexOrEnd/group zeroes              mean 4.733 μs  ( +- 78.60 ns  )
findIndexOrEnd/group zero-one            mean 190.5 μs  ( +- 6.970 μs  )
findIndexOrEnd/groupBy (>=)              mean 128.9 μs  ( +- 3.507 μs  )
findIndexOrEnd/groupBy (>)               mean 380.9 μs  ( +- 18.70 μs  )

So performance improves all across the board, and I do not observe any regression for break.

@Bodigrim Bodigrim requested a review from sjakobi January 11, 2021 18:43
@noughtmare
Copy link
Contributor Author

It occurred to me that these benchmarks will need to be changed if rewrite rules are added that would rewrite dropWhile (== x) to something that uses breakByte or spanByte like the strict module has. Maybe we should pre-emptively change the benchmark?

@Bodigrim
Copy link
Contributor

@noughtmare yes, this is a good idea, let's replace dropWhile (== 0) by dropWhile even or something similar.

@noughtmare
Copy link
Contributor Author

I have changed the benchmarks.

@Bodigrim Bodigrim added this to the 0.11.1.0 milestone Jan 11, 2021
@Bodigrim Bodigrim merged commit eea70ff into haskell:master Jan 11, 2021
@Bodigrim
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

findIndexOrEnd is faster in Data.ByteString than in Data.ByteString.Lazy for no reason.
3 participants