New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Span and pipelinify all the things #1

Open
wants to merge 4 commits into
base: master
from

Conversation

3 participants
@davidfowl

davidfowl commented Jun 8, 2018

This blog post basically nerd sniped me. So here's a version of the line parser and file reader that tries to be extremely optimal (there's some other things that can be done but I have to sleep). It's using System.IO.Pipelines to do the file reading and Span to do the line parsing. I copied BufferReader over from the ASP.NET Core common repository (Kestrel uses this internally for HTTP Parsing).

On my machine these are the results:

#13 ViaPipeReader
Took: 6,625 ms
Allocated: 504 kb
Peak Working Set: 15,772 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

Latest Results:

Ok now we're cooking:

#13 ViaPipeReader
Took: 5,844 ms
Allocated: 16 kb
Peak Working Set: 13,120 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0
@indy-singh

This comment has been minimized.

Show comment
Hide comment
@indy-singh

indy-singh Jun 8, 2018

Owner

Hi @davidfowl

Love it! I honestly didn't expect the article to generate so much discussion. A lot of people have been asking about a Span<T> version and now I can just point them to V13 - thanks for contributing!

Cheers,
Indy

Owner

indy-singh commented Jun 8, 2018

Hi @davidfowl

Love it! I honestly didn't expect the article to generate so much discussion. A lot of people have been asking about a Span<T> version and now I can just point them to V13 - thanks for contributing!

Cheers,
Indy

@kierenj

This comment has been minimized.

Show comment
Hide comment
@kierenj

kierenj Jun 8, 2018

You wouldn't fancy putting together a similar blog write-up of this, @davidfowl ? I mean, the code is great, but another decent blog post on these topics would make the world a better place?

kierenj commented Jun 8, 2018

You wouldn't fancy putting together a similar blog write-up of this, @davidfowl ? I mean, the code is great, but another decent blog post on these topics would make the world a better place?

@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Jun 8, 2018

Allocations are worse here because of the ArrayPoolMemoryPool implementation (Not the ArrayPool):

image

There's an unfortunate allocation per return/return. Luckily the pool is pluggable but I haven't bothered to go further.

davidfowl commented Jun 8, 2018

Allocations are worse here because of the ArrayPoolMemoryPool implementation (Not the ArrayPool):

image

There's an unfortunate allocation per return/return. Luckily the pool is pluggable but I haven't bothered to go further.

@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Jun 8, 2018

Here's what 12 looks like:

image

Allocations are much better. I'll work on a fix later on today.

davidfowl commented Jun 8, 2018

Here's what 12 looks like:

image

Allocations are much better. I'll work on a fix later on today.

@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Jun 9, 2018

Ok added a simpler array pool that doesn't allocate per rent/return 😄 👍

Took: 6,000 ms
Allocated: 80 kb
Peak Working Set: 15,464 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

Still 80K though. Here's the memory profile:

image

davidfowl commented Jun 9, 2018

Ok added a simpler array pool that doesn't allocate per rent/return 😄 👍

Took: 6,000 ms
Allocated: 80 kb
Peak Working Set: 15,464 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

Still 80K though. Here's the memory profile:

image

@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Jun 9, 2018

4K buffer. I think to go into the single digit realm I'll need a custom pipe implementation that just uses a single 4K buffer for the entire read.

#13 ViaPipeReader
Took: 6,406 ms
Allocated: 56 kb
Peak Working Set: 15,264 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

davidfowl commented Jun 9, 2018

4K buffer. I think to go into the single digit realm I'll need a custom pipe implementation that just uses a single 4K buffer for the entire read.

#13 ViaPipeReader
Took: 6,406 ms
Allocated: 56 kb
Peak Working Set: 15,264 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0
@davidfowl

This comment has been minimized.

Show comment
Hide comment
@davidfowl

davidfowl Jun 9, 2018

Ok now we're cooking:

#13 ViaPipeReader
Took: 5,844 ms
Allocated: 16 kb
Peak Working Set: 13,120 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

davidfowl commented Jun 9, 2018

Ok now we're cooking:

#13 ViaPipeReader
Took: 5,844 ms
Allocated: 16 kb
Peak Working Set: 13,120 kb
Gen 0 collections: 0
Gen 1 collections: 0
Gen 2 collections: 0

indy-singh added a commit that referenced this pull request Jun 9, 2018

Added v14 - 8kb allocations
Skipping v13 as #1

- Drop in allocation comes from fixing a small buffer size on the file stream. But this may back fire as there is a trade off between small buffer size vs time.

- Potential speed increase (but really dependent on the line length) but making sure we only touch each index once.

Should really start using BenchmarkDotNet now, as things are now in the nitty gritty
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment