-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IAsyncEnumerable 2 times slower to enumerate than a regular IEnumerable? #1560
Comments
Does increasing the buffer size significantly make a difference? The only place async happens is reading from the CsvHelper/src/CsvHelper/CsvFieldReader.cs Line 117 in 8185f5d
|
@JoshClose setting the buffer size to a higher value significantly increased the speed. Reading 410,000 lines of csv took ~7 seconds synchronously, and now only ~9 seconds asynchronously. This seems about right. Thoughts? EDIT: It seems that for every 410,000 lines read on my machine, 2 seconds of overhead are added to the total computation when doing the async IO. No matter how high I set the buffer to, I can't bring it down up to the speed of the sync IO. Perhaps this IS the overhead of using 410,000 thread switches and context switches (if any). TL;DR: If I had 5 million lines in a CSV, synchronous IO would complete 24.38 seconds faster than asynchronous IO. |
@spiritbob I would suggest that this is expected behavior. Async methods are not designed to make things faster, they're designed to make threads not block and that does take an additional overhead. Based on what you've listed, my bet is that you're reading data from a local file which is likely not a good usecase for an Async method. Async reading would be better when you're reading data from a network stream or other remote data source where delays and lags are to be expected. That would make better use of the threadpool. |
@TonyValenti I'm reading an I agree that this is the expected behavior. |
@spiritbob even it being an IFormFile, it can be backed by any stream, network, file and so forth. Are you reading from the disk or are you taking this as an HTTP request. I am going to guess there is not a buffer that is backing it. I would test this by using a BufferedStream and set the min to at least 64-128k. We drastically sped up our app that was performing a lot of reads from a network share. Do you have sample code that you can share? |
@joefeser it's a simple HTTP request in ASP NET Core 3.1. I believe if the file is less than a certain size, it's stored in the memory, otherwise it's stored in the hard disk? I forgot the exact numbers, feel free to enlighten me. Was your approach applied to that framework? If so, how? |
@spiritbob Network streams should never be used for performance benchmarks. There is no telling how many packet analyzers are in the stream. Especially on a corporate network. |
@joefeser sorry, if you were referring to my actual testing environment, I believe I read the file from the disk, but my practical use case is ASP NET Core. |
What the hell?
I get it that you leverage the async-await flow, which means if done correctly, no thread is ever blocked, but I'd gladly block a thread to do the IO if it meant up to 2x performance increase in speed.
What is happening in the lower levels? You can easily test this.
Just call
CsvReader.GetRecordsAsync<YourObject>()
and enumerate, versusCsvReader.GetRecords<YourObject>()
and enumerate.I feel like there's some fake-async going on, because there's no way this gets slowed down this much from the over-heads of thread-switching/context switching.
Tested on a Console Application, by running the process on a Thread pool thread (Main Console thread Block-waiting for it to finish)
The text was updated successfully, but these errors were encountered: