-
Notifications
You must be signed in to change notification settings - Fork 5k
[API Proposal]: Get/Enumerate line spans from TextReader #114987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @dotnet/area-system-io |
There will be lifetime problem to return spans. The API shape of The shape of |
The key difference between |
I see. The current implementation of StreamReader holds a byte buffer and a char buffer at the same time, for ReadLine it uses a StringBuilder which internally could use linked list storage thus could not get a single span. Any chances the char buffer be resizable, or be backed by ArrayPool/Pipelines? The Pipelines introduction is just too long and warns about a lot of gotchas, and doesn't include how to get a line span, but StringBuilder instead. Looks way more complex compared to the example I have up here. I'm OK with TextReader's ReadLineAsSpan being abstract or throws by default. Need StreamReader to work and StringReader to test. As for decoding overhead, yes, ultimately one would need some sort of Utf8StreamReader and some sort of Utf8Regex to avoid decoding. But those sound like asking too much? |
The example in OP doesn't work at all because of lifetime violation. It should at least be: bool TryReadLine(Span<char> destination); |
This shape could work, with an additional parameter I don't get why returning span violates lifetime though. The following code compiles fine. class MyReader
{
char[] _charBuffer = ['1', '\n'];
public ReadOnlySpan<char> ReadLine() => _charBuffer.AsSpan(0..1);
} |
No, this requires the char buffer to be persisted as a state of the reader. Subsequent read will either invalidate the span, or prevent it to be pooled in any form at all. |
I meant the TryReadLine you proposed. |
Simply throw away the old value and get the new one. Spans are not meant to be kept for a long time by callers, for example, the Span returned by |
This is considered unsafe and not widely used, only with strong performance considerations. |
Uh oh!
There was an error while loading. Please reload this page.
Background and motivation
When parsing a file line by line, and there's no need to keep the whole line as a string, span based APIs will be helpful for reducing unnecessary allocations up to the file size, as we already have span based Regex APIs.
API Proposal
API Usage
ReadLineAsSpan:
EnumerateLines:
Alternative Designs
Risks
For the span returning proposal, subsequent read will invalidate the span, user should not keep the value before the next call.
For the
TryReadLine
design, user need to manage the buffer by themselves.The text was updated successfully, but these errors were encountered: