Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new 'PeekableStream' to do binary categorization. #2753

Merged
merged 6 commits into from
Jan 11, 2024

Conversation

scottoneil-ms
Copy link
Contributor

PeekableStream offers a very weak Seek(Origin) capability that expires once a configurable amount of Stream has been read.

moving files around
public class PeekableStreamTests
{
[Fact]
public void PeekableStream_BoundaryRewinds()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PeekableStream_BoundaryRewinds

So, can we assume that there wasn't a need to produce other directed unit tests because the enumerated artifact class and maybe zip archive already exercise what's interesting? I think that's true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this covers a lot, but I've thought of some other cases to hit and you added one as well.

@michaelcfanning
Copy link
Member

Release note, please! For this excellent change.

@michaelcfanning
Copy link
Member

I know the class is internal and we're swapping out for function that should be equivalent. We'll want signal in the release notes of when we made the switch-over though, in case users report an issue in the enumerated artifact class.


In reply to: 1885935208

throw new System.NotImplementedException();
}

public override int Read(byte[] buffer, int offset, int count)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read

ok! I think I understand. your peekable stream is kinda like a pocket knife with a 'stop' before it fully opens. users can read the header bytes and you will 'stop' when you hit the transition point that will require you to abandon the rewind buffer. this gives the caller the change to easily fully inspect the header without worrying about an over-read. At this point, either the caller presumably rewinds (to get the entire stream by passing it to another reader) OR moves forward.

Two things:

  1. this is weird, but for readability, I might have ordered this code to reflect these three states reflecting where we are in the stream, 1) within the rewind buffer, 2) at the transition point, 3) past the transition point.

Simple because this makes it easier to build a mental model of what's going on by reading your commented code.

Second thing, consider a large comment at the top that describes the general approach. For fun, ask Copilot to generate it. :) haha.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "stop" point is more an implementation detail to make implementation easier than an API feature. The caller is in control of the buffer size. That's 100% of what they need to know to use the class correctly. I would not want them forming dependencies on the partial read behavior.

What the partial reads do, is let me operate over one byte range per Read() call. That simplifies things a bit. It would be harder to read and get correct (I claim) if I got hung up on delivering all the requested bytes, and had to compose both an internal buffer read and a call into the underlying stream, in the same call.

Copy link
Member

@michaelcfanning michaelcfanning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@scottoneil-ms scottoneil-ms merged commit 08c4e34 into main Jan 11, 2024
8 checks passed
@scottoneil-ms scottoneil-ms deleted the users/scott/peekable branch January 11, 2024 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants