-
Notifications
You must be signed in to change notification settings - Fork 266
Split reader to a Parser and a Reader
#449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It can confuse due to impl Deref<Target = Reader> for NsReader if NsReader's own `reader` would unaccessible
This functions will be reused by async reading methods Co-authored-by: Sophie Tauchert <sophie.tauchert@relaxdays.de>
This will allow us to replace reader while keep the state, which would required to implement a sync-to-async conversion and vice versa
This commit only moves code + adds a `pub` to all fields / methods In the future we should add tests just for parsing
This would allow to not duplicate constructor code
Codecov Report
@@ Coverage Diff @@
## master #449 +/- ##
==========================================
+ Coverage 51.24% 51.30% +0.06%
==========================================
Files 27 28 +1
Lines 13295 13312 +17
==========================================
+ Hits 6813 6830 +17
Misses 6482 6482
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
| { | ||
| let mut reader = Self::from_reader(s.as_bytes()); | ||
| reader.encoding = EncodingRef::Explicit(UTF_8); | ||
| reader.parser.encoding = EncodingRef::Explicit(UTF_8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encoding doesn't feel like it belongs on the parser. It's primarily an aspect of reading and the way bytes are provided to the reader, but the parser can override the original default. But as I mentioned on the other PR even the manual override isn't really something that encoding_rs nor encoding_rs_io will give you so that will need some thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Encoding is an interesting field. In can be part of configuration (when you create reader from_str) and it can be part of the parse state (when you create reader by other functions). Because I plan to use that in such manner:
/// Convert any synchronous reader to asynchronous one if inner reader support that
impl<R: AsyncBufRead + Unpin> From<Reader<R>> for Reader<TokioReader<R>> {
fn from(reader: Reader<R>) -> Self {
Self {
reader: TokioReader(reader.reader),
parser: reader.parser,
}
}
}
/// Convert any asynchronous reader to a synchronous one if inner reader support that
impl<R> From<Reader<TokioReader<R>>> for Reader<R> {
fn from(reader: Reader<TokioReader<R>>) -> Self {
Self {
reader: reader.reader.0,
parser: reader.parser,
}
}
}it is convenient to leave it in the Parser. It will be possible to move it later if it is required
src/reader/mod.rs
Outdated
| TagState::Exit => return Ok(Event::Eof), | ||
| let event = match self.parser.state { | ||
| ParseState::Init => self.read_until_open(buf, true), | ||
| ParseState::Closed => self.read_until_open(buf, false), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might make sense to rename these variants as e.g. ParseState::ClosedTag, ParseState::OpenTag
| #[derive(Clone)] | ||
| pub(super) struct Parser { | ||
| /// current buffer position, useful for debugging errors | ||
| /// Number of bytes read from the source of data since the parser was created |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is not going to work once we support multiple encodings, unfortunately. At least not unless "source of data" is defined as the decoded stream of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can change it when we changes the working with encodings
|
Why don't you like "rebase and merge" |
|
Without merge commit it is hard to figure out in which PR work was done and where the one PR finished and the new started |
This PR has two goals:
reader/mod.rsfile to keep it maintainableAlso, I include in this PR some minor changes that simplifies async implementation.
(If merge please do that with merge commit)