Skip to content

Conversation

@Mingun
Copy link
Collaborator

@Mingun Mingun commented Jul 31, 2022

This PR has two goals:

  • reduce size of the reader/mod.rs file to keep it maintainable
  • allow to replate the underlying reader and keep the state. Thus we can convert sync reader to async reader and vice versa if the underlying reader allow this

Also, I include in this PR some minor changes that simplifies async implementation.

(If merge please do that with merge commit)

Mingun and others added 7 commits July 31, 2022 15:53
It can confuse due to impl Deref<Target = Reader> for NsReader
if NsReader's own `reader` would unaccessible
This functions will be reused by async reading methods

Co-authored-by: Sophie Tauchert <sophie.tauchert@relaxdays.de>
This will allow us to replace reader while keep the state, which would required
to implement a sync-to-async conversion and vice versa
This commit only moves code + adds a `pub` to all fields / methods
In the future we should add tests just for parsing
This would allow to not duplicate constructor code
@Mingun Mingun requested a review from dralley July 31, 2022 12:40
@codecov-commenter
Copy link

codecov-commenter commented Jul 31, 2022

Codecov Report

Merging #449 (bddd109) into master (ad57bc2) will increase coverage by 0.06%.
The diff coverage is 96.87%.

@@            Coverage Diff             @@
##           master     #449      +/-   ##
==========================================
+ Coverage   51.24%   51.30%   +0.06%     
==========================================
  Files          27       28       +1     
  Lines       13295    13312      +17     
==========================================
+ Hits         6813     6830      +17     
  Misses       6482     6482              
Flag Coverage Δ
unittests 51.30% <96.87%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/reader/ns_reader.rs 65.13% <83.33%> (+1.77%) ⬆️
src/reader/mod.rs 89.10% <94.44%> (-1.50%) ⬇️
src/reader/parser.rs 98.60% <98.60%> (ø)
src/reader/slice_reader.rs 100.00% <100.00%> (ø)
src/reader/buffered_reader.rs 76.14% <0.00%> (+0.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ad57bc2...bddd109. Read the comment docs.

{
let mut reader = Self::from_reader(s.as_bytes());
reader.encoding = EncodingRef::Explicit(UTF_8);
reader.parser.encoding = EncodingRef::Explicit(UTF_8);
Copy link
Collaborator

@dralley dralley Jul 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoding doesn't feel like it belongs on the parser. It's primarily an aspect of reading and the way bytes are provided to the reader, but the parser can override the original default. But as I mentioned on the other PR even the manual override isn't really something that encoding_rs nor encoding_rs_io will give you so that will need some thought.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoding is an interesting field. In can be part of configuration (when you create reader from_str) and it can be part of the parse state (when you create reader by other functions). Because I plan to use that in such manner:

/// Convert any synchronous reader to asynchronous one if inner reader support that
impl<R: AsyncBufRead + Unpin> From<Reader<R>> for Reader<TokioReader<R>> {
    fn from(reader: Reader<R>) -> Self {
        Self {
            reader: TokioReader(reader.reader),
            parser: reader.parser,
        }
    }
}

/// Convert any asynchronous reader to a synchronous one if inner reader support that
impl<R> From<Reader<TokioReader<R>>> for Reader<R> {
    fn from(reader: Reader<TokioReader<R>>) -> Self {
        Self {
            reader: reader.reader.0,
            parser: reader.parser,
        }
    }
}

it is convenient to leave it in the Parser. It will be possible to move it later if it is required

TagState::Exit => return Ok(Event::Eof),
let event = match self.parser.state {
ParseState::Init => self.read_until_open(buf, true),
ParseState::Closed => self.read_until_open(buf, false),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might make sense to rename these variants as e.g. ParseState::ClosedTag, ParseState::OpenTag

#[derive(Clone)]
pub(super) struct Parser {
/// current buffer position, useful for debugging errors
/// Number of bytes read from the source of data since the parser was created
Copy link
Collaborator

@dralley dralley Jul 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which is not going to work once we support multiple encodings, unfortunately. At least not unless "source of data" is defined as the decoded stream of data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change it when we changes the working with encodings

@dralley
Copy link
Collaborator

dralley commented Jul 31, 2022

Why don't you like "rebase and merge"

@Mingun
Copy link
Collaborator Author

Mingun commented Jul 31, 2022

Without merge commit it is hard to figure out in which PR work was done and where the one PR finished and the new started

@dralley dralley merged commit 9ccc686 into tafia:master Jul 31, 2022
@Mingun Mingun deleted the parser branch July 31, 2022 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants