Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ExactSizeIterator for Pair iterators #833

Merged
merged 5 commits into from Apr 17, 2023

Conversation

MucTepDayH16
Copy link
Contributor

PR adds impl ExactSizeIterator and size_hint accordingly for Pairs, Tokens, FlatPairs and LinesSpan. This can hint rust compiler to allocate proper memory amount while collecting to vector or similar containers

Signed-off-by: MucTepDayH16 <denisdrozhzhin1999@gmail.com>
@MucTepDayH16 MucTepDayH16 requested a review from a team as a code owner April 12, 2023 22:09
@MucTepDayH16 MucTepDayH16 requested review from NoahTheDuke and removed request for a team April 12, 2023 22:09
Signed-off-by: MucTepDayH16 <denisdrozhzhin1999@gmail.com>
Signed-off-by: MucTepDayH16 <denisdrozhzhin1999@gmail.com>
Signed-off-by: MucTepDayH16 <denisdrozhzhin1999@gmail.com>
Copy link
Contributor

@tomtau tomtau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR! I think this is ok to merge, but it'd be good to check if there is any impact on the iterator performance

Comment on lines 106 to 118
fn len(&self) -> usize {
let mut start = self.start;
let mut count = 0;
while start < self.end {
start += 1;
while start < self.end && !self.is_start(start) {
start += 1;
}

count += 1;
}
count
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it ok for a len to have a linear complexity / a loop there? maybe we can check with this benchmark https://github.com/pest-parser/pest/blob/master/grammars/benches/json.rs#L91 to see if it had any impact? (@huacnlee used to to check that the impact of the line index )
alternatively, perhaps there could be a private field to maintain the length iteratively?

@@ -92,6 +92,12 @@ impl<'i, R: RuleType> Tokens<'i, R> {
}
}

impl<'i, R: RuleType> ExactSizeIterator for Tokens<'i, R> {
fn len(&self) -> usize {
self.end - self.start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely ok, but perhaps good to check / add a test that this works with non-ASCII UTF-8 characters

Signed-off-by: MucTepDayH16 <denisdrozhzhin1999@gmail.com>
@tomtau tomtau merged commit 20f0842 into pest-parser:master Apr 17, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants