Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes parsing of macro invocations in Lists and SExps and some cases … #654

Merged
merged 2 commits into from
Oct 6, 2023

Conversation

popematt
Copy link
Contributor

@popematt popematt commented Oct 5, 2023

…in Structs

Issue #, if available:

None

Description of changes:

As I was starting to write test cases in ion-tests for macros, I discovered that macros were working when they appeared at the top level of an Ion stream, or when they were nested no more than one layer deep. I've fixed that, though some issues still remain with macros in structs (see #653).

See specific notes in 🗺️ comments.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Oct 5, 2023

Codecov Report

Attention: 16 lines in your changes are missing coverage. Please review.

Files Coverage Δ
src/lazy/text/buffer.rs 93.46% <85.71%> (-1.46%) ⬇️

... and 2 files with indirect coverage changes

📢 Thoughts on this report? Let us know!.

Copy link
Contributor Author

@popematt popematt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the long term solution needs to be some way of extracting the matcher logic from the TextBufferView and then having the TextBufferView keep track of what version it is currently reading. We could attach the matcher logic to TextEncoding_1_0 and TextEncoding_1_1 by way of a trait impl (with most of the logic being in the default implementation, and only places where it diverges having specific implementations), and then store the version in TextBufferView as an enum or a type-state. A change_version function on TextBufferView would allow a Reader to switch between Ion 1.0 and Ion 1.1 as needed.

value(None, tag(")")),
pair(
opt(Self::match_annotations),
// We need the s-expression parser to recognize the input `--3` as the operator `--` and the
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Because this delegated to match_sexp_value (implied _1_0), it would use the Ion 1.0 matchers for matching anything that is not an E-Expression inside a S-Expression, which caused this to happen:

outer::(     // Handled by match_sexp_1_1
  abc        // Handled by match_value_1_0
  (:foo)     // Handled by match_e_expression
  inner::(   // Handled by match_sexp_1_0
    def      // Handled by match_value_1_0
    (:bar)   // Invalid (for match_sexp_1_0)
  )
)

The solution was to copy/paste the content from match_sexp_value, changing it to use match_value_1_1 instead of match_value_1_0.

src/lazy/text/buffer.rs Outdated Show resolved Hide resolved
@@ -790,7 +807,7 @@ impl<'data> TextBufferView<'data> {
},
),
map(
match_and_length(Self::match_struct),
match_and_length(Self::match_struct_1_1),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I added match_struct_1_1, so it got updated here, inside match_value_1_1.

.map(|matched| Some(RawValueExpr::MacroInvocation(matched))),
value(None, tag("]")),
terminated(
Self::match_annotated_value_1_1.map(Some),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Same problem here as in match_sexp_value_1_1. Fixed by copy/pasting the content of match_list_value, and altering this line to point to match_annotated_value_1_1.

@@ -998,7 +1019,7 @@ impl<'data> TextBufferView<'data> {
}
// Scan ahead to find the end of this struct.
let struct_body = self.slice_to_end(1);
let struct_iter = RawTextStructIterator_1_1::new(struct_body);
let struct_iter = RawTextStructIterator_1_0::new(struct_body);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ Because match_ion_struct was using RawTextStructIterator_1_1, it was theoretically possible that Ion 1.0 would accept E-Expressions as long as they were found inside a struct.

@@ -1021,6 +1042,37 @@ impl<'data> TextBufferView<'data> {
Ok((remaining, matched))
}

pub fn match_struct_1_1(self) -> IonMatchResult<'data> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This is basically a copy/paste of match_struct, but it uses RawTextStructIterator_1_1.

@@ -2276,8 +2328,9 @@ mod tests {
P: Parser<TextBufferView<'data>, O, IonParseError<'data>>,
{
let result = self.try_match(parser);
let (_remaining, match_length) = result
.unwrap_or_else(|_| panic!("Unexpected parse fail for input '{}'", self.input));
let (_remaining, match_length) = result.unwrap_or_else(|e| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ This is a developer experience improvement. It causes a test failure to print the whole error, not just "Unexpected parse fail for input".

@@ -2709,6 +2762,56 @@ mod tests {
mismatch_sexp(input);
}
}
#[test]
fn test_match_sexp_1_1() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗺️ I added some tests for Ion 1.1 SExps and Lists, as well as some test cases for E-Expressions that are nested inside other things. This is far from comprehensive, but it's at least an improvement.

@popematt popematt requested a review from zslayton October 5, 2023 07:00
@zslayton
Copy link
Contributor

zslayton commented Oct 6, 2023

I think the long term solution needs to be some way of extracting the matcher logic from the TextBufferView and then having the TextBufferView keep track of what version it is currently reading. We could attach the matcher logic to TextEncoding_1_0 and TextEncoding_1_1 by way of a trait impl (with most of the logic being in the default implementation, and only places where it diverges having specific implementations), and then store the version in TextBufferView as an enum or a type-state. A change_version function on TextBufferView would allow a Reader to switch between Ion 1.0 and Ion 1.1 as needed.

I think that's a great idea.

@popematt popematt merged commit 755bc99 into amazon-ion:main Oct 6, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants