Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create generic ftl serializer #241

Merged
merged 30 commits into from
Nov 10, 2022
Merged

Conversation

RumovZ
Copy link
Contributor

@RumovZ RumovZ commented Nov 11, 2021

This updates @Michael-F-Bryan's work from #184 to work with the current master branch. It started out as more of a quick and ugly fix, so please let me know if there's anything you want me to smooth out.

@zbraniecki
Copy link
Collaborator

@RumovZ thank you for your contribution and apologies for a late response.

I like your PR and I think it's generally landable, but I'd like to ask for one change - for round trip validation instead of writing unit tests with strings of FTL, could you leverage existing fixtures - https://github.com/projectfluent/fluent-rs/tree/master/fluent-syntax/tests/fixtures ?

The way I imagine it is that you'd add a ./fluent-syntax/tests/serializer_fixtures.rs and pull FTL files from the fixtures directory and round trip them to see if the output is the same.

If any of the FTL doesn't roundtrip properly, you could have a blacklist static array with names of files that don't roundtrip correctly yet to fix in follow ups.

For such files we'd like to make sure that the AST created from it at least doesn't crash when serialized.

Let me know how it sounds to you.

I'm also adding @eemeli as a second reviewer.

@zbraniecki
Copy link
Collaborator

In the unit tests, I'd like to instead see a very small and basic corpus of tests that are parse -> modify -> serialize operations like:

  • change ID
  • change simple string value
  • add PluralRules variant
  • change variable reference
  • remove message
  • add message at the end
  • add message in the middle
  • add message at the top
  • add a comment
  • remove a comment
  • edit a comment

Feel free to pick any of the above, and it's totally ok if you don't add all. We can extend it as we go. I'd like to land your PR soon.

@RumovZ
Copy link
Contributor Author

RumovZ commented Apr 24, 2022

Sounds good, but it's been a while, and I'll have to reimmerse myself in the matter. Nevertheless, I'll definitely try to implement this.

@RumovZ
Copy link
Contributor Author

RumovZ commented Apr 25, 2022

I had a look and the fixtures aren't suitable for parse-serialise roundtrip testing (i.e. comparing plaintext), as they are not normalised. In fact, only the two empty files pass the test.
A lot more pass the slightly weaker parse-serialise-parse roundtrip test (i.e. comparing ASTs), but the majority also fail this one. There may be actual issues with the serializer, but I think most fail due to inconsistent whitespace around junk, which might not be trivial for the parser to preserve.
Would you like to see such a parse-serialise-parse roundtrip test? Then I'd investigate those issues.

For such files we'd like to make sure that the AST created from it at least doesn't crash when serialized.

Not sure what you mean by that. Aren't both parsing and serialising infallible, if we allow for junk?

@alerque
Copy link
Collaborator

alerque commented Apr 25, 2022

Would you like to see such a parse-serialise-parse roundtrip test?

I don't think a serializer would be any use without the ability to round-trip a parsed AST.

I can understand not being able to round-trip from the existing fixtures as many of them are designed to be problematic sources that can be parsed (partially or fully) into functional ASTs. But if you have an AST, you should be able to serialize it and then get an identical AST back by deserializing. And if you can dump the serialized format back to an FTL file, that should also parse back to the same AST. Otherwise something is broken.

@RumovZ
Copy link
Contributor Author

RumovZ commented Apr 25, 2022

Certainly, but serialize(parse(ftl)) == ftl implies parse(serialize(parse(ftl))) == parse(ftl), whereas the reversal is not true, so I'm asking whether testing the latter is sufficient from the maintainers' point of view.
I was under the impression that @zbraniecki was referring to the former, but maybe I had misunderstood.

@gregtatum
Copy link
Member

I'm triaging the pull requests, and it appears that this one has feedback that is unaddressed and is stale. Feel free to re-open as needed.

Suggestion: If you want to continue this work, if it's possible to carve it out into smaller PRs, that would make it easier for maintainers to review and land. I'm not sure without digging into the PR if that's feasible here.

@gregtatum gregtatum closed this Oct 27, 2022
@RumovZ
Copy link
Contributor Author

RumovZ commented Oct 27, 2022

As is stated above, it is not possible to address @zbraniecki's feedback or it needs to be clarified. There is no point in splitting this PR if maintainers are not interested in a merge (nor would it make any sense from a technical point of view).

@alerque
Copy link
Collaborator

alerque commented Oct 27, 2022

This PR is already a follow up to #184 and is an ongoing work for something that would be useful. It does not make sense to split it up into smaller PRs either. The design requirement needs to be settled on so it can be addressed, not closed. I think the diagnosis of "unaddressed feedback" is backwards here.

@gregtatum
Copy link
Member

Ok, I can re-open, but I'm not sure I have the bandwidth myself right now to dive into it myself.

@gregtatum gregtatum reopened this Oct 28, 2022
Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've had time to take some time with this PR. The implementation looks really nice. I'm marking request changes for two things:

  1. I have several docs requests that I think should be added before landing.
  2. For Zibi's suggestion I can see why he would want to use existing fixtures. Could you please give a few examples of how the existing fixtures are failing to roundtrip? It would be good to know if there are actual bugs, or if it's just whitespace issues.

If it's just whitespace issues, I would like to at least land a couple of tests initially that pass the round tripping, then we can follow-up looking at incorporating more round-tripping fixtures. This would provide more confidence for correctness and long-term maintainability.

fluent-syntax/src/serializer.rs Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Show resolved Hide resolved
@gregtatum
Copy link
Member

What Zibi says sounds reasonable. Mainly I would like to see tests that provide good coverage to ensure we don't have bugs. The advantage of having .ftl files is that they are easy to write generic tests against beyond this single case. I don't mind the inline tests like you have in the current changeset, since they are acting as unit tests next to where the code is written. However, I'd also be fine with migrating them all to .ftl files at your discretion.

Thanks for getting back to me on this. After two hiatuses of half a year, it would be cathartic to see it merged.

Now that I'm up to speed, I'll make sure and be available to help get this merged in if you are available to help contribute it! Thanks for sticking through the slow process.

I'm not sure if we're talking about the same thing #241 (comment). Please clarify if you're referring to comparing serializations or ASTs, like I've suggested here.

Mostly I was nervous about the comment above that the lack of round tripping may be bugs.

tests/fixtures/well-formed/*

Suggestion: Maybe normalized would be better than well-formed, since whitespace differences would still be well-formed, but maybe not normalized.

@RumovZ
Copy link
Contributor Author

RumovZ commented Nov 8, 2022

Is the documentation sufficient?

The roundtrip tests revealed a few minor errors with the serializer and parser. I fixed one parse error, but one remains and prevents the last fixture from roundtripping correctly:

key12 =
    { "." }
        four

The parser swallows the four spaces in front of "four", which is wrong, if the Python implemenation is to be believed. This behaviour is also inconsistent with the parsing of

key12 =
{ "." }
    four

(the actual text in the fixture), in which case the whitespace is preserved.

I find the parser code quite hard to get into, so I won't try to fix this one and this PR is finished from my side.

@zbraniecki
Copy link
Collaborator

Thanks for catching that! @stasm - do you remember how you intended for the dedentation to work here?

Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making the changes! This is looking almost ready to merge.

I have a few minor pieces of feedback, and it looks like CI is failing on one of the tests.

Also, when you're ready for re-review please hit the little refresh icon next to my review name so it'll show up in my review queue. I believe from your message that you were ready for re-review.

This arrow:
screenshot of the review UI

fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Outdated Show resolved Hide resolved
fluent-syntax/tests/serializer_fixtures.rs Outdated Show resolved Hide resolved
fluent-syntax/tests/serializer_fixtures.rs Outdated Show resolved Hide resolved
fluent-syntax/src/parser/pattern.rs Outdated Show resolved Hide resolved
fluent-syntax/src/serializer.rs Show resolved Hide resolved
@gregtatum
Copy link
Member

I find the parser code quite hard to get into, so I won't try to fix this one and this PR is finished from my side.

@RumovZ Would you mind filing this as a new issue?

@RumovZ RumovZ requested a review from gregtatum November 9, 2022 22:42
Copy link
Member

@gregtatum gregtatum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woo! Thanks for being patient and working through the review process. Let's merge this!

@gregtatum gregtatum merged commit 00f1499 into projectfluent:master Nov 10, 2022
@RumovZ
Copy link
Contributor Author

RumovZ commented Nov 10, 2022

Amazing! Thank you for your feedback and support.
In the end, I didn't submit an issue for c009674. It's not a bug if you consider two TextElements equivalent to a single TextElement with the concatenation of their texts. At least, it's explicitly mentioned now.

@RumovZ RumovZ deleted the serializer branch November 10, 2022 16:25
@alerque
Copy link
Collaborator

alerque commented Nov 10, 2022

Thanks for all the work on this guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants