Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update parser to support editions syntax #2000

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Update parser to support editions syntax #2000

wants to merge 1 commit into from

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Jun 3, 2024

  • Recognizes the edition keyword. Falls back to proto3 for now.
  • Updates the parser to support more option identifiers with parenthesized extension references, e.g. features.(pb.java).legacy_closed_enum. Retains backwards compatibility with assumptions made by prior options logic.
  • Adds dummy schemas for Cpp/Go/Java features, in case these are imported by the CLI.
  • Adds a test case based on an example editions schema.

cc @protobufjs/maintainers

@@ -0,0 +1 @@
syntax = "proto3";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of these dummies? It seems like this could conflict with the actual versions. Also, it doesn't actually define any custom features, so anyone referencing those would have problems

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose I had in mind was that an import of these files may be present in a schema originating in another language or toolchain, so having dummies prevents "file not found" errors in pbjs. Left them blank because the definitions within are not of relevance to protobuf.js anyway. But now that you say it, given that the dummies might leak out to other toolchains, it seems like a good idea to include their full contents.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, also usually people would import them and reference the features defined in them. So I think you'd still hit errors from missing options definitions?

OTOH copying them verbatim here seems like a good way to get version skew issues. How do you deal with descriptor.proto? We've been modifying that pretty rapidly in the last few years

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing definitions aren't an issue for protobuf.js itself. It currently parses any options literally without looking up schemas. Hence the historic comment in the parser on how properties are assumed. Similarly, descriptor.proto is optional and only updated sporadically, as it is not needed internally for protobuf.js to function. Perhaps a data point that descriptor.proto hasn't leaked out so far.

if (acceptTypeRef && typeRefRe.test(token))
return token;
if (acceptIdentifier && nameRe.test(token))
return readIdentifier(token); // `ENUM_VALUE`

/* istanbul ignore next */
throw illegal(token, "value");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think readRanges needs to be updated as well. In edition 2023 we changed the grammar of reserved field/enum value names to be identifiers instead of strings. This avoids accidental concatenation

option features.(pb.go).legacy_unmarshal_json_enum = true;\
option features.(pb.java).legacy_closed_enum = true;\
message A {\
repeated int32 b = 1 [features.repeated_field_encoding = EXPANDED];\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding reserved field identifiers would be good to test the grammar changes

var option = name;
var identifier = readOptionIdentifier();

// Historically, `(some.option).prop` has been interpreted as a property
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like exactly what it is today. What's the change in behavior here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly a comment I found useful to explain why the somewhat strange code exists in the first place. Say, someone stumbles over this part, wonders why it's there, and decides to make a PR to remove it because it doesn't seem useful to them. Other than that, there's indeed no change in behavior other than also applying the fallback to syntax previously not supported, e.g. a.(b.c).d.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think I misunderstood what was happening here. FWIW it looks like you still don't support multiple extensions (e.g. a.(b).c.(d).e), but we hopefully won't ever need that

Comment on lines +328 to +329
syntax = "proto3";
isProto3 = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems strange. Why treat editions files like proto3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only added the fallback to produce valid schemas that can be tested, including during resolve. I guess another option is to test via the parser only, and for now produce an "editions are not supported" error during resolve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, this is just a temporary test hook? That makes sense, but I definitely like the failure approach better. That way nobody starts depending on this behavior

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants