Skip to content

Conversation

@mkruskal-google
Copy link
Member

This reverts commit dd052c9, reversing changes made to 0d3eaed.

This breaks https://github.com/kythe/kythe, as well as some internal code.

…-code-info-pseudo-options"

This reverts commit dd052c9, reversing
changes made to 0d3eaed.
@mkruskal-google mkruskal-google merged commit 264292e into protocolbuffers:main Sep 26, 2022
@mkruskal-google
Copy link
Member Author

@jhump FYI

@jhump
Copy link
Contributor

jhump commented Sep 26, 2022

This breaks https://github.com/kythe/kythe, as well as some internal code.

@mkruskal-google, how did it break things? Are those tools expecting the source code info as it was previously being generated? These changes are very minor, so I'm surprised that anything would break. Is it just a test expectation that went red -- like some data derived from the source code info that is now slightly different?

@mkruskal-google
Copy link
Member Author

Hey Joshua, sorry for the delay on this. @jaysachs can probably fill you in on some details. I'm not entirely sure what broke on their end.

jaysachs pushed a commit to jaysachs/kythe that referenced this pull request Oct 1, 2022
@jaysachs
Copy link

jaysachs commented Oct 1, 2022

So the original PR changed what turned up as the text pan for the default value expression. See kythe/kythe@master...jaysachs:kythe:proto-default-field-change-sadness for how we'd have to adapt. Per my (admittedly rudimentary) reading of the code, our only option in the face of the original PR as originally presented to restore the original desired behavior is to grovel through the bytes of the buffer, rather than depending on the spans as returned by the parser. I think we need the value location in file descriptor, not the no-op that the original PR created.

(@justbuchanan you may also want to chime in here)

@jhump
Copy link
Contributor

jhump commented Oct 3, 2022

@jaysachs, thanks for sharing that. Although I don't quite follow what you're using the span for. Are you extracting the text from the original source based on that location? Why not instead use the default_value in the field descriptor proto?

Currently, options are represented inconsistently in three different ways:

  1. Field default has a span that is just the value.
  2. Field json_name has two spans: one that is just the value and another that is the entire declaration.
  3. All other options have one span that is the entire declaration.

My PR was just trying to make things consistent by having everything look like bullet 3 above. The main reason for that was for non-compact option declarations (like on files, messages, enums, and services): since these are considered "full declarations" they include comment information, and the leading and trailing comments are around the entire declaration, not just the value. So I think it would be confusing to have a "mixed" location: where the span is for just the value but the comments are for the broader declaration.

Perhaps instead everything should look more like bullet 2. I find it a little confusing to have redundant locations, but if it's reasonably documented that this is expected (one for the whole declaration, which may have comments, and another just the value), this is probably okay. However, I think that would also require changes to kythe since it would have to know to extract the smaller span of the two locations for the same path.

@jaysachs
Copy link

jaysachs commented Oct 3, 2022

Kythe is specifically interested in the span of the default value, for the purposes of overlaying decorations on the code. (Note that this is only an issue for enums, as default values for int64, string etc are literal values, so there's no symbol reference to decorate in those cases.)

Option (2) could work, we could adapt the code to choose the proper span.

@jhump
Copy link
Contributor

jhump commented Jan 23, 2024

@jaysachs, ping, how hard would it be to have Kythe handle a situation where the descriptor had two spans for the same path for default value? I'd like to revisit this soon and implement what was mentioned above, two spans for the default value (one with the full declaration that includes the option name, another with just the value), is the fairly straight-forward. But I don't want to land a change like that only to have it reverted again.

The spans should be emitted (IMO) with the full declaration first and then the value second, mainly so they are effectively sorted by span start (since the value's span must begin after the option name in the full declaration). That is also the easiest way to implement it since the parser starts parsing the full declaration before the value portion.

I wonder if this would even require a change to Kythe at all. If it encounters multiple locations for the path to the default value, how would it behave? If it would use the last location (perhaps a long-shot, since keep first seems more likely), then it would "just work" with the proposed descriptor tweaks.

@jaysachs
Copy link

Do you have proposed PR so we could evaluate it?

@zrlk for context

@jhump
Copy link
Contributor

jhump commented Jan 29, 2024

Do you have proposed PR so we could evaluate it?

@jaysachs, if you mean a PR to this protocolbuffers project, I don't yet. But I can probably get one together this week and ping this thread with a link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants