Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add options to schema output #437

Merged
merged 10 commits into from
May 19, 2022
Merged

feat: add options to schema output #437

merged 10 commits into from
May 19, 2022

Conversation

Vilsol
Copy link
Collaborator

@Vilsol Vilsol commented Dec 8, 2021

This adds options support for outputSchema. This would fix #295

There is only one catch with this feature, and it's that it requires modification to the ts-proto-descriptors lib.

For each of the following:

  • FileOptions
  • MessageOptions
  • FieldOptions
  • OneofOptions
  • EnumOptions
  • EnumValueOptions
  • ServiceOptions
  • MethodOptions

Inside of the decode method for each, before the while loop it must first initialize a new untyped object:

message.encodedOptions = {};

And then change the default switch case to the following:

default:
    const startPos = reader.pos;
    reader.skipType(tag & 7);
    message.encodedOptions[tag>>>3] = [...(message.encodedOptions[tag>>>3] || []), reader.buf.slice(startPos, reader.pos)];
    break;

Due to these requirements, I have made this a draft PR.

I am not sure what would be the best approach to modify these files. Should we add a hardcoded check for these type names in this library itself, or should we make a git patch that gets applied every time before the ts-proto-descirptors lib gets published? Or maybe there is an even more elegant solution.

I am also considering adding exact generated typings for the ProtoMetadata options, where the type for options object would be a perfect match for the one in the actual protoMetadata object itself.

Also, I had to resort to base64 encoding any messages and then they would get decoded at runtime, which is not a clean way to do this. Maybe there is some way to decode it at compile time?

@Vilsol Vilsol marked this pull request as ready for review December 13, 2021 09:49
@stephenh
Copy link
Owner

Hey @Vilsol , this is a great PR! Thanks for tackling this! Sorry for the delay in reviewing this, I'm still figuring out the details of extensions/unknown fields/etc. in protobuf, so took me awhile to find a dedicated time to look at the PR & think about thinks.

I was thinking, for this change you need:

default:
    const startPos = reader.pos;
    reader.skipType(tag & 7);
    message.encodedOptions[tag>>>3] = [...(message.encodedOptions[tag>>>3] || []), reader.buf.slice(startPos, reader.pos)];
    break;

This basically sounds like adding unknown field support, i.e. while decode-ing if we see a key we don't know about, just "put it somewhere".

Wdyt about breaking that functionality out into its own PR? That we could we do:

  1. In general having unknown field support in ts-proto would be great
  2. As well as .encode storing keys in message._unknownFields, we could have .decode watch for the special ._unknownFields and if it exists, output/echo the still-encoded values back onto the wire (i.e. as mentioned here https://developers.google.com/protocol-buffers/docs/proto#extensions). (Granted, you don't need .decode support for this PR, but it seems potentially easy to sneak into a PR that adds .encode support for it...hrm, although I guess it'd need to re-create the tag, so maybe it's a little more nuanced)
  3. When this initial _unknownFields support is released in ts-proto, we could then use that new ts-proto to re-build the ts-proto-descriptors, and that would get the _unknownFields functionality you need into the ts-proto FileDescriptorProto/etc. types that you need

We could potentially put unknown field support behind a feature flag, in case users want to turn it off, but it'd probably be a good thing to have on by default, I think...

Other than that tactical approach, I had a few inline questions about maybe getting the options nudged into the existing metadata types where I think it might be a little more ergonomic to access, but otherwise this looks great!

Wdyt about the approach of an initial/separate PR to support unknown fields, and then doing this PR on top of that one?

@Vilsol
Copy link
Collaborator Author

Vilsol commented Dec 16, 2021

The separate PR is not a bad idea, would mean that anything decoded would unconditionally get re-encoded guaranteeing consistency on any type. I will see if I can get something like that written up, but most likely only after the holidays.

The reason why I did not decode the options in-place under the current FileDescriptorProto was because that would be unexpected to find some untyped field in a fully typed object, but having it there would definitely make it easier to access. My only concern is that FileDescriptorProto.fromPartial that is currently used would strip the field away, but I think we could just try and cast the object to it, considering that it just got serialized from itself.

@stephenh
Copy link
Owner

My only concern is that FileDescriptorProto.fromPartial

Ah shoot, that's a good point; you're right that it's not clear how fromPartial should handle extensions...

It could do a large destructure i.e.:

fromPartial(o) {
  const { firstName, lastName, ...unknownFields } = foo;
  return { firstName: ...coerce/default..., lastName: ...coerce/default..., unknownFields };

And then unknownFields would be a hash of field name --> value...

But for encode unknownFields would be a hash of field number --> value, so sometimes it's keyed by field name, and sometimes keyed by field number. Which is not great.

Maybe unknown fields should be set directly on the object, i.e. foo[12] would be an unknown/extension field that encode saw that had field number 12, and foo["someOption"] would be an unknown/extension field that fromPartial saw that had the field name someOption...

I'm a little wary to fundamentally change/restructure fromPartial to handle extra/unknown keys...

@shimonmagal
Copy link

I tried using the branch for the simplest enum - I got this error with:
--ts_proto_opt=outputJsonMethods=false,outputPartialMethods=false,outputClientImpl=true,outputSchema=true:

FAILED!Cannot convert undefined or null to objectTypeError: Cannot convert undefined or null to object
at Function.keys ()
at encodedOptionsToOptions (C:\Users\shimo\temp\visol\ts-proto\build\schema.js:181:30)
at Object.generateSchema (C:\Users\shimo\temp\visol\ts-proto\build\schema.js:78:23)
at Object.generateFile (C:\Users\shimo\temp\visol\ts-proto\build\main.js:156:33)
at C:\Users\shimo\temp\visol\ts-proto\build\plugin.js:22:37
at Array.map ()
at main (C:\Users\shimo\temp\visol\ts-proto\build\plugin.js:21:53)

I tried to fix this with just adding a x || [] bypass there, but didn't get to see the EnumValue option manifest in the generated proto. I must be missing something.


I just did clone, then yarn, then yarn build

Lastly ran

protoc --ts_proto_opt=outputJsonMethods=false,outputPartialMethods=false,outputClientImpl=true,outputSchema=true --plugin=protoc-gen-ts_proto=/path/to/protoc-gen-ts_proto --ts_proto_out=. ./A.proto

A.proto:

syntax = "proto3";

import "google/protobuf/descriptor.proto";

extend google.protobuf.EnumValueOptions {
  optional uint32 my_enum_value_option = 50005;
}

enum MyEnum {
  FOO = 0 [(my_enum_value_option) = 321];
  BAR = 1 [(my_enum_value_option) = 3212];
}

@Vilsol
Copy link
Collaborator Author

Vilsol commented Feb 17, 2022

@stephenh This should now be fully working. I have switched it over to the new _unknownFields feature and added a few safety checks.

Copy link
Owner

@stephenh stephenh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Vilsol apologies for the really late review; I had a few pending comments that I forgot to hit publish on, and then also have just been busy. I think this is good to merge as-is but I also had a few curious questions. Thanks!

src/schema.ts Outdated
export function generateSchema(ctx: Context, fileDesc: FileDescriptorProto, sourceInfo: SourceInfo): Code[] {
const { options } = ctx;
const chunks: Code[] = [];

fileDesc.extension.forEach((extension) => {
extensionCache[extension.number] = extension;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the key to this cache probably needs to be extension.extendee+extension.number? I.e. the number could be used twice across different extensions.

chunks.push(code`
type ProtoMetaMessageOptions = {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having this separate ProtoMetaMessageOptions, could we output options within the existing fileDescriptor structure?

I.e. a file-level option would be at:

protoMetadata.fileDescriptor.options["optionKey"]

I guess right now we're letting ts-poet .toJSON-ify the descriptor into the generated source code, and we'd probably need to nudge the options around a bit, i.e. like how you're base64 encoding messages...

Copy link
Collaborator Author

@Vilsol Vilsol Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why they get output outside of the existing object is because the existing objects get processed through fromPartial which would strip all of those fields from the runtime.

They would also lose all type-safety if we did this.

src/schema.ts Outdated
@@ -57,13 +96,181 @@ export function generateSchema(ctx: Context, fileDesc: FileDescriptorProto, sour
descriptor.sourceCodeInfo?.location.filter((loc) => loc['leadingComments'] || loc['trailingComments']) || [],
};

let fileOptions: Code | undefined;
if (descriptor.options) {
fileOptions = encodedOptionsToOptions(ctx, (descriptor.options as any)['encodedOptions']);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like here, I wonder if we could do something like:

if (descriptor.options) {
  for (const key in Object.keys(descriptor.options.encodedOptions)) {
    const ext = ...use extension cache to find `extensionee=FileDescriptor` && `number=key`
    descriptor.options[ext.jsonName] = ...option value...
  }
}

And then repeat for the message option / service options, etc...


describe('options', () => {
it('generates types correctly', () => {
expect(protoMetadata).toMatchInlineSnapshot(`
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inline snapshot is huge; can we move it to a regular non-inline snapshot?

Also, could you add one or two tests that specifically load options? I get if the snapshot is correct, it will work, but I'm mostly want test coverage of the API you're specifically using to look up options, so that they act as documentation / reminder of what not to break, if we happen to otherwise refactor some aspects of the metadata.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have extracted the snapshot to a separate file.

I don't fully understand what you mean by the extra tests you are requesting?

src/schema.ts Outdated
return undefined;
}
const resultOptions: Code[] = [];
for (const key of Object.keys(encodedOptions)) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor could do:

for (const [key, value] of Object.entries(a)) {

src/schema.ts Outdated
}
}

function encodedOptionsToOptions(ctx: Context, encodedOptions: { [key: number]: Uint8Array[] }): Code | undefined {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own future future/maintenance, can we add like an in-source comment along the lines of:

/** Takes the protoc's input of options as proto-encoded messages, and turns them into embedded-able-in-source-code representations. */

@Vilsol
Copy link
Collaborator Author

Vilsol commented Mar 15, 2022

@stephenh I have pushed some the requested changes and commented on the other review comments.

@Vilsol Vilsol requested a review from stephenh March 31, 2022 11:07
@Vilsol
Copy link
Collaborator Author

Vilsol commented May 16, 2022

@stephenh just wanted to notify you in case you missed it. The PR now complies with all the requested changes.

@stephenh
Copy link
Owner

Hey @Vilsol ! Thanks for the ping on this! I've looked at this PR a few times, but end up getting distracted because I'm just not that much of an expert on the options/schema output side of things.

Everything looks great to my naive eyes, I just haven't had the chance to really think deeply about this; but that's fine, I'm just going to trust you on it. Thanks! :-)

@stephenh stephenh merged commit e8e4e39 into stephenh:main May 19, 2022
stephenh pushed a commit that referenced this pull request May 27, 2022
# [1.113.0](v1.112.2...v1.113.0) (2022-05-27)

### Features

* add options to schema output ([#437](#437)) ([e8e4e39](e8e4e39))
@stephenh
Copy link
Owner

🎉 This PR is included in version 1.113.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Are custom options read?
3 participants