Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbol missing from generated types #306

Open
cortopy opened this issue May 14, 2022 · 9 comments
Open

Symbol missing from generated types #306

cortopy opened this issue May 14, 2022 · 9 comments

Comments

@cortopy
Copy link

cortopy commented May 14, 2022

Codegen will create typescript definitions that do not include the MESSAGE_TYPE symbol, even though it will be present in all response messages.

Following the example in the manual for this definition:

message Person {
    string name = 1;
    uint64 id = 2;
    int32 years = 3 [json_name = "baz"];
    optional bytes data = 5;
}

I would expect that the typescript interface is something like:

import { MessageType, MESSAGE_TYPE } from '@protobuf-ts/runtime';

export interface Person {
    /**
     * @generated from protobuf field: string name = 1;
     */
    name: string;
    /**
     * @generated from protobuf field: uint64 id = 2;
     */
    id: bigint;
    /**
     * @generated from protobuf field: int32 years = 3 [json_name = "baz"];
     */
    years: number;
    /**
     * maybe a jpeg?
     *
     * @generated from protobuf field: optional bytes data = 5;
     */
    data?: Uint8Array;


    // This what's missing
    [MESSAGE_TYPE]: MessageType<Person>
}
@jcready
Copy link
Contributor

jcready commented May 15, 2022

The [MESSAGE_TYPE] field is a non-enumerable property that is added to the object as a helpful "branding" that can be used for reflection purposes, but is not actually a required to make a valid (in this case) Person message object. The idea to add this to the generated interface was also considered. The generated TS interface is defining the required shape of an object, but the object can have other fields and still pass type checking. For example this following would have no TS errors:

const personWithExtra = {
    name: 'Foo',
    id: 123n,
    years: 123,
    extra: 'bar' // not in interface
};

function onlyPerson(input: Person) {}

onlyPerson(personWithExtra); // no errors

There is a helpful type-guard containsMessageType that you can import from the @protobuf-ts/runtime which will adjust the type of object (in the truthy case) to include the non-enumerable MESSAGE_TYPE property such that you can safely access it.

That function and the symbol should probably be documented in the manual as there doesn't seem to be any mention of it. Perhaps because it is still considered experimental:

/**
* The symbol used as a key on message objects to store the message type.
*
* Note that this is an experimental feature - it is here to stay, but
* implementation details may change without notice.
*/
export const MESSAGE_TYPE: unique symbol = Symbol.for("protobuf-ts/message-type");

@timostamm
Copy link
Owner

Thanks for the summary, @jcready. We have two symbol properties, the other one is "protobuf-ts/unknown":

export const symbol: unique symbol = Symbol.for("protobuf-ts/unknown");

It is used to store unknown fields during deserialization in the message instance. That way, old code can work on a message with added fields, and not lose any data during serialization. It is a core feature of Protobuf. If an implementation doesn't support that, it's not conformant.

Because of that, it seemed pretty harmless to add the experimental MESSAGE_TYPE as well. It is added for each message, though, while "protobuf-ts/unknown" is only added if an unknown field is encountered.

@cortopy, how did you run into the property? Was it through unit tests?

@cortopy
Copy link
Author

cortopy commented May 19, 2022

I found about the MESSAGE_TYPE symbol when fixing a bug in my frontend application. I use redux and some of the messages that come from server responses are directly stored in the redux store without further modification. I thought the message would include only the properties in the type and so I was surprised that when I was diffing the value stored in redux with one the app generates there were some discordance.

Even though Symbols are not enumerable, I would still prefer as much type correctness as possible. This was one of those nasty bugs that are difficult to pin down and so I had to write some logic to remove the symbol before putting in store. This wasn't enough to fix the bug and it may be the case that I didn't need the logic, but having "hidden" properties, whether enumerable or not, is not something I was expecting

@jcready
Copy link
Contributor

jcready commented May 19, 2022

@timostamm I wonder if we could avoid the whole mess of extra properties on the objects by using a WeakMap to store the message type (and possibly the unknown fields). Something to consider for v3.

// runtime/src/message-type-contract.ts
const MessageTypeMap = new WeakMap<object, IMessageType<any>>();
export const getMessageType = <T extends object>(ref: T): IMessageType<T> | undefined => MessageTypeMap.get(ref);

// runtime/src/binary-format-contract.ts
const UnknownFieldMap = new WeakMap<object, UnknownField[]>();
export const getUnknownFields = (ref: object): UnknownField[] | undefined => UnknownFieldMap.get(ref);
// Usage
import { getMessageType, getUnknownFields, IMessageType, UnknownField } from '@protobuf-ts/runtime';

const MessageType: IMessageType<typeof someReference> | undefined = getMessageType(someReference);
if (MessageType) {
    console.log(MessageType.toJson(someReference));
}

const UnknownFields: UnknownField[] | undefined = getUnknownFields(someReference);
if (UnknownFields) {
    console.log(UnknownFields.map((uf) => uf.no));
}

@timostamm
Copy link
Owner

Ah, WeakMap is definitely a nice idea. I've been using it back in the days in ActionScript and didn't realize JavaScript implementations finally caught up :)

Can't think of any drawbacks. We export both symbols, so it is a breaking change, but it seems worth it for a new major.

@nebez
Copy link

nebez commented Feb 23, 2023

This would be useful for users of SvelteKit as well. The default way of passing data from client/server around in that framework gets tripped up on these symbols and requires you clone new objects without the symbols to work.

@jcready
Copy link
Contributor

jcready commented Jun 29, 2023

FWIW my suggested solution using WeakMap does have some unintended consequences with regard to unknown fields. Currently the behavior of unknown fields is to set an ENUMERABLE Symbol (unlike the message type symbol which is not enumerable) on the message object.

If we wish to replace the usage of Symbols with WeakMaps then what do we do for unknown fields now that spreading the message object will no longer preserve the unknown fields? The manual specifically says that the way to drop unknown fields is to use MessageType.clone(). But since spreading will no longer preserve the unknown fields I wonder if for the next major release if we should do the opposite (e.g. MessageType.clone/create/mergePartial() will preserve unknown fields, while spreading will not).

Though that then begs the question: if one were to MessageType.mergePartial() and both the target and source had referentially different unknown fields, do we concat?

@jcready
Copy link
Contributor

jcready commented Sep 8, 2023

I also want to point out that stamping the message with the Symbol is rather expensive. When benchmarking binary reads (on nodejs v20) I'm seeing an almost 2.5x improvement if I comment out this bit from the create() method:

globalThis.Object.defineProperty(message, MESSAGE_TYPE, { enumerable: false, value: this });
With stamp:      550,101 ops/sec ±1.24% (95 runs sampled)
Without stamp: 1,323,473 ops/sec ±0.87% (92 runs sampled)

To be fair, using the WeakMap approach is even slower:

Using WeakMap:   311,061 ops/sec ±4.00% (63 runs sampled)

@jcready
Copy link
Contributor

jcready commented Sep 9, 2023

So far I've only found one way to maintain the current symbol stamp (and all its enumerability behaviors) while avoiding the slow down: define an EMPTY class for creating message instances and then define the symbol property on that class's prototype.

Inside the base MessageType class constructor:

this.MessageInstance = class MessageInstance{};
globalThis.Object.defineProperty(this.MessageInstance.prototype, MESSAGE_TYPE, { value: this });

Then the create() method for each generated class would replace

const message = { repeatedField: [] };
globalThis.Object.defineProperty(message, MESSAGE_TYPE, { enumerable: false, value: this });

with

const message = new this.MessageInstance();
message.repeatedField = [];

// alternatively
const message = Object.assign(new this.MessageInstance(), { repeatedField: [] });

Even though the defineProperty bit is expensive we're now only performing it once per message "class" instead of once per message "instance".


I still think the WeakMap approach could work and be fast so long as we used that MessageInstance class approach like shown above (just without the symbol). The only difference is that instead of adding every message "instance" (this) to the WeakMap, we instead add the this.__proto__ to it.

Previous proposal:

// somewhere in @protobuf-ts/runtime
const MessageTypeMap = new WeakMap<object, IMessageType<any>>();
export const setMessageType = <T extends object>(ref: T, MessageType: IMessageType<T>): void => {
    MessageTypeMap.set(ref, MessageType);
};
export const getMessageType = <T extends object>(ref: T): IMessageType<T> | undefined =>
    MessageTypeMap.get(ref);

// generated file, Foo$Type extends MessageType<Foo>'s create() method:
const message = {};
setMessageType(message, this);

New proposal:

// somewhere in @protobuf-ts/runtime
const MessageTypeMap = new WeakMap<object, IMessageType<any>>();
export const setMessageType = <T extends object>(ref: T, MessageType: IMessageType<T>): void => {
    MessageTypeMap.set(ref.__proto__, MessageType);
};
export const getMessageType = <T extends object>(ref: T): IMessageType<T> | undefined =>
    MessageTypeMap.get(ref.__proto__);

// generated file, Foo$Type extends MessageType<Foo>'s create() method:
const message = new this.MessageInstance();
setMessageType(message, this);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants