Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work In Progress: casttype (new type of customtype) #36

Closed
tv42 opened this issue Dec 22, 2014 · 30 comments
Closed

Work In Progress: casttype (new type of customtype) #36

tv42 opened this issue Dec 22, 2014 · 30 comments

Comments

@tv42
Copy link

tv42 commented Dec 22, 2014

main.go:

package main

//go:generate protoc -I. -I$GOPATH/src -I$GOPATH/src/github.com/gogo/protobuf/protobuf --gogo_out=. repro.proto

type T uint32

func main() {
}

repro.proto:

package repro;

option go_package = "main";

import "github.com/gogo/protobuf/gogoproto/gogo.proto";

option (gogoproto.unmarshaler_all) = true;

message M {
  required uint32 X = 1 [(gogoproto.nullable) = false, (gogoproto.customtype) = "T"];
}
$ go generate && go build
# _/home/tv/src/2014/protobuf-bug-type
./repro.pb.go:69: invalid operation: m.X |= uint32(b) & 127 << shift (mismatched types T and uint32)
@tv42 tv42 changed the title customtype for uint32 breaks unmarshaler build with mismatches types across |= customtype for uint32 breaks unmarshaler build with mismatched types across |= Dec 22, 2014
@awalterschulze
Copy link
Member

custom type only works on bytes. I think this needs clearer documentation
and a clearer error. Thank you for reporting
On Dec 22, 2014 3:10 AM, "Tv" notifications@github.com wrote:

main.go:

package main
//go:generate protoc -I. -I$GOPATH/src -I$GOPATH/src/github.com/gogo/protobuf/protobuf --gogo_out=. repro.proto
type T uint32
func main() {
}

repro.proto:

package repro;
option go_package = "main";
import "github.com/gogo/protobuf/gogoproto/gogo.proto";
option (gogoproto.unmarshaler_all) = true;
message M {
required uint32 X = 1 [(gogoproto.nullable) = false, (gogoproto.customtype) = "T"];
}

$ go generate && go build

_/home/tv/src/2014/protobuf-bug-type./repro.pb.go:69: invalid operation: m.X |= uint32(b) & 127 << shift (mismatched types T and uint32)


Reply to this email directly or view it on GitHub
#36.

@awalterschulze
Copy link
Member

Maybe this could be a feature request.
This would mean that you would have to write MarshalUint32, UnmarshalUint32 and SizeUint32 methods for you customtype, since you might want to use T as a uint32 in certain cases and bytes or int32 in other cases. Also the coder of the customtype would have to marshal those exactly right. Whereas bytes is really hard to mess up.

@tv42
Copy link
Author

tv42 commented Dec 22, 2014

I see what you're saying. My expectation, trying to use it, was that if the proto side of things says "uint32" and my type is just a type T uint32, it would marshal like an uint32 uint32 would have ;) It's already really close; all it seems to need is a type conversion around that |=. (And then non-assignable type T's will crap out on a different weird error, but that's a different problem; that can't really be avoided without protoc-gen-gogo trying to parse the Go source, I think.)

If I understand what you're saying, you're implying this is undesirable because what is a uint32 in Go might on the wire be represented by just about anything; maybe the Go variable is e.g. just a lookup key to var cache map[uint32]string, and the strings are the actual on-wire content. I see that, but also in this case the protobuf type was uint32, and that seems like a sane default to me.

I understand that implementation is more complex.

Also, just to avoid misunderstand, there's no such thing as MarshalUint32 currently, right? I'm wary of http://godoc.org/github.com/gogo/protobuf/proto#Marshaler because as far as I know it ends up on the wire as (protobuf type bytes, or it fails, right?), and my data is just an uvarint in itself.

As a workaround, I set gogoproto.unmarshaler=false for that message, that made it work. That particular message wasn't performance critical for me, for now, anyway.

I'll loop back to this problem when my code is functionality-wise in better shape, and I can start worrying about performance again ;)

Thank you.

@awalterschulze
Copy link
Member

Yes I don't want protoc-gen-gogo to parse Go source.

It does seem like a sensible default.

I don't know how much effort this would be to implement though.

Yes there is no such thing as MarshalUint32 currently and I also don't think that would be a good idea for the future.

Conclusion:
There might be a cool feature to add here.
Make bytes work with MarshalBinary as discussed in issue #28
Make string work with MarshalText
All other types are assumed to be aliased types, since this is already working when generation is turned off and I can't think of any other sensible use case at this time.

I think issue 28 should be sorted out first.
And then this case might also become less important?

@tv42
Copy link
Author

tv42 commented Jan 4, 2015

It seems just changing

                m.X |= (uint32(b) & 0x7F) << shift

to

                m.X |= T((uint32(b) & 0x7F) << shift)

would be enough. I'm not sure if you want to keep this separate from customtype, which is always a length-prefixed []byte thing.

@tv42
Copy link
Author

tv42 commented Jan 4, 2015

Oh and MarshalBinary won't really solve this, because the wire format is different when the .proto says uint32 vs bytes.

@awalterschulze
Copy link
Member

I agree MarshalBinary does not solve this, it might just make this use case less important.

@awalterschulze
Copy link
Member

We should also think about what happens when the type is another message and the field has a customtype.

For example:

message A {
optional Uuid Id = 1 [(gogoproto.customtype="uuid")];
}

message Uuid {
optional fixed64 First8Bytes = 1;
optional fixed64 Second8Bytes = 2;
}

I have seen this exact use case in the wild.

This means that the go client would prefer just to work with the uuid type, but another client prefers the message type.

@petermattis
Copy link

See PR #46 for a stab at fixing customtype when used with integer wire-types. It's not quite complete as behavior of integer wire types with a customtype differs if a generated marshaler/unmarshaler is used vs the reflection-based ones. The reflection-based marshal/unmarshal will use the customtypes Marshal and Unmarshal routines while the generated ones do not. My feeling is that customtypes for integer wire-types should not need Marshal and Unmarshal methods but should use the same encoding and decoding as the wire-type with the appropriate casting to make it work. Allowing custom Marshal and Umarshal removes the portability of the protos to other languages.

Somewhat problematically, the behavior I'm proposing would be a backwards incompatible change as customtype does work with integer wire-types if you're using reflection-based marshaling and unmarshaling.

@bdarnell
Copy link

#46 works only if the custom type is an alias for an integer type (i.e. type NodeID int64). With a custom unmarshaler you could use types that are not integers at all. I'm not sure if it's worth supporting but I can see some potential use cases for that (which do not impair portability to other languages).

The use case I have in mind is our type NodeID int64 and type StoreID int64. We have to cast plain integers into these types all the time, but NodeID(x) will work just as well whether x is a StoreID or an int64. It never makes sense to cast from a NodeID to a StoreID. We could instead define our types as type NodeID struct { ID int64 } with a custom unmarshaler, and conversion methods to/from plain integers, which would disallow casts directly from one type to the other. This is admittedly a lot of work for a fairly small increase in compile-time safety, but it is a use case for extending custom types beyond integer aliases.

@awalterschulze
Copy link
Member

What do you think about
#47

@petermattis
Copy link

@bdarnell
Copy link

Types and names must both match, so the more cumbersome type NodeID struct { NodeID int64 } would work to prevent casting: http://play.golang.org/p/e39xF_NTup

@petermattis
Copy link

@bdarnell For portability to other languages I think the wire-type encodings should not be tampered with. Custom types for integer wire-types should probably have Marshal<wire-type>() <wire-type> and Unmarshal<wire-type>(val <wire-type) methods. For example, for an int32 wire-type you would need to implement this interface:

type MarshalerInt32 struct {
  MarshalInt32() int32
}
type UnmarshalerInt32 struct {
  UnmarshalInt32(val int32)
}

@awalterschulze
Copy link
Member

I hope to get a chance on Monday to think about this again.
Thank you for all the input.
I'll reply again asap.

@awalterschulze
Copy link
Member

Unfortunately I did not get a chance.
I will try to get to this as soon as possible.
Sorry for the delay.

@laurent-descrivan
Copy link
Contributor

I too needed to serialize aliased int types, so I made some modifications just as a proof of concept (PR #50) for generated unmarshal code.
It seems there are quite a few ramifications when designing about this subject, and I am probably missing some issues, since I am not that familiar with the project (and protobuf for that matter).

However, here is my rationale of different type structures, using int64 as an example:

Go custom type Go internal type Protobuf Options Use case
int64 int64 int64 None Simple int serialization, already working
int64 int64 bytes Impossible Impossible, since go int64 cannot have (un)marshal method
MyInt64 int64 bytes (gogoproto.customtype) = "MyInt64" Possible, by implementing (un)marshal methods on MyInt64, since PR #47 no longer expect to alias a struct
MyInt64 int64 int64 (gogoproto.customtype) = "MyInt64" Impossible yet, but possible by implementing simple casting from int64 to MyInt64 as in PoC PR #50

The idea is that gogoproto.customtype is no more an option that means "use (un)marshal methods", but an option that means "use this go type, using different methods":

  • If you use protobuf bytes: []byte is ok as-is, any custom type (even aliased []byte) is ok via (un)marshal, non aliazed types other than []byte forbidden.
  • If you use any other protobuf type except Messages, Enum and Group: any castable go type is ok, no data adaptation beyond cast will be performed.
  • For custom type Messages, Enum and Group, I have not thought about it.

I am not sure I made myself clear :) But if I'm not mistaken that seems to match the use cases.

@awalterschulze
Copy link
Member

Ok I propose a new customtype called casttype.
All this will do in marshaling and unmarshaling situations is simply cast the type.
This means random values can be generated for tests and Equal is also simple.
It will be quite a bit of work, but very doable.

@awalterschulze
Copy link
Member

Sorry for the super long delay.

@awalterschulze
Copy link
Member

Any volunteers?

@awalterschulze
Copy link
Member

Ok so no volunteers, thats fair :)
Maybe I should ask if this would at least solve everyones problem?
I think so, but I would like to make sure.

@laurent-descrivan
Copy link
Contributor

Sorry I was in holidays :) I'm afraid I am not up for the task, considering I'm not familiar enough with the project.
I think the new casttype option is the correct approach, as it clearly separates custom casting from custom marshaling.

@awalterschulze
Copy link
Member

Ok great thats one up vote :)

@awalterschulze awalterschulze changed the title customtype for uint32 breaks unmarshaler build with mismatched types across |= Feature: casttype (new type of customttype) May 22, 2015
@awalterschulze awalterschulze changed the title Feature: casttype (new type of customttype) Feature: casttype (new type of customtype) May 22, 2015
@awalterschulze
Copy link
Member

I really want to clean up some on these long pending issues.
So I started working on casttype first
The work is on the casttype branch.
Here is the commit.
a2aaeaf

If someone would try to use it and see if it breaks it would be greatly appreciated.
I would love some feedback before I merge this back into the proto3 branch and add some docs to gogo.github.io.

Here is an example:

message Castaway {
    optional int64 Int32Ptr = 1 [(gogoproto.casttype) = "int32"];
    optional int64 Int32 = 2 [(gogoproto.nullable) = false, (gogoproto.casttype) = "int32"];
    optional uint64 MyUint64Ptr = 3 [(gogoproto.casttype) = "github.com/gogo/protobuf/test/casttype.MyUint64Type"];
    optional uint64 MyUint64 = 4 [(gogoproto.nullable) = false, (gogoproto.casttype) = "github.com/gogo/protobuf/test/casttype.MyUint64Type"];
}

@awalterschulze awalterschulze changed the title Feature: casttype (new type of customtype) Work In Progress: casttype (new type of customtype) Jun 2, 2015
@awalterschulze
Copy link
Member

The casttype branch now contains all the new features that have recently been added and the bug that has been fixed.
If someone gives me positive feedback on casttype I can merge this back into the proto3 branch.

@laurent-descrivan
Copy link
Contributor

Perfect ! Thank you for this work. I hope to test it as soon as I have some time in the next weeks.

@awalterschulze
Copy link
Member

cockroachdb seems to be having success with casttype :)
I would like one more success story before I merge, if possible.

@awalterschulze
Copy link
Member

merged to master :)
I hope everyone enjoys.

@laurent-descrivan
Copy link
Contributor

Yes we do :) Briefly tested it, so far so good. I had no time to use it thoroughly for production yet, but will do! Thanks!

@awalterschulze
Copy link
Member

Good to hear :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants