Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix protobuf enums representation in generated scala. #100

Closed
toddburnside opened this issue Jun 26, 2019 · 5 comments
Closed

Fix protobuf enums representation in generated scala. #100

toddburnside opened this issue Jun 26, 2019 · 5 comments
Projects

Comments

@toddburnside
Copy link

toddburnside commented Jun 26, 2019

This issue is a related to Mu issue higherkindness/mu-scala#611.

Given this protobuf enum:

enum Test {
  LATER = 0;
  HELLO = 1;
  GOODBYE = 2;
  HI = 5;
}

This scala code is currently generated:

sealed trait Test
object Test {
  case object LATER extends Test
  case object HELLO extends Test
  case object GOODBYE extends Test
  case object HI extends Test
}

Not only does this lose the integral value assigned to each enum member, but it does not work correctly with PBDirect - an empty field is sent over the wire and deserialization produces the wrong value.

For PBDirect, the enum needs to be encoded like this:

sealed trait Test extends Pos
object Test {
  case object LATER extends Test with Pos._0
  case object HELLO extends Test with Pos._1
  case object GOODBYE extends Test with Pos._2
  case object HI extends Test with Pos._5
}

One issue involved here is that, in mu/Transform.scala, both protobuf and avro enums are mapped to the MuF TSum, where the integral value associated with the enum members.

Since avro enums don't have the concept of assigned integral values, this makes them not isomorphic with the protobuf enums. So, one way forward would be to create a new MuF instance named something like TSumIntegral or TSumValued that preserves the integer associated with the protobuf enums.

This would be a fairly simple change, but it does mean that the code generated by skeumorph for the protobuf enums would be PBDirect specific. (Pos, Pos._1, etc. are defined in PBDirect.) This would mean that the MuF representation would not be completely decoupled from protobuf.

I would love to hear if someone has some better ideas.

Note: If we do add the extends Pos etc., import pbdirect.Pos will need to be added to the top of the generated source file. This is done by Mu in ProtoSrcGenerator.scala. The ProtoSrcGenTests would also need to be updated.
The skeumorph readme also contains an example enum and generated code.

@pepegar
Copy link
Member

pepegar commented Jun 26, 2019

Hi @toddburnside!

Thanks for the report, that makes totally sense.

Indeed, as you describe, they both seem not to form an isomorphism but an epimorphism. The first option you propose, adding new values to the AST (TSumIntegral/TSumValued) I'd avoid it in order to not grow the AST too big, since it's already a bit too much.

The approach I'm following in the UAST branch is using annotations to add metadata to some types. In order to solve this, we could represent protobuf enums as Ann[TSum, shapeless.Nat]. This would:

  • add the needed information to protobuf enums to represent positions.
  • decouple from PBDirect, since we would use shapeless.Nat instead of pbdirect's Pos.

Some examples of annotations I'm using in UAST are precision & sign in numeric values of protobuf (see https://github.com/higherkindness/skeuomorph/pull/61/files#diff-0d4b6356cd9fc5dfda83f34f307481e3R109)

What do you think?

@toddburnside
Copy link
Author

@pepegar, those are some interesting suggestions and would be a definite improvement on my idea. I'll have to look through your code more and think about how I can make it work for this situation. Off the top of my head:

  • I need the numbers at the individual symbols within a TSum.
  • The integral value needs to persist through the transition to scala source code.
  • Our fork of PBDirect needs to be modified to use whatever we come up with instead if it's current Enum implementation.

I'll look into it more tomorrow.

@pepegar pepegar added this to To do in skeuomorph Jul 5, 2019
@toddburnside
Copy link
Author

Sorry for the long delay...

I’ve been looking at the uast branch, and unless I’m missing something, the use of Ann[TEnum, …] doesn’t fully solve this problem. In that branch, the TProtoEnum is represented as Ann[TEnum, annotations.EnumAnnotation, A], where the EnumAnnotation has fieldNumbers, options, and aliases, which preserves all the information in a protobuf enum. The problem is that the protobuf TProtoEnum is translated to a mu TEnum before the scala code is generated, and the field numbers (or values) are lost.

Maybe we could update the mu TEnum (TSum in the master branch) to be something like final case class TEnum[A](name: String, symbols: List[String], values: Option[List[Int]])? This would allow for a single value in the AST to support both Avro and protobuf enums.

On the use of shapeless.Nat in the scala encoding instead of the pbdirect.Pos, I have yet to be able to come up with a way to make the serializing/deserializing work. But, I’ve got a few more things to try.

@SemanticBeeng
Copy link

SemanticBeeng commented Jul 29, 2019

@pepegar: (side discussion) - while this solution is evolving is there any way to go around by using hand crafted enum class in Scala (with PBDirect Pos) and somehow have mu use it instead of the generated one?

@juanpedromoreno
Copy link
Member

This is now fixed in the latest version, see higherkindness/mu-scala#611 (comment)

skeuomorph automation moved this from To do to Done Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
skeuomorph
  
Done
Development

No branches or pull requests

5 participants