Add a `@pbIndex` annotation and use it for reading and writing using the correct field indices by cb372 · Pull Request #21 · 47degrees/pbdirect

cb372 · 2019-12-13T17:27:40Z

Add an annotation called @pbIndex for annotating message fields with their protobuf indices (field numbers):

case class MyMessage(
  @pbIndex(1) a: Int,
  @pbIndex(5) b: String
)

This annotation is used when writing to protobuf, to write the fields with the correct indices, and when reading from protobuf, to look for the correct index for each field. The previous behaviour was to assume field indices started at 1 and increased sequentially.

Note: This is a breaking change in behaviour for pbdirect, as users now must annotate every field of their message. (We use some shapeless voodoo to verify at compile time that every field is annotated.) Skeuomorph will need to be updated to generate these annotations when it generates code for Mu. I changed my mind and made the annotation optional. For unannotated case classes, the existing behaviour is preserved.

This is the first step towards making pbdirect a "proper" spec-compliant protobuf serializer, with the ability to handle default scalar values, oneof fields, fixed-width integer fields, packed repeated fields, etc.

Also in this PR:

refactored the writer code, splitting "message writers" and "field writers" into separate traits
refactored the writer tests accordingly
added some property-based tests for the roundtrip of a complex message to protobuf and back, to test for consistency between our reader and writer

If people are happy with the annotation approach, in later PRs we can add similar annotations on fields to support:

Other approaches I considered before deciding on using an annotation:

tagging the type with shapeless Nat, e.g. case class Msg(a: Int @@ _1, b: String @@ _2)
- Nats only go up to the magic number 22
- quite intrusive, e.g. could interfere with Avro serialization in Mu
using a type annotation instead, e.g. case class Msg(a: Int @pbIndex(1), b: String @pbIndex(2))
- should work quite nicely with use of shapeless Coproduct to encode oneof fields, e.g. case class Msg(a: Int @pbIndex(1) :+: String @pbIndex(2) :+: CNil)
- support for working with type annotations exists in shapeless master but hasn't been released (shapeless hasn't been released for 2 years!)

cb372 · 2019-12-17T18:11:38Z

@@ -0,0 +1,29 @@
+/*
+ * Copyright (c) 2019 Beyond the lines


@juanpedromoreno What do you think we should do about the copyright headers in this project? I am not a lawyer but it seems strange to add this copyright to newly added files.

Should we change it for something like this?
https://github.com/ChristopherDavenport/circuit/blob/master/core/src/main/scala/io/chrisdavenport/circuit/CircuitBreaker.scala#L2

Yeah, that sounds reasonable to me.

We can tweak the sbt-headers accordingly to make the copyright the more accurate possible.

I've been looking into the MIT licence a little.

Apparently there is no need for a copyright header in every file. The Apache Foundation recommend it if you are using Apache 2.0, but there is no need for MIT.

There is no need to list changes. The example @fedefernandez linked above is from a project using Apache 2.0, which has a requirement to list changes.

The only requirement is a LICENSE file containing a copyright notice and the MIT licence text.

So the simplest solution could be:

remove the source file headers

add a note to the LICENSE file saying 47 Degrees forked the project in 2019 and we claim copyright for any code added or changed since the fork.

What do you think? @juanpedromoreno @fedefernandez

References:

https://choosealicense.com/licenses/mit/

advice from some random SO user

Sounds good to me

Sounds good to me too 👍

Cool, I'll do that in a separate PR after I've merged this.

cb372 · 2019-12-17T18:20:53Z

+        (FieldIndex(annotation.first :: annotation.more.toList), value)
+    }
+  }
+


This is probably the most interesting part of the PR.

Given some case class instance, e.g.

case class MyMessage(@pbIndex(1) a: Int, @pbIndex(2) b: String) val msg = MyMessage(123, "hello")

we summon the generic representation:

123 :: "hello" :: HNil

and the list of @pbIndex annotations on the fields:

Some(pbIndex(1)) :: Some(pbIndex(2)) :: HNil

We validate that every field is annotated and zip the two HLists together using the zipWithFieldIndex polymorphic function, giving us an HList of tuples:

(FieldIndex(1), 123) :: (FieldIndex(2), "hello") :: HNil

Then we use consWriter and hnilWriter to inductively resolve the correct PBFieldWriter instance for each field and write the fields to the output stream.

cb372 · 2019-12-17T18:26:18Z

+      case Some(annotation) => FieldIndex(annotation.first :: annotation.more.toList)
+    }
+  }
+


This is similar to the writer, but in reverse.

We use the annotations to build an HList of FieldIndexes:

FieldIndex(1) :: FieldIndex(2) :: HNil

and give that HList to the PBProductReader, along with the input bytes.

consProductReader makes use of the head of the HList of FieldIndexes to see what index it should be looking for, then passes the tail of the HList in the recursive step.

Add an annotation called `@pbIndex` for annotating message fields with their protobuf indices (field numbers). The idea is that skeuomorph will generate these annotations when it generates code for Mu. Started work on making use of these annotations to write fields with the correct field numbers when encoding messages to protobuf. Note: This is a breaking change in behaviour for pbdirect, as you now *have to* annotate every field of your message. (We use some shapeless voodoo to verify at compile time that every field is annotated.) This is the first step towards making pbdirect a "proper" spec-compliant protobuf serializer, with the ability to handle default scalar values, `oneof` fields, decoding messages without relying on field order, etc. Still to do: * finish implementation of protobuf encoding * decide what to do about annotating `oneof` fields (coproducts), as they have multiple field numbers * implement protobuf decoding (a bit more complicated) * write more tests

The existing instance for Functor is still there, but these are added for convenience so people don't have to remember to add the appropriate cats imports.

This is to support encoding of shapeless Coproducts as protobuf 'oneof' fields in future. (That functionality is not implemented yet.)

This at least gives us some confidence that the library is consistent with itself, even if it's not yet compliant with the protobuf spec.

tzimisce012

👍

juanpedromoreno

This is great! Thanks @cb372 :)

I'd suggest adding some docs to the README file, but code here LGTM

juanpedromoreno · 2019-12-18T10:06:48Z

@@ -0,0 +1,29 @@
+/*
+ * Copyright (c) 2019 Beyond the lines


Yeah, that sounds reasonable to me.

We can tweak the sbt-headers accordingly to make the copyright the more accurate possible.

cb372 · 2019-12-18T10:40:49Z

I'd suggest adding some docs to the README file

Good point, will do.

See 47degrees/pbdirect#21 Also tidy up the generated code slightly by putting each field on a separate line.

Also fixed incorrect byte array contents in example

BenFradet

👍

I just have a question: thinking from the user's perspective, could we have an all or nothing in terms of specifying @pbIndex? I.e. if none are specified, they are inferred from the generic representation.

cb372 · 2019-12-19T11:10:47Z

@BenFradet I agree, making the annotation mandatory is probably too disruptive for users. I've made it optional in 22380f3.

cb372 changed the title ~~[WIP] Add a @pbIndex annotation and use it for writing~~ Add a @pbIndex annotation and use it for reading and writing using the correct field indices Dec 17, 2019

cb372 requested a review from juanpedromoreno December 17, 2019 18:10

cb372 marked this pull request as ready for review December 17, 2019 18:10

cb372 commented Dec 17, 2019

View reviewed changes

cb372 added 8 commits December 17, 2019 18:27

Fix serialization of maps

a8ee5e5

Restore support for writing of sealed traits

6139dda

Provide PBFieldWriter instances for Option and List

5d2d79e

The existing instance for Functor is still there, but these are added for convenience so people don't have to remember to add the appropriate cats imports.

Separate tests for PBMessageWriter and PBFieldWriter

fd405e1

Use pbIndex annotations when reading from protobuf

b66fc5e

Make pbIndex annotation support multiple indices

e1e7c6b

This is to support encoding of shapeless Coproducts as protobuf 'oneof' fields in future. (That functionality is not implemented yet.)

Add property-based tests for protobuf round-trip

fca8e48

This at least gives us some confidence that the library is consistent with itself, even if it's not yet compliant with the protobuf spec.

cb372 force-pushed the field-indices branch from 07cb9ea to fca8e48 Compare December 17, 2019 18:28

tzimisce012 approved these changes Dec 17, 2019

View reviewed changes

Comment thread src/main/scala/pbdirect/PBFieldWriter.scala

juanpedromoreno approved these changes Dec 18, 2019

View reviewed changes

cb372 added a commit to higherkindness/skeuomorph that referenced this pull request Dec 18, 2019

Add pbIndex annotation to generated Scala code

6a6dd54

See 47degrees/pbdirect#21 Also tidy up the generated code slightly by putting each field on a separate line.

cb372 mentioned this pull request Dec 18, 2019

Add pbIndex annotation to generated Scala code higherkindness/skeuomorph#187

Merged

cb372 added 2 commits December 18, 2019 15:03

Update readme

c08e3b8

Also fixed incorrect byte array contents in example

More readme updates

f51539f

juanpedromoreno approved these changes Dec 18, 2019

View reviewed changes

BenFradet approved these changes Dec 18, 2019

View reviewed changes

Make the pbIndex annotation optional

22380f3

cb372 merged commit cbd378a into master Dec 19, 2019

cb372 deleted the field-indices branch December 19, 2019 11:26

cb372 mentioned this pull request Dec 19, 2019

Update license file and remove license headers from source files #24

Merged

Conversation

cb372 commented Dec 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tzimisce012 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

juanpedromoreno left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cb372 commented Dec 18, 2019

Uh oh!

BenFradet left a comment

Choose a reason for hiding this comment

Uh oh!

cb372 commented Dec 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cb372 commented Dec 13, 2019 •

edited

Loading