Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for introspecting avro-wrapped protobufs #265

Closed
nevillelyh opened this issue Sep 19, 2016 · 6 comments
Closed

Add support for introspecting avro-wrapped protobufs #265

nevillelyh opened this issue Sep 19, 2016 · 6 comments
Assignees
Labels
enhancement New feature or request question ❓ Further information is requested

Comments

@nevillelyh
Copy link
Contributor

Avro DataFileWriter has a .setMeta(String, String) method and we wrap ProtoBuf bytes in Avro files. Maybe set schema there to make ad-hoc file operations easier, e.g. cat, getschema?

@nevillelyh nevillelyh added the enhancement New feature or request label Sep 19, 2016
@nevillelyh
Copy link
Contributor Author

scala> TrackPB.getDescriptor.toProto
res1: com.google.protobuf.DescriptorProtos.DescriptorProto =
name: "TrackPB"
field {
  name: "trackId"
  number: 1
  label: LABEL_REQUIRED
  type: TYPE_STRING
}

@ravwojdyla
Copy link
Contributor

Cool idea! That said - then the user would have to know that even those she is saving data as protobufs, to find schema she needs to use avro-tools and then there is this custom meta there etc. Isn't this too low level and "scio" specific?

@nevillelyh
Copy link
Contributor Author

nevillelyh commented Sep 19, 2016

I mean we can make a proto-tools for quickly eyeballing a file. Not much work and maybe worth it if Protobuf becomes popular.

@nevillelyh nevillelyh added the question ❓ Further information is requested label Oct 11, 2016
@nevillelyh nevillelyh changed the title Set schema in ProtoBuf IO? Add support for introspecting avro-wrapped protobufs Oct 20, 2016
@nevillelyh
Copy link
Contributor Author

https://github.com/nevillelyh/protobuf-generic

This lib can do the following:

  • generate schema/JSON for any compiled protobuf type T, which we can attach to Avro files as metadata.
  • encode/decode binary protobuf as JSON given the above schema.

I'll look into Scio side of things.

@nevillelyh nevillelyh self-assigned this Oct 22, 2016
@nevillelyh
Copy link
Contributor Author

@nevillelyh
Copy link
Contributor Author

This will be supported in 0.2.6, due for release tomorrow. Closing this now.
https://github.com/spotify/scio/wiki/Protobuf#file-format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question ❓ Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants