-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Avro Message Parser #41
Comments
I guess you can write your custom parser and override configs to use it. Maybe also submit a PR so that it can be included in the default parser's secor ships with if it's reusable I guess. |
I've looked into building a custom parser using Avro's GenericRecord. We need a Schema repository for associating Kafka topics with Avro Schemas in order to deserialize the records within a single record parser. Camus uses the same concept and Avro has a very active ticket for implementing the Schema Repo https://issues.apache.org/jira/browse/AVRO-1124 To get something up and running now, I think the best approach right now would be to introduce an interface for the Schema Repo and just have an implementation set up by local configuration. I'll work on a PR for this. |
I am looking at exactly this problem and have run Camus. I would like to flush to S3 based on an size-based upload policy and also partition based on properties of the (Avro) message, which it looks like Secor allows whereas Camus does not. In our case we have a static in-memory repository of Avro schemas indexed by the an 8-bit schema fingerprint. We use this fingerprint as the first 8-bits of the Kafka message and can use this to look up the schema to decode the rest of the message. Since schema creation and migration is linked to our source control process this works for us. Clearly a dynamically updateable schema repository is necessary if you want to aggregate arbitrary Avro messages without redeploying code. @upio I am about to look into implementing and Avro message parser for Secor. If you get in touch soon we may be able to avoid duplicating effort. |
@silasdavis It sounds like you're further along with this than I am. I haven't used Camus, only looked at how they use the Schema Repositories. So far all i've done is hook up a basic "repository" that uses a hard-coded configuration to associate a Kafka topic with a Schema. Basically an interface and a HashMap ;) I'd be interested in seeing what you have learned from using Camus. |
Secor only supports Thrift and JSON messages out of the box right now. Secor should provide a configurable Avro Message Parser which uses Avros GenericRecord parser and a configurable Timestamp field.
The text was updated successfully, but these errors were encountered: