Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[avro] Add support for reading schema from Avro-encoded file #8

Open
cowtowncoder opened this issue May 6, 2016 · 9 comments
Open
Labels
avro pr-needed Feature request for which PR likely needed (no active development but idea is workable)

Comments

@cowtowncoder
Copy link
Member

(moved from FasterXML/jackson-dataformat-avro#10)

Avro streams may include embedded schema, and since it should be relatively safe to either auto-detect it; or just configure this to be the default if no schema is specified, we should support this mode.

As to sample data, maybe this project:

https://github.com/miguno/avro-cli-examples

has data we could use for confirming proper usage.

A follow-up feature should probably be that of producing & embedded schema; but that'd be a separate RFE.

cowtowncoder added a commit that referenced this issue Feb 17, 2017
…aybe next step. #1 only affects `master` will do next
@bkenned4
Copy link

Is this ticket resolved? Noticed it's referenced in the chery-pick commit

@cowtowncoder
Copy link
Member Author

@bkenned4 No; may have accidentally included issue id of the old repo.

As to implementation I suspect auto-detection may be slightly risky (it is possible to have encoded data start with same 4 bytes). But as long as it's format feature, disabled by default, may make sense. In addition to forcing use

@bkenned4
Copy link

@cowtowncoder clear. thanks for the context

@iehrlich
Copy link

@cowtowncoder what's the status of this issue? This'd be 100% useful in a number of cases. For example, another part of my system is generating AVRO documents, and I know for sure that the schema is present, so at least a possibility of a manual schema detection would be nice!

@njawad25
Copy link

@cowtowncoder given the last comment was few years ago, I'm not sure where this issue stands, but it looks like it still open. I'm currently writing a spring boot application to consume multiple files in different formats (xml, csv, avro) and this would help a lot.with keeping code clean and easy to follow.
Thank you

@cowtowncoder cowtowncoder added the pr-needed Feature request for which PR likely needed (no active development but idea is workable) label Jun 24, 2022
@cowtowncoder
Copy link
Member Author

At this point I do not have time to work on this feature, even though I fully agree that this would be a great feature.

However: if someone has the itch and would like to try to produce a PR, I will find time to help getting PR refined and hopefully merged. At this point such contribution could make it to upcoming 2.14.0 and earn kudos for a really, really nice addition from happy users. :)

Also: one thing that can help motivate others is to "up vote" issue with "thumbs up" reaction. While that does not change anyone's availability, sometimes it can help prioritize things nonetheless.

@cowtowncoder
Copy link
Member Author

An additional idea: if you don't think you know how to tackle somewhat advanced feature like this one (it's not trivial to figure out where and how to plug it in if not familiar with the project, at least), one thing that would be helpful is simply a unit test: writing test that tries to read input file that contains embedded Schema, using default parser/mapper with no extra settings -- and would currently fail.

Feature implementor would only need to modify test lightly to enable schema-reading (I think a AvroParser.Feature is needed since due zero-redundancy it is not possible 100% reliably detect that content starts with Schema, I think) but could use it as verification of feature functioning.

@MichalFoksa
Copy link
Contributor

How do you imagine this functionality? Something like this:

AvroMapper mapper = AvroMapper.builder().build();
InputStream in = getClass().getResourceAsStream("twitter.avro");
Twitter twitter= objectMapper.readValue(in, Twitter.class);

?

@cowtowncoder
Copy link
Member Author

@MichalFoksa Yes, something like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
avro pr-needed Feature request for which PR likely needed (no active development but idea is workable)
Projects
None yet
Development

No branches or pull requests

5 participants