Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrecognized codec: gzip when using AvroFileReaderWriterFactory #482

Closed
eugenemiretsky opened this issue Jan 7, 2019 · 3 comments
Closed

Comments

@eugenemiretsky
Copy link

I get the following error
Is it possible to turn off compression? Or set it to something that works with Avro?
We didn't specify gzip anywhere.

2019-01-07 06:22:36,461 [Thread-5] (com.pinterest.secor.io.impl.AvroFileReaderWriterFactory) ERROR Error creating codec factory
org.apache.avro.AvroRuntimeException: Unrecognized codec: gzip
	at org.apache.avro.file.CodecFactory.fromString(CodecFactory.java:102)
	at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory$AvroFileWriter.getCodecFactory(AvroFileReaderWriterFactory.java:143)
	at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory$AvroFileWriter.<init>(AvroFileReaderWriterFactory.java:135)
	at com.pinterest.secor.io.impl.AvroFileReaderWriterFactory.BuildFileWriter(AvroFileReaderWriterFactory.java:67)
	at com.pinterest.secor.util.ReflectionUtil.createFileWriter(ReflectionUtil.java:154)
	at com.pinterest.secor.common.FileRegistry.getOrCreateWriter(FileRegistry.java:133)
	at com.pinterest.secor.writer.MessageWriter.write(MessageWriter.java:94)
	at com.pinterest.secor.consumer.Consumer.consumeNextMessage(Consumer.java:165)
	at com.pinterest.secor.consumer.Consumer.run(Consumer.java:98)
2019-01-07 06:22:36,461 [Thread-3] (com.pinterest.secor.io.impl.AvroFileReaderWriterFactory) ERROR Error creating codec factory
@HenryCaiHaiying
Copy link
Contributor

It seems to me that you specified to use gzip codec in your secor config

@richiesgr
Copy link

richiesgr commented Nov 30, 2020

Hi
I would like to add more on this because I check it now
The problem is that Avro doesn't support Gzip compression by default.
The problem is that Avro and the message writer both use the same configuration params:
secor.compression.codec
So if you put org.apache.hadoop.io.compress.GzipCodec you get an exception because Avro writer try to use it and fail
If you put null (no compression) so the MessageWriter fail with an exception because it try to instance a class with that
If put empty it fails because there you must put value

So I don't know what to put here because I try to Big query to read the files and don't thing it support something else than Gzip or uncompressed. and you can't set neither !!

@HenryCaiHaiying
Copy link
Contributor

HenryCaiHaiying commented Nov 30, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants