学习研究AVRO相关技术

开发avro mapreduce 程序注意事项

maven 中引入avro-mapred包时如果采用mr2/yarn需要指定hadoop2 org.apache.avro avro-mapred ${avro.version} hadoop2
在提交到hadoop集群上时请参考script 下面的run.sh脚本执行
hadoop依赖需要包含 hadoop-mapreduce-client-core hadoop-mapreduce-client-common hadoop-mapreduce-client-jobclient
avro 的版本一定要与hadoop对应版本一致本实例使用的是hadoop 2.6.0 对应的avro版本是 1.7.4

参考文档

引用

The Avro MapReduce API is an Avro module for running MapReduce programs which produce or consume Avro data files.

If you are using Maven, simply add the following dependency to your POM:

org.apache.avro avro-mapred 1.7.4 hadoop2

Then write your program using the Avro MapReduce javadoc for guidance.

At runtime, include the avro and avro-mapred JARs in the HADOOP_CLASSPATH; and the avro, avro-mapred and paranamer JARs in -libjars.

To enable Snappy compression on output files call AvroJob.setOutputCodec(job, "snappy") when configuring the job. You will also need to include the snappy-java JAR in -libjars.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
README.md		README.md
avro.iml		avro.iml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

学习研究AVRO相关技术

开发avro mapreduce 程序注意事项

参考文档

引用

About

Releases

Packages

Languages

super-sponge/avro

Folders and files

Latest commit

History

Repository files navigation

学习研究AVRO相关技术

开发avro mapreduce 程序注意事项

参考文档

引用

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages