Skip to content

super-sponge/avro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

学习研究AVRO相关技术

开发avro mapreduce 程序注意事项

  • maven 中引入avro-mapred包时如果采用mr2/yarn需要指定hadoop2 org.apache.avro avro-mapred ${avro.version} hadoop2
  • 在提交到hadoop集群上时请参考script 下面的run.sh脚本执行
  • hadoop依赖需要包含 hadoop-mapreduce-client-core hadoop-mapreduce-client-common hadoop-mapreduce-client-jobclient
  • avro 的版本一定要与hadoop对应版本一致本实例使用的是hadoop 2.6.0 对应的avro版本是 1.7.4

参考文档

引用

The Avro MapReduce API is an Avro module for running MapReduce programs which produce or consume Avro data files.

If you are using Maven, simply add the following dependency to your POM:

org.apache.avro avro-mapred 1.7.4 hadoop2

Then write your program using the Avro MapReduce javadoc for guidance.

At runtime, include the avro and avro-mapred JARs in the HADOOP_CLASSPATH; and the avro, avro-mapred and paranamer JARs in -libjars.

To enable Snappy compression on output files call AvroJob.setOutputCodec(job, "snappy") when configuring the job. You will also need to include the snappy-java JAR in -libjars.

Releases

No releases published

Packages

No packages published