Latest release version (available in Maven Central) - 1.0.0.


Maven Plugin for running Apache Hadoop jobs in pseudo-distributed-mode (default configuration is changeable with hmp.hadoopConf property).

Verified on Linux and Mac OS X / JDK 1.6.0_33 / Apache Hadoop 0.21.0 and 1.0.2.


start - start NameNode, DataNode, JobTracker and TaskTracker
copyFromLocal - copy file/directory from local file system to HDFS
submitJob - submit job to Apache
copyToLocal - copy file/directory form HDFS to local file system
stop - stop daemons started by 'start' goal (necessary only if -Dhmp.autoShutdown=false been used)

NOTE: maven-dependency-plugin & maven-jar-plugin are used to assembly job jar. See sample project for details.


Add following snippet to your pom.xml (adjust <hadoopHome/> if needed) and hit "mvn hadoop:start -Dhmp.autoShutdown=false"



See sample-maven-project.


In case you are going to use custom Hadoop conf directory (default one is shown here), hdfs-site.xml should include a snippet provided below. Otherwise, 'mvn hadoop:start hadoop:submitJob' may fail (during hadoop:submitJob goal) due to the DataNode being still unavailable.



Apache License, Version 2.0