hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
Java
sakserv Merge pull request #50 from NoRights/master
Support for System property of HADOOP_HOME in windows which fixes issue #49
Latest commit 0021581 Oct 26, 2017
Permalink
Failed to load latest commit information.
hadoop-mini-clusters-activemq [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-common #49 Support for System property of HADOOP_HOME in windows Oct 26, 2017
hadoop-mini-clusters-hbase [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-hdfs [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-hivemetastore [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-hiveserver2 [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-hyperscaledb [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-kafka [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-kdc [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-knox [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-mapreduce [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-mongodb [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-oozie [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-storm [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-yarn [maven-release-plugin] prepare for next development iteration Sep 25, 2017
hadoop-mini-clusters-zookeeper [maven-release-plugin] prepare for next development iteration Sep 25, 2017
windows_libs add missing 2.6.1.0 windows libs Oct 22, 2017
.dockerignore added support for running tests in docker in preperation for using do… Feb 11, 2016
.gitignore remove ide related assests Apr 4, 2017
.travis.yml undo travis changes Jun 29, 2017
CHANGES.md prepare for next release Sep 25, 2017
Dockerfile added support for running tests in docker in preperation for using do… Feb 11, 2016
LICENSE Initial commit Jan 4, 2015
README.md prepare for next release Sep 25, 2017
checkstyle.xml added checkstyle plugin, disabled test failures due to checkstyle war… Nov 3, 2015
docker_build.sh added support for running tests in docker in preperation for using do… Feb 11, 2016
pom.xml [maven-release-plugin] prepare for next development iteration Sep 25, 2017

README.md

hadoop-mini-clusters

hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE, without the need for a full blown development cluster or container orchestration. It allows the user to debug with the full power of the IDE. It provides a consistent API around the existing Mini Clusters across the ecosystem, eliminating the tedious task of learning the nuances of each project's approach.

Coverage Status

Modules:

The project structure changed with 0.1.0. Each mini cluster now resides in a module of its own. See the module names below.

Modules Included:

  • hadoop-mini-clusters-hdfs - Mini HDFS Cluster
  • hadoop-mini-clusters-yarn - Mini YARN Cluster (no MR)
  • hadoop-mini-clusters-mapreduce - Mini MapReduce Cluster
  • hadoop-mini-clusters-hbase - Mini HBase Cluster
  • hadoop-mini-clusters-zookeeper - Curator based Local Cluster
  • hadoop-mini-clusters-hiveserver2 - Local HiveServer2 instance
  • hadoop-mini-clusters-hivemetastore - Derby backed HiveMetaStore
  • hadoop-mini-clusters-storm - Storm LocalCluster
  • hadoop-mini-clusters-kafka - Local Kafka Broker
  • hadoop-mini-clusters-oozie - Local Oozie Server - Thanks again Vladimir
  • hadoop-mini-clusters-mongodb - I know... not Hadoop
  • hadoop-mini-clusters-activemq - Thanks Vladimir Zlatkin!
  • hadoop-mini-clusters-hyperscaledb - For testing various databases
  • hadoop-mini-clusters-knox - Local Knox Gateway
  • hadoop-mini-clusters-kdc - Local Key Distribution Center (KDC)

Tests:

Tests are included to show how to configure and use each of the mini clusters. See the *IntegrationTest classes.

Using:

  • Maven Central - latest release
<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters</artifactId>
    <version>0.1.14</version>
</dependency>

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-common</artifactId>
    <version>0.1.14</version>
</dependency>

Profile Support:

Multiple versions of HDP are available. The current list is:

  • HDP 2.6.2.0 (default)
  • HDP 2.6.1.0
  • HDP 2.6.0.3
  • HDP 2.5.3.0
  • HDP 2.5.0.0
  • HDP 2.4.2.0
  • HDP 2.4.0.0
  • HDP 2.3.4.0
  • HDP 2.3.2.0
  • HDP 2.3.0.0

To use a different profiles, add the profile name to your maven build:

mvn test -P2.3.0.0

Note that backwards compatibility is not guarenteed.

Examples:

HDFS Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-hdfs</artifactId>
    <version>0.1.14</version>
</dependency>
HdfsLocalCluster hdfsLocalCluster = new HdfsLocalCluster.Builder()
    .setHdfsNamenodePort(12345)
    .setHdfsNamenodeHttpPort(12341)
    .setHdfsTempDir("embedded_hdfs")
    .setHdfsNumDatanodes(1)
    .setHdfsEnablePermissions(false)
    .setHdfsFormat(true)
    .setHdfsEnableRunningUserAsProxyUser(true)
    .setHdfsConfig(new Configuration())
    .build();
                
hdfsLocalCluster.start();

YARN Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-yarn</artifactId>
    <version>0.1.14</version>
</dependency>
YarnLocalCluster yarnLocalCluster = new YarnLocalCluster.Builder()
    .setNumNodeManagers(1)
    .setNumLocalDirs(Integer.parseInt(1)
    .setNumLogDirs(Integer.parseInt(1)
    .setResourceManagerAddress("localhost:37001")
    .setResourceManagerHostname("localhost")
    .setResourceManagerSchedulerAddress("localhost:37002")
    .setResourceManagerResourceTrackerAddress("localhost:37003")
    .setResourceManagerWebappAddress("localhost:37004")
    .setUseInJvmContainerExecutor(false)
    .setConfig(new Configuration())
    .build();
   
yarnLocalCluster.start();

MapReduce Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-mapreduce</artifactId>
    <version>0.1.14</version>
</dependency>
MRLocalCluster mrLocalCluster = new MRLocalCluster.Builder()
    .setNumNodeManagers(1)
    .setJobHistoryAddress("localhost:37005")
    .setResourceManagerAddress("localhost:37001")
    .setResourceManagerHostname("localhost")
    .setResourceManagerSchedulerAddress("localhost:37002")
    .setResourceManagerResourceTrackerAddress("localhost:37003")
    .setResourceManagerWebappAddress("localhost:37004")
    .setUseInJvmContainerExecutor(false)
    .setConfig(new Configuration())
    .build();

mrLocalCluster.start();

HBase Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-hbase</artifactId>
    <version>0.1.14</version>
</dependency>
HbaseLocalCluster hbaseLocalCluster = new HbaseLocalCluster.Builder()
    .setHbaseMasterPort(25111)
    .setHbaseMasterInfoPort(-1)
    .setNumRegionServers(1)
    .setHbaseRootDir("embedded_hbase")
    .setZookeeperPort(12345)
    .setZookeeperConnectionString("localhost:12345")
    .setZookeeperZnodeParent("/hbase-unsecure")
    .setHbaseWalReplicationEnabled(false)
    .setHbaseConfiguration(new Configuration())
    .activeRestGateway()
        .setHbaseRestHost("localhost")
        .setHbaseRestPort(28000)
        .setHbaseRestReadOnly(false)
        .setHbaseRestThreadMax(100)
        .setHbaseRestThreadMin(2)
        .build()
    .build();

hbaseLocalCluster.start();

Zookeeper Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-zookeeper</artifactId>
    <version>0.1.14</version>
</dependency>
ZookeeperLocalCluster zookeeperLocalCluster = new ZookeeperLocalCluster.Builder()
    .setPort(12345)
    .setTempDir("embedded_zookeeper")
    .setZookeeperConnectionString("localhost:12345")
    .setMaxClientCnxns(60)
    .setElectionPort(20001)
    .setQuorumPort(20002)
    .setDeleteDataDirectoryOnClose(false)
    .setServerId(1)
    .setTickTime(2000)
    .build();

zookeeperLocalCluster.start();

HiveServer2 Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-hiveserver2</artifactId>
    <version>0.1.14</version>
</dependency>
HiveLocalServer2 hiveLocalServer2 = new HiveLocalServer2.Builder()
    .setHiveServer2Hostname("localhost")
    .setHiveServer2Port(12348)
    .setHiveMetastoreHostname("localhost")
    .setHiveMetastorePort(12347)
    .setHiveMetastoreDerbyDbDir("metastore_db")
    .setHiveScratchDir("hive_scratch_dir")
    .setHiveWarehouseDir("warehouse_dir")
    .setHiveConf(new HiveConf())
    .setZookeeperConnectionString("localhost:12345")
    .build();

hiveLocalServer2.start();

HiveMetastore Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-hivemetastore</artifactId>
    <version>0.1.14</version>
</dependency>
HiveLocalMetaStore hiveLocalMetaStore = new HiveLocalMetaStore.Builder()
    .setHiveMetastoreHostname("localhost")
    .setHiveMetastorePort(12347)
    .setHiveMetastoreDerbyDbDir("metastore_db")
    .setHiveScratchDir("hive_scratch_dir")
    .setHiveWarehouseDir("warehouse_dir")
    .setHiveConf(new HiveConf())
    .build();

hiveLocalMetaStore.start();

Storm Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-storm</artifactId>
    <version>0.1.14</version>
</dependency>
StormLocalCluster stormLocalCluster = new StormLocalCluster.Builder()
    .setZookeeperHost("localhost")
    .setZookeeperPort(12345)
    .setEnableDebug(true)
    .setNumWorkers(1)
    .setStormConfig(new Config())
    .build();

stormLocalCluster.start();

Kafka Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-kafka</artifactId>
    <version>0.1.14</version>
</dependency>
KafkaLocalBroker kafkaLocalBroker = new KafkaLocalBroker.Builder()
    .setKafkaHostname("localhost")
    .setKafkaPort(11111)
    .setKafkaBrokerId(0)
    .setKafkaProperties(new Properties())
    .setKafkaTempDir("embedded_kafka")
    .setZookeeperConnectionString("localhost:12345")
    .build();

kafkaLocalBroker.start();

Oozie Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-oozie</artifactId>
    <version>0.1.14</version>
</dependency>
OozieLocalServer oozieLocalServer = new OozieLocalServer.Builder()
    .setOozieTestDir("embedded_oozie")
    .setOozieHomeDir("oozie_home")
    .setOozieUsername(System.getProperty("user.name"))
    .setOozieGroupname("testgroup")
    .setOozieYarnResourceManagerAddress("localhost")
    .setOozieHdfsDefaultFs("hdfs://localhost:8020/")
    .setOozieConf(new Configuration())
    .setOozieHdfsShareLibDir("/tmp/oozie_share_lib")
    .setOozieShareLibCreate(Boolean.TRUE)
    .setOozieLocalShareLibCacheDir("share_lib_cache")
    .setOoziePurgeLocalShareLibCache(Boolean.FALSE)
    .setOozieShareLibFrameworks(
        Lists.newArrayList(Framework.MAPREDUCE_STREAMING, Framework.OOZIE))
    .build();

OozieShareLibUtil oozieShareLibUtil = new OozieShareLibUtil(
    oozieLocalServer.getOozieHdfsShareLibDir(),
    oozieLocalServer.getOozieShareLibCreate(), 
    oozieLocalServer.getOozieLocalShareLibCacheDir(),
    oozieLocalServer.getOoziePurgeLocalShareLibCache(), 
    hdfsLocalCluster.getHdfsFileSystemHandle(),
    oozieLocalServer.getOozieShareLibFrameworks());
oozieShareLibUtil.createShareLib();

oozieLocalServer.start();

MongoDB Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-mongodb</artifactId>
    <version>0.1.14</version>
</dependency>
MongodbLocalServer mongodbLocalServer = new MongodbLocalServer.Builder()
    .setIp("127.0.0.1")
    .setPort(11112)
    .build();

mongodbLocalServer.start();

ActiveMQ Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-activemq</artifactId>
    <version>0.1.14</version>
</dependency>
ActivemqLocalBroker amq = new ActivemqLocalBroker.Builder()
    .setHostName("localhost")
    .setPort(11113)
    .setQueueName("defaultQueue")
    .setStoreDir("activemq-data")
    .setUriPrefix("vm://")
    .setUriPostfix("?create=false")
    .build();

amq.start();

HyperSQL DB Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-hyperscaledb</artifactId>
    <version>0.1.14</version>
</dependency>
hsqldbLocalServer = new HsqldbLocalServer.Builder()
    .setHsqldbHostName("127.0.0.1")
    .setHsqldbPort("44111")
    .setHsqldbTempDir("embedded_hsqldb")
    .setHsqldbDatabaseName("testdb")
    .setHsqldbCompatibilityMode("mysql")
    .setHsqldbJdbcDriver("org.hsqldb.jdbc.JDBCDriver")
    .setHsqldbJdbcConnectionStringPrefix("jdbc:hsqldb:hsql://")
    .build();

hsqldbLocalServer.start();

Knox Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-knox</artifactId>
    <version>0.1.14</version>
</dependency>
KnoxLocalCluster knoxCluster = new KnoxLocalCluster.Builder()
    .setPort(8888)
    .setPath("gateway")
    .setHomeDir("embedded_knox")
    .setCluster("mycluster")
    .setTopology(XMLDoc.newDocument(true)
        .addRoot("topology")
            .addTag("gateway")
                .addTag("provider")
                    .addTag("role").addText("authentication")
                    .addTag("enabled").addText("false")
                    .gotoParent()
                .addTag("provider")
                    .addTag("role").addText("identity-assertion")
                    .addTag("enabled").addText("false")
                    .gotoParent()
                .gotoParent()
            .addTag("service")
                .addTag("role").addText("NAMENODE")
                .addTag("url").addText("hdfs://localhost:8020")
                .gotoParent()
            .addTag("service")
                .addTag("role").addText("WEBHDFS")
                .addTag("url").addText("http://localhost:50070/webhdfs")
        .gotoRoot().toString())
        .build();

knoxCluster.start();

KDC Example

<dependency>
    <groupId>com.github.sakserv</groupId>
    <artifactId>hadoop-mini-clusters-kdc</artifactId>
    <version>0.1.14</version>
</dependency>
KdcLocalCluster kdcLocalCluster = new KdcLocalCluster.Builder()
        .setPort(34340)
        .setHost("127.0.0.1")
        .setBaseDir("embedded_kdc")
        .setOrgDomain("ORG")
        .setOrgName("ACME")
        .setPrincipals("hdfs,hbase,yarn,oozie,oozie_user,zookeeper,storm,mapreduce,HTTP".split(","))
        .setKrbInstance("127.0.0.1")
        .setInstance("DefaultKrbServer")
        .setTransport("TCP")
        .setMaxTicketLifetime(86400000)
        .setMaxRenewableLifetime(604800000)
        .setDebug(false)
        .build();
kdcLocalCluster.start();

Find how to integrate KDC with HDFS, Zookeeper or HBase in the tests under hadoop-mini-clusters-kdc/src/test/java/com/github/sakserv/minicluster/impl

Modifying Properties

To change the defaults used to construct the mini clusters, modify src/main/java/resources/default.properties as needed.

Intellij Testing

If you desire running the full test suite from Intellij, make sure Fork Mode is set to method (Run -> Edit Configurations -> fork mode)

InJvmContainerExecutor

YarnLocalCluster now supports Oleg Z's InJvmContainerExecutor. See Oleg Z's Github for more.