Skip to content
This repository


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Kettle plugin that provides support for interacting within many "big data" projects including Hadoop, Hive, HBase, Cassandra, MongoDB, and others.

Merge pull request #218 from brosander/PDI-11565

[CLEANUP] - License headers
latest commit 732ca34771
Matt Burgess mattyb149 authored
Octocat-spinner-32 build-res shared file update from distribute.groovy April 08, 2014
Octocat-spinner-32 package-res PDI-11308 - Adding plugins folder to big data plugin, using KettleCli… February 14, 2014
Octocat-spinner-32 res [PDI-7843] - adding an icon for the Oozie Job Executor June 14, 2012
Octocat-spinner-32 samples PDI-10806 - Updated wordcount sample to work with mapr December 17, 2013
Octocat-spinner-32 src [CLEANUP] - License headers April 16, 2014
Octocat-spinner-32 test-res [PDI-8044] Introduction of Hadoop Configuration (shim layer) to allow… August 02, 2012
Octocat-spinner-32 test-src [PDI-11565] - Refactoring oozie to use shimmed version if available April 11, 2014
Octocat-spinner-32 .classpath Fixed plugin project dependencies to mapred classes. Shims now publis… August 17, 2012
Octocat-spinner-32 .gitignore Now ignoring test artifacts (pdi-*) August 29, 2012
Octocat-spinner-32 .project BAD-38: Moved shims into their own project (pentaho-hadoop-shims) June 11, 2013
Octocat-spinner-32 LICENSE.txt [BAD-8] Updated license to Apache License, Version 2.0. January 12, 2012
Octocat-spinner-32 README.markdown Added project README August 16, 2012
Octocat-spinner-32 Updated Kettle revision to TRUNK-SNAPSHOT for master branch April 22, 2013
Octocat-spinner-32 build.xml [PDI-10205]: Exclude commons-httpclient August 13, 2013
Octocat-spinner-32 ivy.xml [CLEANUP]: Update shim list to match current SP April 04, 2014
Octocat-spinner-32 ivysettings.xml shared file update from distribute.groovy December 02, 2013
Octocat-spinner-32 package-ivy.xml [PDI-4601] Fixing artifacts September 16, 2010
Octocat-spinner-32 package-samples-ivy.xml [build] updated package-samples-ivy.xml to correctly add -samples to … October 13, 2010
Octocat-spinner-32 pentaho-big-data-plugin.iml [PDI-7843] - "Basic Options" mode functional. more extraction of re-u… June 14, 2012

Pentaho Big Data Plugin

The Pentaho Big Data Plugin Project provides support for an ever-expanding Big Data community within the Pentaho ecosystem. It is a plugin for the Pentaho Kettle engine which can be used within Pentaho Data Integration (Kettle), Pentaho Reporting, and the Pentaho BI Platform.


The Pentaho Big Data Plugin is built with Apache Ant and uses Apache Ivy for dependency management. All you'll need to get started is Ant 1.7.0 or newer to build the project. The build scripts will download Ivy if you do not already have it installed.

$ git clone git://
$ cd big-data-plugin
$ ant

This will produce a plugin archive in dist/pentaho-big-data-plugin-${project.revision}.tar.gz (and .zip). This archive can then be extracted into your Pentaho Data Integration plugin directory.

Further Reading

Additional documentation is available on the Community wiki: Big Data Plugin for Java Developers


Licensed under the Apache License, Version 2.0. See LICENSE.txt for more information.

Something went wrong with that request. Please try again.