Storm provisioning for Hortonworks Sandbox 2.0 VM
Puppet Shell
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Puppet provisioning project to add Storm and Kettle-Storm to Hortonworks Sandbox 2.0 VM

Hortonworks has a long detailed document on how to add Storm to their Sandbox VM ( ).

This project automates that process plus the process of installing the Pentaho Kettle-Storm project.

Getting started


The Hortonworks Sandbox 2.0 (virtualbox version) :

The Sandbox VM should be running and configured with internet access.

To provision the VM:

  1. Log into the VM via SSH (i.e. putty on Windows) as instructed on the VM's splash screen.

  2. Run these commands:

    # git clone
    # cd hw-sandbox-storm-provision
    # ./
  3. When the provisioning is done, you should be able to tail the storm logs:

    # tail -F /var/log/storm/*
  4. The last step is to set up port forwarding for the two new HTTP ports.

    1. Open the VirtualBox application
      1. Select the Hortonworks Sandbox 2.0 VM from the list on the left
      2. Click the Settings button on the toolbar
      3. Click the Network icon on the Settings dialog toolbar
      4. Under the Advanced section for Adapter 1, click Port Forwarding
      5. Click the + button to add a new port forward
      6. Name: storm-ui; Host Port: 8880; Guest Port: 8880
      7. Click the + button again to add a new port forward
      8. Name: storm-logviewer; Host Port: 8881; Guest Port: 8881
  5. Visit the Storm Nimbus UI: http://localhost:8880/

To try out the Storm Starter sample topologies (jobs):


To try out the Kettle-Storm sample transformation:

  1. Since this transformation uses twitter4j, you must set up credentials for the demo.

    1. Create app and access token

      1. Log in to the Twitter Dev site:
      2. Click the "Create New App button"
      3. Fill out the "Application details" form.
        1. The contents of this form are not important for the demo.
        2. You can leave the "Callback URL" field blank.
      4. Click the "Create your Twitter application" button.
      5. On the "Application Mangement" page for your newly created app, click the "manage API keys" link under "Application settings" / "API key"
      6. On the "API Keys" page, click the "Create my access token" button at the bottom.
      7. Wait a few moments, then refresh the page to see your access token.
    2. Edit file

      1. In the hw-sandbox-storm-provision directory of your VM, edit the file
      2. Copy each of the four values from the Twitter App API Keys page into this file:
        1. API key == "oath.consumerKey"
        2. API secret == "oath.consumerSecret"
        3. Access token == "oath.accessToken"
        4. Access token secret == "oath.accessTokenSecret"
    3. Add the modified file to the jar so it can be found in the classpath

      1. In the hw-sandbox-storm-provision directory of your VM, run this command:

        # jar uf kettle-engine-storm-0.0.2-SNAPSHOT-for-remote-topology.jar
    4. Submit the jar to the Storm cluster, passing in the Kettle transformation to run

      # storm jar kettle-engine-storm-0.0.2-SNAPSHOT-for-remote-topology.jar org.pentaho.kettle.engines.storm.KettleStorm demo-twitter4j.ktr
    5. By default, the transformation will run for 15 seconds then automatically shut down.

    6. View the results in the output text file

      # cat /home/storm/tweets.txt
  2. Optionally, modify the duration or change the filter keywords by editing the demo-twitter4j.ktr