Skip to content

Commit

Permalink
A library to process sensor data events real time using storm for sto…
Browse files Browse the repository at this point in the history
…rm-contrib
  • Loading branch information
surajwaghulde committed Mar 3, 2012
1 parent b7b5cf9 commit e637168
Show file tree
Hide file tree
Showing 7 changed files with 496 additions and 0 deletions.
48 changes: 48 additions & 0 deletions examples/movingAverageWithSpikeDetection/README
@@ -0,0 +1,48 @@
Technical Details:

This is a fun project I created to leverage the newest real time computing platforms to process data generated from sensor devices. This project is about creating a library to continuously listen to and analyze stream of data generated by various sensor devices. This library is developed using storm-distributed stream computing platform (http://www.slideshare.net/nathanmarz/storm-distributed-and-faulttolerant-realtime-computation). The basic architecture of the platform is such that it continuously processes the streaming data generating another stream of result and you can continue creating of such pipeline of stream processing endlessly.

I have written a library to compute moving average and spike detection for continuous stream of data that can be applied to finance or any other streams. The purpose of the library is I want to create a library of bolts that can do certain operations and we can reuse these bolts than starting over from scratch.

Let us start with deploying my project on single node. Deploying it on single node is very simple. I benchmarked this project to process 96,000 sensor values per second on cluster of 3 machines. It detects spike within few milliseconds.

Hardware Requirements: (Not necessary as you can generate input like sensor data input using inputStreamSpout)

1. arduino kit with circuit design of photo resistor.
2. Interface this kit with your laptop using serial port and run the light program I submitted to generate light intensity events.
3. It will list the serial port being used on the machine.

Software Requirements:

1. Download storm version 0.6.2 from https://github.com/nathanmarz/storm/downloads
2. Install maven. I am considering Java 1.6 is installed.


Running my project:
1. Download storm-starter project from https://github.com/nathanmarz/storm-starter/downloads, unzip the project, rename m2-pom.xml to pom.xml
2. Copy my project movingAverageWithSpikeDetection.tar.gz to storm-starter/src/jvm directory. Unzip my project submitted movingAverageWithSpikeDetection.tar.gz using “tar –zxf movingAverageWithSpikeDetection.tar.gz”
3. Build my project using maven with command in storm-starter folder – “mvn clean install” (it will install all the libraries for serialization and other stuff for distributed system.)
4. Run “mvn eclipse:eclipse” to create eclipse .project file for simplicity.
5. Open eclipse and import the movingAverageWithSpikeDetection project.
6. Open LightEventSpout.java and change the PORT_NAMES[] entry according to the serial port on your machine which arduino kit is using. Baud rate is defined to be 9600 in LightEventSpout.java so if you change it for experiment make sure both the baud rates are matching, one from the device and one from the program.
7. Upload the light program on the arduino kit.
8. Run SpikeDetectionTopology.java, if you are getting an exception PortInUse then create a folder /var/lock/ and give 775 permissions to it, it is a problem with arduino to Java interface. (This will automatically invoke zookeeper distributed cluster management and run the program over it with one node)


Creating a cluster of machines and running my project on distributed cluster:

1. Download zookeeper from - http://download.filehat.com/apache/zookeeper/zookeeper-3.3.3/
2. Unzip zookeeper and change zoo_example.cfg file from the config folder to zoo.cfg
3. Go to bin folder in zookeeper and start zookeeper instance using this command - “zkServer.sh start”
4. Unzip storm-0.6.1 folder
5. Copy storm.yaml.example to storm.yaml and add all the machine names (you can use IP address) to storm.zookeeper.servers that indicates zookeeper is running on every machine for co-ordination in distributed system
6. Add master machine-name as nimbus.host which is interfaced with the arduino
(Remember all these steps needs to be done on every machine)


Now your cluster is set. Do the following to run the above project on cluster of machines:

1. On every machine, go to storm-0.6.2 folder. Go to bin directory inside it and run “./storm supervisor”
2. On master machine, run “./storm nimbus” that will start the master process called nimbus that distributes the runnable programs over the cluster dynamically.
3. Now run our project from the master node using the command – “./storm jar movingAverageSpikeDetection.jar movingAverage.SpikeDetectionTopology”. if you are getting an exception PortInUse then create a folder /var/lock/ and give 775 permissions to it, it is a problem with arduino to Java interface. (This will automatically invoke zookeeper distributed cluster management and run the program over it with one node)

38 changes: 38 additions & 0 deletions examples/movingAverageWithSpikeDetection/arduino/light
@@ -0,0 +1,38 @@
/*
* A simple programme that will change the intensity of
* an LED based * on the amount of light incident on
* the photo resistor.
*
*/
//PhotoResistor Pin
int lightPin = 0; //the analog pin the photoresistor is
//connected to
//the photoresistor is not calibrated to any units so
//this is simply a raw sensor value (relative light)
//LED Pin
int ledPin = 9; //the pin the LED is connected to
//we are controlling brightness so
//we use one of the PWM (pulse width
// modulation pins)
void setup()
{
Serial.begin(9600);
pinMode(ledPin, OUTPUT); //sets the led pin to output
}
/*
* loop() – this function will start after setup
* finishes and then repeat
*/
void loop()
{
int lightLevel = analogRead(lightPin); //Read the
// lightlevel
lightLevel = map(lightLevel, 0, 900, 1000, 9999);
//adjust the value 0 to 900 to
//span 0 to 255
lightLevel = constrain(lightLevel, 1000, 9999);//make sure the
//value is betwween
//0 and 255
analogWrite(ledPin, lightLevel); //write the value
Serial.println(lightLevel);
}
@@ -0,0 +1,64 @@
package movingAverage;

import java.util.Map;
import java.util.Random;

import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.IRichSpout;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Values;

public class InputStreamSpout implements IRichSpout {
private static final long serialVersionUID = 1L;

private SpoutOutputCollector collector;
private int count = 1000000;
private String deviceID = "Arduino";

private final Random random = new Random();

@Override
public boolean isDistributed() {
return true;
}

@Override
public void open(@SuppressWarnings("rawtypes") final Map conf, final TopologyContext context,
final SpoutOutputCollector collector) {
this.collector = collector;
}

@Override
public void nextTuple() {
if (count-- > 0) {
collector.emit(new Values(deviceID, (random.nextDouble() * 10) + 50));
} else if (count-- == -1) {
collector.emit(new Values(deviceID, -1.0));
}
// try {
// Thread.sleep(20);
// } catch (InterruptedException e) {
// e.printStackTrace();
// }
}

@Override
public void close() {
}

@Override
public void ack(final Object id) {
}

@Override
public void fail(final Object id) {
}

@Override
public void declareOutputFields(final OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("string","double"));
}

}
@@ -0,0 +1,174 @@
package movingAverage;

import java.util.Map;

import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.IRichSpout;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Values;

import java.io.InputStream;
import java.io.OutputStream;
import gnu.io.CommPortIdentifier;
import gnu.io.SerialPort;
import gnu.io.SerialPortEvent;
import gnu.io.SerialPortEventListener;
import java.util.Enumeration;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;

public class LightEventSpout implements IRichSpout, SerialPortEventListener {

private static final long serialVersionUID = 1L;

SerialPort serialPort;
/** The port we're normally going to use. */
private static final String PORT_NAMES[] = {
"/dev/tty.usbmodemfa131", // Mac OS X
};
/** Buffered input stream from the port */
private InputStream input;
/** The output stream to the port */
private OutputStream output;
/** Milliseconds to block while waiting for port open */
private static final int TIME_OUT = 2000;
/** Default bits per second for COM port. */
private static final int DATA_RATE = 9600;

private SpoutOutputCollector collector;
private String deviceID = "Arduino";
private BlockingQueue<Integer> blockingQueue = new LinkedBlockingQueue<Integer>();

@Override
public boolean isDistributed() {
return true;
}

@Override
public void open(@SuppressWarnings("rawtypes") final Map conf, final TopologyContext context,
final SpoutOutputCollector collector) {
this.collector = collector;

CommPortIdentifier portId = null;
Enumeration portEnum = CommPortIdentifier.getPortIdentifiers();

// iterate through, looking for the port
while (portEnum.hasMoreElements()) {
CommPortIdentifier currPortId = (CommPortIdentifier) portEnum.nextElement();
for (String portName : PORT_NAMES) {
if (currPortId.getName().equals(portName)) {
portId = currPortId;
System.out.println(portName);
System.out.println(portId);
break;
}
}
}

if (portId == null) {
System.out.println("Could not find COM port.");
return;
}

try {
// open serial port, and use class name for the appName.
System.out.println("serial port : " + serialPort);
serialPort = (SerialPort) portId.open(this.getClass().getName(),
TIME_OUT);
System.out.println("serial port : " + serialPort);

// set port parameters
serialPort.setSerialPortParams(DATA_RATE,
SerialPort.DATABITS_8,
SerialPort.STOPBITS_1,
SerialPort.PARITY_NONE);

// open the streams
input = serialPort.getInputStream();
output = serialPort.getOutputStream();

// add event listeners
serialPort.addEventListener(this);
serialPort.notifyOnDataAvailable(true);
} catch (Exception e) {
System.err.println(e.toString());
}
}

/**
* This should be called when you stop using the port.
* This will prevent port locking on platforms like Linux.
*/
public synchronized void closeSerial() {
if (serialPort != null) {
serialPort.removeEventListener();
serialPort.close();
}
}

@Override
public void nextTuple() {
try {
collector.emit(new Values(deviceID, blockingQueue.take()));
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}

@Override
public void close() {
}

@Override
public void ack(final Object id) {
}

@Override
public void fail(final Object id) {
}

@Override
public void declareOutputFields(final OutputFieldsDeclarer declarer) {
declarer.declare(new Fields("string","double"));
}

@Override
public void serialEvent(SerialPortEvent oEvent) {
if (oEvent.getEventType() == SerialPortEvent.DATA_AVAILABLE) {
try {
int available = input.available();
byte chunk[] = new byte[available];
input.read(chunk, 0, available);
int j = 0;
// System.out.println("chunk length " + chunk.length);
while(j < chunk.length && chunk[j] != '\n') {
j++;
}
j++;
byte number[] = new byte[4];
int count = 0;
while(j < chunk.length-1) {
while (chunk[j] != 10 && chunk[j] != 13 && j < chunk.length-1) {
number[count++] = chunk[j];
j++;
}
j++;
System.out.println(new String(number));
count = 0;
blockingQueue.add(Integer.parseInt(new String(number)));
}

} catch (Exception e) {
System.out.println("Error");
System.err.println(e.toString());
}
}
// Ignore all the other eventTypes, but you should consider the other ones.

}

}

0 comments on commit e637168

Please sign in to comment.