No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Latest commit 4367f23 Apr 26, 2018

IoT Predictive Maintenance using Recurrent Neural Networks

This is a demo that utilizes the MapR Converged Data Platform, Pyspark, TensorFlow and Python 3.5 for predicting the next time period value in a time series on data from an IoT device. This is particularly useful for manufacturing and industry 4.0 where sensors are attached to components sending data back for real time monitoring. This demo takes real time monitoring and enhances with it with prediction capabilities that can generate alerts when the prediction exceeds a threshold of normal behavior.

If we can predict when a piece of hardware will fail accurately, and replace that component before it fails, we can achieve much higher levels of operational efficiency.

With many devices now including sensor data and other components that send diagnosis reports, predictive maintenance using big data becomes increasingly more accurate and effective.

What are we working with?

A sensor attached to a automated manufacturing device capture position and calibration at each time stamp. The sensor is capturing real time data on the device and the device's positioning. The data is stored for historical analysis to identify trends and patterns to determine if any devices need to be taken out of production for health checks and maintenance.


2,014 .dat files that, when unpackaged, were xml format

When loaded into the sandbox, take the last 10 .dat files and move to another folder. These will be the "real-time" data that the producer script will open and read into the MapR stream that is created and that our .

Download the MapR Sandbox

To get started, download the MapR Sandbox and install in Virtual Box:

Before you start your new "Sandbox" VM, add the following line to the Port Forwarding rules for the NAT network interface. If you are running the gateway on a different port, please substitute your port number for '8082' below. Add jupyter to this as well.
Kafka_REST TCP 8082 8082

jupyter TCP 9999 9999

Login as the ‘root’ user, install the Kafka REST Gateway

Set the flush timeout for the Kafka REST gateway buffer, and then restart the Warden service:

$ ssh -p 2222 root@localhost

password: mapr

[root@maprdemo ~]# yum install mapr-kafka-rest

[root@maprdemo ~]# echo '' >> /opt/mapr/kafka-rest/kafka-rest-2.0.1/config/

[root@maprdemo ~]# service mapr-warden restart

[root@maprdemo ~]# exit

Login as the 'user01' user, create the MapR Stream and Topic for this demo:

$ ssh -p 2222 user01@localhost

password: mapr

[user01@maprdemo ~]$ maprcli stream create -path /user/user01/iot_stream -produceperm p -consumeperm p -topicperm p

[user01@maprdemo ~]$ maprcli stream topic create -path /user/user01/iot_stream -topic sensor_record

Now use scp or vi to copy the following Python files into your Sandbox VM at '/user/user01'.

Install Anaconda Python: (note, instead of installing miniconda, replace with "wget") The Python files included in this demo package assume that you have installed Python 3.5.2 AND the "Requests HTTP Library for Python".

For the visualization:

Plotly no longer has cloud supported streaming. I will be working on another visualization application to replace Plotly.

First, go to and set up an account. Once you have set up an account, go to your account settings and on the left you will see a menu selection for API key. Click that and then "Regenerate Key". Then set up two Streaming API tokens. Once this is completed you need to install the plotly package in the Sandbox and then set up your credentials. In the sandbox do the following:

[user01@maprdemo ~]$ pip install plotly

[user01@maprdemo ~]$ vi ~/.plotly/.credentials

add in stream tokens,username and api-key. To view your visualization, click on the My Files tab on the plotly website and then "view".


Justin Brandenburg

Data Scientist, MapR Data Technologies


Mike Aube

Solutions Engineer - MapR Federal, MapR Data Technologies