Skip to content
This repository has been archived by the owner on May 27, 2019. It is now read-only.
Steve Martinelli edited this page Apr 18, 2018 · 13 revisions

Umbrella SI Journey

IoT Edge Analysis: Perform analysis with IoT elements

Short Name

Change Point Detection on IoT Time series data

Short Description

Method for detecting Change point in Sensor data. Sensors mounted on devices like IoT devices, Automated manufacturing like Robot arms, Process monitoring and Control equipment etc., collect and transmit data on a continuous basis which is Time stamped.

Offering Type

Emerging Tech

Introduction

Change Point detection involves collating statistics on Time series data and identify if a Change point has occurred. Core building blocks would include computing Statistical parameters from the Time series data, which compares a Previous dataset of a certain Time range in the past with the Current Series in a recent Time range. Statistical comparison between these two results in detection of any change points. R statistical software is used in this Journey with sample Sensor data loaded into the Data Science experience cloud..

Author

By Krishna Prabu D, Shikha Maheshwari

Code

Video

Overview

This journey takes you through end to end flow of steps in collating statistics on such Time series data and identify if a Change point has occurred. Core building blocks would include computing Statistical parameters from the Time series data, which compares a Previous dataset of a certain Time range in the past with the Current Series in a recent Time range. Statistical comparison between these two results in detection of any change points. R statistical software is used in this Journey with sample Sensor data loaded into the Data Science experience cloud.

All the intermediary steps are modularized and all code open sourced to enable developers to use / modify the modules / sub-modules as they see fit for their specific application When you have completed this journey, you will understand how to

  1. Read Sensor data for a single sensor
  2. Extract 2 Time series datasets one in the past and another in the present
  3. Compress these datasets by translating them into a bunch of statistics that accurately describe the characteristics of these datasets
  4. Compare these statistics and quantify them Analyze these comparisons to detect any occurrence of Change points in the data between Previous data set and Current data set

Flow

  1. Sign up for the Data Science Experience
  2. Create Bluemix services
  3. Create Node-RED App and inject IoT data
  4. Create the notebook
  5. Add the data and configuration files
  6. Update the notebook with service credentials
  7. Run the notebook
  8. Download the results

Included components

  • IBM Data Science Experience: Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
  • Bluemix Object Storage: A Bluemix service that provides an unstructured cloud data store to build and deliver cost effective apps and services with high reliability and fast speed to market.
  • IBM Node-RED Cloud Foundry App: Develop, deploy, and scale server-side JavaScript® apps with ease. The IBM SDK for Node.js™ provides enhanced performance, security, and serviceability.
  • DB2 Warehouse on cloud: IBM Db2 Warehouse on Cloud is a fully-managed, enterprise-class, cloud data warehouse service. Powered by IBM BLU Acceleration.
  • Internet of Things platform: This service is the hub for IBM Watson IoT and lets you communicate with and consume data from connected devices and gateways. Use the built-in web console dashboards to monitor your IoT data and analyze it in real time.

Featured technologies

  • R: R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.
  • Jupyter Notebooks: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
  • Data Science: Systems and scientific methods to analyze structured and unstructured data in order to extract knowledge and insights.
  • Analytics: Analytics delivers the value of data for the enterprise.

Blog

Change point detection is used to detect any behavioural change in the performance of Time Series data. Specifically, in IoT Sensor data the applications are very wide. Traiditional Change point detection that are implemented use Rule based methods that compare 2 data points or sets of 2 time series to compare and detect if there is a significant change that had taken place. This Journey uses Statistical approach to detect such change points. This journey leverages Node-RED in IBM Blue mix and R Spark services in IBM Data science experience at its core to implement.

All components are designed to be reused either as a complete flow or as individual components. With that purpose in mind, all components are made completely configurable so that multiple experiments can be repeated by tweaking the parameters. The outputs which are statistical metrics can be used further in downstream applications. The entire Logical Architecture or flow of the Journey can be split into 2 main modules. The first module that uses IBM Blue mix collects data from a IoT Sensor source and injects into a DB2 database in cloud. The second module leverages R statistical functions written in R Spark – Jupyter Notebook in IBM Data Science Experience to read this data from DB2 and then compute the statistics to detect if any change point had occurred.

Links

  • Change Point detection: In statistical analysis, change detection or change point detection tries to identify times when the probability distribution of a stochastic process or time series changes. In general the problem concerns both detecting whether or not a change has occurred, or whether several changes might have occurred, and identifying the times of any such changes.

  • Time Series: A time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.

  • Introduction to Statistical Change point: Changepoint analysis for time series is an increasingly important aspect of statistics. Simply put, a changepoint is an instance in time where the statistical properties before and after this time point differ. With potential changes naturally occurring in data and many statistical methods assuming a "no change" setup, changepoint analysis is important in both applied and theoretical statistics.

  • A Survey of Methods for Time Series Change Point Detection: Change points are abrupt variations in time series data. Such abrupt changes may represent transitions that occur between states. Detection of change points is useful in modelling and prediction of time series and is found in application areas such as medical condition monitoring, climate change detection, speech and image analysis, and human activity analysis. This survey article enumerates, categorizes, and compares many of the methods that have been proposed to detect change points in time series.