Data Exploration for PI System
The project around PI System must be considered as Data Driven Projects.
The functional team is not fully qualified to test the data delivery by its own. It needs help to challenge the data quality. It needs some Continuous Control Monitoring. It need to accept and integrate the failure!
Install Anaconda TODO change this to pure python.
conda create --name dataexploration matplotlib pandas. TODO change this to pure python.
An accessible up and running PI System with PI WEB-API.
PI WEB-API should be configured for a basic authentication (username/password).
The conf/credentials.yml has not been pushed for a security reason. Therefore, a conf/credentials.yaml.template is added to copy and rename to conf/credentials.yml with a username/password basic authentication.
- PI-Web-API-Client-Python – PI Client for Python
- csv - Read CSV files
- pandas - Work with data structure like missed Data
- numpy - perform calculations over Data
Data Exploration description
This model consists of several steps/scripts.
Get the Data from PI System
This script consists on getting the Data from Asset Framework and PI Data Archive. In order to make our testing fully independent of every environment and also to make our testing rules easy to prepare, we will previously insert some AFElements and PIPoints with their appropriate Data in order to work with them with our models.
This script cleans and pre process the data from your sensor values,before doing any task you have to format your file from this script. Before executing this script you will have to add date time value on the top row of your file with spaces otherwise it won't work
Decision Tree Model
This script consists of unsupervised decision tree model to classify your time-serie values.
It generates a file named leak.csv with values having leaks.
Generate some missing data
Pass a csv file with "
date time value" format, this script will identify the frequency and then generate the missing rows in a csv file autofill-output.csv so that you can fill the values and merge them
Merge two files together
This script will take two csv file with "
date time value" format and add the filled values in the orignal file on the recquired place
Statistics on data received
This script generates some periodic statistics on the data received. The results are put into csv files. Statistics implemented:
- Percentage of received data per hour. Therefore, low threshold and a high treshold is calculated regarding of the data received.