# Exploratory Data Analysis + Basic Modeling

### Context
We have sensors for a multistage chemical process that include information about temperatures, chemical flow rates, additive ratios, pressures, and blending ratios in various stages in the process. In a real situation, we would need to think about how molecules travel through a process and at what time they would have experienced the given process settings. This data set has already been preprocessed to "flatten" the mentioned time series elements, meaning all residence time effects have already been considered. The data is real production data and has some issues that we will have to explore and account for.

A simple diagram of our process:

![Process Diagram](https://i.imgur.com/4lPQtRn.jpg)

### Modeling Objective
Predict the parameter "Quality" given the sensor data.

## Load libraries, data

In [1]:
import time
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os
%matplotlib inline

In [2]:
data = pd.read_csv("../../datasets/anonymized_SAP_data.csv")
data.shape

(2709, 23)

In [3]:
data.head()

Unnamed: 0.1,Unnamed: 0,Date.Time,Main_Mass_Flow,Additive_1_Ratio,Additive_2_Ratio,Additive_3_Ratio,Additive_4_Ratio,Additive_5_Ratio,Additive_6_Ratio,Flow_Gas_Ratio,...,T_Zone_3,T_Zone_4,T_Zone_5,T_Zone_6,T_Zone_7,T_Zone_8,T_Zone_9,T_Zone_10,Blending,Quality
0,1543100,2017-01-01 02:42:00,17947.958984,0.000626,0.053146,0.000876,0.001,0.00091,0.00525,2.53477,...,197.865295,175.733353,177.881012,180.187149,175.28804,181.461487,184.349884,178.632538,24.0,45.075
1,1601100,2017-01-01 03:40:00,17942.625,0.000626,0.053146,0.000876,0.001,0.00091,0.00525,2.524599,...,198.242508,175.395889,176.09288,178.021576,173.222076,180.226273,183.61171,177.633362,24.0,44.6825
2,1670100,2017-01-01 04:49:00,17955.152344,0.000626,0.053146,0.000876,0.001,0.00091,0.00525,2.528997,...,197.993622,175.388733,176.331268,178.255264,173.343567,179.927109,183.406296,177.319366,24.0,44.29
3,1709100,2017-01-01 05:28:00,17965.117188,0.000626,0.053146,0.000876,0.001,0.00091,0.00525,2.5197,...,198.104874,175.564926,175.403198,177.945908,172.33783,178.773697,182.526581,176.517807,24.0,44.4275
4,1768100,2017-01-01 06:27:00,17949.132812,0.000626,0.053146,0.000876,0.001,0.00091,0.00525,2.525926,...,197.611877,174.955109,174.076019,176.105621,170.777313,177.310883,181.188675,175.84726,24.0,44.565
