# Factory Process Regression
		

## Introduction

### Description of physical setup

- The data comes from a continuous flow process. 
- Sample rate is 1 Hz.


- In the first stage, Machines 1, 2, and 3 operate in parallel, and feed their outputs into a step that combines the flows.
- Output from the combiner is measured in 15 locations. These measurements are the **primary measurements to predict.**


- Next, the output flows into a second stage, where Machines 4 and 5 process in series.
- Measurements are made again in the same 15 locations. These are the **secondary measurements to predict**.


- Measurements are noisy.
- Each measurement also has a target or *Setpoint* (setpoints are included in the first row of data).
- The goal is to predict the measurements (or the error versus setpoints) for as many of the 15 measurements as possible.
- Some measurements will be more predictable than others!

### Tasks

- Prediction of measurements after the first stage are the primary interest.
- Prediction of measurements after the second stage are nice-to-have but the data is much more noisy.

### Note on variable naming conventions

- **.C.Setpoint* -- Setpoint for Controlled variable 
- **~.C.Actual* -- Actual value of Controlled variable 
- **~.U.Actual* -- Actual value of Uncontrolled variable 
- *Others* -- Environmental or raw material variables, States / events, etc.

### Columns Desctription

```
Start col |   End col | Description
        0 |         0 | Time stamp
        1 |         2 | Factory ambient conditions
        3 |         6 | First stage, Machine 1, raw material properties (material going in to Machine 1)
        7 |        14 | First stage, Machine 1 process variables
       15 |        18 | First stage, Machine 2, raw material properties (material going in to Machine 2)
       19 |        26 | First stage, Machine 2 process variables
       27 |        30 | First stage, Machine 3, raw material properties (material going in to Machine 3)
       31 |        38 | First stage, Machine 3 process variables
       39 |        41 | Combiner stage process parameters. Here we combines the outputs from Machines 1, 2, and 3.
       42 |        71 | PRIMARY OUTPUT TO CONTROL: Measurements of 15 features (in mm), along with setpoint or target for each
       72 |        78 | Second stage, Machine 4 process variables
       79 |        85 | Second stage, Machine 5 process variables
       86 |       115 | SECONDARY OUTPUT TO CONTROL: Measurements of 15 features (in mm), along with setpoint or target for each
```

### Acknowledgement

The data was taken from [this project](https://www.kaggle.com/supergus/multistage-continuousflow-manufacturing-process).



## Imports & Drive Mount

In [None]:
%tensorflow_version 2.x
%matplotlib inline

import random

import numpy as np
import pandas as pd

import matplotlib
import matplotlib.pyplot as plt

from tqdm.notebook import tqdm

In [None]:
matplotlib.rcParams['figure.figsize'] = (25, 6)
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

In [None]:
from google.colab import drive
drive.mount('/content/drive/')

Mounted at /content/drive/


## Loading data

In [None]:
import pandas as pd

df = pd.read_csv('/content/drive/My Drive/ml-college/time-series-analysis/data/manufacturing/continuous_factory_process.csv')

Let's check if we can see the structure of the dataset as described...

In [None]:
for i, c in enumerate(df.columns.values):
    print(i, c)

We'll use `filter` method from pandas to get to the rigth subset of columns:

In [None]:
df.filter(regex='^Stage1.Output.Measurement.*Actual$').columns

Index(['Stage1.Output.Measurement0.U.Actual',
       'Stage1.Output.Measurement1.U.Actual',
       'Stage1.Output.Measurement2.U.Actual',
       'Stage1.Output.Measurement3.U.Actual',
       'Stage1.Output.Measurement4.U.Actual',
       'Stage1.Output.Measurement5.U.Actual',
       'Stage1.Output.Measurement6.U.Actual',
       'Stage1.Output.Measurement7.U.Actual',
       'Stage1.Output.Measurement8.U.Actual',
       'Stage1.Output.Measurement9.U.Actual',
       'Stage1.Output.Measurement10.U.Actual',
       'Stage1.Output.Measurement11.U.Actual',
       'Stage1.Output.Measurement12.U.Actual',
       'Stage1.Output.Measurement13.U.Actual',
       'Stage1.Output.Measurement14.U.Actual'],
      dtype='object')