TEP is analytically extremely complex because you have correlated variables, non-normal distributions, and nonlinear dynamics

<img src="https://miro.medium.com/max/700/0*OC6s0YDZcQxACUAJ.jpg" width="600" height="600"/>

# Process description

<img src="https://raw.githubusercontent.com/gmxavier/TEP-meets-LSTM/master/tep_flowsheet.png" width="900" height="900"/>


<img src="https://www.researchgate.net/profile/Liang-Ma-28/publication/335313863/figure/fig4/AS:794479560753152@1566430096256/Process-flow-diagram-of-Tennessee-Eastman-process.png" width="900" height="900"/>

### Manipulated Variables

Variable | Description
-------- | -----------
`XMV(1)`  | D Feed Flow (stream 2)            (Corrected Order)
`XMV(2)`  | E Feed Flow (stream 3)            (Corrected Order)
`XMV(3)`  | A Feed Flow (stream 1)            (Corrected Order)
`XMV(4)`  | A and C Feed Flow (stream 4)
`XMV(5)`  | Compressor Recycle Valve
`XMV(6)`  | Purge Valve (stream 9)
`XMV(7)`  | Separator Pot Liquid Flow (stream 10)
`XMV(8)`  | Stripper Liquid Product Flow (stream 11)
`XMV(9)`  | Stripper Steam Valve
`XMV(10)` | Reactor Cooling Water Flow
`XMV(11)` | Condenser Cooling Water Flow
`XMV(12)` | Agitator Speed

### Continuous Process Measurements

Variable | Description | unit
-------- | ----------- | ----
`XMEAS(1)`  | A Feed  (stream 1)                  | kscmh
`XMEAS(2)`  | D Feed  (stream 2)                  | kg/hr
`XMEAS(3)`  | E Feed  (stream 3)                  | kg/hr
`XMEAS(4)`  | A and C Feed  (stream 4)            | kscmh
`XMEAS(5)`  | Recycle Flow  (stream 8)            | kscmh
`XMEAS(6)`  | Reactor Feed Rate  (stream 6)       | kscmh
`XMEAS(7)`  | Reactor Pressure                    | kPa gauge
`XMEAS(8)`  | Reactor Level                       | %
`XMEAS(9)`  | Reactor Temperature                 | Deg C
`XMEAS(10)` | Purge Rate (stream 9)               | kscmh
`XMEAS(11)` | Product Sep Temp                    | Deg C
`XMEAS(12)` | Product Sep Level                   | %
`XMEAS(13)` | Prod Sep Pressure                   | kPa gauge
`XMEAS(14)` | Prod Sep Underflow (stream 10)      | m3/hr
`XMEAS(15)` | Stripper Level                      | %
`XMEAS(16)` | Stripper Pressure                   | kPa gauge
`XMEAS(17)` | Stripper Underflow (stream 11)      | m3/hr
`XMEAS(18)` | Stripper Temperature                | Deg C
`XMEAS(19)` | Stripper Steam Flow                 | kg/hr
`XMEAS(20)` | Compressor Work                     | kW
`XMEAS(21)` | Reactor Cooling Water Outlet Temp   | Deg C
`XMEAS(22)` | Separator Cooling Water Outlet Temp | Deg C

### Sampled Process Measurements

- Reactor Feed Analysis (Stream 6)
  > - Sampling Frequency = 0.1 hr
  > - Dead Time = 0.1 hr
  > - Mole %
  
Variable | Description
-------- | -----------
`XMEAS(23)` | Component A
`XMEAS(24)` | Component B
`XMEAS(25)` | Component C
`XMEAS(26)` | Component D
`XMEAS(27)` | Component E
`XMEAS(28)` | Component F

- Purge Gas Analysis (Stream 9)
  > - Sampling Frequency = 0.1 hr
  > - Dead Time = 0.1 hr
  > - Mole %

Variable | Description
-------- | -----------
`XMEAS(29)` | Component A
`XMEAS(30)` | Component B
`XMEAS(31)` | Component C
`XMEAS(32)` | Component D
`XMEAS(33)` | Component E
`XMEAS(34)` | Component F
`XMEAS(35)` | Component G
`XMEAS(36)` | Component H

- Product Analysis (Stream 11)
  > - Sampling Frequency = 0.25 hr
  > - Dead Time = 0.25 hr
  > - Mole %

Variable | Description
-------- | -----------
`XMEAS(37)` | Component D
`XMEAS(38)` | Component E
`XMEAS(39)` | Component F
`XMEAS(40)` | Component G
`XMEAS(41)` | Component H

### Process Disturbances

Variable | Description
-------- | -----------
`IDV(1)`  | A/C Feed Ratio, B Composition Constant (Stream 4)          Step
`IDV(2)`  | B Composition, A/C Ratio Constant (Stream 4)               Step
`IDV(3)`  | D Feed Temperature (Stream 2)                              Step
`IDV(4)`  | Reactor Cooling Water Inlet Temperature                    Step
`IDV(5)`  | Condenser Cooling Water Inlet Temperature                  Step
`IDV(6)`  | A Feed Loss (Stream 1)                                     Step
`IDV(7)`  | C Header Pressure Loss - Reduced Availability (Stream 4)   Step
`IDV(8)`  | A, B, C Feed Composition (Stream 4)            Random Variation
`IDV(9)`  | D Feed Temperature (Stream 2)                  Random Variation
`IDV(10)` | C Feed Temperature (Stream 4)                  Random Variation
`IDV(11)` | Reactor Cooling Water Inlet Temperature        Random Variation
`IDV(12)` | Condenser Cooling Water Inlet Temperature      Random Variation
`IDV(13)` | Reaction Kinetics                                    Slow Drift
`IDV(14)` | Reactor Cooling Water Valve                            Sticking
`IDV(15)` | Condenser Cooling Water Valve                          Sticking
`IDV(16)` | Unknown
`IDV(17)` | Unknown
`IDV(18)` | Unknown
`IDV(19)` | Unknown
`IDV(20)` | Unknown

### Tennessee-Eastman Dataset – Summary of Key Concepts

#### 1. **Target Variable**

* **`faultNumber`** is the target.
* Values:

  * `0`: normal operation (no fault)
  * `1–20`: specific fault types

#### 2. **File Types**

* Files with only `faultNumber = 0`:

  * Simulate fault-free operation
  * Used for training (e.g., PCA baseline) or monitoring control limits
* Files with `faultNumber` transitioning from `0` to a fault:

  * Simulate fault onset and progression
  * Used for detection and classification tasks

#### 3. **`simulationRun` Column**

* Each unique `simulationRun` is one full time series (typically 500 time steps)
* Represents an independent simulation instance
* Used to:

  * Separate train/test datasets
  * Maintain time dependencies within each run

#### 4. **What Varies Across `simulationRun`**

* Only the **random seed** is different
* All noise and disturbances use **fixed distributions** (e.g., Gaussian)
* Changing `simulationRun` creates different noise realizations from the same distribution

#### 5. **Use Cases**

* Unsupervised models: train on normal (`faultNumber == 0`) runs
* Supervised models: train on labeled faulty runs
* Evaluation: split by `simulationRun` to ensure independent samples and avoid leakage



#### 6 Unsupervised model example

Using PCA with encoding-decoding behavior to detect process anomalies via Squared Prediction Error (SPE), you follow a structured approach. Here's the general idea and step-by-step procedure:

---

### **Goal**

Train PCA on fault-free behavior (`faultNumber == 0`) so it captures the "normal space" of process variability. Any significant deviation from this space in future data (from faulty runs) results in high reconstruction error (SPE), signaling an anomaly.

---

### **Step-by-Step Procedure**

#### **Step 1: Collect and Prepare Normal Data**

* **Filter data**: Keep only rows where `faultNumber == 0`.
* **Group by**: `simulationRun` to respect time order.
* **Drop columns**: Remove `faultNumber`, `simulationRun`, and any timestamps.
* **Concatenate** all normal runs into a single dataset.
* **Standardize**: Apply mean-centering and scaling (e.g., z-score).

#### **Step 2: Train the PCA Model**

* **Fit PCA**: On the standardized normal data only.
* **Retain components**: Enough to capture \~95–99% of variance.

This gives:

* A projection matrix $P$
* Principal scores $T = XP$
* Reconstruction: $\hat{X} = TP^T$
* Reconstruction error (SPE): $\text{SPE} = \| X - \hat{X} \|^2$

#### **Step 3: Establish Control Limits**

* Calculate SPE for the training (normal) data.
* Fit a control limit:

  * Empirically (e.g., 99.5th percentile of normal SPEs),
  * Or statistically (e.g., Chi-squared or Kernel density estimate).

This threshold distinguishes between **expected variation** and **anomalous deviation**.

#### **Step 4: Apply to New (Possibly Faulty) Data**

* Standardize new samples using the training mean and std.
* Project using PCA model.
* Reconstruct and compute SPE.
* Compare SPE to the control limit:

  * **SPE below threshold** → likely normal
  * **SPE above threshold** → potential anomaly

#### **Step 5: Postprocessing**

* Plot SPE over time to observe fault onset points.
* Optionally use smoothing or alarm delay mechanisms to reduce false positives.



### Assumptions

* Data is **mean-centered** (always).
* Faults create changes in process variable relationships that PCA cannot reconstruct well.




## Fault explaind

**All 21 Tennessee Eastman fault types** (numbered 1–21, excluding 9 and 15), with **detailed descriptions** grouped by type:


### **I. Step Changes in Process Variables (Faults 1–11)**

These faults represent a **sudden shift** in a key input or condition.

| Fault  | Description                                                                                           |
| ------ | ----------------------------------------------------------------------------------------------------- |
| **1**  | A/C feed ratio — sudden change in the ratio of A to C reactants. Affects reaction balance.            |
| **2**  | B composition — sudden impurity or change in feed B’s composition.                                    |
| **3**  | D feed temperature — abrupt temperature change in stream D.                                           |
| **4**  | Reactor cooling water temperature — changes heat removal; **very slow impact**, often hard to detect. |
| **5**  | Condenser cooling water temperature — similar to 4, but affects condenser.                            |
| **6**  | A feed loss — cuts or reduces flow of A; disrupts stoichiometry.                                      |
| **7**  | C feed loss — same as 6 but for reactant C.                                                           |
| **8**  | A and C feed loss — simultaneous loss of both inputs.                                                 |
| **10** | D feed loss — supply of stream D is lost.                                                             |
| **11** | Loss of reactor agitator — affects mixing; impacts reaction quality.                                  |

---

### **II. Step Changes in Process Equipment (Faults 12–14)**

These simulate **equipment faults** rather than input feeds.

| Fault  | Description                                                                        |
| ------ | ---------------------------------------------------------------------------------- |
| **12** | Reactor cooling water valve stuck — can't regulate cooling; can cause overheating. |
| **13** | Condenser cooling water valve stuck — leads to poor condensation; pressure rises.  |
| **14** | Unknown — unspecified process disruption; behaves as a black-box fault.            |

---

### **III. Random Variations (Faults 16–17)**

These inject **stochastic noise or variability**.

| Fault  | Description                                                                       |
| ------ | --------------------------------------------------------------------------------- |
| **16** | Random variation in reactor feed temperature — process input fluctuates randomly. |
| **17** | Random variation in reactor feed rate — flow rate varies erratically.             |

---

### **IV. Slow Drifts in Sensor or Process (Faults 18–20)**

These simulate **gradual drifts**, typical of aging or calibration drift.

| Fault  | Description                                                                             |
| ------ | --------------------------------------------------------------------------------------- |
| **18** | Slow drift in reactor cooling water temperature — subtle and long-term; hard to detect. |
| **19** | Slow drift in condenser cooling water temperature — same as above but for condenser.    |
| **20** | Slow drift in feed A composition — slowly deteriorating input quality.                  |

---

### **V. Sensor Fault (Fault 21)**

Simulates a fault in measurement, not in process.

| Fault  | Description                                                                        |
| ------ | ---------------------------------------------------------------------------------- |
| **21** | Random bias in measurement — one sensor starts giving biased (incorrect) readings. |

---

### **VI. Skipped Faults**

| Fault  | Description              |
| ------ | ------------------------ |
| **9**  | Not used in the dataset. |
| **15** | Not used in the dataset. |

---

This classification helps you:

* Understand which faults are **abrupt vs gradual**
* Identify **process vs sensor** faults
* Prepare **different detection methods** per fault category

---

Here is a table with all Tennessee Eastman faults, their descriptions, and the **commonly affected variables** in the dataset — both measured variables (`xmeas_x`) and manipulated variables (`xmv_x`), based on the standard TEP naming and indices:

| Fault No | Description                                       | Affected Variables (xmeas\_x or xmv\_x)           |
| -------- | ------------------------------------------------- | ------------------------------------------------- |
| 1        | A/C feed ratio step change                        | `xmeas_3` (Feed A flow), `xmeas_5` (Feed C flow)  |
| 2        | B composition step change                         | `xmeas_8` (Feed B composition)                    |
| 3        | D feed temperature step change                    | `xmeas_9` (Feed D temperature)                    |
| 4        | Reactor cooling water temperature drift           | `xmeas_11` (Reactor cooling water temperature)    |
| 5        | Condenser cooling water temperature drift         | `xmeas_12` (Condenser cooling water temperature)  |
| 6        | A feed loss                                       | `xmv_1` (Feed A flow manipulated variable)        |
| 7        | C feed loss                                       | `xmv_3` (Feed C flow manipulated variable)        |
| 8        | A and C feed loss                                 | `xmv_1` and `xmv_3`                               |
| 10       | D feed loss                                       | `xmv_5` (Feed D flow manipulated variable)        |
| 11       | Loss of reactor agitator                          | `xmeas_13` (Agitator speed)                       |
| 12       | Reactor cooling water valve stuck                 | `xmv_7` (Reactor cooling water valve position)    |
| 13       | Condenser cooling water valve stuck               | `xmv_8` (Condenser cooling water valve position)  |
| 14       | Unknown process disturbance                       | Various, but mostly `xmeas_` and `xmv_` variables |
| 16       | Random variation in reactor feed temperature      | `xmeas_9` (Feed D temperature)                    |
| 17       | Random variation in reactor feed rate             | `xmv_5` (Feed D flow manipulated variable)        |
| 18       | Slow drift in reactor cooling water temperature   | `xmeas_11` (Reactor cooling water temperature)    |
| 19       | Slow drift in condenser cooling water temperature | `xmeas_12` (Condenser cooling water temperature)  |
| 20       | Slow drift in feed A composition                  | `xmeas_3` (Feed A flow or composition proxy)      |
| 21       | Sensor bias                                       | Various sensor measurements (`xmeas_` variables)  |

---

### Notes:

* `xmeas_x`: measured variables (process sensor outputs), indexed by position in the dataset (e.g., `xmeas_3` is the third measured variable).
* `xmv_x`: manipulated variables (process inputs controlled by valves, feed rates, etc.).
* Index numbers may vary slightly depending on dataset version. Commonly used TEP datasets have 41 measured variables and 11 manipulated variables.


In [None]:
# mapp errors to their respective fault numbers
X_dict = {
    "XMEAS_1": "A_feed_stream",
    "XMEAS_2": "D_feed_stream",
    "XMEAS_3": "E_feed_stream",
    "XMEAS_4": "Total_fresh_feed_stripper",
    "XMEAS_5": "Recycle_flow_into_rxtr",
    "XMEAS_6": "Reactor_feed_rate",
    "XMEAS_7": "Reactor_pressure",
    "XMEAS_8": "Reactor_level",
    "XMEAS_9": "Reactor_temp",
    "XMEAS_10": "Purge_rate",
    "XMEAS_11": "Separator_temp",
    "XMEAS_12": "Separator_level",
    "XMEAS_13": "Separator_pressure",
    "XMEAS_14": "Separator_underflow",
    "XMEAS_15": "Stripper_level",
    "XMEAS_16": "Stripper_pressure",
    "XMEAS_17": "Stripper_underflow",
    "XMEAS_18": "Stripper_temperature",
    "XMEAS_19": "Stripper_steam_flow",
    "XMEAS_20": "Compressor_work",
    "XMEAS_21": "Reactor_cooling_water_outlet_temp",
    "XMEAS_22": "Condenser_cooling_water_outlet_temp",
    "XMEAS_23": "Composition_of_A_rxtr_feed",
    "XMEAS_24": "Composition_of_B_rxtr_feed",
    "XMEAS_25": "Composition_of_C_rxtr_feed",
    "XMEAS_26": "Composition_of_D_rxtr_feed",
    "XMEAS_27": "Composition_of_E_rxtr_feed",
    "XMEAS_28": "Composition_of_F_rxtr_feed",
    "XMEAS_29": "Composition_of_A_purge",
    "XMEAS_30": "Composition_of_B_purge",
    "XMEAS_31": "Composition_of_C_purge",
    "XMEAS_32": "Composition_of_D_purge",
    "XMEAS_33": "Composition_of_E_purge",
    "XMEAS_34": "Composition_of_F_purge",
    "XMEAS_35": "Composition_of_G_purge",
    "XMEAS_36": "Composition_of_H_purge",
    "XMEAS_37": "Composition_of_D_product",
    "XMEAS_38": "Composition_of_E_product",
    "XMEAS_39": "Composition_of_F_product",
    "XMEAS_40": "Composition_of_G_product",
    "XMEAS_41": "Composition_of_H_product",
    "XMV_1": "D_feed_flow_valve",
    "XMV_2": "E_feed_flow_valve",
    "XMV_3": "A_feed_flow_valve",
    "XMV_4": "Total_feed_flow_stripper_valve",
    "XMV_5": "Compressor_recycle_valve",
    "XMV_6": "Purge_valve",
    "XMV_7": "Separator_pot_liquid_flow_valve",
    "XMV_8": "Stripper_liquid_product_flow_valve",
    "XMV_9": "Stripper_steam_valve",
    "XMV_10": "Reactor_cooling_water_flow_valve",
    "XMV_11": "Condenser_cooling_water_flow_valve",
    "XMV_12": "Agitator_speed",
}

In [None]:
import pyreadr
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt


train_normal_path = 'data/TEP_FaultFree_Training.RData'
train_faulty_path = 'data/TEP_Faulty_Training.RData'

test_normal_path = 'data/TEP_FaultFree_Testing.RData'
test_faulty_path = 'data/TEP_Faulty_Testing.RData'

train_normal_complete = pyreadr.read_r(train_normal_path)['fault_free_training']
train_faulty_complete = pyreadr.read_r(train_faulty_path)['faulty_training']

test_normal_complete = pyreadr.read_r(test_normal_path)['fault_free_testing']
#test_faulty_complete = pyreadr.read_r(test_faulty_path)['faulty_testing']

In [None]:
train_faulty_complete.head()

Unnamed: 0,faultNumber,simulationRun,sample,xmeas_1,xmeas_2,xmeas_3,xmeas_4,xmeas_5,xmeas_6,xmeas_7,...,xmv_2,xmv_3,xmv_4,xmv_5,xmv_6,xmv_7,xmv_8,xmv_9,xmv_10,xmv_11
0,0,1.0,1,0.25171,3672.4,4466.3,9.5122,27.057,42.473,2705.6,...,54.494,24.527,59.71,22.357,40.149,40.074,47.955,47.3,42.1,15.345
1,0,1.0,2,0.25234,3642.2,4568.7,9.4145,26.999,42.586,2705.2,...,53.269,24.465,60.466,22.413,39.956,36.651,45.038,47.502,40.553,16.063
2,0,1.0,3,0.2484,3643.1,4507.5,9.2901,26.927,42.278,2703.5,...,54.0,24.86,60.642,22.199,40.074,41.868,44.553,47.479,41.341,20.452
3,0,1.0,4,0.25153,3628.3,4519.3,9.3347,26.999,42.33,2703.9,...,53.86,24.553,61.908,21.981,40.141,40.066,48.048,47.44,40.78,17.123
4,0,1.0,5,0.21763,3655.8,4571.0,9.3087,26.901,42.402,2707.7,...,53.307,21.775,61.891,22.412,37.696,38.295,44.678,47.53,41.089,18.681


In [None]:
# Filter the first simulation run
df_train = train_normal_complete[train_normal_complete.simulationRun ==1].iloc[:, 3:]

fig, ax = plt.subplots(13, 4, figsize=(30, 50))
# this for loop plots each column of the DataFrame in a separate subplot
# ax.ravel() flattens the 2D array of axes into a 1D array,
for i in range(df_train.shape[1]):
    df_train.iloc[:, i].plot(ax=ax.ravel()[i])
    ax.ravel()[i].legend()