# Fault detection, given a feed forward control system
This will be a case of fault detection at a simple feed forward control system using:
- engineering expertise,
- a labelled data set (supervised machine learning),
- an unlabelled data set (unsupervised machine learning).

This exhibition just introduces some practical considerations when implementing fault detection using engineering expertise and machine learning. 

## Introduction to the case

![Figure](figures/FaultDetection01a.png)

The case in the Figure consists of a light, a resistor, a ground connection and a switch. The components outside the marked area remain out of scope, although they may still influence fault detection. The evidence for the fault detection has been confined to:
- the position of the switch (S<sub>1</sub>), 
- the voltage V<sub>0</sub> before the light,
- the voltage V<sub>1</sub> between the light and the resistor, and
- the voltage V<sub>2</sub> after the resistor.

The scope of this case and the evidence may not be ideal, but this is not exceptional in observational studies of operational data. 

## Fault detection using engineering expertise
At the stage of design, an engineer may use a bottom-up Failure Mode and Effect Analysis (FMEA) to identify the failure modes of each component. Let the engineer decide that the following failure modes are applicable.
 
<table>
<tr>
<th>Component <th>Failure mode <th>Code <th>Failure effect
<tr>
<td>wire to light <td>break circuit <td>F1 <td>No light when switched on
<tr>
<td>light <td>break circuit <br /> short circuit <td>F2 <br />F3 <td>No light when switched on
<tr>
<td>resistor <td>break circuit <br /> short circuit <td>F4 <br />F5 <td>No light when switched on <br />Light when switched on
<tr>
<td>wire to ground <td>break circuit <td>F6 <td>No light when switched on
</table>

Evidently, this table is arbitrary and operating experience may reveal that this bottom-up FMEA appears to be incomplete. In a conventional FMEA, the effects of the failure modes will involve HSE, operational and non operational impact. However, the expected sensor values, given *one* particular failure mode are important here as illustrated in the incomplete table below. [Click here](https://github.com/chrisrijsdijk/RAMS/blob/master/data/Arduino_trainingset_deploymentNormalopenLabelDesign.xlsx) for the equivalent table that will be loaded by the script.
 
<table>
<tr>
<th>Component <th>Failure mode <th>Code <th colspan=3>Switch closed S<sub>1</sub>=1 <th colspan=3>Switch open S<sub>1</sub>=0
<tr>
<td> <td> <td> <td colspan=1>V<sub>0</sub> in V <td colspan=1>V<sub>1</sub> in V <td colspan=1>V<sub>2</sub> in V <td colspan=1>V<sub>0</sub> in V <td colspan=1>V<sub>1</sub> in V <td colspan=1>V<sub>2</sub> in V
<tr>
<td> <td>HEALTHY SYSTEM <td> <td colspan=1>3.4 <td colspan=1>1.4 <td colspan=1>0 <td colspan=1>0 <td colspan=1>0 <td colspan=1>0 
<tr>
<td>wire to light <td>break circuit <td>F1 <td colspan=1>- <td colspan=1>- <td colspan=1>- <td colspan=1>- <td colspan=1>- <td colspan=1>- 
<tr>
<td>light <td>break circuit <br /> short circuit <td>F2 <br />F3 <td colspan=1>5 <br />- <td colspan=1>0 <br />- <td colspan=1>0 <br />- <td colspan=1>- <br />- <td colspan=1>- <br />- <td colspan=1>- <br />-
<tr>
<td>resistor <td>break circuit <br /> short circuit <td>F4 <br />F5 <td colspan=1>- <br />- <td colspan=1>- <br />- <td colspan=1>- <br />- <td colspan=1>- <br />- <td colspan=1>- <br />- <td colspan=1>- <br />-
<tr>
<td>wire to ground <td>break circuit <td>F6 <td colspan=1>- <td colspan=1>- <td colspan=1>- <td colspan=1>- <td colspan=1>- <td colspan=1>- 
</table>

Given a closed switch (S<sub>1</sub>=1), the design engineer may hope to detect the failure modes using the voltage measurements but given an open switch (S<sub>1</sub>=0), the design engineer will fail to detect any of the failure modes using the voltage measurements. The voltage measurements only detect faults at a closed switch (S<sub>1</sub>=1). To detect faults at an open switch, other measurements should be introduced. In this way, the engineer may improve the health monitoring system.

An FMEA may suffice for the reliability engineering practice. However, for fault detection, the system states are more important. In this case, the system state follows from six binary failure modes and one binary switch. So, the number of system states $n$ is tentatively:

$ n= 2^{6} \times 2 = 128 $

However, a short circuit and an open circuit of a light or a resistor cannot occur simultaneously. Therefore, the number of system states $n$ reduces to:

$ n= (2^{6} - 2^{4} - 2^{4} + 2^{2} ) \times 2 =72 $

The FMEA already showed that the voltage measurements are informative about the system state, i.e. the ordered lists (V<sub>0</sub>=0, V<sub>1</sub>=0, V<sub>2</sub>=0) and (V<sub>0</sub>=5, V<sub>1</sub>=0, V<sub>2</sub>=0) map to different subsets of the possible system states. Still, the FMEA also showed that the ordered list (V<sub>0</sub>=0, V<sub>1</sub>=0, V<sub>2</sub>=0) does *not* map to a unique system state. So, the system state is *just partially* identifiable from the voltage measurements.
    
The engineer often reduces his analysis burden by using a top-down FMEA rather than the bottom-up FMEA that has been introduced here. A top-down FMEA departs from the assigned functions. Let the assigned function be 
    
*to produce light, given a closed switch*
    
Then, the FMEA simplifies to failure modes that imply a loss of the assigned function as shown below. 
 
<table>
<tr>
<th>Component <th>Failure mode <th>Code <th colspan=3>Switch closed S<sub>1</sub>=1 
<tr>
<td>  <td> <td> <td colspan=1>V<sub>0</sub> in V <td colspan=1>V<sub>1</sub> in V <td colspan=1>V<sub>2</sub> in V 
<tr>
<td colspan=3> DARK, GIVEN A CLOSED SWITCH <td colspan=1> <td colspan=1> <td colspan=1>  
<tr>
<td>wire to light <td>break circuit <td>F1 <td colspan=1>- <td colspan=1>- <td colspan=1>-  
<tr>
<td>light <td>break circuit <br /> short circuit <td>F2 <br />F3 <td colspan=1>5 <br />- <td colspan=1>0 <br />- <td colspan=1>0 <br />- 
<tr>
<td>resistor <td>break circuit <td>F4 <td colspan=1>- <td colspan=1>- <td colspan=1>-  
<tr>
<td>wire to ground <td>break circuit <td>F6 <td colspan=1>- <td colspan=1>- <td colspan=1>-  
</table>

Then, the number of relevant system states becomes

$ n= (2^{5} - 2^{3} ) =24 $

which reduces the analysis burden. These kinds of simplifications are common practice in conventional Reliability Centred Maintenance implementations to compose a proactive maintenance policy. However, the $72-24$ irrelevant system states may still occur in practice and fault detection may still be required during these irrelevant, but possible system states. Even in this simple case, the engineer may be required to assess the signatures of *72* system states. State space explosions are a common concern as each state requires attention.

In this script, the predefined signature that is closest to the measured signature $[V_{0,m},V_{1,m}, V_{2,m}, S_{1,m}]$ and the predefined signature $[V_{0,s},V_{1,s}, V_{2,s}, S_{1,s}]$ will be assigned to the measurement. Therefore the distance $d$ to the various predefined signatures is calculated before the minimum will be chosen:

$ min(d)= \sqrt{(V_{0,m}-V_{0,s})^{2}+(V_{1,m}-V_{1,s})^{2}+(V_{2,m}-V_{2,s})^{2}+(S_{m}-S_{s})^{2}} ) $
    
The calculation of $min(d)$ requires a choice of a scaler that determines the weight of the switch position with respect to the voltages. The script does not rescale the measurements. So, the voltages range from 0 to 5 V and the switch position is just a binary 0 or 1. The predefined signature $[V_{0,s},V_{1,s}, V_{2,s}, S_{1,s}]$ that satisfies $min(d)$, will be deemed as applicable. To warn for measurements that are very different from the predefined signatures, $min(d)$ should not surpass some limit:

$ if\ (min(d)\ <\ limit): \\
\qquad "assign\ the\ label\ of\ the\ predefined\ signature\ to\ the\ measurements" \\
 else: \\
\qquad "report\ anomaly" $

This condition warns for unanticipated measurements.

In conclusion, a proactive maintenance policy is often composed from a top-down FMEA whereas fault detection should in principle work at all possible system states. Even if these system states are not deemed as critical. Still, an FMEA may be helpful to determine the system states. Reliability engineers often deem a top-down FMEA as tedious. It should be realised that an analysis of all possible system states is even more labour intensive. In this simple case, the identification of the signature of all system is still tractable but state space explosions are often a concern.

    
## Limit states relying on a labelled data set

At the stage of deployment, an engineer may capture the evolution in the voltages and in the position of the switch as shown in the time series in Figure YYY. 
    


...

# [Click here to see the script](https://nbviewer.jupyter.org/github/chrisrijsdijk/RAMS/blob/master/notebook/Arduino_diagnostics_ensemble.ipynb) 
