# 3. Using H1st.AI to Encode Human Insights as a Model and Harmonize Human + ML in a H1st.Graph

### 3a. Use case analysis: turning on safe-mode vs post-moterm analysis

The H1ST.AI approach to this problem begins by thinking about the end-users of the decision system, and their uses cases.

What are the use cases for such Automotive Cybersecurity system? We can envision two distinctive use cases:
  1. The onboard intrusion detection system can detect an attack event in realtime and set the car into a safe mode so that drivers can safely get to a safe location and not be stuck in the highway with malfunctioning cars.
  2. An security expert could review the attack in post-mortem mode, in which the IDS provides message-by-message attack vs normal classification.

For use case #1 "safe mode triggering by attack event detection", the ML requirement is that it has near-zero FPR. 

To give an example, each second might contain 100 of CAN messages per car. If we have a fleet with just 1000 cars, each driven 1h per day, then a FPR of 0.00001 at message-level still means that each day we have 0.00001 x 100msg x 3600s x 1000cars = 3600 false positive events per day!

Additionally, for deployment & anticipated regulatory purpose, the system should behave robustly and explainably. While explainability is a complex subject, we meant that one could anticipate the system’s behavior reasonably well, as well as for legal/regulation purposes. As we saw with iForest or GBM ML models, they don’t quite meet this requirement, as it is hard to explain precisely how these models classify attacks, even if they can achieve good accuracy.

For use case #2 "post-morterm analysis", it turns out that the requirement is very different. Some FPR could be traded off for higher TPR for post-mortem. And the system might not need to highly explainable as it is after all the jobs of the security experts to analyze the attacks in depth and make the final decisions.

### 3b. Problem (re)formulation into H1st.AI Graph

We reformulate the problem into the form of a decision graph, where the outermost flow detects attack events and corresponding yes branches handles message classification. For this tutorial we focus on injection attacks which are most common in the wild (we will revisit this later).

The graph looks like this.

<img src="http://docs.arimo.com/H1ST_AI_Tutorial/img/graph2.png" alt="automotive cybersecurity solution graph"/>

### 3c. Encoding human insights for event detection as a H1st.Model

Remember when we start analyzing the CAN dataset, we have remarked that the normal data is highly regular, especially in terms of the message frequency for each CAN ID.

It turns out that using message frequency statistics for injection event detection is highly accurate for safe-mode use cases (high TPR, low FNR). This surprising fact was first pointed out by the original CAN bus hackers Chris Valasek and Charlie Miller in the seminal white paper [Adventures in Automotive Networks and Control Units](https://ioactive.com/pdfs/IOActive_Adventures_in_Automotive_Networks_and_Control_Units.pdf).

> It is pretty straightforward to detect the attacks discussed in this paper.  They always involve either sending new, unusual CAN packets or flooding the CAN bus with common packets... Additionally, the frequency of normal CAN packets is very predictable... Therefore we propose that a system can detect CAN anomalies based on the known frequency of certain traffic and can alert a system or user if frequency levels vary drastically from what is well known. 

Using H1ST, we can encode insights of such “human” models and use them just like ML models. An h1.Model is essentially anything that can predict. H1ST provides tools to help automate their saving and loading, too, easing the way for using them in an integrated decision system.

A data-science project in H1ST.AI is designed to be a Python-importable package. You can create such a project using the `h1` command-line tool.

Organizing model code this way makes it easy to use. The Model API is uniquely designed so that models can be used interactively in notebooks as well as in more complex project such as this one.

In a H1ST project structure, we typically organize this under `models` directory, e.g. the content of `models/msg_freq_event_detector.py` looks like this. The details of training is quite simple: looping through a number of files to compute window statistics such as how many msg per CAN ID are found & what’s the min & max and percentile values.

```{note}
The H1st package of the full tutorial is available from the H1st Github project at [https://github.com/h1st-ai/h1st/tree/master/examples/AutomotiveCybersecurity](https://github.com/h1st-ai/h1st/tree/master/examples/AutomotiveCybersecurity).

Simply go ahead and clone it, then follow along.
```

In [1]:
import h1st as h1

SENSORS = ["SteeringAngle", "CarSpeed", "YawRate", "Gx", "Gy"]

class MsgFreqEventDetectorModel(h1.Model):
    def load_data(self, num_files=None):
        return util.load_data(num_files)
    
    def train(self, prepared_data):
        files = prepared_data["train_normal_files"]
        
        from collections import defaultdict
        def count_messages(f):
            df = pd.read_csv(f)
            df.columns = ['Timestamp', 'Label', 'CarSpeed', 'SteeringAngle', 'YawRate', 'Gx', 'Gy']
            counts = defaultdict(list)
            
            for window_start in util.gen_windows(df, window_size=config.WINDOW_SIZE, step_size=config.WINDOW_SIZE):
                w_df = df[(df.Timestamp >= window_start) & (df.Timestamp < window_start + config.WINDOW_SIZE)]
                for sensor in config.SENSORS:
                    counts[sensor].append(len(w_df.dropna(subset=[sensor])))

            return pd.DataFrame(counts)
        
        ret = [count_messages(f) for f in files]
        df = pd.concat(ret)

        self.stats = df.describe()
    
    def predict(self, data):
        df = data["df"].copy()
        df = util.compute_timediff_fillna(df)
        df['MsgIsAttack'] = 0
        df['WindowInAttack'] = 0
        for event_result in data["event_detection_results"]:
            if event_result['WindowInAttack']:
                # print("window %s in attack: event_result = %s" % (event_result['window_start'], event_result))
                in_window = (df.Timestamp >= event_result['window_start']) & (df.Timestamp < event_result['window_start'] + config.WINDOW_SIZE)
                w_df = df[in_window]
                if len(w_df) > 0:
                    ypred = self.model.predict(w_df[FEATURES])
                    df.loc[in_window, "WindowInAttack"] = 1
                    df.loc[in_window, "MsgIsAttack"] = ypred.astype(int)
        return {"injection_window_results": df}

Now let's import and train this `MsgFreqEventDetectorModel`.


```{note}
Here, we call `h1.init()` to make sure we can import the package in our notebooks even when the package is not installed (as long as the notebooks are within the project folder structure).
```

In [2]:
h1.init()

from AutomotiveCybersecurity.models.msg_freq_event_detector import MsgFreqEventDetectorModel

m = MsgFreqEventDetectorModel()

In [15]:
# one long trip is sufficient to compute freq stats at sub-second window level for each car model
data = m.load_data(num_files=1)

In [16]:
m.train(data)

In [17]:
m.stats

Unnamed: 0,SteeringAngle,CarSpeed,YawRate,Gx,Gy
count,28920.0,28920.0,28920.0,28920.0,28920.0
mean,34.328008,17.164696,34.329011,34.329011,34.329011
std,1.247765,2.085938,1.331684,1.331684,1.331684
min,31.0,0.0,0.0,0.0,0.0
25%,33.0,17.0,33.0,33.0,33.0
50%,34.0,17.0,34.0,34.0,34.0
75%,35.0,18.0,35.0,35.0,35.0
max,42.0,24.0,41.0,41.0,41.0


The nice things about h1st.Model that we can easily save/load them. By default, the "model", "stats" and "metrics" properties are persisted and they support a variety of flavors & data structure.

In [22]:
m.persist()

2020-09-02 23:15:07,644 INFO h1st.model_repository.model_repository: Saving stats property...


'01EH99NG3T9XF02RDQTCWAPAPF'

### 3d. Working with H1st Graph

Let's now make some event-level predictions.

Note that since the model was persisted using H1st model repo, this means that we can easily come back to a notebooks and/or scripts and load the trained model or computed statistics.

Importantly, H1st allows much speedier integration into a Graph (and later deployment, too).

In [18]:
data['attack_files'][0]

's3://h1st-tutorial-autocyber/attack-samples/20181114_Driver2_Trip1-0.parquet'

In [4]:
import pandas as pd

from AutomotiveCybersecurity.graph import WindowGenerator
from AutomotiveCybersecurity.models.msg_freq_event_detector import MsgFreqEventDetectorModel

graph = h1.Graph()
graph.start()\
     .add(WindowGenerator())\
     .add(MsgFreqEventDetectorModel().load())
graph.end()

df = pd.read_parquet(data['attack_files'][0])

results = graph.predict({"df": df})
results.keys()

2020-09-02 22:59:32,242 INFO h1st.model_repository.model_repository: Loading version 01EH8BNKHBKYXBE3RNN8H0DZFW ....


dict_keys(['window_starts', 'event_detection_results'])

And we should see that starting we can detect attacks evemts.

### 3e. Adding a message classifier, harmonizing human + ML models in the graph

For message-level classification we can simply bring back our gradient-boosted trees which did a decent job of recognizing injection messages. (Integrating sequence model such as Bidirectional LSTM is left as an exercise for the reader).

For convenient, we've re-orgarnized it as a H1st.Model, ready for use. The content of `models/gradient_boosting_msg_classifier.py` looks like this.

In [7]:
FEATURES = SENSORS + ["%s_TimeDiff" % s for s in SENSORS]

class GradientBoostingMsgClassifierModel(h1.Model):
    def load_data(self, num_samples=None):
        return util.load_data_daic(num_samples, shuffle=True)

    def prep_data(self, data):
        # concat multiple files into separate training/test pd.DataFrame
        def concat_processed_files(files):
            dfs = []
            for f in files:
                z = pd.read_csv(f)
                z.columns = ['Timestamp', 'Label', 'CarSpeed', 'SteeringAngle', 'YawRate', 'Gx', 'Gy',]
                z = util.compute_timediff_fillna(z)
                dfs.append(z)
            df2 = pd.concat(dfs)
            return df2
        return {
            "train_attack_df": concat_processed_files(data["train_attack_files"]),
            "test_attack_df": concat_processed_files(data["test_attack_files"])
        }

    def train(self, prepared_data):
        df = prepared_data["train_attack_df"]
        from sklearn.experimental import enable_hist_gradient_boosting
        from sklearn.ensemble import HistGradientBoostingClassifier
        X = df[FEATURES]
        y = df.Label == "Tx"
        self.model = HistGradientBoostingClassifier(max_iter=500).fit(X, y)

    def evaluate(self, data):        
        df = prepared_data["test_attack_df"]
        ypred = self.model.predict(df[FEATURES])
        import sklearn.metrics
        cf = sklearn.metrics.confusion_matrix(df.Label == "Tx", ypred)
        acc = sklearn.metrics.accuracy_score(df.Label == "Tx", ypred)
        print(cf)
        print("Accuracy = %.4f" % acc)
        self.metrics = {"confusion_matrix": cf, "accuracy": acc}
    
    def predict(self, data):
        df = data["df"].copy()
        df = util.compute_timediff_fillna(df)
        df['MsgIsAttack'] = 0
        df['WindowInAttack'] = 0
        for event_result in data["event_detection_results"]:
            if event_result['WindowInAttack']:
                # print("window %s in attack: event_result = %s" % (event_result['window_start'], event_result))
                in_window = (df.Timestamp >= event_result['window_start']) & (df.Timestamp < event_result['window_start'] + WINDOW_SIZE)
                w_df = df[in_window]
                ypred = self.model.predict(w_df[FEATURES])
                df.loc[in_window, "WindowInAttack"] = 1
                df.loc[in_window, "MsgIsAttack"] = ypred.astype(int)
                return {"injection_window_results": df}

In [8]:
import h1st as h1
h1.init()

from AutomotiveCybersecurity.models.gradient_boosting_msg_classifier import GradientBoostingMsgClassifierModel

m2 = GradientBoostingMsgClassifierModel()
data = m2.load_data(num_files=6)

In [9]:
prepared_data = m2.prep_data(data)

len train_attack_df = 796422
len test_attack_df = 1113009


In [18]:
prepared_data["train_attack_df"]

Unnamed: 0,Timestamp,SteeringAngle,CarSpeed,YawRate,Gx,Gy,Label,AttackSensor,AttackMethod,AttackParams,AttackEventIndex,SteeringAngle_TimeDiff,CarSpeed_TimeDiff,YawRate_TimeDiff,Gx_TimeDiff,Gy_TimeDiff
4,0.030932,0.8,0.0,0.190211,0.009452,0.034356,Normal,,,0.0,,-1.000000,-1.0,-1.000000,-1.000000,-1.000000
5,0.036363,0.8,0.0,0.190211,0.009452,0.034356,Normal,,,0.0,,0.010708,-1.0,-1.000000,-1.000000,-1.000000
6,0.038058,0.8,0.0,0.190317,0.009466,0.034129,Normal,,,0.0,,-1.000000,-1.0,0.012148,0.012148,0.012148
7,0.049180,0.8,0.0,0.190317,0.009466,0.034129,Normal,,,0.0,,0.012817,-1.0,-1.000000,-1.000000,-1.000000
8,0.049489,0.8,0.0,0.190422,0.009480,0.033902,Normal,,,0.0,,-1.000000,-1.0,0.011431,0.011431,0.011431
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
264768,1199.974999,7.2,0.0,0.156113,-0.047354,0.010066,Normal,,,0.0,,-1.000000,-1.0,0.012123,0.012123,0.012123
264769,1199.982700,7.2,0.0,0.156113,-0.047354,0.010066,Normal,,,0.0,,0.011751,-1.0,-1.000000,-1.000000,-1.000000
264770,1199.984296,7.2,0.0,0.156266,-0.046999,0.010154,Normal,,,0.0,,-1.000000,-1.0,0.009297,0.009297,0.009297
264771,1199.994381,7.2,0.0,0.156266,-0.046999,0.010154,Normal,,,0.0,,0.011681,-1.0,-1.000000,-1.000000,-1.000000


In [19]:
m2.train(prepared_data)

In [20]:
m2.evaluate(prepared_data)

[[1063112    1876]
 [  22391   25630]]
Accuracy = 0.9782


In [21]:
m2.persist()

2020-09-02 23:15:04,406 INFO h1st.model_repository.model_repository: Saving metrics property...
2020-09-02 23:15:04,408 INFO h1st.model_repository.model_repository: Saving model property...


'01EH99NCYMAZTWX9VYM38ZXWHF'

In [10]:
class NoOp(h1.Action):
    def call(self, command, inputs):
        pass

graph = h1.Graph()
graph.start()\
     .add(WindowGenerator())\
     .add(h1.Decision(MsgFreqEventDetectorModel().load(),
                      decision_field="WindowInAttack",
                      result_field="event_detection_results"))\
     .add(yes=GradientBoostingMsgClassifierModel().load(),
          no=NoOp())
graph.end()

results = graph.predict({"df": df})
results.keys()

2020-09-02 23:00:43,445 INFO h1st.model_repository.model_repository: Loading version 01EH8BNKHBKYXBE3RNN8H0DZFW ....
2020-09-02 23:00:43,452 INFO h1st.model_repository.model_repository: Loading version 01EH70YES5K33C78XQX0MHJ9RV ....


dict_keys(['window_starts', 'event_detection_results', 'injection_window_results'])

Now let's evaluate the whole graph, especially focusing on the event-level TPR & FPR since they are crucial in the safe-mode deployment use case.

In [23]:
from AutomotiveCybersecurity.util import evaluate_event_graph

evaluate_event_graph(graph, prepared_data['test_files'][-2:])

Event-level confusion matrix
[[7218    0]
 [  18 1012]]
Event TPR = 0.9825, FPR = 0.0000


(7218, 0, 18, 1012)

Now that's something! Event-level FPR=0.0% with zero false positives!

(Note that this is still a subsample of the data, but once you've tried it on the full dataset the results should be the same: zero false positive at event-level.)

The message-level accuracy should be nearly the same because we used the same classifier. However the decomposition leads to separation of concerns and requirement for these two use cases. We're much more comfortable with the solution now both in terms of accuracy as well as robustness and explainability.

Another significance worth pointing out here is that we get multiple output streams from H1st.Graph: event-level outputs and msg-level outputs, exactly what we need for two different use cases we highlighted: safe-mode triggering and post-mortem analysis.