## Train a Model and Predict Device Activities

The file works through training a model to detect activities of a given device. An activity is defined as any action a device allow its users to do, and each activity should contain at least three repeated experiments to make representative learnings. 

**Before you go ahead, download the required pcap files.** Request the dataset at https://moniotrlab.ccis.neu.edu/imc19/. When access has been granted, download the `iot-model.tgz` archive, decompress it to the current folder. You should expect the file structure to be `traffic/us/yi-camera/{activity_name}/{datetime}.{length}.pcap`.

**IMPORTANT** Make sure to use `python3`, and install all the dependencies. 
- `pip install -r requirements.txt`


#### Extract pcap files to per-flow level info
Output has been truncated because of length.

In [1]:
!./raw2intermediate.sh exp_list.txt tagged-intermediate/us

Running raw2intermediate.sh...
tagged-intermediate/us/yi-camera/power/2019-04-25_19:28:58.154s.txt exists.
tagged-intermediate/us/yi-camera/power/2019-04-25_19:25:30.155s.txt exists.
tagged-intermediate/us/yi-camera/power/2019-04-25_19:21:40.166s.txt exists.
tagged-intermediate/us/yi-camera/local_move/2019-04-25_19:47:16.40s.txt exists.
tagged-intermediate/us/yi-camera/local_move/2019-04-25_19:48:09.53s.txt exists.
tagged-intermediate/us/yi-camera/local_move/2019-04-25_19:46:16.41s.txt exists.
tagged-intermediate/us/yi-camera/android_lan_photo/2019-04-26_22:48:43.35s.txt exists.
tagged-intermediate/us/yi-camera/android_lan_photo/2019-04-26_22:07:05.36s.txt exists.
tagged-intermediate/us/yi-camera/android_lan_photo/2019-04-26_21:58:17.36s.txt exists.
tagged-intermediate/us/yi-camera/android_lan_photo/2019-04-26_21:27:00.31s.txt exists.
tagged-intermediate/us/yi-camera/android_lan_photo/2019-04-26_22:08:57.36s.txt exists.
tagged-intermediate/us/yi-camera/android_lan_photo/2019-04-26_21:3

#### Parse per-flow info to features per-activity
Output has been truncated because of length

In [2]:
!python extract_tbp_features.py tagged-intermediate/us/ features/us/

Running extract_tbp_features.py...
Input files located in: tagged-intermediate/us/
Output files placed in: features/us/
mkdir: created directory 'features'
mkdir: created directory 'features/us'
mkdir: created directory 'features/us//caches'
Feature files to be generated from following devices: yi-camera
Total packets: 160
    Saved to features/us//caches/yi-camera_power_2019-04-25_19:25:30.155s.csv
Total packets: 162
    Saved to features/us//caches/yi-camera_power_2019-04-25_19:21:40.166s.csv
Total packets: 152
    Saved to features/us//caches/yi-camera_power_2019-04-25_19:28:58.154s.csv
Total packets: 1557
    Saved to features/us//caches/yi-camera_android_wan_photo_2019-04-27_22:08:01.37s.csv
Total packets: 1573
    Saved to features/us//caches/yi-camera_android_wan_photo_2019-04-27_21:44:25.36s.csv
Total packets: 1715
    Saved to features/us//caches/yi-camera_android_wan_photo_2019-04-27_22:29:48.37s.csv
Total packets: 1590
    Saved to features/us//caches/yi-camera_android_wan_p

#### Train the model using the features
Reruning the command below will skip the model training. Delete .model and .label.txt files in `tagged-models/us/` to retrain.

In [3]:
!python train_rf_models.py features/us/ tagged-models/us/

Running train_rf_models.py...
mkdir: created directory 'tagged-models'
mkdir: created directory 'tagged-models/us'
mkdir: created directory 'tagged-models/us//output'
Scanning features/us//yi-camera.csv
  Data points: 2490 
	Variable: spanOfGroup          Importance: 0.402
	Variable: q90                  Importance: 0.071
	Variable: meanTBP              Importance: 0.063
	Variable: q60                  Importance: 0.05
	Variable: q80                  Importance: 0.05
	Variable: kurtosisLength       Importance: 0.045
	Variable: meanBytes            Importance: 0.043
	Variable: q70                  Importance: 0.04
	Variable: q40                  Importance: 0.039
	Variable: medAbsDev            Importance: 0.036
	Variable: skewLength           Importance: 0.029
	Variable: medianTBP            Importance: 0.026
	Variable: varTBP               Importance: 0.025
	Variable: q50                  Importance: 0.021
	Variable: kurtosisTBP          Importance: 0.02
	Variable: skewTBP            

#### Predict activities given a pcap file

In [4]:
!python -W ignore predict.py yi-camera sample_yi_camera_recording.pcap sample_result.csv tagged-models/us/

Running predict.py...
mkdir: created directory 'user-intermediates/'
Model: tagged-models/us//yi-camera.model
Total packets: 1621
Number of slices: 2
Results:
             ts        ts_end  ts_delta  num_pkt              state
0  1.556329e+09  1.556329e+09  0.000019     1620  android_lan_watch
Results saved to sample_result.csv


In [5]:
!cat sample_result.csv

ts,ts_end,ts_delta,num_pkt,state,device
1556329377.198794,1556329407.828307,1.9e-05,1620,android_lan_watch,yi-camera


Explanation: Between epoch time 1556329377.198794 and 1556329407.828307, the network traffic from yi-camera was predicted to be the same activity as android_lan_watch, which is using the android companion app to watch the video from the camera when both devices are connected to the same WI-FI network.