# Video Identification of Suspicious Behavior

This notebook will process your video data by:
* Utilize the data processed in the `Video Identification of Suspicious Behavior: Preparation`
* Load the training data
* Train the model against the training data
* Generate predictions against the test data using this model
* Any suspicious activity in our videos?

The source data used in this notebook can be found at [EC Funded CAVIAR project/IST 2001 37540](http://homepages.inf.ed.ac.uk/rbf/CAVIAR/)

<img src="https://databricks.com/wp-content/uploads/2018/09/mnt_raela_video_splash.png" width=900/>

### Prerequisite
* Execute the `Video Identification of Suspicious Behavior: Preparation` to setup the images and feature datasets

### Cluster Configuration
* Suggested cluster configuration:
 * Databricks Runtime Version: `Databricks Runtime for ML` (e.g. 4.1 ML, 4.2 ML, etc.)
 * Driver: 64GB RAM Instance (e.g. `Azure: Standard_D16s_v3, AWS: r4.4xlarge`)
 * Workers: 2x 64GB RAM Instance (e.g. `Azure: Standard_D16s_v3, AWS: r4.4xlarge`)
 * Python: `Python 3`
 
### Need to install manually
To install, refer to **Upload a Python PyPI package or Python Egg** [Databricks](https://docs.databricks.com/user-guide/libraries.html#upload-a-python-pypi-package-or-python-egg) | [Azure Databricks](https://docs.azuredatabricks.net/user-guide/libraries.html#upload-a-python-pypi-package-or-python-egg)

* Python Libraries:
 * `opencv-python`: 3.4.2 
 
### Libraries Already Included in Databricks Runtime for ML
Because we're using *Databricks Runtime for ML*, you do **not** need to install the following libraires
* Python Libraries:
 * `h5py`: 2.7.1
 * `tensorflow`: 1.7.1
 * `keras`: 2.1.5 (Using TensorFlow backend)
 * *You can check by `import tensorflow as tf; print(tf.__version__)`*

* JARs:
 * `spark-deep-learning-1.0.0-spark2.3-s_2.11.jar`
 * `tensorframes-0.3.0-s_2.11.jar`
 * *You can check by reviewing cluster's Spark UI > Environment)*

In [2]:
%run ./video_config

* Read the Parquet files previously generated containing the training dataset
* Read the hand labelled data

In [4]:
# Prefix to add prior to join
prefix = "dbfs:" + targetImgPath

# Read in hand-labeled data 
from pyspark.sql.functions import expr
labels = spark.read.csv(labeledDataPath, header=True, inferSchema=True)
labels_df = labels.withColumn("filePath", expr("concat('" + prefix + "', ImageName)")).drop('ImageName')

# Read in features data (saved in Parquet format)
featureDF = spark.read.parquet(imgFeaturesPath)

# Create train-ing dataset by joining labels and features
train = featureDF.join(labels_df, featureDF.origin == labels_df.filePath).select("features", "label", featureDF.origin)

# Validate number of images used for training
train.count()

In [5]:
from pyspark.ml.classification import LogisticRegression

# Fit LogisticRegression Model
lr = LogisticRegression(maxIter=20, regParam=0.05, elasticNetParam=0.3, labelCol="label")
lrModel = lr.fit(train)

In [6]:
from pyspark.ml.classification import LogisticRegression, LogisticRegressionModel

# Load Test Data
featuresTestDF = spark.read.parquet(imgFeaturesTestPath)

# Generate predictions on test data
result = lrModel.transform(featuresTestDF)
result.createOrReplaceTempView("result")

In [7]:
from pyspark.sql.functions import udf
from pyspark.sql.types import FloatType

# Extract first and second elements of the StructType
firstelement=udf(lambda v:float(v[0]),FloatType())
secondelement=udf(lambda v:float(v[1]),FloatType())

# Second element is what we need for probability
predictions = result.withColumn("prob2", secondelement('probability'))
predictions.createOrReplaceTempView("predictions")

In [8]:
%sql
select origin, probability, prob2, prediction from predictions where prediction = 1  order by prob2 desc

origin,probability,prob2,prediction
dbfs:/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0024.jpg,"List(1, 2, List(), List(0.017307610033970747, 0.9826923899660293))",0.98269236,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0014.jpg,"List(1, 2, List(), List(0.03786095034365344, 0.9621390496563466))",0.96213907,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0017.jpg,"List(1, 2, List(), List(0.0451830428080614, 0.9548169571919385))",0.95481694,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0016.jpg,"List(1, 2, List(), List(0.07447960986961318, 0.9255203901303868))",0.92552036,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0019.jpg,"List(1, 2, List(), List(0.11471190107131549, 0.8852880989286844))",0.8852881,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/LeftBoxframe0027.jpg,"List(1, 2, List(), List(0.129674019216907, 0.870325980783093))",0.870326,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/LeftBoxframe0033.jpg,"List(1, 2, List(), List(0.1747258698663635, 0.8252741301336365))",0.8252741,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/LeftBag_AtChairframe0022.jpg,"List(1, 2, List(), List(0.19848773904188566, 0.8015122609581143))",0.80151224,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/LeftBag_AtChairframe0023.jpg,"List(1, 2, List(), List(0.21661128842070904, 0.783388711579291))",0.78338873,1.0
dbfs:/mnt/tardis6/videos/cctvFrames/test/LeftBoxframe0034.jpg,"List(1, 2, List(), List(0.21938627263970645, 0.7806137273602934))",0.7806137,1.0


View the top three most suspicious images based on `prob2` column

In [10]:
displayImg("/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0024.jpg")

In [11]:
displayImg("/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0014.jpg")

In [12]:
displayImg("/mnt/tardis6/videos/cctvFrames/test/Fight_OneManDownframe0017.jpg")

## View the Source Video
View the source video of the suspicious images

![](https://s3.us-east-2.amazonaws.com/databricks-dennylee/media/Fight_OneManDown.gif)