# Neural Networks in Forensics
#### Jason Saporta
#### 11/7/17

# What is Machine Learning?
- Branch of Artificial Intelligence.
- Has existed since the late 50's.
- Data-driven approach dominant for about the last 20 years (since SVMs).
- **Optimization** vs. **Probabilistic Model**-based ML.
- **Supervised**, **Unsupervised**, and **Reinforcement** learning.

# What is Deep Learning?
- Branch of Machine Learning.
- Usually involves neural networks, but not always.
- Key benefit: **Automatic feature extraction**.
- Drawbacks: Data-hungry, computationally intensive.
- **Feature Hierarchy**: higher level features constructed from lower-level ones.

![](https://devblogs.nvidia.com/parallelforall/wp-content/uploads/2015/11/hierarchical_features.png)
Image taken from [NVIDIA blog](https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/).

# What are Neural Networks?
- A machine learning method for **function approximation**.
- Network represents a function from a (vector) input to a (vector) output.
- For each node, output $a \left( \sum_i w_i x_i \right)$, where the $x_i$ values come from the previous layers and the $w_i$ are learned weights (parameters). $a(\cdot)$ is called an **activation function**.

![](http://neuralnetworksanddeeplearning.com/images/tikz1.png)
Image from Michael Nielsen's [Neural Networks and Deep Learning](http://neuralnetworksanddeeplearning.com/).

# Types of Neural Networks
- 3 main classes: **Multilayer Perceptron**, **Convolutional**, and **Recurrent**.
- Other varieties: Siamese, GANs, Capsule networks, etc.
- Different architectures correspond to different situations.

In a nutshell:
- Convolutional NNs for computer vision.
- Recurrent NNs for time series data.
- Siamese NNs for matching problems.
- GANs or VAEs for generating new data.

# Training Neural Networks I
- Treat outputs as independently normally distributed and find MLEs.
- Easily extended to classification using a logistic activation in final layer.
- Almost exclusively use first-order iterative optimization methods.
- When data don't fit in memory, use stochastic optimization techniques.
- Bayesian approaches are also possible.

# Training Neural Networks II
- **Backpropagation** makes computing derivatives computationally efficient.
- Implemented in NN packages, so you don't need to do any manual calculus.
- Training alternates between forward and backward passes.
- Many computations $\rightarrow$ use GPUs.

# Training Neural Networks III
- NN packages take care of autodiff, GPU management, etc. They use CUDA C++ behind the scenes.
- Broadly applicable: Use when you want fast matrix (tensor) operations and don't want to write raw C++.

<center><img src="https://s3.amazonaws.com/keras.io/img/keras-logo-2018-large-1200.png" height=65% width=65%></center>

In [1]:
import keras
from keras.layers import *

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = x_train.reshape(-1, 784) / 255
x_test = x_test.reshape(-1, 784) / 255
y_train = keras.utils.to_categorical(y_train)
y_test = keras.utils.to_categorical(y_test)

model = keras.models.Sequential()
model.add(Dense(100, input_shape=(784, ), activation='relu'))
model.add(Dense(10, input_shape=(100, ), activation='softmax'))
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=32, verbose=0)
print(model.evaluate(x_test, y_test, verbose=0))

Using TensorFlow backend.


[0.099712034922046583, 0.97450000000000003]


# Why Use (Deep) Neural Networks?
In almost every field where machine learning has been applied, you can find claims of better results with deep neural networks. In particular:
- Image Recognition
- Game Playing (Atari, Go, etc.)
- Speech Recognition
- Automatic Translation
- Forensics Applications!

In general: Consider deep learning when you have **lots of data** and expect that you will get the best performance from **complex features** that are unclear how to create by hand.

# Handwriting
- I couldn't find clear NN-based results for handwriting matching.
- See [Graves 2013](https://arxiv.org/abs/1308.0850) for examples of handwriting generation taking style into account.
- Approaches to handwriting problems involve recurrent networks with LSTM (long short-term memory) cells.

![](http://colah.github.io/posts/2015-08-Understanding-LSTMs/img/LSTM3-chain.png)
Image from [Chris Olah's blog](http://colah.github.io/posts/2015-08-Understanding-LSTMs/).

# Shoeprints
- [Cross-Domain Forensic Shoeprint Matching](https://vision.ics.uci.edu/papers/KongSRF_BMVC_2017/KongSRF_BMVC_2017.pdf) from CSAFE UCI.
- Goal: Determine which type of shoe left an impression found at a crime scene.
- Method: Put low- and mid- features from a pre-trained CNN into a Siamese network with a custom matching function, train end-to-end to fine-tune the weights.
- Outcome: State-of-the-art performance.

![](https://vision.ics.uci.edu/papers/KongSRF_BMVC_2017/icon_drop.jpg)

Can this network be modified to get good results on other shoe-matching problems?

# Bullets
- A little research from the early 2000's on NNs for bullet matching using a simple multilayer perceptron. Results are unclear; few experiments seem to have been performed.

Idea:
- We can use recurrent neural networks to take in bullet signatures and output a learned feature representation. Two of these representations could be put into a matching function with some other trainable parameters. Train the whole thing end-to-end with logistic loss and we should have a good method for matching bullets.

# Steganography/Steganalysis
- Neural networks have been used for both hiding messages in images and trying to find them.
- Use GANs to create convincing images containing hidden messages.
- Approaches to steganalysis involve using convolutional neural networks with custom activation functions to tease out small changes to pixels.

Ideas:
- Do these approaches work with realistic images?
- Is it even possible to detect messages without some knowledge of the encoding algorithm?
- Credible sets for the predictions generated from these algorithms.