In this notebook we aim to reproduce the methods described in the paper: "Camera model identification with the use of Deep Convolutional Neural Networks" Tauma, A. et al 2016. However, we will try to use the Socrates dataset provided by Eurecom.

Since this is a very preliminary test, we will download only a small sample of the dataset to ensure that the pipeline works.



In [1]:
import tensorflow
from tensorflow import keras
from tensorflow.keras import layers

We need to decide what the input size will be/How are we going to retrieve it

In [None]:
INPUT_SIZE = (256, 256, 3)

In [2]:
#preprocessing, Define a sequential block to preprocess the input
#Exact shape to be discussed, place-holder for now, we just rescale
preprocessing = keras.Sequential(
    [
     layers.experimental.preprocessing.Rescaling(1./255)
    ]
)

In [5]:
#Define a model architecture based on their description 
def get_model(width=256, height=256, channels=3, num_classes=10):
  inputs = keras.Input(shape=(width, height, channels))
  x = preprocessing(inputs)
  x = layers.Conv2D(filters=64, kernel_size=(3,3), strides=(2,2), activation='relu') (x)
  x = layers.Conv2D(filters=64, kernel_size=(3,3), strides=(2,2), activation='relu') (x)
  x = layers.Conv2D(filters=32, kernel_size=(3,3), strides=(2,2), activation='relu') (x)
  x = layers.MaxPooling2D(pool_size=(3,3), strides=(2,2) ) (x)
  x = layers.Dense(256, activation='relu') (x)
  x = layers.Dropout(0.2) (x)
  x = layers.Dense(4096, activation='relu') (x)
  x = layers.Dropout(0.2) (x)
  outputs = layers.Dense(num_classes, activation='softmax') (x)
  return keras.Model(inputs, outputs)

model = get_model()
model.summary()

Model: "functional_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         [(None, 256, 256, 3)]     0         
_________________________________________________________________
sequential (Sequential)      (None, 256, 256, 3)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 127, 127, 64)      1792      
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 63, 63, 64)        36928     
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 31, 31, 32)        18464     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 15, 15, 32)        0         
_________________________________________________________________
dense_3 (Dense)              (None, 15, 15, 256)      

A priori, without having done any test, I am a bit skeptical about this model's architecure. It is an adaptation to AlexNet, which is a bit outdated, maybe other adaptations of more recent architectures may work much better

The next steps are to import a small subset of our data (to test the pipeline), contaning, for example, the images caputred by 10 different models. Of those, for each class, split the images into a train and val set, and perform a training to see if the pipeline works.

**Summary of next steps**
* Import a small subset of images
* Decide how to preprocess the images so all have the same input size (Crop the center 256^2 pixels?)
* Decide the preprocessing pipeline; in the paper they define a de-noising filter, it shouldn't be hard to implement.
* Launch a training just to test the pipeline
* Then, go back to analyzing the whole dataset structure, and decide on an input size, actual train and val sets, number of classes, etc.
* Explore other architectures

