# Ablation Study
Ablation Study pierwotnie był częścią eksperymentalnej neuropsychologii, tzn. usuwano części mózgów zwierząt i obserwowano zmiany w ich zachowaniach. Znając połączenie sieci neuronowych z tym jak faktycznie działają mózgi, oczywiste staje sie pochodzenie tego terminu w tej dziedzinie. W uczeniu maszynowym Ablation Study jest eksperymentalnym sprawdzaniem funkcjonowania sieci neuronowych, poprzez przykładowo usuwanie warstw ukrytych, lub feature'ów i mierzenie efektów zmian różnymi miarami skutecznosci (accuracy, recall, f1, itd.). W ten sposob możemy lepiej zrozumiec funkcjonowanie naszego systemu, jak wpływają kolejne warstwy na wynik.
(https://arxiv.org/abs/1901.08644)

Ponizej przedstawiam moje eksperymenty, dla przykladu implementacji CNN na zbiorze MNIST, z dokumentacji biblioteki Keras.
https://keras.io/examples/vision/mnist_convnet/


In [1]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing import image
import os
import random
import tensorflow as tf
from keras import backend as K

In [2]:
seed_value = 22
os.environ['PYTHONHASHSEED']=str(seed_value)
random.seed(seed_value)
np.random.seed(seed_value)
tf.compat.v1.set_random_seed(seed_value)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)

Niestety nawet używając w tylu miejscach ziaren losowości wciąż nie ma tych samych wyników w kolejnych próbach. Z tego co znalazłem to problem jest w odpalaniu na GPU kodu, na CPU wychodzą te same wyniki.
https://stackoverflow.com/questions/32419510/how-to-get-reproducible-results-in-keras

## Potrzebne definicje

In [3]:
# Code from Keras example
def load_mnist():
  # Model / data parameters
  num_classes = 10
  input_shape = (28, 28, 1)

  # the data, split between train and test sets
  (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

  # Scale images to the [0, 1] range
  x_train = x_train.astype("float32") / 255
  x_test = x_test.astype("float32") / 255
  # Make sure images have shape (28, 28, 1)
  x_train = np.expand_dims(x_train, -1)
  x_test = np.expand_dims(x_test, -1)
  print("x_train shape:", x_train.shape)
  print(x_train.shape[0], "train samples")
  print(x_test.shape[0], "test samples")


  # convert class vectors to binary class matrices
  y_train = keras.utils.to_categorical(y_train, num_classes)
  y_test = keras.utils.to_categorical(y_test, num_classes)
  return (x_train, y_train), (x_test, y_test)

In [4]:
# Code from Keras Example
def compile_and_fit(data, model_test, batch_size = 128, epochs = 15, verbose=False, optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]):
  (x_train, y_train), (x_test, y_test) = data

  model_test.compile(loss=loss, optimizer=optimizer, metrics=metrics)
  print("Fitting model")
  model_test.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1, verbose = (1 if verbose else 0))

In [5]:
# Code from now on is written by me
def evaluate(data, model_eval):
  (_, _), (x_test, y_test) = data
  score = model_eval.evaluate(x_test, y_test, verbose=0)
  return score

In [6]:
def print_score(score):
  print("Test loss:", score[0])
  print("Test accuracy:", score[1])

In [7]:
data = load_mnist()
def ablation(model, data=data, verbose=False, batch_size = 128, epochs = 15, optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]):
  if verbose:
    print(model.summary())
  compile_and_fit(data, model, verbose=verbose, batch_size = batch_size, epochs = epochs, optimizer=optimizer, loss=loss, metrics=metrics)
  score = evaluate(data, model)
  print_score(score)
  return score

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


In [8]:
input_shape = (28, 28, 1)
num_classes = 10

In [9]:
layers_original = [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
model_original = keras.Sequential(layers_original)

model_add_conv = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(128, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model_add_dense = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dense(512, activation='relu'),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model_avg_pool = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.AveragePooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.AveragePooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model_sigmoid_output = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="sigmoid"),
    ]
)

model_dropout_low = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.2),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

model_dropout_high = keras.Sequential(
    [
        keras.Input(shape=input_shape),
        layers.Conv2D(32, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Conv2D(64, kernel_size=(3, 3), activation="relu"),
        layers.MaxPooling2D(pool_size=(2, 2)),
        layers.Flatten(),
        layers.Dropout(0.8),
        layers.Dense(num_classes, activation="softmax"),
    ]
)
#https://keras.io/api/layers/
#https://github.com/fchollet/deep-learning-with-python-notebooks

## Testy Zmian Warstw

In [10]:
original = ablation(model_original, verbose=True)
#Original Model

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1600)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1600)              0         
_________________________________________________________________
dense (Dense)                (None, 10)                1

In [11]:
ablation(model_add_conv)
#One additional convolution layer with 128 filters and relu activation

Fitting model
Test loss: 0.03916343301534653
Test accuracy: 0.9883999824523926


[0.03916343301534653, 0.9883999824523926]

In [12]:
ablation(model_add_dense)
#One additional dense layer with 512 neurons and relu activation

Fitting model
Test loss: 0.02572786621749401
Test accuracy: 0.9934999942779541


[0.02572786621749401, 0.9934999942779541]

In [13]:
ablation(model_avg_pool)
#Pooling change from max2D to avg2D

Fitting model
Test loss: 0.03129206970334053
Test accuracy: 0.989799976348877


[0.03129206970334053, 0.989799976348877]

## Zmiana Parametrow Warstw

In [14]:
print("Original Model Score")
print_score(original)

Original Model Score
Test loss: 0.02682262286543846
Test accuracy: 0.991599977016449


In [15]:
ablation(model_sigmoid_output)
#Sigmoid activation function in the output dense layer (prev. softmax)

Fitting model
Test loss: 0.026978613808751106
Test accuracy: 0.9905999898910522


[0.026978613808751106, 0.9905999898910522]

In [16]:
ablation(model_dropout_low)
#Only the dropout factor is changed from the original model (from 0.5 to 0.2)

Fitting model
Test loss: 0.025646276772022247
Test accuracy: 0.9919000267982483


[0.025646276772022247, 0.9919000267982483]

In [17]:
ablation(model_dropout_high)
#Only the dropout factor is changed from the original model (from 0.5 to 0.8)

Fitting model
Test loss: 0.03042452409863472
Test accuracy: 0.9894999861717224


[0.03042452409863472, 0.9894999861717224]

## Zmiana Parametrow Treningu

In [18]:
models = [None, None, None]
for i in range(len(models)):
  models[i] = keras.Sequential(layers_original)

ablation(models[0], verbose=True, batch_size=512, epochs=10)

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 11, 11, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1600)              0         
_________________________________________________________________
dropout (Dropout)            (None, 1600)              0         
_________________________________________________________________
dense (Dense)                (None, 10)               

[0.02474917471408844, 0.9907000064849854]

In [19]:
ablation(models[1], batch_size=64, epochs=20)

Fitting model
Test loss: 0.02691711112856865
Test accuracy: 0.9923999905586243


[0.02691711112856865, 0.9923999905586243]

In [20]:
ablation(models[2], batch_size=128, epochs=15, loss='poisson', optimizer='Adadelta')

Fitting model
Test loss: 0.10261466354131699
Test accuracy: 0.9926999807357788


[0.10261466354131699, 0.9926999807357788]

## Augmentacja Danych

In [21]:
datagenerator = image.ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    horizontal_flip=True)
(X_train, y_train), (X_test, y_test) = data
model_original_2 = keras.Sequential(layers_original)

datagenerator.fit(X_train)

model_original_2.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
model_original_2.fit(datagenerator.flow(X_train, y_train, batch_size = 128), epochs = 15)

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<tensorflow.python.keras.callbacks.History at 0x7fe66632c310>

In [22]:
print_score(model_original_2.evaluate(X_test, y_test, verbose=0))

Test loss: 0.24782820045948029
Test accuracy: 0.9642000198364258


## Wnioski
Ablation study jest użytecznym narzędziem do lepszego zrozumienia działania naszej sieci neuronowej. W tym przypadku tylko ustawienie loss='poisson', optimizer='Adadelta', miało istotny wpływ na skutecznosc, i to tylko na finalną wartość loss. Augmentacja danych w przypadku zbioru MNIST oczywiście nie ma sensu i w wyniku otrzymujemy znacznie gorszą skuteczność na zbiorze testowym. Cała reszta zmian nie miała znacznego wpływu na naszą skuteczność, w różnych odpaleniach (z powodów losowości GPU) czasem otrzymane były wyższe wartości od pierwotnego modelu a czasem niższe. Wydaje się to być spowodowane tym, że ten zbiór danych jest wyjątkowo prostym zadaniem dla sieci neuronowych.