# Entrauschen mit Noise2Void
Dieses Notebook ist dem 2D-Beispiel [denoising2D_SEM](https://github.com/juglab/n2v/tree/master/examples/2D/denoising2D_SEM) der GitHub Implementierung entnommen.

Nun wenden wir uns dem entrauschen mit DeepLearning zu. Das Entrauschen war bei den *E. Coli* Swarming Daten nicht das Problem. Bei vielen unserer Mikroskopbildern für Biofilme ist das Entrauschen allerdings komplizierter.

In [None]:
# We import all our dependencies.
import tensorflow as tf
from n2v.models import N2VConfig, N2V
import skimage.filters
import numpy as np
from csbdeep.utils import plot_history
from n2v.utils.n2v_utils import manipulate_val_data
from n2v.internals.N2V_DataGenerator import N2V_DataGenerator
from matplotlib import pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import urllib
import os
import zipfile

## Auswahl einer GPU
Auf dieser Workstation stehen 4 GPUs zur Verfügung. Damit in diesem Notebook nicht alle vier GPUs allokiert werden, sagen wir CUDA, dass nur die erste GPU genutzt werden soll. Dazu setzen wir die [Umgebungsvariable](https://de.wikipedia.org/wiki/Umgebungsvariable) `CUDA_VISIBLE_DEVICES`. Mit tf.config.list_physical_devices können wir uns die zur Verfügung stehenden Recheneinheiten anzeigen lassen.

In [None]:
%env CUDA_DEVICE_ORDER=PCI_BUS_ID
%env CUDA_VISIBLE_DEVICES=0

In [None]:
tf.config.list_physical_devices()


# Training Data Preparation

For training we load __one__ set of low-SNR images and use the <code>N2V_DataGenerator</code> to extract training <code>X</code> and validation <code>X_val</code> patches.

In [None]:
# We create our DataGenerator-object.
# It will help us load data and extract patches for training and validation.
datagen = N2V_DataGenerator()

In [None]:
# We load all the '.tif' files from the 'data' directory.
# If you want to load other types of files see the RGB example.
# The function will return a list of images (numpy arrays).
#imgs = datagen.load_imgs_from_directory(directory = "data/")
imgs = datagen.load_imgs_from_directory(directory = "/extdata/readonly/f-prak-v15/v-cholera-biofilm")

# Let's look at the shape of the images.
print(imgs[0].shape, imgs[1].shape)
# The function automatically added two extra dimensions to the images:
# One at the beginning, is used to hold a potential stack of images such as a movie.
# One at the end, represents channels.

## Entrauschen mit Gauss
Wir schauen uns ein zufälliges Bild an und entrauschen es mit einem Gaußfilter. Ihr könnt auch andere Filter nutzen, um das Bild zu entrauschen und andere Sizes und Sigmas verwenden.

In [None]:
img = imgs[np.random.randint(0, len(imgs))][0, 300:700, 300:700, 0]
denoise = skimage.filters.gaussian(img.astype(float), 0.8)

plt.figure(figsize=(16,10))
plt.subplot(121)
plt.imshow(img)
plt.axis('off')
plt.title('Raw image')
plt.subplot(122)
plt.imshow(denoise)
plt.axis('off')
plt.title('Gauß filter')
None;

## Entrauschen mit Deep Learning
Noise2Void ist eine unsupervised Methode. Deep Learning Methoden müssen stets auf gewissen Daten trainiert werden. Im folgenden werden die Bilder, die wir oben eingelesen haben, für das Training vorbereitet. Dabei werden aus den großen Bildern Patches ausgeschnitten. Zudem werden wieder Trainings- und Validierungsset erstellt. Ein Test-Set entfällt hier, da die Methode Unsupervised ist.

In [None]:
# We will use the first image to extract training patches and store them in 'X'
imgs_len = len(imgs)
X = datagen.generate_patches_from_list(imgs[:imgs_len//4*3], shape=(96,96))

# We will use the second image to extract validation patches.
X_val = datagen.generate_patches_from_list(imgs[imgs_len//4:], shape=(96,96))

# Patches are created so they do not overlap.
# (Note: this is not the case if you specify a number of patches. See the docstring for details!)
# Non-overlapping patches would also allow us to split them into a training and validation set 
# per image. This might be an interesting alternative to the split we performed above.

In [None]:
# Just in case you don't know how to access the docstring of a method:
#datagen.generate_patches_from_list?

In [None]:
# Let's look at one of our training and validation patches.
plt.figure(figsize=(14,7))
plt.subplot(1,2,1)
plt.imshow(X[0,...,0], cmap='magma')
plt.title('Training Patch');
plt.subplot(1,2,2)
plt.imshow(X_val[100,...,0], cmap='magma')
plt.title('Validation Patch');

# Konfiguration
Deep Learning Modelle suchen zwar selbst eigene Optimale Parameter, um gute Resultate zu erzielen. Trotzdem können sehr viele Konfigurationen vorgenommen werden. Diese Konfigurationen betreffen die Architektur des Netzwerks, die Art und Länge des Trainings und vieles mehr. Viele Methoden können mit einer Config-Klasse konfiguriert werden.

Noise2Void comes with a special config-object, where we store network-architecture and training specific parameters. See the docstring of the <code>N2VConfig</code> constructor for a description of all parameters.

When creating the config-object, we provide the training data <code>X</code>. From <code>X</code> we extract <code>mean</code> and <code>std</code> that will be used to normalize all data before it is processed by the network. We also extract the dimensionality and number of channels from <code>X</code>.

Compared to supervised training (i.e. traditional CARE), we recommend to use N2V with an increased <code>train_batch_size</code> and <code>batch_norm</code>.
To keep the network from learning the identity we have to manipulate the input pixels during training. For this we have the parameter <code>n2v_manipulator</code> with default value <code>'uniform_withCP'</code>. Most pixel manipulators will compute the replacement value based on a neighborhood. With <code>n2v_neighborhood_radius</code> we can control its size. 

Other pixel manipulators:
* normal_withoutCP: samples the neighborhood according to a normal gaussian distribution, but without the center pixel
* normal_additive: adds a random number to the original pixel value. The random number is sampled from a gaussian distribution with zero-mean and sigma = <code>n2v_neighborhood_radius</code>
* normal_fitted: uses a random value from a gaussian normal distribution with mean equal to the mean of the neighborhood and standard deviation equal to the standard deviation of the neighborhood.
* identity: performs no pixel manipulation

For faster training multiple pixels per input patch can be manipulated. In our experiments we manipulated about 0.198% of the input pixels per patch. For a patch size of 64 by 64 pixels this corresponds to about 8 pixels. This fraction can be tuned via <code>n2v_perc_pix</code>.

For Noise2Void training it is possible to pass arbitrarily large patches to the training method. From these patches random subpatches of size <code>n2v_patch_shape</code> are extracted during training. Default patch shape is set to (64, 64).  

<font color='red'>Warning:</font> to make this example notebook execute faster, we have set <code>train_epochs</code> to only 10. <br>For better results we suggest 100 to 200 <code>train_epochs</code>.

## Wir starten erstmal nur mit 10 Epochen
Wenn wir den Rest des Notebooks verstanden haben, können wir länger trainieren.

In [None]:
# train_steps_per_epoch is set to (number of training patches)/(batch size), like this each training patch 
# is shown once per epoch. 
config = N2VConfig(X, unet_kern_size=3, 
                   train_steps_per_epoch=int(X.shape[0]/128), train_epochs=10, train_loss='mse', batch_norm=True, 
                   train_batch_size=128, n2v_perc_pix=0.198, n2v_patch_shape=(96, 96), 
                   n2v_manipulator='uniform_withCP', n2v_neighborhood_radius=5)

# Let's look at the parameters stored in the config-object.
vars(config)

In [None]:
# a name used to identify the model
model_name = 'n2v_biofilm'
# the base directory in which our model will live
basedir = 'models'
# We are now creating our network model.
model = N2V(config, model_name, basedir=basedir)

# Training

Training the model will likely take some time. We recommend to monitor the progress with TensorBoard, which allows you to inspect the losses during training. Furthermore, you can look at the predictions for some of the validation images, which can be helpful to recognize problems early on.

You can start TensorBoard in a terminal from the current working directory with tensorboard --logdir=. Then connect to http://localhost:6006/ with your browser.

In [None]:
# We are ready to start training now.
history = model.train(X, X_val)

### After training, lets plot training and validation loss.
Was kann man in diesen Graphen erkennen?

## Richtiges Training
Nachdem die ersten 10 Epochen durchgerechnet wurden, können wir die Epochenzahl erhöhen und das richtige Training laufen lassen. Dann können wir schonmal zum nächsten Notebook springen.

In [None]:
print(sorted(list(history.history.keys())))
plt.figure(figsize=(16,5))
plot_history(history,['loss','val_loss']);

In [None]:
input_val = imgs[0][0, ..., 0]
pred_val = model.predict(input_val, axes='YX')

In [None]:
sigma = 1
gauss = skimage.filters.gaussian(input_val, 1)
slx = slice(400, 600)
sly = slice(400, 600)
# Let's look at the results.
plt.figure(figsize=(16,16))
plt.subplot(2,2,1)
plt.imshow(input_val[sly, slx], cmap="magma")
plt.title('Input');
plt.subplot(2, 2, 2)
plt.imshow(pred_val[sly, slx], cmap="magma")
plt.title('Prediction');
plt.subplot(2, 2, 3)
plt.imshow(gauss[sly, slx], cmap="magma")
plt.title('Gauß Filter Sigma = {:0.1f}'.format(sigma));