# README

1. Segmentation

	1.1 Load images and convert to png
    
    1.2 Annotate images using cvat
    
    1.3 Preprocessing of images
		- Create training/validation/test fraction text files
		- Normalization
		- Creating border labels
		- Augmentation (affine transformation)
    
    1.4 Training
    
    1.5 Prediction
    
    1.6 Evaluation

## 1. Segmentation

I have with help from the script by Broad Bioimage Benchmarc Collection (BBBC), found at https://github.com/carpenterlab/unet4nuclei,
recreated their experiment with the use of images from Aits lab.

In the BBBC repository, the full script can be found for their experiments, but for our purpose the script has been modified.

Some modifications had to be done due to outdated versions of python packages, and some modifications because of different image-formats. 
In our experiment it was also of importance to include micronucleis in the modelprediction, which was not covered by the BBBC group, so the scripts are modified to fit that purpose.


#### Set up folders and scripts

In the folder where the scripts will be stored, there should also be a script called config.py and a folder named utils containing scripts with functions. The filetree should look like this:


**2_Final_Models**  
- **1_Model1**
    - **Unet**
        - 02-training.py
        - 03-prediction.py
        - 04-evaluation.ipynb
        - config.py
        - **utils**
            - augmentation.py
            - data_provider.py
            - dirtools.py
            - evaluation.py
            - experiment.py
            - metrics.py
            - model_builder.py
            - objectives.py
                
- **2_Model2**
    - **Unet**
        - 02-training.py
        - 03-prediction.py
        - 04-evaluation.ipynb
        - config.py
        - **utils**
            - augmentation.py
            - data_provider.py
            - dirtools.py
            - evaluation.py
            - experiment.py
            - metrics.py
            - model_builder.py
            - objectives.py
                
- **3_Model3**
    - **Unet**
        - 02-training.py
        - 03-prediction.py
        - 04-evaluation.ipynb
        - config.py
        - **utils**
            - augmentation.py
            - data_provider.py
            - dirtools.py
            - evaluation.py
            - experiment.py
            - metrics.py
            - model_builder.py
            - objectives.py
                
- **4_Model4**
    - **Unet**
        - 02-training.py
        - 03-prediction.py
        - 04-evaluation.ipynb
        - config.py
        - **utils**
            - augmentation.py
            - data_provider.py
            - dirtools.py
            - evaluation.py
            - experiment.py
            - metrics.py
            - model_builder.py
            - objectives.py
                
- **data**
    - **1_raw_annotations**
    - **2_raw_images**
    - **3_preprocessing_of_data**
        - 00-load-and-reformat-dataset.py
        - 01-Augmentation.py 
        - config.py 
    - **4_filelists**
        - 1-2_training.txt
        - 3_training.txt
        - 4_training.txt
        - TEST.txt
        - VALIDATION.txt
    

The folder **2_Final_Models/data/1_raw_annotations** is where the annotations will be stored.

The folder **2_Final_Models/data/1_raw_data** is where the images will be stored. 


The config.py file is set up to look like this:

## Config.py

```python
import os
import utils.dirtools

config_vars = {}

# ************ 01 ************ #
# ****** PREPROCESSING ******* #
# **************************** #

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 01.01 INPUT DIRECTORIES AND FILES

config_vars["root_directory"] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/'

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 01.02 DATA PARTITION INFO

## Maximum number of training images (use 0 for all)
config_vars["max_training_images"] = 0

## Generate partitions?
## If False, load predefined partitions (training.txt, validation.txt and test.txt)
config_vars["create_split_files"] = False

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 01.03 IMAGE STORAGE OPTIONS

## Transform gray scale TIF images to PNG
config_vars["transform_images_to_PNG"] = True
config_vars["pixel_depth"] = 8

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 01.04 PRE-PROCESSING OF ANNOTATIONS

## Area of minimun object in pixels
config_vars["min_nucleus_size"] = 10

## Pixels of the boundary (min 2 pixels)
config_vars["boundary_size"] = 2

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 01.05 DATA AUGMENTATION USING ELASTIC DEFORMATIONS

## Elastic deformation takes a lot of times to compute. 
## It is computed only once in the preprocessing. 
config_vars["augment_images"] =  False

## Augmentation parameters. 
## Calibrate parameters using the 00-elastic-deformation.ipynb
config_vars["elastic_points"] = 16
config_vars["elastic_distortion"] = 5

## Number of augmented images
config_vars["elastic_augmentations"] = 10


# ************ 02 ************ #
# ********* TRAINING ********* #
# **************************** #

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 02.01 OPTIMIZATION

config_vars["learning_rate"] = 1e-4

config_vars["epochs"] = 15

config_vars["steps_per_epoch"] = 500

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 02.02 BATCHES

config_vars["batch_size"] = 10

config_vars["val_batch_size"] = 10

# >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
# 02.03 DATA NORMALIZATION

config_vars["rescale_labels"] = True

config_vars["crop_size"] = 256

# ************ 03 ************ #
# ******** PREDICTION ******** #
# **************************** #

config_vars["cell_min_size"] = 30

config_vars["boundary_boost_factor"] = 1

# ************ 04 ************ #
# ******** EVALUATION ******** #
# **************************** #

config_vars["object_dilation"] = 3

# **************************** #
# ******** FINAL SETUP ******* #
# **************************** #

config_vars = utils.dirtools.setup_working_directories(config_vars)

```

### 1.1 Load images and convert to png


I have worked with 100 randomly selected images from the full data set consisting of approx. 6 million images, these images are of the format .C01 which is a microscopy format. The input images in the BBBC scripts are expected to be .tiff or .png, so all 100 images are first converted to png.

For this I have created a script that will take a directory as input, and output the converted images to a new selected directory.

The script is done with argparse, and can be used just by downloading the script and following these steps:

#### Downloading and installing bftools:

```bash
cd ~/bin
wget http://downloads.openmicroscopy.org/latest/bio-formats/artifacts/bftools.zip
unzip bftools.zip
rm bftools.zip
export PATH=$PATH:~/bin/bftools
```

#### Installing required python packages:

```bash
pip install argparse
pip install os
pip install subprocess
pip install tqdm
pip install pathlib
```

#### The program is run like:

```bash
python3 format_convertion.py -i INDIR -o OUTDIR -ift IN_FILETYPE -oft OUT_FILETYPE
```

## Script format_convertion.py:

```python
import argparse
import os
import os.path
import subprocess
from tqdm import tqdm
from pathlib import Path

#################################### ARGPARSE ##################################
usage = 'Enter the directory name where the files to convert are located, what format to convert the files to \
and name a directory where you want the converted files to end up.'
parser = argparse.ArgumentParser(description=usage)
parser.add_argument(
    '-i',
    dest = 'infile',
    metavar = 'INDIR',
    type = str,
    help = 'set the directory where the input files are located',
    required = True
    )
parser.add_argument(
    '-o',
    dest = 'outfile',
    metavar = 'OUTDIR',
    type = str,
    help = 'set the directory to store the converted files',
    required = True
    )
parser.add_argument(
    '-ift',
    dest = 'input_filetype',
    metavar = 'IN_FILETYPE',
    type = str,
    help = 'Set what format the input files are, e.g C01 png',
    required = True
    )
parser.add_argument(
    '-oft',
    dest = 'output_filetype',
    metavar = 'OUT_FILETYPE',
    type = str,
    help = 'Chose format to convert to, e.g. tiff or png',
    required = True
    )
args = parser.parse_args()
################################################################################

# Convert the input to the absolute path
input_dir = os.path.abspath(args.infile)
output_dir = os.path.abspath(args.outfile)


out_filetype = '.{}'.format(args.output_filetype)
in_filetype = args.input_filetype

# If the output directory does not exist,
# a directory will be created with that name.
my_file = Path(output_dir)
if not my_file.exists():
    os.mkdir(output_dir)

# If the path provided is not a directory, raise error
if not os.path.isdir(input_dir):
    raise argparse.ArgumentTypeError('Input must be a directory')
if not os.path.isdir(output_dir):
    raise argparse.ArgumentTypeError('Output must be a directory')

input_files = []
converted_files = []
os.chdir(input_dir)
for i in os.listdir(input_dir):
    if i.split('.')[-1] == in_filetype: # Checks that filename ends with format chosen
        input_files.append(input_dir + '/' + i)
        converted_files.append(output_dir + '/' + i.split('.')[0] + out_filetype)

for i,j in tqdm(zip(input_files,converted_files), total = len(input_files)): # tqdm creates a progressbar to see the progress.
    subprocess.run(['bfconvert', '-overwrite', '-nogroup',i,j],stdout = subprocess.PIPE, stderr = subprocess.DEVNULL) #Runs bftools which needs to be preinstalled, output to DEVNULL.
    subprocess.run(['convert', i, '-auto-level', '-depth', '16', '-define', 'quantum:format=unsigned', '-type', 'grayscale', j],stdout = subprocess.PIPE, stderr = subprocess.DEVNULL) #Convert images to 16-bits tiff images.
    ```

#### Download bbbc images

The images from the bbbc experiment can be found here:

https://data.broadinstitute.org/bbbc/BBBC039/images.zip

The images are extracted and put in the folder **2_Final_Models/data/1_raw_data**


And the annotations for the images:

https://data.broadinstitute.org/bbbc/BBBC039/masks.zip

The mask images are extracted and put in the folder **2_Final_Models/data/2_raw_annotations**



## 1.2 Annotate images using cvat


We have annotated 50 images to use in the experiment, we have used the annotation program cvat. Information about the program and installation instructions are found on their github page, https://github.com/opencv/cvat.

Only one label is used for annotation, nucleus, and each nucleus is drawn with the polygon tool.

The work is saved using the option “DUMP ANNOTATION” -> “MASK ZIP 1.1”

That will create and download a zip file with one folder of images only with the class (nucleus), showing all nucleus in the same color, and one folder with annotations of the objects, each image will be an RGB image, with all objects being different colors to distinguish between them.

In the creation of our labels, the object images was used. These images are not of the same format as the bbbc institute’s, so the script had to be modified to fit these images.


The images we have annotated can be found here:

{insert link}


## 1.3 Preprocessing of images

When later training the model, the imageset is divided in to 3 categories; training-, validation-, and test- set.

For our experiments we have done four different models, with different training set.
10 images is saved as a test set, 10 is used for validation, and for the different models we have different training sets.

Model 1: 30 images from Aitslab

Model 2: 30 images from Aitslab that has been augmented with elastic transformation so that a total of 300 images is used.

Model 3: The same as Model 2 but with additional 100 images from the bbbc group.

Model 4: The same as Model 2 but with additional 200 images from the bbbc group.

### Normalization and creation of boundary labels

First the images are normalized to have pixel values between 0-1 instead of 0-255, and converted to png if that is not already done. 

Then boundary labels are created. 
Objects are found using the skimage module, both for finding distinguished objects, and for finding boundaries of the objects. The boundaries are created outside of the object.   

## 00-load-and-reformat-dataset.py

```python
import glob
import os
import shutil
import zipfile
import requests
from config import config_vars
import random
import matplotlib.pyplot as plt
import numpy as np
import pathlib
from tqdm.notebook import tqdm
import skimage.io
import skimage.segmentation
import utils.dirtools
import utils.augmentation
from skimage.util import img_as_ubyte
from skimage.color import rgb2lab

# Create output directories for transformed data

os.makedirs(config_vars["normalized_images_dir"], exist_ok=True)
os.makedirs(config_vars["boundary_labels_dir"], exist_ok=True)

config_vars["raw_images_dir"]='/home/maloua/Malou_Master/5_Models/2_Final_Models/data/2_raw_images/'
config_vars["raw_annotations_dir"]='/home/maloua/Malou_Master/5_Models/2_Final_Models/data/1_raw_annotations/'

# ## Normalize images

if config_vars["transform_images_to_PNG"]:
    
    filelist = sorted(os.listdir(config_vars["raw_images_dir"]))

    # run over all raw images
    for filename in tqdm(filelist):

        # load image and its annotation
        orig_img = skimage.io.imread(config_vars["raw_images_dir"] + filename)       

        # IMAGE

        # normalize to [0,1]
        percentile = 99.9
        high = np.percentile(orig_img, percentile)
        low = np.percentile(orig_img, 100-percentile)

        img = np.minimum(high, orig_img)
        img = np.maximum(low, img)

        img = (img - low) / (high - low) # gives float64, thus cast to 8 bit later
        img = skimage.img_as_ubyte(img) 

        skimage.io.imsave(config_vars["normalized_images_dir"] + filename[:-3] + 'png', img)    
else:
    config_vars["normalized_images_dir"] = config_vars["raw_images_dir"]

# ## Create boundary labels

filelist = sorted(os.listdir(config_vars["raw_annotations_dir"]))
from skimage.util import img_as_ubyte
from skimage.color import rgb2lab
total_objects = 0

# run over all raw images
for filename in tqdm(filelist):
    
    # GET ANNOTATION
    annot = skimage.io.imread(config_vars["raw_annotations_dir"] + filename)

    # strip the first channel
    if annot.shape[2]!=3:
        annot = annot[:,:,0]
    else:
        annot = rgb2lab(annot)
        annot = annot[:,:,0]
    # label the annotations nicely to prepare for future filtering operation
    
    annot = skimage.morphology.label(annot)
    total_objects += len(np.unique(annot)) - 1
      
    # find boundaries
    boundaries = skimage.segmentation.find_boundaries(annot, mode = 'outer')

    # BINARY LABEL
    
    # prepare buffer for binary label
    label_binary = np.zeros((annot.shape + (3,)))
    
    # write binary label
    label_binary[(annot == 0) & (boundaries == 0), 0] = 1
    label_binary[(annot != 0) & (boundaries == 0), 1] = 1
    label_binary[boundaries == 1, 2] = 1
    
    label_binary = img_as_ubyte(label_binary)
    # save it - converts image to range from 0 to 255
    skimage.io.imsave(config_vars["boundary_labels_dir"] + filename, label_binary)
    
print("Total objects: ",total_objects)
```

### Augmentation

For Model 1, the 30 images will be augmented with affine transformation, which is done with the following script:


## 01-Augmentation.py

```python
import os
import pathlib
from tqdm.notebook import tqdm
import skimage.io
import skimage.segmentation
from config import config_vars
import utils.dirtools
import utils.augmentation

config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'
config_vars['path_files_validation'] ='/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/VALIDATION.txt'
config_vars['path_files_test'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/TEST.txt'


# ## Augment (elastic transformation)

config_vars["augment_images"] = True
def generate_augmented_examples(filelist, n_augmentations, n_points, distort, dir_boundary_labels, dir_images_normalized_8bit):
    
    updated_filelist = []
    
    # run over all raw images
    for filename in tqdm(filelist):
        print("Augmenting {}".format(filename))
            
        # check if boundary labels were calculated 
        my_file = pathlib.Path(dir_boundary_labels + filename)
        
        if my_file.is_file():
            
            # load image 
            x = skimage.io.imread(dir_images_normalized_8bit + filename)
            
            # load annotation 
            y = skimage.io.imread(dir_boundary_labels + filename)
            
            for n in range(1,n_augmentations):
                # augment image and annotation 
                x_augmented, y_augmented = utils.augmentation.deform(x, y, points = n_points, distort = distort)
                
                # filename for augmented images
                filename_augmented = os.path.splitext(filename)[0] + '_aug_{:03d}'.format(n) + os.path.splitext(filename)[1]
                skimage.io.imsave(dir_images_normalized_8bit + filename_augmented, x_augmented)
                skimage.io.imsave(dir_boundary_labels + filename_augmented, y_augmented)
                updated_filelist.append(filename_augmented)
                
    return filelist + updated_filelist 

if config_vars["augment_images"]:
    
    tmp_value = config_vars["max_training_images"]
    config_vars["max_training_images"] = 0
    tmp_partitions = utils.dirtools.read_data_partitions(config_vars, load_augmented=False)
    
    training_files = generate_augmented_examples(
        tmp_partitions["training"], 
        config_vars["elastic_augmentations"], 
        config_vars["elastic_points"], 
        config_vars["elastic_distortion"], 
        config_vars["boundary_labels_dir"], 
        config_vars["normalized_images_dir"]
    )
    
    config_vars["max_training_images"] = tmp_value

```

## 1.4 Training

The training script looks the same for each model except for the variable "config_vars['path_files_training']"
and "experiment_name", and "data_partitions" which are modified as follows:

for Model 1, the script is the one below,

for Model 2:

```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'

data_partitions = utils.dirtools.read_data_partitions(config_vars)

experiment_name = 'Model_2'
```

for Model 3:

```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/3_training.txt'

data_partitions = utils.dirtools.read_data_partitions(config_vars)

experiment_name = 'Model_3'
```

for Model 4:
```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/4_training.txt'

data_partitions = utils.dirtools.read_data_partitions(config_vars)

experiment_name = 'Model_4'
```


### 02-training.py

```python
import sys
import os

import numpy as np
import skimage.io

import tensorflow as tf

import keras.backend
import keras.callbacks
import keras.layers
import keras.models
import keras.optimizers

import utils.model_builder
import utils.data_provider
import utils.metrics
import utils.objectives
import utils.dirtools

from config import config_vars

config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'
config_vars['path_files_validation'] ='/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/VALIDATION.txt'
config_vars['path_files_test'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/TEST.txt'

experiment_name = 'Model_1'

config_vars = utils.dirtools.setup_experiment(config_vars, experiment_name)

data_partitions = utils.dirtools.read_data_partitions(config_vars, load_augmented = False)


# build session running on GPU 1
configuration = tf.compat.v1.ConfigProto()
configuration.gpu_options.allow_growth = True
configuration.gpu_options.visible_device_list = "1"
session = tf.compat.v1.Session(config = configuration)

# apply session
tf.compat.v1.keras.backend.set_session(session)

train_gen = utils.data_provider.random_sample_generator(
    config_vars["normalized_images_dir"],
    config_vars["boundary_labels_dir"],
    data_partitions["training"],
    config_vars["batch_size"],
    config_vars["pixel_depth"],
    config_vars["crop_size"],
    config_vars["crop_size"],
    config_vars["rescale_labels"]
)

val_gen = utils.data_provider.single_data_from_images(
     config_vars["normalized_images_dir"],
     config_vars["boundary_labels_dir"],
     data_partitions["validation"],
     config_vars["val_batch_size"],
     config_vars["pixel_depth"],
     config_vars["crop_size"],
     config_vars["crop_size"],
     config_vars["rescale_labels"]
)


# build model
model = utils.model_builder.get_model_3_class(config_vars["crop_size"], config_vars["crop_size"], activation=None)
model.summary()

#loss = "categorical_crossentropy"
loss = utils.objectives.weighted_crossentropy

metrics = [keras.metrics.categorical_accuracy, 
           utils.metrics.channel_recall(channel=0, name="background_recall"), 
           utils.metrics.channel_precision(channel=0, name="background_precision"),
           utils.metrics.channel_recall(channel=1, name="interior_recall"), 
           utils.metrics.channel_precision(channel=1, name="interior_precision"),
           utils.metrics.channel_recall(channel=2, name="boundary_recall"), 
           utils.metrics.channel_precision(channel=2, name="boundary_precision"),
          ]

optimizer = keras.optimizers.RMSprop(lr=config_vars["learning_rate"])

model.compile(loss=loss, metrics=metrics, optimizer=optimizer)

# Performance logging
callback_csv = keras.callbacks.CSVLogger(filename=config_vars["csv_log_file"])

callbacks=[callback_csv]


# TRAIN
statistics = model.fit_generator(
    generator=train_gen,
    steps_per_epoch=config_vars["steps_per_epoch"],
    epochs=config_vars["epochs"],
#    epochs = 3,
    validation_data=val_gen,
    validation_steps=int(len(data_partitions["validation"])/config_vars["val_batch_size"]),
    callbacks=callbacks,
    verbose = 1
)

model.save_weights(config_vars["model_file"])
```

## 1.5 Prediction

As for the training scripts, the prediction scripts look the same for each model except for the variable "config_vars['path_files_training']"
and "experiment_name", which are modified as follows:

for Model 1, the script is the one below,

for Model 2:

```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'

experiment_name = 'Model_2'
```

for Model 3:

```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/3_training.txt'

experiment_name = 'Model_3'
```

for Model 4:
```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/4_training.txt'

experiment_name = 'Model_4'
```

## 03-prediction.py

```python
import os
import os.path
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

import skimage.io
import skimage.morphology

import tensorflow as tf
import keras

import utils.metrics
import utils.model_builder
print(skimage.__version__)


# # Configuration

# In[2]:


from config import config_vars

# Partition of the data to make predictions (test or validation)

config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'
config_vars['path_files_validation'] ='/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/VALIDATION.txt'
config_vars['path_files_test'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/TEST.txt'

partition = "validation"

experiment_name = 'Model_1'

config_vars = utils.dirtools.setup_experiment(config_vars, experiment_name)

data_partitions = utils.dirtools.read_data_partitions(config_vars)


# Configuration to run on GPU
configuration = tf.compat.v1.ConfigProto()
configuration.gpu_options.allow_growth = True
configuration.gpu_options.visible_device_list = "0"

session = tf.compat.v1.Session(config = configuration)

# apply session
tf.compat.v1.keras.backend.set_session(session)


# # Load images and run predictions

image_names = [os.path.join(config_vars["normalized_images_dir"], f) for f in data_partitions[partition]]

imagebuffer = skimage.io.imread_collection(image_names)

images = imagebuffer.concatenate()

dim1 = images.shape[1]
dim2 = images.shape[2]

images = images.reshape((-1, dim1, dim2, 1))

# preprocess (assuming images are encoded as 8-bits in the preprocessing step)
images = images / 255

# build model and load weights
model = utils.model_builder.get_model_3_class(dim1, dim2)
model.load_weights(config_vars["model_file"])

# Normal prediction time
predictions = model.predict(images, batch_size=1)


from scipy import ndimage as ndi

# Code inspired by scikit-images source-code for skimage.morphology.remove_small_objects()
def remove_large_objects(image, min_size):
    out = np.copy(image)
    
    if out.dtype == bool:
        selem = ndi.generate_binary_structure(image.ndim,1)
        ccs = np.zeros_like(image, dtype=np.int32)
        ndi.label(image, selem, output=ccs)
    else:
        ccs = out
    component_sizes = np.bincount(ccs.ravel())
    too_large = component_sizes > min_size
    too_large_mask = too_large[ccs]
    out[too_large_mask] = 0
    return out

def pred_to_label(pred, cell_min_size, cell_label=1):
    # Only marks interior of cells (cell_label = 1 is interior, cell_label = 2 is boundary)
    cell_orig = (pred == cell_label)
    
    cell_small = (pred == 1) + (pred == 2)
    cell_small = remove_large_objects(cell_small,100) 
    
    cell_concat = cell_orig + cell_small
    
    cell_orig = skimage.morphology.remove_small_holes(cell_concat, area_threshold=cell_min_size)
    cell_orig = skimage.morphology.remove_small_objects(cell_concat, min_size=cell_min_size)
    # label cells only
    [label, num] = skimage.morphology.label(cell_orig, return_num=True)
    return label


# # Transform predictions to label matrices

for i in range(len(images)):

    filename = imagebuffer.files[i]
    filename = os.path.basename(filename)
    
    probmap = predictions[i].squeeze()
    
    skimage.io.imsave(config_vars["probmap_out_dir"] + filename, probmap)
    
    pred = np.argmax(probmap * [1, 1, 1], -1)
    
    label = pred_to_label(pred, config_vars["cell_min_size"])
    
    skimage.io.imsave(config_vars["labels_out_dir"] + filename, label)
```

## 1.6 Evaluation

The evaluation script looks the same for each model except for the variable "config_vars['path_files_training']"
and "experiment_name", which are modified as follows:

for Model 1, the script is the one below,

for Model 2:

```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'

experiment_name = 'Model_2'
```

for Model 3:

```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/3_training.txt'

experiment_name = 'Model_3'
```

for Model 4:
```python
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/4_training.txt'

experiment_name = 'Model_4'
```

## 04-evaluation.py

```python

import os

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sb

import skimage.io
import skimage.morphology
import skimage.segmentation

import utils.evaluation
from config import config_vars


# Partition of the data to make predictions (test or validation)
config_vars['path_files_training'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/1-2_training.txt'
config_vars['path_files_validation'] ='/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/VALIDATION.txt'
config_vars['path_files_test'] = '/home/maloua/Malou_Master/5_Models/2_Final_Models/data/4_filelists/TEST.txt'

partition = "validation"

experiment_name = 'Model_1'

config_vars = utils.dirtools.setup_experiment(config_vars, experiment_name)

data_partitions = utils.dirtools.read_data_partitions(config_vars)


# Display prediction along with segmentation to visualize errors

def show(ground_truth, prediction, threshold=0.5, image_name="N"):
    
    # Compute Intersection over Union
    IOU = utils.evaluation.intersection_over_union(ground_truth, prediction)
    
    # Create diff map
    diff = np.zeros(ground_truth.shape + (3,))
    A = ground_truth.copy()
    B = prediction.copy()
    A[A > 0] = 1
    B[B > 0] = 1
    D = A - B
    #diff[D > 0,:2] = 1
    #diff[D < 0,1:] = 1
    
    # Object-level errors
    C = IOU.copy()
    C[C>=threshold] = 1
    C[C<threshold] = 0
    missed = np.where(np.sum(C,axis=1) == 0)[0]
    extra = np.where(np.sum(C,axis=0) == 0)[0]

    for m in missed:
        diff[ground_truth == m+1, 0] = 1
    for e in extra:
        diff[prediction == e+1, 2] = 1
    
    # Display figures
    fig, ax = plt.subplots(1, 4, figsize=(18,6))
    ax[0].imshow(ground_truth)
    ax[0].set_title("True objects:"+str(len(np.unique(ground_truth))))
    ax[1].imshow(diff)
    ax[1].set_title("Segmentation errors:"+str(len(missed)))
    ax[2].imshow(prediction)
    ax[2].set_title("Predicted objects:"+str(len(np.unique(prediction))))
    ax[3].imshow(IOU)
    ax[3].set_title(image_name)


from scipy import ndimage as ndi

# Code inspired by scikit-images source-code for skimage.morphology.remove_small_objects()
def remove_large_objects(image, min_size):
    out = np.copy(image)
    
    if out.dtype == bool:
        selem = ndi.generate_binary_structure(image.ndim,1)
        ccs = np.zeros_like(image, dtype=np.int32)
        ndi.label(image, selem, output=ccs)
    else:
        ccs = out
    component_sizes = np.bincount(ccs.ravel())
    too_large = component_sizes > min_size
    too_large_mask = too_large[ccs]
    out[too_large_mask] = 0
    return out

def pred_to_label(pred, cell_min_size, cell_label=1):
    # Only marks interior of cells (cell_label = 1 is interior, cell_label = 2 is boundary)
    cell_orig = (pred == cell_label)
    
    cell_small = (pred == 1) + (pred == 2)
    cell_small = remove_large_objects(cell_small,100) 
    
    cell_concat = cell_orig + cell_small
    
    cell_orig = skimage.morphology.remove_small_holes(cell_concat, area_threshold=cell_min_size)
    cell_orig = skimage.morphology.remove_small_objects(cell_concat, min_size=cell_min_size)
    # label cells only
    [label, num] = skimage.morphology.label(cell_concat, return_num=True)
    return label


# Run the evaluation

l_images = data_partitions[partition]
from skimage.color import rgb2gray,rgb2lab

results = pd.DataFrame(columns=["Image", "Threshold", "F1", "Jaccard", "TP", "FP", "FN"])
false_negatives = pd.DataFrame(columns=["False_Negative", "Area"])
splits_merges = pd.DataFrame(columns=["Image_Name", "Merges", "Splits"])

for image_name in all_images:
    # Load ground truth data
    img_filename = os.path.join(config_vars["boundary_labels_dir"], image_name)
    ground_truth = skimage.io.imread(img_filename)
    ground_truth = ground_truth.squeeze()
    #if len(ground_truth.shape) == 3:
    #    ground_truth = rgb2lab(ground_truth)
    #    ground_truth = ground_truth[:,:,0]
    
    ground_truth = np.argmax(ground_truth * [1, 1, 1], -1)
    
    ground_truth = pred_to_label(ground_truth, config_vars["cell_min_size"])
    # Transform to label matrix
    #ground_truth = skimage.morphology.label(ground_truth)
    
    # Load predictions
    pred_filename = os.path.join(config_vars["labels_out_dir"], image_name)
    prediction = skimage.io.imread(pred_filename)
    
    # Apply object dilation
    if config_vars["object_dilation"] > 0:
        struct = skimage.morphology.square(config_vars["object_dilation"])
        prediction = skimage.morphology.dilation(prediction, struct)
    elif config_vars["object_dilation"] < 0:
        struct = skimage.morphology.square(-config_vars["object_dilation"])
        prediction = skimage.morphology.erosion(prediction, struct)
        
    # Relabel objects (cut margin of 30 pixels to make a fair comparison with DeepCell)
    ground_truth = skimage.segmentation.relabel_sequential(ground_truth)[0] #[30:-30,30:-30])[0]
    prediction = skimage.segmentation.relabel_sequential(prediction)[0] #[30:-30,30:-30])[0]
    
    # Compute evaluation metrics
    results = utils.evaluation.compute_af1_results(
        ground_truth, 
        prediction, 
        results, 
        image_name
    )
    
    false_negatives = utils.evaluation.get_false_negatives(
        ground_truth, 
        prediction, 
        false_negatives, 
        image_name
    )
    
    splits_merges = utils.evaluation.get_splits_and_merges(
        ground_truth, 
        prediction, 
        splits_merges, 
        image_name
    )
    
    # Display an example image
    #if image_name == all_images[0]:
    show(ground_truth, prediction, image_name=image_name)


# Display accuracy results

average_performance = results.groupby("Threshold").mean().reset_index()

R = results.groupby("Image").mean().reset_index()
g = sb.jointplot(data=R[R["F1"] > 0.4], x="Jaccard", y="F1")

average_performance
R.sort_values(by="F1",ascending=False)



# Plot accuracy results

sb.regplot(data=average_performance, x="Threshold", y="F1", order=3, ci=None)
average_performance



# Compute and print Average F1

average_F1_score = average_performance["F1"].mean()
jaccard_index = average_performance["Jaccard"].mean()
print("Average F1 score:", average_F1_score)
print("Jaccard index:", jaccard_index)

# Summarize False Negatives by area

false_negatives = false_negatives[false_negatives["False_Negative"] == 1]

false_negatives.groupby(
    pd.cut(
        false_negatives["Area"], 
        [0,250,625,900,10000], # Area intervals
        labels=["Tiny nuclei","Small nuclei","Normal nuclei","Large nuclei"],
    )
)["False_Negative"].sum()


# Summarize splits and merges

print("Splits:",np.sum(splits_merges["Splits"]))
print("Merges:",np.sum(splits_merges["Merges"]))

# Report false positives

print("Extra objects (false postives):",results[results["Threshold"].round(3) == 0.7].sum()["FP"])
```

# utils scripts

## augmentation.py

```python
import numpy as np
import skimage.transform
from skimage.util import img_as_ubyte
# Based on example code from:
# http://scikit-image.org/docs/dev/auto_examples/transform/plot_piecewise_affine.html

def deform(image1, image2, points=10, distort=5.0):
    
    # create deformation grid 
    rows, cols = image1.shape[0], image1.shape[1]
    src_cols = np.linspace(0, cols, points)
    src_rows = np.linspace(0, rows, points)
    src_rows, src_cols = np.meshgrid(src_rows, src_cols)
    src = np.dstack([src_cols.flat, src_rows.flat])[0]

    # add distortion to coordinates
    s = src[:, 1].shape
    dst_rows = src[:, 1] + np.random.normal(size=s)*np.random.uniform(0.0, distort, size=s)
    dst_cols = src[:, 0] + np.random.normal(size=s)*np.random.uniform(0.0, distort, size=s)
    
    dst = np.vstack([dst_cols, dst_rows]).T

    tform = skimage.transform.PiecewiseAffineTransform()
    tform.estimate(src, dst)

    out_rows = rows 
    out_cols = cols
    out1 = skimage.transform.warp(image1, tform, output_shape=(out_rows, out_cols), mode="symmetric")
    out2 = skimage.transform.warp(image2, tform, output_shape=(out_rows, out_cols), mode="symmetric")
    
    return img_as_ubyte(out1), img_as_ubyte(out2)


def resize(x, y):
    wf = 1 + np.random.uniform(-0.25, 0.25)
    hf = 1 + np.random.uniform(-0.25, 0.25)

    w,h = x.shape[0:2]

    wt, ht = int(wf*w), int(hf*h)

    new_x = skimage.transform.resize(x, (wt,ht))
    new_y = skimage.transform.resize(y, (wt,ht))

    return new_x, new_y

```

## data_provider.py

```python
import os
import os.path
import numpy as np

import skimage.io
import keras.preprocessing.image

import utils.augmentation


def setup_working_directories(config_vars):

    ## Expected raw data directories:
    config_vars["raw_images_dir"] = os.path.join(config_vars["root_directory"], 'raw_images/')
    config_vars["raw_annotations_dir"] = os.path.join(config_vars["root_directory"], 'raw_annotations/')

    ## Split files
    config_vars["path_files_training"] = os.path.join(config_vars["root_directory"], 'training.txt')
    config_vars["path_files_validation"] = os.path.join(config_vars["root_directory"], 'validation.txt')
    config_vars["path_files_test"] = os.path.join(config_vars["root_directory"], 'test.txt')

    ## Transformed data directories:
    config_vars["normalized_images_dir"] = os.path.join(config_vars["root_directory"], 'norm_images/')
    config_vars["boundary_labels_dir"] = os.path.join(config_vars["root_directory"], 'boundary_labels/')

    return config_vars

def single_data_from_images(x_dir, y_dir, image_names, batch_size, bit_depth, dim1, dim2, rescale_labels):

    ## Prepare image names
    x_image_names = [os.path.join(x_dir, f) for f in image_names]
    y_image_names = [os.path.join(y_dir, f) for f in image_names]

    ## Load all images in memory
    x = skimage.io.imread_collection(x_image_names).concatenate()
    y = skimage.io.imread_collection(y_image_names).concatenate()

    ## Crop the desired size
    x = x[:, 0:dim1, 0:dim2]
    x = x.reshape(-1, dim1, dim2, 1)
    y = y[:, 0:dim1, 0:dim2, :]

    ## Setup Keras Generators
    rescale_factor = 1./(2**bit_depth - 1)
    
    if(rescale_labels):
        rescale_factor_labels = rescale_factor
    else:
        rescale_factor_labels = 1

    gen_x = keras.preprocessing.image.ImageDataGenerator(rescale=rescale_factor)
    gen_y = keras.preprocessing.image.ImageDataGenerator(rescale=rescale_factor_labels)
    
    seed = 42

    stream_x = gen_x.flow(
        x,
        batch_size=batch_size,
        seed=seed
    )
    stream_y = gen_y.flow(
        y,
        batch_size=batch_size,
        seed=seed
    )
    
    flow = zip(stream_x, stream_y)
    
    return flow


def random_sample_generator(x_dir, y_dir, image_names, batch_size, bit_depth, dim1, dim2, rescale_labels):

    do_augmentation = True
    
    # get image names
    print('Training with',len(image_names), 'images.')

    # get dimensions right -- understand data set
    n_images = len(image_names)
    ref_img = skimage.io.imread(os.path.join(y_dir, image_names[0]))

    if(len(ref_img.shape) == 2):
        gray = True
    else:
        gray = False
    
    # rescale images
    rescale_factor = 1./(2**bit_depth - 1)
    if(rescale_labels):
        rescale_factor_labels = rescale_factor
    else:
        rescale_factor_labels = 1
        
    while(True):
        
        if(gray):
            y_channels = 1
        else:
            y_channels = 3
            
        # buffers for a batch of data
        x = np.zeros((batch_size, dim1, dim2, 1))
        y = np.zeros((batch_size, dim1, dim2, y_channels))
        
        # get one image at a time
        for i in range(batch_size):
                       
            # get random image
            img_index = np.random.randint(low=0, high=n_images)
            
            # open images
            x_big = skimage.io.imread(os.path.join(x_dir, image_names[img_index])) * rescale_factor
            y_big = skimage.io.imread(os.path.join(y_dir, image_names[img_index])) * rescale_factor_labels

            # resizing
            #x_big, y_big = utils.augmentation.resize(patch_x, patch_y)


            # get random crop
            start_dim1 = np.random.randint(low=0, high=x_big.shape[0] - dim1)
            start_dim2 = np.random.randint(low=0, high=x_big.shape[1] - dim2)

            patch_x = x_big[start_dim1:start_dim1 + dim1, start_dim2:start_dim2 + dim2] #* rescale_factor
            patch_y = y_big[start_dim1:start_dim1 + dim1, start_dim2:start_dim2 + dim2] #* rescale_factor_labels

            if(do_augmentation):
                
                rand_flip = np.random.randint(low=0, high=2)
                rand_rotate = np.random.randint(low=0, high=4)
                
                # flip
                if(rand_flip == 0):
                    patch_x = np.flip(patch_x, 0)
                    patch_y = np.flip(patch_y, 0)
                
                # rotate
                for rotate_index in range(rand_rotate):
                    patch_x = np.rot90(patch_x)
                    patch_y = np.rot90(patch_y)

                # illumination
                ifactor = 1 + np.random.uniform(-0.75, 0.75)
                patch_x *= ifactor
                    
            # save image to buffer
            x[i, :, :, 0] = patch_x
            
            if(gray):
                y[i, :, :, 0] = patch_y
            else:
                y[i, :, :, 0:y_channels] = patch_y
            
        # return the buffer
        yield(x, y)


```

## dirtools.py 

```python
import os
import glob
import random 

def create_image_lists(dir_raw_images, fraction_train = 0.5, fraction_validation = 0.25):
    file_list = os.listdir(dir_raw_images)

    if (fraction_train + fraction_validation >= 1):
        print("fraction_train + fraction_validation is > 1!")
        print("setting fraction_train = 0.5, fraction_validation = 0.25")
        fraction_train = 0.5
        fraction_validation = 0.25
        
    fraction_test = 1 - fraction_train - fraction_validation

    image_list = [x for x in file_list if x.endswith("png") ]

    random.shuffle(image_list)

    index_train_end = int( len(image_list) * fraction_train)
    index_validation_end = index_train_end + int(len(image_list) * fraction_validation)

    # split into two parts for training and testing 
    image_list_train = image_list[0:index_train_end]
    image_list_test = image_list[index_train_end:(index_validation_end)]
    image_list_validation = image_list[index_validation_end:]
    return(image_list_train, image_list_test, image_list_validation)


def write_path_files(file_path, list):
    with open(file_path, 'w') as myfile:
        for line in  list: myfile.write(line + '\n')


def setup_working_directories(config_vars):

    ## Expected raw data directories:
    config_vars["raw_images_dir"] = os.path.join(config_vars["root_directory"], 'raw_images/')
    config_vars["raw_annotations_dir"] = os.path.join(config_vars["root_directory"], 'raw_annotations/')

    ## Split files
    config_vars["path_files_training"] = os.path.join(config_vars["root_directory"], 'training.txt')
    config_vars["path_files_validation"] = os.path.join(config_vars["root_directory"], 'validation.txt')
    config_vars["path_files_test"] = os.path.join(config_vars["root_directory"], 'test.txt')

    ## Transformed data directories:
    config_vars["normalized_images_dir"] = os.path.join(config_vars["root_directory"], 'norm_images/')
    config_vars["boundary_labels_dir"] = os.path.join(config_vars["root_directory"], 'boundary_labels/')

    return config_vars


def read_data_partitions(config_vars, load_augmented=True):
    with open(config_vars["path_files_training"]) as f:
        training_files = f.read().splitlines()
        if config_vars["max_training_images"] > 0:
            random.shuffle(training_files)
            training_files = training_files[0:config_vars["max_training_images"]]
        
    with open(config_vars["path_files_validation"]) as f:
        validation_files = f.read().splitlines()
    
    with open(config_vars["path_files_test"]) as f:
        test_files = f.read().splitlines()

    # Add augmented images to the training list
    if load_augmented:
        files = glob.glob(config_vars["root_directory"] + "norm_images/*_aug_*.png")
        files = [f.split("/")[-1] for f in files]
        augmentedtraining = []
        augmentedvalidation = []
        for trf in training_files:
            augmentedtraining += [f for f in files if f.startswith(trf.split(".")[0])]
        training_files += augmentedtraining
        #for vlf in validation_files:
        #    augmentedvalidation += [f for f in files if f.startswith(vlf.split(".")[0])]
        #validation_files += augmentedvalidation
        #else:
         #   training_files += files

    partitions = {
        "training": training_files,
        "validation": validation_files,
        "test": test_files
    }

    return partitions

def setup_experiment(config_vars, tag):

    # Output dirs
    config_vars["experiment_dir"] = os.path.join(config_vars["root_directory"], "experiments/" + tag + "/out/")
    config_vars["probmap_out_dir"] = os.path.join(config_vars["experiment_dir"], "prob/")
    config_vars["labels_out_dir"] = os.path.join(config_vars["experiment_dir"], "segm/")

    # Files
    config_vars["model_file"] = config_vars["root_directory"] + "experiments/" + tag + "/model.hdf5"
    config_vars["csv_log_file"] = config_vars["root_directory"] + "experiments/" + tag + "/log.csv"

    # Make output directories
    os.makedirs(config_vars["experiment_dir"], exist_ok=True)
    os.makedirs(config_vars["probmap_out_dir"], exist_ok=True)
    os.makedirs(config_vars["labels_out_dir"], exist_ok=True)

    return config_vars


```

## evaluation.py

```python

import numpy as np
import pandas as pd


def intersection_over_union(ground_truth, prediction):
    
    # Count objects
    true_objects = len(np.unique(ground_truth))
    pred_objects = len(np.unique(prediction))
    
    # Compute intersection
    h = np.histogram2d(ground_truth.flatten(), prediction.flatten(), bins=(true_objects,pred_objects))
    intersection = h[0]
    
    # Area of objects
    area_true = np.histogram(ground_truth, bins=true_objects)[0]
    area_pred = np.histogram(prediction, bins=pred_objects)[0]
    
    # Calculate union
    area_true = np.expand_dims(area_true, -1)
    area_pred = np.expand_dims(area_pred, 0)
    union = area_true + area_pred - intersection
    
    # Exclude background from the analysis
    intersection = intersection[1:,1:]
    union = union[1:,1:]
    
    # Compute Intersection over Union
    union[union == 0] = 1e-9
    IOU = intersection/union
    
    return IOU
    


def measures_at(threshold, IOU):
    
    matches = IOU > threshold
    
    true_positives = np.sum(matches, axis=1) == 1   # Correct objects
    false_positives = np.sum(matches, axis=0) == 0  # Extra objects
    false_negatives = np.sum(matches, axis=1) == 0  # Missed objects
    
    assert np.all(np.less_equal(true_positives, 1))
    assert np.all(np.less_equal(false_positives, 1))
    assert np.all(np.less_equal(false_negatives, 1))
    
    TP, FP, FN = np.sum(true_positives), np.sum(false_positives), np.sum(false_negatives)
    
    f1 = 2*TP / (2*TP + FP + FN + 1e-9)
    
    return f1, TP, FP, FN

# Compute Average Precision for all IoU thresholds

def compute_af1_results(ground_truth, prediction, results, image_name):

    # Compute IoU
    IOU = intersection_over_union(ground_truth, prediction)
    if IOU.shape[0] > 0:
        jaccard = np.max(IOU, axis=0).mean()
    else:
        jaccard = 0.0
    
    # Calculate F1 score at all thresholds
    for t in np.arange(0.5, 1.0, 0.05):
        f1, tp, fp, fn = measures_at(t, IOU)
        res = {"Image": image_name, "Threshold": t, "F1": f1, "Jaccard": jaccard, "TP": tp, "FP": fp, "FN": fn}
        row = len(results)
        results.loc[row] = res
        
    return results

# Count number of False Negatives at 0.7 IoU

def get_false_negatives(ground_truth, prediction, results, image_name, threshold=0.7):

    # Compute IoU
    IOU = intersection_over_union(ground_truth, prediction)
    
    true_objects = len(np.unique(ground_truth))
    if true_objects <= 1:
        return results
        
    area_true = np.histogram(ground_truth, bins=true_objects)[0][1:]
    true_objects -= 1
    
    # Identify False Negatives
    matches = IOU > threshold
    false_negatives = np.sum(matches, axis=1) == 0  # Missed objects

    data = np.asarray([ 
        area_true.copy(), 
        np.array(false_negatives, dtype=np.int32)
    ])

    results = pd.concat([results, pd.DataFrame(data=data.T, columns=["Area", "False_Negative"])])
        
    return results

# Count the number of splits and merges

def get_splits_and_merges(ground_truth, prediction, results, image_name):

    # Compute IoU
    IOU = intersection_over_union(ground_truth, prediction)
    
    matches = IOU > 0.1
    merges = np.sum(matches, axis=0) > 1
    splits = np.sum(matches, axis=1) > 1
    r = {"Image_Name":image_name, "Merges":np.sum(merges), "Splits":np.sum(splits)}
    results.loc[len(results)+1] = r
    return results
``` 

## experiment.py

```python
import sys
import os
import os.path
    
import numpy as np
import pandas as pd
    
import tensorflow as tf
    
import keras.backend
import keras.callbacks
import keras.layers
import keras.models
import keras.optimizers
    
import utils.model_builder
import utils.data_provider
import utils.metrics
import utils.objectives
import utils.dirtools
import utils.evaluation
    
import skimage.io
import skimage.morphology
import skimage.segmentation


def run(config_vars, data_partitions, experiment_name, partition, GPU="2"):

    # Device configuration
    configuration = tf.ConfigProto()
    configuration.gpu_options.allow_growth = True
    configuration.gpu_options.visible_device_list = GPU
    session = tf.Session(config = configuration)
    
    # apply session
    keras.backend.set_session(session)

    # # Step 02
    # # Training a U-Net model    
    
    train_gen = utils.data_provider.random_sample_generator(
        config_vars["normalized_images_dir"],
        config_vars["boundary_labels_dir"],
        data_partitions["training"],
        config_vars["batch_size"],
        config_vars["pixel_depth"],
        config_vars["crop_size"],
        config_vars["crop_size"],
        config_vars["rescale_labels"]
    )
    
    val_gen = utils.data_provider.single_data_from_images(
         config_vars["normalized_images_dir"],
         config_vars["boundary_labels_dir"],
         data_partitions["validation"],
         config_vars["val_batch_size"],
         config_vars["pixel_depth"],
         config_vars["crop_size"],
         config_vars["crop_size"],
         config_vars["rescale_labels"]
    )
    
    model = utils.model_builder.get_model_3_class(config_vars["crop_size"], config_vars["crop_size"], activation=None)
    
    loss = utils.objectives.weighted_crossentropy
    
    metrics = [keras.metrics.categorical_accuracy, 
               utils.metrics.channel_recall(channel=0, name="background_recall"), 
               utils.metrics.channel_precision(channel=0, name="background_precision"),
               utils.metrics.channel_recall(channel=1, name="interior_recall"), 
               utils.metrics.channel_precision(channel=1, name="interior_precision"),
               utils.metrics.channel_recall(channel=2, name="boundary_recall"), 
               utils.metrics.channel_precision(channel=2, name="boundary_precision"),
              ]
    
    optimizer = keras.optimizers.RMSprop(lr=config_vars["learning_rate"])
    
    model.compile(loss=loss, metrics=metrics, optimizer=optimizer)
    
    callback_csv = keras.callbacks.CSVLogger(filename=config_vars["csv_log_file"])
    
    callbacks=[callback_csv]
    
    # TRAIN
    statistics = model.fit_generator(
        generator=train_gen,
        steps_per_epoch=config_vars["steps_per_epoch"],
        epochs=config_vars["epochs"],
        validation_data=val_gen,
        validation_steps=int(len(data_partitions["validation"])/config_vars["val_batch_size"]),
        callbacks=callbacks,
        verbose = 1
    )
    
    model.save_weights(config_vars["model_file"])
    
    print('Training Done! :)')
    
    
    # # Step 03
    # # Predict segmentations
        
    image_names = [f for f in data_partitions[partition] if f.startswith("IXM")]
    image_names = [os.path.join(config_vars["normalized_images_dir"], f) for f in image_names]#data_partitions[partition]]
    
    imagebuffer = skimage.io.imread_collection(image_names)
    
    images = imagebuffer.concatenate()
    
    dim1 = images.shape[1]
    dim2 = images.shape[2]
    
    images = images.reshape((-1, dim1, dim2, 1))
    
    images = images / 255
    
    model = utils.model_builder.get_model_3_class(dim1, dim2)
    model.load_weights(config_vars["model_file"])
    
    predictions = model.predict(images, batch_size=1)
    
    for i in range(len(images)):
    
        filename = imagebuffer.files[i]
        filename = os.path.basename(filename)
        
        probmap = predictions[i].squeeze()
        
        skimage.io.imsave(config_vars["probmap_out_dir"] + filename, probmap)
        
        pred = utils.metrics.probmap_to_pred(probmap, config_vars["boundary_boost_factor"])
    
        label = utils.metrics.pred_to_label(pred, config_vars["cell_min_size"])
        
        skimage.io.imsave(config_vars["labels_out_dir"] + filename, label)
    
    
    # # Step 04
    # # Evaluation of performance
    
    all_images = data_partitions[partition]
    #all_images = [f for f in data_partitions[partition] if f.startswith("IXM")]
    
    
    results = pd.DataFrame(columns=["Image", "Threshold", "Precision"])
    false_negatives = pd.DataFrame(columns=["False_Negative", "Area"])
    splits_merges = pd.DataFrame(columns=["Image_Name", "Merges","Splits"])
    
    for image_name in all_images:
        img_filename = os.path.join(config_vars["raw_annotations_dir"], image_name)
        ground_truth = skimage.io.imread(img_filename)
        if len(ground_truth.shape) == 3:
            ground_truth = ground_truth[:,:,0]
        
        ground_truth = skimage.morphology.label(ground_truth)
        
        pred_filename = os.path.join(config_vars["labels_out_dir"], image_name)
        prediction = skimage.io.imread(pred_filename) #.replace(".png",".tiff"))
        
        if config_vars["object_dilation"] > 0:
            struct = skimage.morphology.square(config_vars["object_dilation"])
            prediction = skimage.morphology.dilation(prediction, struct)
        elif config_vars["object_dilation"] < 0:
            struct = skimage.morphology.square(-config_vars["object_dilation"])
            prediction = skimage.morphology.erosion(prediction, struct)
            
        ground_truth = skimage.segmentation.relabel_sequential(ground_truth[30:-30,30:-30])[0] # )[0] #
        prediction = skimage.segmentation.relabel_sequential(prediction[30:-30,30:-30])[0] # )[0] #
        
        results = utils.evaluation.compute_ap_results(
            ground_truth, 
            prediction, 
            results, 
            image_name
        )
        
        false_negatives = utils.evaluation.get_false_negatives(
            ground_truth, 
            prediction, 
            false_negatives, 
            image_name
        )
        
        splits_merges = utils.evaluation.get_splits_and_merges(
            ground_truth, 
            prediction, 
            splits_merges, 
            image_name
        )
        
    
    # # Report of results
    
    output = {}

    average_precision = results.groupby("Threshold").mean().reset_index()
    mean_average_precision = average_precision["Precision"].mean()
    output["MAP"] = mean_average_precision
    
    false_negatives = false_negatives[false_negatives["False_Negative"] == 1]
    
    missed = false_negatives.groupby(
        pd.cut(
            false_negatives["Area"], 
            [0,250,625,900,10000], # Area intervals
            labels=["Tiny nuclei","Small nuclei","Normal nuclei","Large nuclei"],
        )
    )["False_Negative"].sum()
    
    output["Missed"] = missed
    output["Splits"] = np.sum(splits_merges["Splits"])
    output["Merges"] = np.sum(splits_merges["Merges"])

    return output
    ```

## metrics.py

```python
import numpy as np
import skimage.segmentation
import skimage.io
import keras.backend as K
import tensorflow as tf

debug = False

def channel_precision(channel, name):
    def precision_func(y_true, y_pred):
        y_pred_tmp = K.cast(tf.equal( K.argmax(y_pred, axis=-1), channel), "float32")
        true_positives = K.sum(K.round(K.clip(y_true[:,:,:,channel] * y_pred_tmp, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred_tmp, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
    
        return precision
    precision_func.__name__ = name
    return precision_func


def channel_recall(channel, name):
    def recall_func(y_true, y_pred):
        y_pred_tmp = K.cast(tf.equal( K.argmax(y_pred, axis=-1), channel), "float32")
        true_positives = K.sum(K.round(K.clip(y_true[:,:,:,channel] * y_pred_tmp, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true[:,:,:,channel], 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
    
        return recall
    recall_func.__name__ = name
    return recall_func


## PROBMAP TO CONTOURS TO LABEL

def probmap_to_contour(probmap, threshold = 0.5):
    # assume 2D input
    outline = probmap >= threshold
    
    return outline

def contour_to_label(outline, image):
    # see notebook contours_to_labels for why we do what we do here
    
    # get connected components
    labels = skimage.morphology.label(outline, background=1)
    skimage.morphology.remove_small_objects(labels, min_size = 100, in_place = True)
    
    n_ccs = np.max(labels)

    # buffer label image
    filtered_labels = np.zeros_like(labels, dtype=np.uint16)

    # relabel as we don't know what connected component the background has been given before
    label_index = 1
    
    # start at 1 (0 is contours), end at number of connected components
    for i in range(1, n_ccs + 1):

        # get mask of connected compoenents
        mask = labels == i

        # get mean
        mean = np.mean(np.take(image.flatten(),np.nonzero(mask.flatten())))

        if(mean > 50/255):
            filtered_labels[mask] = label_index
            label_index = label_index + 1
            
    return filtered_labels


## PROBMAP TO PRED TO LABEL

def probmap_to_pred(probmap, boundary_boost_factor):
    # we need to boost the boundary class to make it more visible
    # this shrinks the cells a little bit but avoids undersegmentation
    pred = np.argmax(probmap * [1, 1, boundary_boost_factor], -1)
    
    return pred


def pred_to_label(pred, cell_min_size, cell_label=1):
    # Only marks interior of cells (cell_label = 1 is interior, cell_label = 2 is boundary)
    cell=(pred == cell_label)
    # fix cells
    cell = skimage.morphology.remove_small_holes(cell, area_threshold=cell_min_size)
    cell = skimage.morphology.remove_small_objects(cell, min_size=cell_min_size)
    
    # label cells only
    [label, num] = skimage.morphology.label(cell, return_num=True)
    return label


def compare_two_labels(label_model, label_gt, return_IoU_matrix):
    
    # get number of detected nuclei
    nb_nuclei_gt = np.max(label_gt)
    nb_nuclei_model = np.max(label_model)
    
    # catch the case of an empty picture in model and gt
    if nb_nuclei_gt == 0 and nb_nuclei_model == 0:
        if(return_IoU_matrix):
            return [0, 0, 1, np.empty(0)]     
        else:
            return [0, 0, 1]
    
    # catch the case of empty picture in model
    if nb_nuclei_model == 0:
        if(return_IoU_matrix):
            return [0, nb_nuclei_gt, 0, np.empty(0)]     
        else:
            return [0, nb_nuclei_gt, 0]
    
    # catch the case of empty picture in gt
    if nb_nuclei_gt == 0:
        if(return_IoU_matrix):
            return [nb_nuclei_model, 0, 0, np.empty(0)]     
        else:
            return [nb_nuclei_model, 0, 0]
    
    # build IoU matrix
    IoUs = np.full((nb_nuclei_gt, nb_nuclei_model), -1, dtype = np.float32)

    # calculate IoU for each nucleus index_gt in GT and nucleus index_pred in prediction    
    # TODO improve runtime of this algorithm
    for index_gt in range(1,nb_nuclei_gt+1):

        nucleus_gt = label_gt == index_gt
        number_gt = np.sum(nucleus_gt)

        for index_model in range(1,nb_nuclei_model+1):
            
            if debug:
                print(index_gt, "/", index_model)
            
            nucleus_model = label_model == index_model 
            number_model = np.sum(nucleus_model)
            
            same_and_1 = np.sum((nucleus_gt == nucleus_model) * nucleus_gt)
            
            IoUs[index_gt-1,index_model-1] = same_and_1 / (number_gt + number_model - same_and_1)
    
    # get matches and errors
    detection_map = (IoUs > 0.5)
    nb_matches = np.sum(detection_map)

    detection_rate = IoUs * detection_map
    
    nb_overdetection = nb_nuclei_model - nb_matches
    nb_underdetection = nb_nuclei_gt - nb_matches
    
    mean_IoU = np.mean(np.sum(detection_rate, axis = 1))
    
    if(return_IoU_matrix):
        result = [nb_overdetection, nb_underdetection, mean_IoU, IoUs]
    else:
        result = [nb_overdetection, nb_underdetection, mean_IoU]
    return result

def splits_and_merges_3_class(y_model_pred, y_gt_pred):
    
    # get segmentations
    label_gt = pred_to_label(y_gt_pred, cell_min_size=2)
    label_model = pred_to_label(y_model_pred, cell_min_size=2)
    
    # compare labels
    result = compare_two_labels(label_model, label_gt, False)
        
    return result

def splits_and_merges_boundary(y_model_outline, y_gt_outline, image):
    
    # get segmentations
    label_gt = contour_to_label(y_gt_outline, image)
    label_model = contour_to_label(y_model_outline, image)
    
    # compare labels
    result = compare_two_labels(label_model, label_gt, False)
        
    return result
```

## model_builder.py

```python
import keras.layers
import keras.models
import tensorflow as tf

CONST_DO_RATE = 0.5

option_dict_conv = {"activation": "relu", "border_mode": "same"}
option_dict_bn = {"mode": 0, "momentum" : 0.9}


# returns a core model from gray input to 64 channels of the same size
def get_core(dim1, dim2):
    
    x = keras.layers.Input(shape=(dim1, dim2, 1))

    a = keras.layers.Convolution2D(64, 3, 3, **option_dict_conv)(x)  
    a = keras.layers.BatchNormalization(**option_dict_bn)(a)

    a = keras.layers.Convolution2D(64, 3, 3, **option_dict_conv)(a)
    a = keras.layers.BatchNormalization(**option_dict_bn)(a)

    
    y = keras.layers.MaxPooling2D()(a)

    b = keras.layers.Convolution2D(128, 3, 3, **option_dict_conv)(y)
    b = keras.layers.BatchNormalization(**option_dict_bn)(b)

    b = keras.layers.Convolution2D(128, 3, 3, **option_dict_conv)(b)
    b = keras.layers.BatchNormalization(**option_dict_bn)(b)

    
    y = keras.layers.MaxPooling2D()(b)

    c = keras.layers.Convolution2D(256, 3, 3, **option_dict_conv)(y)
    c = keras.layers.BatchNormalization(**option_dict_bn)(c)

    c = keras.layers.Convolution2D(256, 3, 3, **option_dict_conv)(c)
    c = keras.layers.BatchNormalization(**option_dict_bn)(c)

    
    y = keras.layers.MaxPooling2D()(c)

    d = keras.layers.Convolution2D(512, 3, 3, **option_dict_conv)(y)
    d = keras.layers.BatchNormalization(**option_dict_bn)(d)

    d = keras.layers.Convolution2D(512, 3, 3, **option_dict_conv)(d)
    d = keras.layers.BatchNormalization(**option_dict_bn)(d)

    
    # UP

    d = keras.layers.UpSampling2D()(d)

    y = keras.layers.merge.concatenate([d, c], axis=3)

    e = keras.layers.Convolution2D(256, 3, 3, **option_dict_conv)(y)
    e = keras.layers.BatchNormalization(**option_dict_bn)(e)

    e = keras.layers.Convolution2D(256, 3, 3, **option_dict_conv)(e)
    e = keras.layers.BatchNormalization(**option_dict_bn)(e)

    e = keras.layers.UpSampling2D()(e)

    
    y = keras.layers.merge.concatenate([e, b], axis=3)

    f = keras.layers.Convolution2D(128, 3, 3, **option_dict_conv)(y)
    f = keras.layers.BatchNormalization(**option_dict_bn)(f)

    f = keras.layers.Convolution2D(128, 3, 3, **option_dict_conv)(f)
    f = keras.layers.BatchNormalization(**option_dict_bn)(f)

    f = keras.layers.UpSampling2D()(f)

    
    y = keras.layers.merge.concatenate([f, a], axis=3)

    y = keras.layers.Convolution2D(64, 3, 3, **option_dict_conv)(y)
    y = keras.layers.BatchNormalization(**option_dict_bn)(y)

    y = keras.layers.Convolution2D(64, 3, 3, **option_dict_conv)(y)
    y = keras.layers.BatchNormalization(**option_dict_bn)(y)

    return [x, y]


def get_model_3_class(dim1, dim2, activation="softmax"):
    
    [x, y] = get_core(dim1, dim2)

    y = keras.layers.Convolution2D(3, 1, 1, **option_dict_conv)(y)

    if activation is not None:
        y = keras.layers.Activation(activation)(y)

    model = keras.models.Model(x, y)
    
    return model

```

## objectives.py

```python
import keras.metrics
import tensorflow as tf


def weighted_crossentropy(y_true, y_pred):

    class_weights = tf.constant([[[[1., 1., 10.]]]])

    unweighted_losses = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_pred)

    weights = tf.reduce_sum(class_weights * y_true, axis=-1)

    weighted_losses = weights * unweighted_losses

    loss = tf.reduce_mean(weighted_losses)

    return loss
```