# Siamese U-Net Quickstart

*Yuxi Long*

## 1. Introduction

The Siamese U-Net is an improvement on the original U-Net architecture. It adds an additional additional encoder that encodes an additional frame other than the frame that we are trying to predict. See [this paper](https://pubmed.ncbi.nlm.nih.gov/31927473/). This repository contains an implementation of this network.

If you need help using a function, you can always try running `help(whichever_interesting_function)` or just look at the source code. If you need help using a class (one that is directly under the `biu.siam_unet` director), trying to understand the examples in this notebook probably will be more helpful than finding the documentation of that function.

IMPORTANT: Two packages that depend on your hardware need to be installed manually before running biu. To install CUDA 11.1 which is officially supported by PyTorch, navigate to [its installation page](https://developer.nvidia.com/cuda-11.1.1-download-archive) and follow the instructions onscreen. Because PyTorch depends on your CUDA installation version, it will need to be installed manually as well, through [the official PyTorch website](https://pytorch.org/get-started/locally/). Select the correct distribution of CUDA on this webpage and run the command in your terminal. biu doesn't depend on a specific version of CUDA and has been tested with PyTorch 1.7.0+.

Finally, to import the Siamese U-Net package, write `import biu.siam_unet as unet`.

### Basics of Siam U-Net

The Siamese U-Net takes in two inputs, the frame which we are trying to infer (the "current frame"), and the frame which is before the current frame (the "previous frame"), and tries to infer where the cell boundaries are based on these two inputs.

```
current frame --->|                 |
                  |   Siam U-Net    | ---> cell boundaries ("labels")
previous frame -->|                 |

```

To train the Siamese U-Net, we need to first do the work of drawing the cell boundaries ourselves, so the machine can learn how to draw the cell boundaries given the frames. Example training data can be found at <https://filedn.eu/lKfS794F9UgX7PDuBQcfChB/DeepTissue/training_data.zip>. 

You can use any drawing software to create the training data. We have found [Autodesk Sketchbook](https://www.autodesk.com/products/sketchbook/overview) or [GIMP](https://www.gimp.org/) to be quite useful. Just make sure to use a tablet with stylus and turn on a podcast while you draw. To create the training data, we first isolate the frame we want either using `biu.helpers.extract_frame_of_movie`, or use gimp or whatever magical tool you have. Open the image in your drawing software, decrease its opacity, add another layer on top of the image, and start drawing your labels (what you want the network to predict). You are encouraged to use the pen tool with color black and radius 1 in your drawing software.

After you have finished drawing labels of the entire image, remove (or make invisible) the image layer, and you are just left with the label layer. Export your label, convert it to grayscale using any tool you can find (we have found <https://online-photo-converter.com/black-and-white-image> to be handy), name it the same as the image, and move on to the next frame you wish to create a label for. You don't need to create a label for every frame, but theoretically the more frames you have, the better your neural network will learn. Always feel free to train the network and see what output you get even if you don't think you have enough labels.

Once you are done creating labels, move on to the next step.

## 2. Data preparation

Because Siam UNet requires an additional input for training, we need to utilize an additional frame and use the appropriate dataloader for that. 

#### If you know which frame you drew the label with

The dataloader in `siam_unet_cosh` takes an image that results from concatenating the previous frame with the current frame. If you already know which frame of which movie you want to train on, you can create this concatenated data using `generate_siam_unet_input_imgs.py`.

In [None]:
movie_dir = '/media/longyuxi/H is for HUGE/docmount backup/unet_pytorch/training_data/test_data/new_microscope/21B11-shgGFP-kin-18-bro4.tif' # change this
frame = 10 # change this
out_dir = './training_data/training_data/yokogawa/siam_data/image/' # change this



from biu.siam_unet.helpers.generate_siam_unet_input_imgs import generate_coupled_image

generate_coupled_image(movie_dir, frame, out_dir)

#### If you don't know which frame you drew the label with

If you have frames and labels, but you don't know which frame of which movie each frame comes from, you can use  `find_frame_of_image`. This function takes your query and compares it against a list of tif files you specify through the parameter `search_space`.

In [None]:
image_name = f'./training_data/training_data/yokogawa/lateral_epidermis/image/83.tif'

razer_local_search_dir = '/media/longyuxi/H is for HUGE/docmount backup/all_movies'
tifs_names = ['21B11-shgGFP-kin-18-bro4', '21B25_shgGFP_kin_1_Pos0', '21C04_shgGFP_kin_2_Pos4', '21C26_shgGFP_Pos12', '21D16_shgGFPkin_Pos7']
search_space = [razer_local_search_dir + '/' + t + '.tif' for t in tifs_names]

from biu.siam_unet.helpers.find_frame_of_image import find_frame_of_image

find_frame_of_image(image_name, search_space=search_space)


This function not only outputs what it finds to stdout, but also creates a machine readable output, location of which specified by `machine_readable_output_filename`, about which frames it is highly confident with at locating (i.e. an MSE of < 1000 and matching frame numbers). This output can further be used by `generate_siam_unet_input_images.py`.

In [None]:
from biu.siam_unet.helpers.generate_siam_unet_input_imgs import utilize_search_result

utilize_search_result(f'./training_data/training_data/yokogawa/amnioserosa/search_result_mr.txt', f'./training_data/test_data/new_microscope', f'./training_data/training_data/yokogawa/amnioserosa/label/', f'./training_data/training_data/yokogawa/siam_amnioserosa_sanitize_test/')


Finally, organize the labels and images in a way similar to this shown. An example can be found at `siam_package/training_data/yokogawa/siam_lateral_epidermis`

```
siam_package/training_data/yokogawa/siam_lateral_epidermis
|
├── image
│   ├── 105.tif
│   ├── 111.tif
│   ├── 120.tif
│   ├── 121.tif
│   ├── 1.tif
│   ├── 2.tif
│   ├── 3.tif
│   ├── 5.tif
│   ├── 7.tif
│   └── 83.tif
└── label
    ├── 105.tif
    ├── 111.tif
    ├── 120.tif
    ├── 121.tif
    ├── 1.tif
    ├── 2.tif
    ├── 3.tif
    ├── 5.tif
    ├── 7.tif
    └── 83.tif
```

## 3. Training

Training is simple. For example:

In [None]:
from biu.siam_unet import *

dataset = 'yokogawa/siam_lateral_epidermis'
base_dir = './'

# path to training data (images and labels with identical names in separate folders)
dir_images = f'{base_dir}/training_data/training_data/{dataset}/image/'
dir_masks = f'{base_dir}/training_data/training_data/{dataset}/label/'

print('starting to create training dataset')
# create training data set
data = DataProcess([dir_images, dir_masks], data_path='./data', dilate_mask=0, aug_factor=10, create=False, invert=False, clip_thres=(0.2, 99.8), dim_out=(256, 256), shiftscalerotate=(0, 0, 0))

save_dir = f'{base_dir}/models/siam_bce_amnio'
# create trainer
training = Trainer(data ,num_epochs=500 ,batch_size=12, load_weights=False, lr=0.0001, n_filter=32, save_iter=True, save_dir=save_dir)

training.start()


Note here that the value of the `n_filter` parameter is set to `32`. The network won't break with a different value of this, but you need to use the same value for the Predict part.

## 4. Predict

Predicting is simple as well. Just swap in the parameters

In [None]:
# load package
from biu.siam_unet import *
import os
os.nice(10)
from  biu.siam_unet.helpers import tif_to_mp4

base_dir = './'
out_dir = f'{base_dir}/predicted_out'
model = f'{base_dir}/models/siam_bce_amnio/model_epoch_100.pth'

tif_file = f'{base_dir}/training_data/test_data/new_microscope/21C04_shgGFP_kin_2_Pos4.tif'

result_file = f'{out_dir}/siam_bce_amnio_100_epochs_21C04_shgGFP_kin_2_Pos4.tif'
out_mp4_file = result_file[:-4] + '.mp4'

print('starting to predict file')
# predict file 
predict = Predict(tif_file, result_file, model, invert=False, resize_dim=(512, 512), n_filter=32)
# convert to mp4
tif_to_mp4.convert_to_mp4(result_file, output_file=out_mp4_file, normalize_to_0_255=True)

# Appendix: An annotated structure of the siam_unet package

Below is an annotated structure of the siam_unet package. Use `help(function)` to read the docstring of each function for a better understanding.

```
Package                                         Use

.
├── __init__.py
├── data.py                                     dataloader script
├── siam_unet.py                                Siam U-Net model
├── train.py                                    training script
├── losses.py                                   loss functions
├── predict.py                                  prediction script
├── helpers                                     helper functions (usually not 
                                                        so useful except the 
                                                        ones mentioned in this notebook)
                                                        
│   ├── average_tifs.py                             averages a list of tiff files
│   ├── create_pixel_value_histogram.py             creates histograms for the 
                                                        pixel values in tif 
                                                        files. Useful for 
                                                        debugging during training
│   ├── cuda_test.py                                tests cuda functionality
│   ├── extract_frame_of_movie.py                   extract a certain frame of a 
                                                        tif movie 
│   ├── find_frame_of_image.py                      finds the frame number of 
                                                        a given query image 
                                                        within search_space.
│   ├── generate_plain_image.py                     generates a plain image
│   ├── generate_siam_unet_input_imgs.py            generates a coupled image 
                                                        for Siam U-Net training
│   ├── low_mem_tif_utils.py                        utilities for handling tif 
                                                        files with low memory 
                                                        usage
│   ├── threshold_images.py                         thresholds each frame of a 
                                                        tif movie
│   ├── tif_to_mp4.py                               uses ffmpeg to convert a tif 
                                                        movie to mp4
│   └── util.py                                     various utilities. see docstring

```