# DeePiCt 2D U-Net segmentation

This Colab notebook can be used to create predictions with already trained 2D models for cytosol and organelle prediction. The tomogram that you want to use for prediction should be available somewhere online, for example on Drive. The initial step of the spectrum matching filter is not included in the notebook, and you should execute it beforehand. For more details about the model, follow the instructions available on the [DeePiCt Github repository](https://github.com/ZauggGroup/DeePiCt/blob/main/README.md).  

## Instructions: 
* This notebook includes 4 steps to segment the tomogram and optional step 5 for visualization of the result. 
* Make sure that the tomogram with applied filter is available on an online share, example Google Drive. 
* Run the cells in the order that they are displayed. To run a cell, you need to click the play button on the left corner of the cell. 
* Some cells contain parameters that need to be defined, so make sure you enter all the required information correctly before running the particular cell. You have to run the cell, so that the parameter value is saved. 




# Configurations
___

### Make sure you have GPU access enabled by going to Runtime -> Change Runtime Type -> Hardware accelerator and selecting GPU

## Step 1. Installations

In [None]:
#@markdown ## 1.1. Run this cell to connect your Google Drive to colab

#@markdown * Click on the URL. 

#@markdown * Sign in your Google Account. 

#@markdown You will either have to:
#@markdown * copy the authorisation code and enter it into box below OR

#@markdown * in the new google colab, you can just click "Allow" and it should connect.

#@markdown * Click on "Folder" icon on the Left, press the refresh button. Your Google Drive folder should now be available here as "gdrive". 

# mount user's Google Drive to Google Colab.
from google.colab import drive
drive.mount('/content/gdrive')

In [None]:
#@markdown ## 1.2. Run this cell to install necessary packages

#@markdown The code in this cell: 
#@markdown * Gets the git repository of DeePiCt

!git clone https://github.com/ZauggGroup/DeePiCt.git

#@markdown * Installs required packages

!pip install mrcfile
!pip install h5py==2.10.0
!pip install tensorflow-gpu==2.0.0
!pip install keras==2.3.1

## Step 2. Set the data variables and config file

___

In [None]:
#@markdown ## 2.1. Choose the model based on what you want to segment. The available models are prediction for cytosol and organelle. 

# Define the variable:
predict_type = "cytosol" #@param ["cytosol","organelles"]

models_weights = {"cytosol": "https://www.dropbox.com/sh/oavbtcvusi07xbh/AAAI0DrqdCOVKeCLjf0EcdBva/2d_cnn/vpp_model_cytosol_eq.h5?dl=0",
                  "organelles": "https://www.dropbox.com/sh/oavbtcvusi07xbh/AAA2DxQVSKqIygfHa51mdM30a/2d_cnn/vpp_model_organelles_eq.h5?dl=0"}


!wget -O model_weights.h5 {models_weights[predict_type]}

In [46]:
from posixpath import split
#@markdown ## 2.2. Define important variables

#@markdown ### Define the following information in the given variables:

srcdir = '/content/DeePiCt/2d_cnn'
original_config_file = '/content/DeePiCt/2d_cnn/config.yaml'
model_path = '/content/model_weights.h5'

# Define the folowing variables:

# @markdown * **ID/name for the tomogram**:
tomo_name = '180426_005' #@param {type:"string"}

# @markdown * **Path to the tomogram .mrc file**:
tomogram_path = '/content/gdrive/MyDrive/tomo_data/match_spectrum_filt.mrc' #@param {type:"string"}

# @markdown * **Use n/2 slices above and below z center. If 0, select all labeled slices**:
z_cutoff = 0  #@param {type:"integer"}

# @markdown * **Path where to save the config .yaml file (you can leave the default option)**:
user_config_file = '/content/gdrive/MyDrive/DeePiCt_2d/config.yaml'  #@param {type:"string"}

# @markdown * **Path where to save the data .csv file (you can leave the default option)**:
user_data_file = '/content/gdrive/MyDrive/DeePiCt_2d/data.csv' #@param {type:"string"}

# @markdown * **Path to folder where to save prediction files (you can leave the default option)**:
user_prediction_folder = '/content/gdrive/MyDrive/DeePiCt_2d/predictions/'  #@param {type:"string"}


import os

os.makedirs(os.path.split(user_config_file)[0], exist_ok=True)
os.makedirs(os.path.split(user_data_file)[0], exist_ok=True)
os.makedirs(os.path.split(user_prediction_folder)[0], exist_ok=True)

if z_cutoff == 0:
    z_cutoff = None


In [47]:
#@markdown ## 2.3. Create data csv file and yaml config file
#@markdown Run this cell to create the .csv data file and .yaml config file

import csv
import yaml

header = ['tomo_name','id','data','filtered_data']

# Define the elements of this list:
data = [tomo_name, tomo_name,'', tomogram_path]

with open(user_data_file, 'w', encoding='UTF8') as f:
    writer = csv.writer(f)

    # write the header
    writer.writerow(header)

    # write the data
    writer.writerow(data)
  
data_dictionary = dict(zip(header, data))

def read_yaml(file_path):
    with open(file_path, "r") as stream:
        data = yaml.safe_load(stream)
    return data

def save_yaml(data, file_path):
    with open(file_path, 'w') as yaml_file:
        yaml.dump(data, yaml_file, default_flow_style=False)

d = read_yaml(original_config_file)
d['prediction_data'] = user_data_file
d['output_dir'] = user_prediction_folder

d['preprocessing']['filtering']['active'] = False
d['preprocessing']['filtering']['target_spectrum'] = ''
d['preprocessing']['filtering']['lowpass_cutoff'] = 350
d['preprocessing']['filtering']['smoothen_cutoff'] = 20
d['preprocessing']['slicing']['z_cutoff'] = z_cutoff

d['training']['evaluation']['active'] = False
d['training']['production']['active'] = False

d['prediction']['active'] = True
d['prediction']['model'] = model_path


save_yaml(d, user_config_file)

## Step 3. Predict with trained neural network

___

In [None]:
#@markdown ## 3.1. Segment the tomogram
#@markdown Run this cell to create the segmentation

import os

prediction = os.path.join(user_prediction_folder, data_dictionary['id'] + "_pred.mrc")

!python /content/DeePiCt/2d_cnn/scripts/predict_organelles.py \
        --features {data_dictionary['filtered_data']} \
        --output {prediction} \
        --model {model_path} \
        --config {user_config_file}

## Step 4. Post-processing of the prediction

___


In [None]:
#@markdown ## 4.1. Post-processing of the prediction
#@markdown Run this cell to do post-processing of the prediction

import os

post_processed_prediction = os.path.join(user_prediction_folder, data_dictionary['id'] + "_post_processed_pred.mrc")


!python3 /content/DeePiCt/2d_cnn/scripts/postprocess.py \
        --input {prediction} \
        --output {post_processed_prediction} \
        --config {user_config_file}

# Step 5. Visualize results

___

In [None]:
#@markdown ## 5.1. Read the tomogram and the prediction
#@markdown Run this cell to read the tomogram and the predictions

import mrcfile


def read_tomogram(filename):
    with mrcfile.open(filename, permissive=True) as m:
        return m.data

tomogram = read_tomogram(data_dictionary['filtered_data'])
predictions = read_tomogram(post_processed_prediction)

In [None]:
#@markdown ## 5.2. Visualize the prediction
#@markdown Run this cell to do visualization of particular z axis


z_axis = 100 #@param {type:"integer"}

import numpy as np
import matplotlib.pyplot as plt

# First figure
plt.figure(figsize = (10,10))
plt.imshow(tomogram[z_axis], cmap='gray')

# Second figure
plt.figure(figsize = (10,10))
plt.imshow(tomogram[z_axis], cmap='gray')
alphas = np.zeros(predictions.shape)
alphas[predictions > 0] = 0.8
plt.imshow(predictions[z_axis], alpha=alphas[z_axis], cmap='Blues')