# STAPL-3D organoid segmentation demo

This notebook demonstrates the core components of the STAPL3D organoid pipeline: **...** and **...**.

If you did not follow the STAPL-3D README: please find STAPL-3D and the installation instructions [here](https://github.com/RiosGroup/STAPL3D) before doing this demo.

Because STAPL-3D is all about big datafiles, we provide small cutouts and precomputed summary data that will be downloaded while progressing through the notebook.

Let's start with some general settings and imports.

In [1]:
# Show all output
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# Imports.
import os
import yaml
import zipfile
import urllib.request
from pprint import pprint

# Yaml printing function.
def yprint(ydict):
    """Print dictionary in yaml formatting."""
    print(yaml.dump(ydict, default_flow_style=False))


First, define where you want the data to be downloaded by changing *projectdir*; default is the current demo directory. The name of the dataset is *'FGS_RL1_Exp001'*.

In [2]:
projectdir = os.path.abspath('.')

dataset = 'FGS_RL1_Exp001'
datadir = os.path.join(projectdir, dataset)


We provided a ...
> REPLACE: preprocessed data cutout in the Imaris v5.5 file format. which is an hdf5 file with 5 dimensions (a free [Imaris Viewer](https://imaris.oxinst.com/imaris-viewer) is available; and the file format can be inspected with [HDFview](https://www.hdfgroup.org/downloads/hdfview/) or with `h5ls` or `h5py`.

We download the file and name it according to the default STAPL-3D pipeline conventions.

Download and extract the data.

In [3]:
zipfilepath = os.path.join(projectdir, f'{dataset}.zip')

# TODO: upload and create public link
# TODO: all these files are hosted on my surfdrive: need to transfer ownership
#if not os.path.exists(zipfilepath):
#    url = '<public_link>/download'
#    urllib.request.urlretrieve(url, zipfilepath)

# TODO: rename in zipfile to *'FGS_RL1_Exp001'*
if not os.path.exists(datadir):
    with zipfile.ZipFile(zipfilepath, 'r') as zf:
        zf.extractall()


The name of the extracted dataset is *'FGS_RL1_Exp001'*. Jump to it.


In [4]:
os.chdir(datadir)
f'working in directory: {os.path.abspath(".")}'


'working in directory: d:\\mkleinnijenhuis\\STAPL3D\\demos\\FGS_RL1_Exp001'

We define STAPL3D parameters preferably using a [yaml](https://yaml.org) parameter file. It has a simple structure and can be parsed in Python and `bash`. We will download the example, read it into a dictionary structure and list all the main entries in the file. 

In [5]:
parameter_file = f'{dataset}.yml'

# Download the yml-file.
if not os.path.exists(parameter_file):
    url = 'https://surfdrive.surf.nl/files/index.php/s/WknwEh5etW9IHWZ/download'
    urllib.request.urlretrieve(url, parameter_file)

# Load parameter file.
with open(parameter_file, 'r') as ymlfile:
    cfg = yaml.safe_load(ymlfile)

# List all entries.
cfg.keys()


('FGS_RL1_Exp001.yml', <http.client.HTTPMessage at 0x27c9f9422e0>)

dict_keys(['dataset', 'biasfield', 'splitter', 'segmentation_prep', 'segmentation_edt', 'segmentation', 'segmentation_small', 'segmentation_medium', 'segmentation_large', 'segmentation_filter', 'segmentation_plot', 'features', 'backproject'])

In [None]:
# Imports.
import napari
import numpy as np
from glob import glob

from stapl3d import Image, blocks, backproject
from stapl3d.segmentation import segment, features


We specify the input using a format specifier *'{f}.czi'*, which will select all files with the czi extension in the data directory.
In this example, there are 6 files in the dataset. Therefore, we set the maximaum number of simultaneous workers to 6.

In [7]:
filespec_raw = '{f}.czi'

max_workers = 6


## Splitt3r

For organoid segmentation, we calculate the mean over channels in order to achieve maximal coverage and optimal signal for segmentation. The parameter file specifies the volumes to generate:

In [9]:
cfg['splitter']['split']

{'volumes': {'mean': {}}}

In [8]:
from stapl3d import blocks

splitt3r = blocks.Splitt3r(filespec_raw, parameter_file, max_workers=max_workers)
splitt3r.run()


Running  splitter:blockinfo in 6 jobs over 6 workers
Running  splitter:split in 6 jobs over 6 workers


This creates the *'blocks'* directory, with hdf5-files having the same names as the czi-files, but the *'.h5'* extension.

In [11]:
filepaths = os.listdir('blocks')
filepaths

['FGS_RL1_Exp001-2_Img020_34T.h5',
 'FGS_RL1_Exp001-2_Img024_36T.h5',
 'FGS_RL1_Exp001-2_Img040_MDO4.h5',
 'FGS_RL1_Exp001-3_Img003_10T.h5',
 'FGS_RL1_Exp001-3_Img006_13T.h5',
 'FGS_RL1_Exp001-3_Img026_169M.h5']

These now each contain 3D volume named *'mean'* and a group *'block_info'* containing some metadata.

In [16]:
import h5py
f = h5py.File(os.path.join('blocks', filepaths[0]))
f.keys()
f['mean']
f.close()


<KeysViewHDF5 ['block_info', 'mean']>

<HDF5 dataset "mean": shape (50, 1024, 1024), type "<u2">

We visualize the mean volumes with the napari viewer.

In [17]:
block_idxs = list(range(len(splitt3r.filepaths)))
splitt3r.view(block_idxs, images=['mean'])


## Segment3r

For the purposes of the demo, we separate the steps in the segmentation over a number of parameter file entries: *segmentation_prep*, *segmentation_edt*, *segmentation* and *segmentation_filter*. We select steps specifying the 'step_id' argument to the segmenter, e.g.:
```
segment3r = segment.Segment3r(filespec_raw, parameter_file, max_workers=max_workers, step_id='segmentation_prep')
```
and then run it calling
```
segment3r.estimate()
```

### Prep

In the first step,
1. a clipping mask is generated
2. the data is smoothed
3. an organoid mask is generated

Furthermore,

4. the data is downsampled and then smoothed
5. an organoid mask is generated at the lower resolution.

The associated parameters are as follows:

In [37]:
for i, (k, v) in enumerate(cfg['segmentation_prep']['estimate'].items()):
    print(f'{i+1}.')
    yprint({k: v})

# FIXME: mask_dset_ds is not correct

1.
mask_clip:
  ids_image: mean
  ods_mask: clip
  threshold: 65000

2.
prep_dset:
  filter:
    inplane: false
    sigma: 2.5
    type: gaussian
  ids_image: mean
  ods_image: prep

3.
mask_dset:
  ids_image: prep
  ods_mask: mask
  otsu:
    perc_range:
    - 0
    - 99

4.
prep_dset_ds:
  downsample:
    factors:
      x: 5
      y: 5
      z: 1
    ods: mean_ds
  filter:
    inplane: false
    sigma: 2.5
    type: gaussian
  ids_image: mean
  ods_image: prep_ds

5.
mask_dset_ds:
  dilate: {}
  fill: 2D
  ids_image: prep_ds
  ods_mask: mask_ds
  otsu:
    perc_range:
    - 0
    - 99
  size_filter_vx: 100



In [None]:
segment3r = segment.Segment3r(filepath_raw, filepath_par, max_workers=max_workers, step_id='segmentation_prep')
segment3r.estimate()


In [None]:
block_idxs = [0]
images = ['mean', 'prep', 'mean_ds', 'prep_ds']
labels = ['clip', 'mask', 'mask_ds']
segment3r.view(block_idxs, images, labels)


Calculate the distance transform.

In [None]:
segment3r = segment.Segment3r(filepath_raw, filepath_par, max_workers=max_workers, step_id='segmentation_edt')
segment3r.estimate()


In [None]:
block_idxs = [0]
images = ['prep_ds', 'mask_ds_edt']
labels =  ['mask_ds', 'blobs_ds_label']
segment3r.view(block_idxs, images, labels)


Perform the segmentation.

In [None]:
segment3r = segment.Segment3r(filepath_raw, filepath_par, max_workers=max_workers, step_id='segmentation')
segment3r.estimate()


In [None]:
block_idxs = [0]
images = ['prep_ds', 'mask_ds_edt']
labels = ['blobs_ds_label', 'mask_ds_seeds', 'blobs_ds_raw']
segment3r.view(block_idxs, images, labels)


Postprocess the organoid segmentation.

In [None]:
segment3r = segment.Segment3r(filepath_raw, filepath_par, max_workers=max_workers, step_id='segmentation_filter')
segment3r.estimate()


In [None]:
block_idxs = [0]
images = ['mean', 'prep']
labels = ['blobs_expand', 'mask', 'blobs', 'clip', 'blobs_clip', 'blobs_clip_deleted']
segment3r.view([0], images, labels)


Generate a report for each stack.

In [None]:
segment3r = segment.Segment3r(filepath_raw, filepath_par, max_workers=max_workers, step_id='segmentation_plot')
segment3r.estimate()


Merge the reports to the logs directory.

In [None]:
segment3r.postprocess()


## Feature extraction