# STAPL-3D preprocessing demo

This notebook demonstrates the core components of the STAPL3D preprocessing pipeline: **z-stack shading** correction and **3D inhomogeneity** correction. 

If you did not follow the STAPL-3D README: please find STAPL-3D and the installation instructions [here](https://github.com/RiosGroup/STAPL3D) before doing this demo.

Because STAPL-3D is all about big datafiles, we provide small cutouts and precomputed summary data that will be downloaded while progressing through the notebook.

Let's start with some general settings and imports.

In [None]:
# Qt gui for napari viewer
%gui qt

# Show all output
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# Imports.
import os
import yaml
import zipfile
import urllib.request
from pprint import pprint

# Yaml printing function.
def yprint(ydict):
    """Print dictionary in yaml formatting."""
    print(yaml.dump(ydict, default_flow_style=False))


First, define where you want the data to be downloaded by changing *projectdir*; default is the current demo directory. The name of the dataset is *'HFK16w'* (for Human Fetal Kidney - 16 weeks). We create a directory for the dataset and jump to it.

In [None]:
projectdir = '.'
dataset = 'HFK16w'

datadir = os.path.join(projectdir, dataset)

os.makedirs(datadir, exist_ok=True)
os.chdir(datadir)
f'working in directory: {os.path.abspath(".")}'


We define STAPL3D parameters preferably using a [yaml](https://yaml.org) parameter file. It has a simple structure and can be parsed in Python and `bash`. We will download the example, read it into a dictionary structure, list all entries and show the entry that contains information on the default directory structure for STAPL3D. 

In [None]:
parameter_file = f'{dataset}.yml'

# Download the yml-file.
if not os.path.exists(parameter_file):
    url = 'https://surfdrive.surf.nl/files/index.php/s/WZZB4GCBghexOiy/download'
    urllib.request.urlretrieve(url, parameter_file)

# Load parameter file.
with open(parameter_file, 'r') as ymlfile:
    cfg = yaml.safe_load(ymlfile)

# List all entries.
cfg.keys()

# Inspect directory tree.
yml_entry = 'dirtree'
yprint(cfg[yml_entry])  # in yaml format
pprint(cfg[yml_entry])  # as a dictionary


## Shading correction

Shading correction (or flatfield correction) attempts to remove the intensity gradients that may be present in the xy-plane of the z-stacks that make up the dataset. These originate from imperfections in the microscope's optics and manifest as a grid over the assembled 3D volume. Because the shading is channel-specific, STAPL-3D estimates a 2D profile for each channel separately from the data.

We provide a 2-ch z-stack of data (106 x 1024 x 1024 x 2) in the data archive for demonstration purposes. These are two channels extracted from an 8-channel dataset of 262 stacks, i.e. ~0.1% of the data. The stack includes a nuclear channel (DAPI) and a membrane channel (NCAM1).


In [None]:
# Download the czi-file.

czi_filepath = f'{dataset}.czi'
if not os.path.exists(czi_filepath):
    url = 'https://surfdrive.surf.nl/files/index.php/s/Ly85srzZmdWJCyJ/download'
    urllib.request.urlretrieve(url, czi_filepath)


We define the parameters to the shading correction module in the yaml parameter file.

In [None]:
# Load parameter file.
yprint(cfg['shading'])


This means that, in this example, we calculate the *median* value for z-stacks concatenated over X and Y, while masking any value < *1000*. We use the *20%* of planes that have the highest median intensities to calculate the 1D shading profile that is fit using a *3rd order* polynomial. The resulting files of this processing step are postfixed with *_shading* 

The estimation of the shading profile is done in parallel for channels. The number of concurrent processes can be set by specifying 'max_workers' in the yml-file, or as an argument. The default is to use the same number of processors as there are channels in the dataset--if available. 

Note that for cluster-deployment (SGE or SLURM), more specific configurations can be set in the yaml.


Now run the shading estimation.

In [None]:
from stapl3d.preprocessing import shading

deshad3r = shading.Deshad3r(czi_filepath, parameter_file, prefix=dataset)
deshad3r.run()


For each channel, this will write the estimated shading profile as an image (.tif) and a processing report (.pdf), as well as the calculated medians (.npz), a logfile (.log) and parameters (.yml) to the *HFK16w/shading/* directory.

In [None]:
sorted(os.listdir('shading'))


For the single stack these do not look great, because the algorithm needs multiple z-stacks to reliably estimate the shading profile. Therefore, we provide pre-calculated medians for the full 262-stack dataset (in *HFK16w/shading_full*) to demonstrate the expected output. First, let's plot the results for a single channel: 

In [None]:
# Download and extract the shading_full directory.
shading_dir = os.path.abspath('shading_full')
zipfilepath = f'{shading_dir}.zip'

if not os.path.exists(shading_dir):
    url = 'https://surfdrive.surf.nl/files/index.php/s/yEqSRZdvQ9nYb7Q/download'
    urllib.request.urlretrieve(url, zipfilepath)

    with zipfile.ZipFile(zipfilepath, 'r') as zf:
        zf.extractall()


 First, let's plot the results for a single channel:

In [None]:
# Generate and show the report of channel 0
ch = 0

dataset_full = '190910_RL57_FUnGI_16Bit_25x_125um_shading'
filestem = os.path.join('shading_full', f'{dataset_full}_C{ch:03}')
paths = {
    'report': f'{filestem}.pdf',
    'shading': f'{filestem}.tif',
    'profile_X': f'{filestem}_X.npz',
    'profile_Y': f'{filestem}_Y.npz',
}
deshad3r.report(outputpath=None, ioff=False, channel=ch, outputs=paths)


The first row shows the result for concatenating the data over *X*, i.e. yielding a median value for each *yz*-coordinate. The left plot shows the medians of the planes (selected using a *quantile_threshold* parameter of 0.8) in rainbow colours. The right plot shows the normalized profile with confidence intervals as well as the normalized fit. The bottom left shows the median profile over *z*, with the selected planes indicated by tick marks. The bottom right image shows the 2D shading profile to use for correcting each plane in each z-stack of the channel. The red dashed traces indicates an -arbitrary- threshold to help with flagging potential issues with the data; the viridis colormap is also clipped to red  at this threshold.

Now, let's run it for all channnels, sending the output to pdf's in *HFK16w/shading_full*.

In [None]:
for ch in range(8):
    filestem = os.path.join('shading_full', f'{dataset_full}_C{ch:03}')
    paths = {
        'report': f'{filestem}.pdf',
        'shading': f'{filestem}.tif',
        'profile_X': f'{filestem}_X.npz',
        'profile_Y': f'{filestem}_Y.npz',
    }
    deshad3r.report(outputpath=paths['report'], ioff=True, channel=ch, outputs=paths)

# Merge the pdfs into one file.
deshad3r.inputpaths['postprocess']['report'] = os.path.join(shading_dir, f'{dataset_full}_C???.pdf')
deshad3r.outputpaths['postprocess']['report'] = os.path.join(shading_dir, f'{dataset_full}.pdf')
deshad3r.postprocess()


# Stitching

If you plan to use your own imaging data, we stitch in the proprietary Zeiss Zen or Arivis software packages. Please stitch and then convert the result to an Imaris or STAPL-3D file-format. For the bias field estimation demo below, we provide a downsampled image of the stitching result in hdf5 format. For the segmentation demo, we provide a cutout of the fully preprocessed file in Imaris format: *HFK16w_shading_stitching_biasfield.ims*. A free viewer for these data can be downloaded [here](https://imaris.oxinst.com/imaris-viewer). 

We provide an interface to Fiji BigStitcher in beta. Provide the path to FIJI as an environment variable 'FIJI' or directly below.

# Inhomogeneity correction.

We next correct the stitched file for inhomogeneities such as depth attenuation and uneven penetration of clearing agents and antibodies. This is done using the *N4* algorithm ([Tustison et al., 2010](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3071855) as implemented in simpleitk) on a downsampled image. For this demo, we provide the downsampled data in an hdf5 file.

We download the data, use the STAPL-3D Image class to get some info about this image, and display the parameters.

In [None]:
stitch_stem = f'{dataset}_shading_stitching'
bfc_filepath = f'{stitch_stem}.h5'

# Download the hdf5-file.
if not os.path.exists(bfc_filepath):
    url = 'https://surfdrive.surf.nl/files/index.php/s/WkMMCW5e4wgNUgb/download'
    urllib.request.urlretrieve(url, bfc_filepath)

# Print image info.
from stapl3d import Image
image_in = '{}/data'.format(bfc_filepath)
im = Image(image_in)
im.load(load_data=False)
props = im.get_props()
im.close()

pprint(props)


The image dimensions are *zyxc* = *106 x 263 x 249 x 8* with voxels of *1.2 x 21.3 x 21.3* $\mu$m. This is a good size for estimating the slow variations over the volume. Default parameters are:

In [None]:
yprint(cfg['biasfield']['estimate'])


The `n_iterations`, `n_fitlevels` and `n_bspline_cps` are passed to the [ITK-filter](https://simpleitk.org/doxygen/latest/html/classitk_1_1simple_1_1N4BiasFieldCorrectionImageFilter.html). On workstations, the `tasks` parameter will set the number of processors ITK will use. 
(Note that for HPC cluster deployment, there is more control: channels are distributed over separate jobs, and the number of threads used for each channel can be set separately.)

If an imaris pyramid image is provided, data will be taken at `resolution_level` and further downsampled with `downsample_factors`. Because the hdf5 file already contains downsampled data, we set `downsample_factors` to unitary.

<!-- 
To use a mask in the estimation, the `mask` input can either be 
 - set to `True`, in which case the path defaults to `{dataset}{cfg['mask']['postfix']}.h5/mask`
 - contain the path to a mask image (in which background should be `0`).
 The mask image is expected to be the same size as the input image, i.e. it will also be downsampled with `downsample_factors`. -->
 


In [None]:
from stapl3d.preprocessing import biasfield

homogeniz3r = biasfield.Homogeniz3r(czi_filepath, parameter_file, prefix=dataset)
homogeniz3r.estimate()


Merge the reports to a single pdf.

In [None]:
homogeniz3r.postprocess()


Next, we (re)generate the bias field correction report of a single channel to inspect the result.

In [None]:
# Generate and show the report of a single channel
ch = 3
_, opaths = homogeniz3r.fill_paths('estimate', reps={'c': ch})
homogeniz3r.report(outputpath=None, ioff=False, channel=ch, outputs=opaths)


The left column shows orthogonal sections of the downsampled dataset for the uncorrected (top) and corrected data (middle) as well as the estimated bias field (bottom). Plotted on the left and top of the images are profiles of the median values over the three axes. The right column offers a closer comparison of the profiles (*mean + SD*) for the corrected (green) vs uncorrected (red) data. The bias field correction yields a much flatter profile for *z*, as well as *xy*. Low-frequency inhomogeneities are removed, while the detail of the specific staining is retained in the corrected data.

Now, run the estimation with a more reasonable number of iterations (N=50). This will take a lot longer, therefore we run it for a single channel. It will overwrite the data for the specified channel generated in the previous test.

In [None]:
homogeniz3r.channels = [ch]  # only estimate the specified channel
homogeniz3r.n_iterations = 50  # override the yaml-defined parameter
homogeniz3r.estimate()


Next, compare the report for the new estimation using appropriate number of iterations.


In [None]:
# Generate and show the report of channel 
_, opaths = homogeniz3r.fill_paths('estimate', reps={'c': ch})
homogeniz3r.report(outputpath=None, ioff=False, channel=ch, outputs=opaths)


Note that above, we only estimated the inhomogeneities at a low resolution. To apply the estimation to a full-resolution dataset and generate a file that merges all channels in a single (symlinked) hdf5-file, use ```homogeniz3r.apply()``` and ```homogeniz3r.postprocess()```. Or ```homogeniz3r.run()``` to execute all steps in a single call.


### 3D visualization

Next to the static reports, we can explore the results in ND using napari.
- Ctrl/Cmd-E to roll axes
- Ctrl/Cmd-G to toggle overlay / side-by-side view


In [None]:
images = ['data', 'corr', 'bias']

homogeniz3r.view(opaths['file'], images)

homogeniz3r.viewer.title = 'STAPL3D homogeniz3r demo'

axes = homogeniz3r.viewer.axes
dims = homogeniz3r.viewer.dims
cam = homogeniz3r.viewer.camera
layers = homogeniz3r.viewer.layers

# Move scrollbars to centreslices.
cslcs = [int(s / 2) for s in props['shape'][:3]]
dims.current_step = cslcs

# Show axes.
axes.visible = True

# Rescale the z-axis for better appreciation in the xz and yz views
orig_scale = [s for s in layers['data'].scale]  # keep for convenience
for lay in layers:
    lay.scale = [1, 1, 1]

# Set equal contrast limits for uncorrected and corrected volumes.
clim = [0, 10000]
layers['data'].contrast_limits = layers['corr'].contrast_limits = clim

# Toggle bias off for now
layers['bias'].visible = False
layers['bias'].colormap = 'magma'

# Ctrl/Cmd-E to roll axes
# dims.order = (2, 0, 1)

# Ctrl/Cmd-G to toggle overlay / side-by-side view
# homogeniz3r.viewer.grid.enabled = True


In [None]:
# look at the bias landscape
dims.ndisplay = 3
dims.order = (0, 1, 2)

cam.zoom = 2.0
cam.center = cslcs
cam.angles = (-10, -10, 155)

layers['bias'].visible = True
layers['bias'].contrast_limits = [0.8, 2.0]
layers['bias'].colormap = 'magma'


Instead of the guided look at the single channel, the results of all channels can be loaded as 4D images. Note that when you followed this tutorial exactly only one channel was processed with the right parameters and shows differences between corrected and non-corrected.


In [None]:
viewer_settings = {
    'title': 'STAPL3D homogeniz3r demo',
    'crosshairs': [int(s / 2) for s in props['shape'][:4]],
    'axes_visible': False,
    'clim': [0, 10000],
}
homogeniz3r.view(images=['data', 'corr'], settings=viewer_settings)
