<h4>OTLS raw data processing script</h4>

In this script, we load the raw data h5 file as downloaded from [PCa_Bx_3Dpathology](https://www.cancerimagingarchive.net/collection/pca_bx_3dpathology/).
This script does the following for a given OTLS sample.
- Open 2X downsampled OTLS (1um/voxel)
- Assign nuclear channels to channel 1 and 2, and eosin channel to channel 3
- Match the maximum intensity between nuclear and eosin channels
- Save each 2D OTLS slice as its own .tiff file.

In [12]:
import numpy as np
import os
import h5py
from PIL import Image
import imageio
from tqdm import tqdm

In [6]:
def cat_str(s):
    # Helper function such that 2D tissue image always has the 4-digit ID (helps with ordering)
    # e.g., 0001, 0010
    gap = 4 - len(s)
    suffix = '0'*gap + s
    return suffix

In [7]:
def scale(eosin, nuclear_max):
    # Scale the maximum of eosin channel to that of nuclear channel
    eosin_max = np.percentile(eosin, 99)
    
    eosin_new = nuclear_max / eosin_max * eosin
    eosin_new = eosin_new.astype(np.uint16)
    
    return eosin_new

In [11]:
fname = '/path/to/parent_dir'
fpath = os.path.join(fname, 'data-f0.h5')

nuclear_min=100 # This is the floor intensity value for this OTLS dataset

with h5py.File(fpath) as f:
    nuclear_raw = f['t00000']['s00']['1']['cells'][()]  # 's00' accesses nuclear channel, '1' accesses 2x downsampled version (1 um /pixel)
    nuclear = np.array(nuclear_raw).transpose(1, 0, 2).astype(np.uint16)
    
    eosin_raw = np.array(f['t00000']['s01']['1']['cells'][()]) # 's01' accesses eosin channel, '1' accesses 2x downsampled version (1 um /pixel)
    eosin = np.array(eosin_raw).transpose(1, 0, 2).astype(np.uint16)

    del nuclear_raw, eosin_raw

print("Finding nuclear_max...")
nuclear_max = np.percentile(nuclear, 99)

print("\nFinding eosin_max...")
eosin_new = scale(eosin, nuclear_max)
img_new = np.stack([nuclear, nuclear, eosin_new], axis=-1) # img_new (depth, width, hieght, channel)

print("\nSaving 2D image tiff stack...")
for idx in tqdm(range(len(img_new))):
    basename_new = cat_str(str(idx)) + '.tiff'    
    fname_new = os.path.join(fname, basename_new)
    imageio.imwrite(fname_new, img_new[idx])

Finding nuclear_max...
Finding eosin_max...
