# STAPL-3D feature extraction demo

This notebook demonstrates the core components of the STAPL-3D feature extraction module.

If you did not follow the STAPL-3D README: please find STAPL-3D and the installation instructions [here](https://github.com/RiosGroup/STAPL3D) before doing this demo.


First, define where you have put the data. Please change *datadir* to point to the *HFK16w* directory that you have unzipped.

In [1]:
import os

datadir = './HFK16w'
dataset = 'HFK16w'
filestem = os.path.join(datadir, dataset)


Here we define the blocksize, blockmargin and the image dimensions.

In [2]:
from stapl3d import Image
from glob import glob

n_proc = 16

bs = 176  # blocksize
bm = 64  # blockmargin

ipf = ''
blockdir = os.path.join(datadir, 'blocks_{:04d}'.format(bs))
filelist = glob(os.path.join(blockdir, '{}_*{}.h5'.format(dataset, ipf)))
filelist.sort()

image_in = '{}_bfc_block.ims'.format(filestem)
im = Image(image_in)
im.load(load_data=False)
dims = im.dims
im.close()


The STAPL-3D feature extraction module offers fast extraction of features from large amounts of data. We create a feature table for each datablock using parallel processing, then combine these feature tables while filtering out doubles of the segments that are represented in multiple datablocks.

In [3]:
segs = [
    'segm/labels_memb_del_relabeled_fix', 
    'segm/labels_memb_del_relabeled_fix_memb',
    'segm/labels_memb_del_relabeled_fix_nucl',
]

featdir = os.path.join(datadir, 'profiling', 'features')
os.makedirs(featdir, exist_ok=True)

seg_paths = [os.path.join(datadir, '{}.h5/{}'.format(dataset, seg)) for seg in segs]
seg_names = ['full', 'memb', 'nucl']
data_paths = [os.path.join(datadir, '{}_bfc_block.ims'.format(dataset))]
data_names = ['ch{:02d}'.format(ch) for ch in range(8)]
aux_data_path = []
downsample_factors = [1, 1, 1],
outputstem = os.path.join(featdir, dataset)
blocksize = [dims[0], bs, bs, dims[3], dims[4]]
blockmargin = [0, bm, bm, 0, 0]
blockrange = []
channels = []
filter_borderlabels = True
min_labelsize = 50
split_features = True
fset_morph = 'maximal'
fset_intens = 'medium'
fset_addit = []


In [4]:
import multiprocessing
from stapl3d.segmentation.features import export_regionprops

arglist = []
for block_idx, datafile in enumerate(filelist):
    args = [
        seg_paths,
        seg_names,
        data_paths,
        data_names,
        aux_data_path,
        downsample_factors,
        outputstem,
        blocksize,
        blockmargin,
        [block_idx, block_idx + 1],
        channels,
        filter_borderlabels,
        min_labelsize,
        split_features,
        fset_morph,
        fset_intens,
        fset_addit,
        ]
    arglist.append(tuple(args))

with multiprocessing.Pool(processes=n_proc) as pool:
    pool.starmap(export_regionprops, arglist)


This generates 3 csv files for each datablock, in which the features for the full segment, and the nuclear and membranal parts are written to files in *HFK16w/profiling/features*. Finally, we merge all csv's together, taking the intensity features from the subsegments and retain the most interesting morphological and spatial features. We write then to a single outputfile as *HFK16w/profiling/features/HFK16w_features.csv*.

In [5]:
from stapl3d.segmentation.features import postprocess_features

postprocess_features(
    seg_paths,
    blocksize,
    blockmargin,
    blockrange=[],
    csv_dir=featdir,
    csv_stem=dataset,
    feat_pf='_features',
    segm_pfs=['full', 'memb', 'nucl'],
    ext='csv',
    min_size_nucl=min_labelsize,
    save_border_labels=True,
    split_features=True,
    fset_morph=fset_morph,
    fset_intens=fset_intens,
)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  dfb[dfb.index.isin(dfs['nucl'].index)] = dfs['nucl'][comcols]
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.loc._setitem_with_indexer(indexer, value)
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self._setitem_array(key, value)


Unnamed: 0,block,label,area,com_z,com_y,com_x,equivalent_diameter,extent,fractional_anisotropy,major_axis_length,minor_axis_length,ch03_mean_intensity,ch05_mean_intensity,ch06_mean_intensity,ch07_mean_intensity,ch00_mean_intensity,ch01_mean_intensity,ch02_mean_intensity,ch04_mean_intensity,polarity
0,0,3,9539,13,71,79,26.312861,0.083728,0.553242,115.644706,52.169134,2279.847573,3642.349827,2681.082608,2837.526365,2700.592200,2983.458119,3711.245309,3294.517664,0.100843
1,0,9,1741,16,20,86,14.925654,0.235525,0.470216,30.509788,16.845797,2608.482481,3016.347501,3113.519242,1969.399770,10297.904078,1835.790350,7691.045951,3041.091327,0.032776
2,0,11,381,16,46,93,8.994467,0.148828,0.379327,21.465101,14.587997,3240.871391,2994.469816,2472.532808,3777.729659,3131.695538,2554.433071,2732.467192,2880.412073,0.046587
3,0,13,4104,14,98,111,19.864131,0.152361,0.545889,60.207242,27.701614,2734.029240,4627.703216,2181.793129,3009.348684,4811.009503,3019.333090,3068.956384,3082.065058,0.118614
4,0,15,4004,14,112,147,19.701463,0.216667,0.537521,52.288112,24.646941,3179.589161,2780.460040,2162.513487,3618.914585,8226.547453,2135.581169,2508.384865,2306.678571,0.054093
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59776,61,94686,8264,93,1283,1318,25.084025,0.199229,0.558871,60.159631,26.551042,7685.074540,2863.654647,1961.907309,1550.988988,11934.622217,3059.709584,3441.079743,2736.184898,0.000000
59778,61,94688,2539,91,1290,1375,16.926014,0.190501,0.378428,32.407363,20.993438,9142.198897,2695.053564,2311.173690,3807.687278,6004.465538,2917.433635,1609.811343,2763.711304,0.000000
59779,61,94693,5268,93,1320,1386,21.588164,0.263664,0.319747,36.882803,25.983951,7647.136105,2769.556948,2623.637244,2890.243736,5586.335421,2245.228740,2162.981587,3046.222096,0.060626
59782,61,94718,3714,92,1329,1283,19.213851,0.252859,0.351044,34.391686,23.455411,5388.410339,2822.964190,2284.174744,1845.872644,10755.364028,3050.807754,5639.850027,2925.248250,0.050362
