## Feature extraction postnatal development - napari_simpleitk_image_processing

The workflow for  extracting features with `napari_simpleitk_image_processing` and its `label_statistics` function was already demonstrated [here]. 

This notebook aims to extract features from all images in the statistics-dataset folder using a for loop.

In [1]:
import apoc
import numpy as np
import os
import pandas as pd
import pyclesperanto_prototype as cle

from napari_simpleitk_image_processing import label_statistics

import sys
sys.path.append("../..")
from quapos_lm import rescale_image, rescale_segmentation, predict_image

In [2]:
# Load classifier
quapos_lm = apoc.ObjectSegmenter(opencl_filename = "../../01-training-and-validation/02-quapos-lm.cl")
quapos_lm.feature_importances()

{'gaussian_blur=1': 0.32557488170342097,
 'difference_of_gaussian=1': 0.4231073391932076,
 'laplace_box_of_gaussian_blur=1': 0.25131777910337144}

Now the folder with the images used for statistical analysis will be loaded and a file list defined.

In [4]:
# Define directory for images
images = "../../data/02-data-for-pixel-classifier/statistics-data/"

# Define file list
file_list = os.listdir(images)

# Show file list
print(file_list)

['C1-p08-20x-zoom-1.2-replicate-1.tif', 'C1-p08-20x-zoom-1.5-replicate-1.tif', 'C1-p08-20x-zoom-1.7-replicate-1.tif', 'C1-p08-20x-zoom-2.2-replicate-2.tif', 'C1-p08-20x-zoom-2.5-replicate-2.tif', 'C1-p08-20x-zoom-2.7-replicate-2.tif', 'C1-p08-20x-zoom-3.2-replicate-3.tif', 'C1-p08-20x-zoom-3.5-replicate-3.tif', 'C1-p08-20x-zoom-3.7-replicate-3.tif', 'C1-p08-20x-zoom-4.2-replicate-4.tif', 'C1-p08-20x-zoom-4.5-replicate-4.tif', 'C1-p08-20x-zoom-4.7-replicate-4.tif', 'C1-p10-20x-zoom-1.1-suse-replicate-5.tif', 'C1-p10-20x-zoom-1.3-suse-replicate-5.tif', 'C1-p10-20x-zoom-1.4-suse-replicate-5.tif', 'C1-p10-20x-zoom-2.2-suse-replicate-6.tif', 'C1-p10-20x-zoom-2.3-suse-replicate-6.tif', 'C1-p10-20x-zoom-2.5-suse-replicate-6.tif', 'C1-p10-20x-zoom-3.1-suse-replicate-7.tif', 'C1-p10-20x-zoom-3.3-suse-replicate-7.tif', 'C1-p10-20x-zoom-3.5-suse-replicate-7.tif', 'C1-p12-20x-zoom-flo-1.4-replicate-8.tif', 'C1-p12-20x-zoom-flo-1.5-replicate-8.tif', 'C1-p12-20x-zoom-flo-1.7-replicate-8.tif', 'C1-p1

### Extract features

Now a for loop will be computed to extract all the features. Additionally, information from the filename will be used to add the columns `age`, `biol_repl`, and `image_id`.

In [9]:
# Define empty array to store all extracted features
features = []

# Itterate a for loop over the image folder
for i, file_name in enumerate(file_list):
    
    # Load image
    image = cle.imread(images + file_name)
    
    # Predict the image
    prediction = predict_image(image=image, classifier=quapos_lm)
    
    # Rescale the image
    image_rescaled = rescale_image(image=image, voxel_x=0.323, voxel_y=0.323, voxel_z=0.49)
    
    # Rescale the prediction
    prediction_rescaled = rescale_segmentation(segmentation=prediction, voxel_x=0.323, voxel_y=0.323, voxel_z=0.49)
    
    # Extract features with napari_simpleitk_image_processing's label_statistics function
    # Here all possible features are extracted
    features_i = label_statistics(
        intensity_image=image_rescaled,
        label_image=prediction_rescaled,
        size=True,
        intensity=True,
        perimeter=True,
        shape=True,
        position=True,
        moments=True)
    
    # Add information from the filename into corresponding columns
    # Split the filename
    file_name_split = file_name.split("-")
    
    # Add age in a respective column
    age = int(file_name_split[1].replace("p", ""))
    features_i["age"] = pd.Series([age for x in range(len(features_i))])
    
    # Add the biological replicate in a respective column
    biol_repl = file_name_split[-1].split(".")
    biol_repl = biol_repl[0]
    features_i["biol_repl"] = pd.Series([biol_repl for x in range(len(features_i))])
    
    # Add the image_id in a respective column
    features_i["image_id"] = pd.Series([i for x in range(len(features_i))])
    
    # Store measurements of current image in dataframe
    features.append(features_i)
    
# Concatenate dataframe with all measurements
features = pd.concat(features)

In [10]:
# Show the table with the extracted features
features

Unnamed: 0,label,maximum,mean,median,minimum,sigma,sum,variance,bbox_0,bbox_1,...,principal_axes5,principal_axes6,principal_axes7,principal_axes8,principal_moments0,principal_moments1,principal_moments2,age,biol_repl,image_id
0,1,803.0,314.092437,282.884766,186.0,99.098496,261639.0,9820.511878,4,290,...,0.955010,-0.957034,0.113538,-0.266826,1.832462,11.716350,43.131628,8,1,0
1,2,1845.0,620.114058,466.576172,174.0,407.194822,467566.0,165807.623096,5,281,...,0.021927,-0.020838,-0.015309,-0.999666,2.216001,5.510770,23.444648,8,1,0
2,3,564.0,274.831858,253.494141,154.0,91.856545,62112.0,8437.624936,21,275,...,0.053457,-0.054283,-0.013105,-0.998440,0.824688,3.316142,9.906961,8,1,0
3,4,540.0,285.008439,268.189453,153.0,89.769054,67547.0,8058.482979,34,277,...,-0.171591,0.154089,0.082694,-0.984590,0.638453,3.475667,12.067510,8,1,0
4,5,264.0,234.200000,238.798828,202.0,20.192821,5855.0,407.750000,45,279,...,0.034095,0.008930,-0.036309,-0.999301,0.255054,0.484601,2.409145,8,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
173,174,685.0,346.677288,326.101562,187.0,92.460071,424333.0,8548.864700,1324,341,...,0.409121,0.620022,-0.664031,0.417895,9.008137,17.118816,38.706719,24,28,83
174,175,609.0,337.340530,326.101562,188.0,81.507414,343750.0,6643.458582,1351,348,...,-0.411856,0.349005,-0.921846,0.168509,6.496385,8.731998,49.841969,24,28,83
175,176,347.0,255.431818,252.820312,193.0,37.513655,11239.0,1407.274313,1365,349,...,0.202175,-0.194022,-0.061208,-0.979086,0.326646,1.280348,2.477201,24,28,83
176,177,345.0,247.479915,245.492188,176.0,32.455513,117058.0,1053.360295,1367,352,...,0.815519,0.833391,-0.530961,-0.153428,3.571197,4.750325,50.039745,24,28,83


In [7]:
features.to_csv("../../measurements/wt-postnatal-development/01-a-feature-extraction-simpleitk.csv", index = False)