<h2>Data Analysis - Batch Processing - Quantification of cell populations</h2>

The following notebook is able to process the .csv files resulting from Batch Processing (Average Intensity or Colocalization) and:

1. Define cell populations based on single or multiple markers (positive, negative or a combination of both)
2. Plot resulting data using Plotly.
3. Extract numbers of cells positive for a marker based on colocalization (using a user-defined threshold).
4. Aggregate all per labels results in a single .csv file ("BP_populations_marker_+_summary_{method}.csv")
4. Save summary % results on a cell population basis in .csv file ("BP_populations_marker_+_summary_{method}.csv").

In [17]:
from pathlib import Path
from utils_data_analysis import calculate_perc_pops, plot_perc_pop_per_filename_roi

In [18]:
# Define the path containing your results
results_path = Path("./results/070824 x/2D/Cellpose")

# Input the method used to define cells as positive for a marker ("avg_int", "coloc") #TODO: "pixel_class"
method = "avg_int"

# Define the channels you want to analyze using the following structure:
# markers = [(channel_name, channel_nr, cellular_location),(..., ..., ...)]
markers = [("neun", 0, "nucleus"), ("sox2", 1, "nucleus")]

# Define the min_max average intensity parameters to select your populations of interest
# You have the possibility to define populations for the same marker (i.e. neun high and neun low)
# max_values are set to 255 since the test input images are 8-bit, higher bit depths can result in higher max avg_int values
min_max_per_marker = [{"marker": "neun", "min_max": (50,115), "population":"neun_low"},
                      {"marker": "neun", "min_max": (115,255), "population":"neun_high"},
                      {"marker": "sox2", "min_max": (65,255), "population":"sox2"},]

# Define cell populations based on multiple markers (i.e. double marker positive (True) or marker positive (True) and marker2 negative (False))
# Based on populations in min_max_per_marker in case multiple pops per marker are defined, as in the case of "neun"
# For cell_pop defined by a single populations marker add a + so it does not have the same name as population in min_max_per_marker
cell_populations = [{"cell_pop": "ox_stress_total", "subpopulations": [("sox2", True)]},
                    {"cell_pop": "no_ox_stress_total", "subpopulations": [("sox2", False)]},
                    {"cell_pop": "ox_stress_neun_high", "subpopulations": [("neun_high", True), ("sox2", True)]},
                    {"cell_pop": "ox_stress_neun_low", "subpopulations": [("neun_low", True), ("sox2", True)]},
                    {"cell_pop": "non_ox_stress_neun_high", "subpopulations": [("neun_high", True), ("sox2", False)]},
                    {"cell_pop": "non_ox_stress_neun_low", "subpopulations": [("neun_low", True), ("sox2", False)]}]

In [19]:
# Extract model and segmentation type from results Path
# Calculate percentages of each cell population, save them as a summary .csv
percentage_true, model_name, segmentation_type = calculate_perc_pops(results_path, method, min_max_per_marker, cell_populations)

percentage_true

Unnamed: 0,filename,ROI,neun_low,neun_high,sox2,ox_stress_total,no_ox_stress_total,ox_stress_neun_high,ox_stress_neun_low,non_ox_stress_neun_high,non_ox_stress_neun_low
0,A1_Brain1_TR1,full_image,35.188129,12.541954,31.831832,31.831832,68.168168,0.547606,3.744921,11.994347,31.443208
1,A1_Brain1_TR2,full_image,33.940736,8.021117,11.393052,11.393052,88.606948,0.13624,0.970708,7.884877,32.970027
2,A1_Brain2_TR1,full_image,29.061514,17.014984,31.999211,31.999211,68.000789,0.35489,2.799685,16.660095,26.26183
3,A1_Brain2_TR2,full_image,32.59302,18.690511,32.448803,32.448803,67.551197,1.009518,5.595616,17.680992,26.997404
4,A2_Brain1_TR1_HipSub,full_image,28.462966,9.127721,26.370845,26.370845,73.629155,0.202463,1.248524,8.925257,27.214442
5,A2_Brain1_TR1_SVZ,full_image,11.90368,1.226715,36.347115,36.347115,63.652885,0.090868,2.089959,1.135847,9.813721
6,A2_Brain1_TR2_HipSub,full_image,29.122007,21.117446,31.345496,31.345496,68.654504,0.570125,4.526796,20.54732,24.595211
7,A2_Brain1_TR2_SVZ,full_image,13.497773,3.23741,45.889003,45.889003,54.110997,0.308325,4.282288,2.929085,9.215485
8,A2_Brain2_TR1_HipSub,full_image,28.376857,16.884286,30.031517,30.031517,69.968483,0.360198,3.185502,16.524088,25.191355
9,A2_Brain2_TR1_SVZ,full_image,43.90528,24.301242,64.945652,64.945652,35.054348,13.509317,28.10559,10.791925,15.799689


In [20]:
# Plot the resulting cell population percentages of a per filename per ROI basis
plot_perc_pop_per_filename_roi(percentage_true, model_name, segmentation_type)