# ABBA cell count analysis

This notebook is the last step in the ABBA whole-brain cell counting analysis.  
It assumes you have done the following steps:
- Alignment of brain slices in ABBA, exported to a QuPath project.
- Detected cells of interest in QuPath. The detections should be exported to ```.csv``` files (one per slice) in a folder called ```results```. 
- If there are regions to exclude, you should have drawn them and exported to ```.txt``` files (one per slice) in a folder called ```regions_to_exclude```.

Run this notebook to load the cell counts and do analysis on them. 

## Before we start ...
The majority of the functions and classes we need written in 3 files: ```brain_hierarchy.py```, ```readCSV_helpers.py``` and ```pls_helpers.py```. We will now import the necessary functions and classes from these python files to this notebook, so that we can use them later:

In [None]:
import BraiAn
import os
import numpy as np
from collections import namedtuple
GroupDirectory = namedtuple("GroupDirectory", "name dirs")

## The Allen Brain Atlas

We start by importing the mouse Allen Brain Atlas, in which we find information about all brain regions (their parent region and children regions in the brain hierarchy, for example).

In [None]:
# from https://help.brain-map.org/display/api/Downloading+an+Ontology%27s+Structure+Graph
# StructureGraph id=1
path_to_allen_json = "./data/AllenMouseBrainOntology.json"

branches_to_exclude = ["retina","VS","grv","fiber tracts"]
AllenBrain = BraiAn.AllenBrainHierarchy(path_to_allen_json, branches_to_exclude)

#parent_region = AllenBrain.parent_region
#direct_subregions = AllenBrain.direct_subregions
#full_name = AllenBrain.full_name
#regions = AllenBrain.list_all_subregions("root", mode="depth")

We can also visualize the hierarchy of brain regions as a network (a tree). **Note that running the above cell may take a few minutes**.

In [None]:
## Plot brain region hierarchy
## If you want to plot it, install PyDot (pydot)
fig = AllenBrain.plot_plotly_graph()
fig.show()

Based on the graph above, you might want to specify the regions on which you want to do further PLS analysis:  
*Note: to see more information about the regions, hover over them with your mouse.*

- Specify a level. Analysis can only be done on one level (slice) in the brain region.

- To exclude brain regions that belong to a certain branch, add the *abbreviated* nodes at the beginning of the branches to the list above.  
Example:  
```branches_to_exclude = ["retina", "VS"]```  
means that **all the subregions that belong to the retina and the ventricular systems** are excluded from the PLS analysis.

## Load data

Now, we're ready to read the ```.csv``` files with the cell counts, and also the exclusion files (if there were regions to exclude).  
Below, you have to specify:
- ```animals_root```: Absolute path to the folder that contains the animal folders.
- ```group_1_dirs```: A list of names of the folders corresponding to animals in **Group 1** (e.g., Control group). Indeed, it is necessary to store the results in individual folders for each animal.
- ```group_2_dirs```: A list of names of the folders corresponding to animals in **Group 2** (e.g., Stress group).
- ```group_1_name```: A meaningful string for Group 1.
- ```group_2_name```: A meaningful string for Group 2.
- ```area_key```: A string of the column in the ```.csv``` files that refers to the size of a brain areatra
- ```tracer_key```: A string of the column in the ```.csv``` files that refers to the tracer number used to highlight the marker
- ```marker```: A string of the marker we would like to highlight (e.g. CFos)

Provare a modificar per ottenere densita in mm^2 (da micron)

In [None]:
# SOUMNYA FEMALES+MALES - 2 Groups {Stress|Control}
experiment = "soumnya"
groups = [
    GroupDirectory(
        name="Control",
        dirs=["42C", "43C", "44C", "49C", "50C", "51C", "52C", "58C", "60C", "74C", "76C", "81C", "83C", "89C", "91C"]
    ),
    GroupDirectory(
        name="Stress",
        dirs=["45S", "46S", "47S", "48S", "54S", "55S", "56S", "62S", "64S", "78S", "80S", "85S", "87S", "93S", "95S"]
    ),
]
plots_output_folder = "C-S"

In [None]:
# SOUMNYA ALL - 2 Groups {Stress|Control} + 2 Groups {Males|Females}
experiment = "soumnya"
groups = [
    GroupDirectory(
        name="Control (Females)",
        dirs=["42C", "44C", "49C", "60C", "74C", "76C", "58C"]
    ),
    GroupDirectory(
        name="Stress (Females)",
        dirs=["46S", "48S", "54S", "56S", "62S", "64S", "80S", "78S"]
    ),
    GroupDirectory(
        name="Control (Males)",
        dirs=["43C", "50C", "51C", "52C", "81C", "83C", "89C", "91C"]
    ),
    GroupDirectory(
        name="Stress (Males)",
        dirs=["47S", "55S", "85S", "95S", "45S", "87S", "93S"]
    ),
]
plots_output_folder = "CF-SF-CM-SM"

In [None]:
# SOUMNYA FEMALES - 2 Groups {Stress|Control}
experiment = "soumnya"
groups = [
    GroupDirectory(
        name="Control (Females)",
        dirs=["42C", "44C", "49C", "60C", "74C", "76C", "58C"]
    ),
    GroupDirectory(
        name="Stress (Females)",
        dirs=["46S", "48S", "54S", "56S", "62S", "64S", "80S", "78S"]
    ),
]
plots_output_folder = "CF-SF"

In [None]:
# SOUMNYA MALES - 2 Groups {Stress|Control}
experiment = "soumnya"
groups = [
    GroupDirectory(
        name="Control (Males)",
        dirs=["43C", "50C", "51C", "52C", "81C", "83C", "89C", "91C"] 
    ),
    GroupDirectory(
        name="Stress (Males)",
        dirs=["47S", "55S", "85S", "95S", "45S", "87S", "93S"]   
    ),
]
plots_output_folder = "CM-SM"

In [None]:
# SHILA - 3 Groups {Control|Stress|Resilient}
experiment = "shila"
groups = [
    GroupDirectory(
        name="Control",
        dirs=["16C", "17C", "19C"] # 18C
    ),
    GroupDirectory(
        name="Stress",
        dirs=["5S", "8S", "10S", "13S", "14S"]
    ),
    GroupDirectory(
        name="Resilient",
        dirs=["1R", "2R", "3R", "4R", "11R"]
    ),
]
plots_output_folder = "C-S-R"

In [None]:
# SHILA - 2 Groups {Control|Stress+Resilient}
groups = [
    GroupDirectory(
        name="Control",
        dirs=["16C", "17C", "19C"] # 18C
    ),
    GroupDirectory(
        name="Stress",
        dirs=["5S", "10S", "13S", "14S", "1R", "2R", "3R", "4R", "11R"] # 8S
    ),
]
plots_output_folder = "C-S"

In [None]:
# TEST DATA - Sample data
experiment = "test"
groups = [
    GroupDirectory(
        name="Control",
        dirs=["Control_16C", "Control_17C", "Control_18C", "Control_19C", "Control_42C"]
    ),
    GroupDirectory(
        name="Stress",
        dirs=["Stress_5S", "Stress_8S", "Stress_10S", "Stress_13S"]
    ),
    GroupDirectory(
        name="Resilient",
        dirs=["Resilient_1R", "Resilient_2R", "Resilient_3R", "Resilient_4R", "Resilient_11R"]
    ),
]
plots_output_folder = "C-S-R"

In [None]:
# ####################################### SET PARAMETERS ####################################

animals_root = f"./data/experiments/{experiment}/QuPath_output/"
area_key = "Area um^2"
tracer_key = "Num AF647"
marker = "CFos"

data_output_path = f"./data/experiments/{experiment}/BraiAn_norm_output/"
plots_output_path = f"./plots/{experiment}/{plots_output_folder}"


# ###########################################################################################


if not(os.path.exists(data_output_path)):
    os.makedirs(data_output_path, exist_ok=True)
if not(os.path.exists(plots_output_path)):
    os.makedirs(plots_output_path, exist_ok=True)

Now, we load the Control and Stress results seperately in two pandas dataframes, and save the results.

**Note**: regions to exclude are automatically excluded.

In [None]:
from typing import List

groups_slices: List[List[BraiAn.SlicedBrain]] = []
for i in range(len(groups)):
    group_slices = [BraiAn.SlicedBrain(animal_dir,
                                        os.path.join(animals_root, animal_dir),
                                        AllenBrain,
                                        area_key,
                                        tracer_key,
                                        marker,
                                        area_units="µm2")
                    for animal_dir in groups[i].dirs]
    groups_slices.append(group_slices)
    print(f"Imported all brain slices from {str(len(groups[i].dirs))} animals of {groups[i].name} group.")

In [None]:
AllenBrain.select_from_csv("./data/AllenSummaryStructures.csv")
root_plot = BraiAn.plot_region_density("root", *groups_slices, width=1000, height=500)
cvar_plot = BraiAn.plot_cv_above_threshold(AllenBrain, *groups_slices, cv_threshold=1, width=1000, height=500)
root_plot.show()

In [None]:
# print("N regions above threshold:", sum([(brain.data > cv_threshold).sum() for brain in cvar_brains]))
# print("N regions below threshold:", sum([(brain.data <= cv_threshold).sum() for brain in cvar_brains]))
cvar_plot.show()

In [None]:
r = "IG"
n_group = 2
n_animal = 3
sliced_brain = groups_slices[n_group-1][n_animal-1]
sliced_brain = BraiAn.merge_sliced_hemispheres(sliced_brain)
all_slices_df = sliced_brain.concat_slices()
slices_per_area = all_slices_df.groupby(all_slices_df.index).count().iloc[:,0]
print(f"""Summary for brain region '{r}' of {sliced_brain.name}:
    - N slices: {slices_per_area[r]}
    - Mean: {BraiAn.AnimalBrain(sliced_brain, mode="avg").data[r]:.2f} {sliced_brain.marker}/mm²),
    - S.D.: {BraiAn.AnimalBrain(sliced_brain, mode="std").data[r]:.2f} {sliced_brain.marker}/mm²,
    - Coefficient of Variation: {BraiAn.AnimalBrain(sliced_brain, mode="cvar").data[r]}
""")

In [None]:
# NOTE: brains are being written WITH Left/Right discrimination
# If you desire to save them without, call AnimalBrain with hemisphere_distinction=False

groups_sum_brains: List[List[BraiAn.AnimalBrain]] = [[BraiAn.AnimalBrain(sliced_brain) for sliced_brain in sliced_brain_list] for sliced_brain_list in groups_slices]
for i in range(len(groups)):
    group_output_path = os.path.join(data_output_path, groups[i].name)
    for animal in groups_sum_brains[i]:
        animal.write_all_brains(group_output_path)

In [None]:
animal_groups: List[BraiAn.AnimalGroup] = [BraiAn.AnimalGroup(groups[i].name, groups_sum_brains[i], AllenBrain) for i in range(len(groups))]

In [None]:
# Save results
for i in range(len(groups)):
    animal_groups[i].to_csv(data_output_path, f"results_cell_counts_{groups[i].name}.csv", overwrite=True)

In [None]:
normalization = "Density"
low_threshold = 700 # Only plot bars with value larger than threshold
top_threshold = 4_000 # np.inf
# regions_to_plot = BraiAn.regions_to_plot(groups=animal_groups, normalization=normalization, low_threshold=low_threshold, top_threshold=top_threshold)
regions_to_plot = AllenBrain.get_selected_regions() # selects the Summary Strucutures
fig = BraiAn.plot_groups(normalization, AllenBrain, *animal_groups,
                            selected_regions=regions_to_plot, use_acronyms=False, height=10_000)
fig.show()

file_title = f"barplot_{animal_groups[0].marker}_{normalization}_{len(animal_groups)}groups.png"
fig.write_image(os.path.join(plots_output_path, file_title))