# Data preparation

Here we show how you can use `braian` python library to prepare the data and perform a preliminary analysis.

This notebook is the last step in the ABBA whole-brain cell counting analysis.  
It assumes you have done the following steps:
- Alignment of brain slices in ABBA, exported to a QuPath project.
- Detected cells of interest in QuPath. The detections should be exported to ```.csv``` files (one per slice) in a folder called ```results```. 
- If there are regions to exclude, you should have drawn them and exported to ```.txt``` files (one per slice) in a folder called ```regions_to_exclude```.

Run this notebook to load the cell counts and do analysis on them. 

## Before we start ...
### Set parameters

In [None]:
CONFIG_FILE_NAME = "config_example.yml"                     # assumes the file is in DATA_ROOT directory

# Script's code

In [None]:
import braian
import braian.config
import braian.plot as bap
import braian.stats as bas

import plotly.io as pio
from pathlib import Path

# This ensures BraiAn's figures works in multiple places:
pio.renderers.default = "plotly_mimetype+notebook"

In [None]:
root_dir = Path.cwd().absolute().parent
config_file = root_dir/CONFIG_FILE_NAME
config = braian.config.BraiAnConfig(root_dir/"data", config_file)

config.output_dir.mkdir(parents=True, exist_ok=True)

## The Allen Brain Atlas

We start by importing the mouse Allen Brain Atlas, in which we find information about all brain regions (their parent region and children regions in the brain hierarchy, for example).

In [None]:
atlas_ontology = config.read_atlas_ontology()
selected_regions = atlas_ontology.get_regions("summary structures")
print(f"You selected {len(selected_regions)} regions to plot.")

#parent_region = atlas_ontology.parent_region
#direct_subregions = atlas_ontology.direct_subregions
#full_name = atlas_ontology.full_name
#regions = atlas_ontology.list_all_subregions("root", mode="depth")

In [None]:
# Plot brain region hierarchy
bap.hierarchy(atlas_ontology)

## Load data

Now, we're ready to read the ```.csv``` files with the cell counts, and also the exclusion files (if there were regions to exclude).  
Below, you have to specify:
- ```animals_root```: Absolute path to the folder that contains the animal folders.
- ```group_1_dirs```: A list of names of the folders corresponding to animals in **Group 1** (e.g., Control group). Indeed, it is necessary to store the results in individual folders for each animal.
- ```group_2_dirs```: A list of names of the folders corresponding to animals in **Group 2** (e.g., Stress group).
- ```group_1_name```: A meaningful string for Group 1.
- ```group_2_name```: A meaningful string for Group 2.
- ```area_key```: A string of the column in the ```.csv``` files that refers to the size of a brain areatra
- ```tracer_key```: A string of the column in the ```.csv``` files that refers to the tracer number used to highlight the marker
- ```marker```: A string of the marker we would like to highlight (e.g. CFos)

Provare a modificar per ottenere densita in mm^2 (da micron)

Now, we load the Control and Stress results seperately in two pandas dataframes, and save the results.

**Note**: regions to exclude are automatically excluded.

In [None]:
project_sliced = config.project_from_qupath(sliced=True)

In [None]:
region_name = "root"
bap.plot_region_density(region_name, project_sliced, width=1000, height=500)

In [None]:
project_cvar = project_sliced.to_project(braian.SliceMetrics.CVAR, min_slices=0, fill_nan=False)

In [None]:
ms = dict()
for group_cvar in project_cvar.groups:
    print(f"{group_cvar.name}")
    for marker in group_cvar.markers:
        ms[marker] = ms.get(marker, 0) + sum([(brain_cvar[marker].data >  1).sum() for brain_cvar in group_cvar.animals])
ms

In [None]:
for group_cvar in project_cvar.groups:
    print(f"{group_cvar.name}")
    for marker in group_cvar.markers:
        print(f"\t{marker}: #regions > threshold:",  sum([(brain_cvar[marker].data >  1).sum() for brain_cvar in group_cvar.animals]))
        print(f"\t{marker}: #regions <= threshold:", sum([(brain_cvar[marker].data <= 1).sum() for brain_cvar in group_cvar.animals]))

In [None]:
CVAR_THRESHOLD = 1
atlas_ontology.select_summary_structures() # we to plot these in the coefficient of variation
cvar_plot = bap.plot_cv_above_threshold(atlas_ontology, project_sliced, cv_threshold=CVAR_THRESHOLD, width=1000, height=500)
cvar_plot.show()

In [None]:
def print_region_stats(brain: braian.SlicedBrain, region_acronym: str, marker=None):
    brain = braian.SlicedBrain.merge_hemispheres(brain)
    slice_count = brain.count()
    if region_acronym not in slice_count:
        print(f"Can't find region '{region_acronym}' for animal '{brain.name}'")
        return
    markers = brain.markers if marker is None else [marker]
    brain_avg = braian.AnimalBrain.from_slices(brain, mode=braian.SliceMetrics.MEAN,  densities=True)
    brain_std = braian.AnimalBrain.from_slices(brain, mode=braian.SliceMetrics.STD,   densities=True)
    brain_cvar = braian.AnimalBrain.from_slices(brain, mode=braian.SliceMetrics.CVAR, densities=True)
    for m in markers:
        print(f"""Summary for brain region '{region_acronym}' of marker '{m}':
            - N slices: {slice_count[region_acronym]}
            - Mean: {brain_avg[m][region_acronym]:.2f} {m}/mm²),
            - S.D.: {brain_std[m][region_acronym]:.2f} {m}/mm²,
            - Coefficient of Variation: {brain_cvar[m][region_acronym]}
        """)

import pandas as pd
def check_slices(brain: braian.SlicedBrain, region_acronym: str):
    slices = []
    brain = braian.SlicedBrain.merge_hemispheres(brain)
    for slice in brain.slices:
        if region_acronym not in slice.markers_density.index:
            continue
        region_densities = slice.markers_density.loc[region_acronym].copy()
        region_densities.index += " density"
        region_densities.name = slice.name
        slices.append(region_densities)
    return pd.concat(slices, axis=1) if len(slices) != 0 else None

In [None]:
animal_name = "287HC"
region_acronym = "LGv"

from IPython.display import display

if animal_name in project_sliced:
    brain = project_sliced[animal_name]
    print_region_stats(brain, region_acronym, marker=None) #, marker="cFos")
    display(check_slices(brain, region_acronym))
else:
    print(f"Can't find region '{region_acronym}' for animal '{animal_name}'")

In [None]:
# NOTE: brains are being written WITH Left/Right discrimination
# If you desire to save them without, call AnimalBrain with hemisphere_distinction=False

project = config.project_from_sliced(project_sliced, fill_nan=False)

for group in project.groups:
    group.to_csv(config.output_dir, overwrite=True)
    dgroup = braian.AnimalGroup(group.name, [bas.density(braian.AnimalBrain.merge_hemispheres(a)) for a in group.animals],
                                brain_ontology=atlas_ontology, fill_nan=True)
    dgroup.to_csv(config.output_dir, overwrite=True)
    for animal in group.animals:
        animal = braian.AnimalBrain.merge_hemispheres(animal)
        output = animal.to_csv(config.output_dir, overwrite=True)
        # print(f"{animal} saved to {output}")

In [None]:
import importlib
import sys
__imported_modules = sys.modules.copy()
for module_name, module in __imported_modules.items():
    if not module_name.startswith("braian"): # and not module_name.startswith("bgheatmaps"):
        continue
    try:
        # print("reaload:", module_name)
        importlib.reload(module)
    except ModuleNotFoundError:
        continue