## DO NOT TOUCH THIS MASTER VERSION OF THE NOTEBOOK. Create a duplicate notebook with your own copy name (e.g. marcelo_segmentador.ipynb) for your own usage


# Analyzing Segmentador data
Gabriel Emilio Herrera Oropeza <br>
13/06/2022

## *INSTRUCTIONS TO DEVELOPERS*

**Make sure to fetch and pull the most updated code and run `pip install .` before using the jupyter notebooks. Codes are constantly being improved and the workflow below work best with the newest version.**

This document describes the workflow to analyze data from segmented images that were previously processed by `insert tool name`. We will show how to import, visualise, filter and cluster the data using robust, simple-to-use functions. Many of these functions take up positional arguments that can be modified. To display the usage of these functions, run `help(name_of_function)`.

We begin by importing the `tool name` module.

In [None]:
from ngtools.analyzer import Analyzor

## Create Analyzor object class

We provide an Analyzor object class that facilitates storage of nuclei segmented data and its downstream processing. To construct this object, we can easily pass the path to Segmentador output directory to `Analyzor` class function:

In [None]:
path_to_experiments = "../data/sample_output"
obj = Analyzor(path_to_experiments, pattern="output*.csv", collated_csv=None)

## Data preprocessing

### Normalize DAPI intensity

In [None]:
obj.normChannel(channel = "dapi", method = "mode", nbins = 100, intensity_type = "total")

### Select single cells
Identify single cells based on DNA marker content. (Here we have to give the option to select the range of the spread we want to have from sum of DAPI intensity. You can see blebb and no Blebb have different spreading.

In [None]:
obj.findSingleCells(byExperiment = True, nbins = 100, spread = 0.4, channel = None)

In [None]:
# Keep only single cells
obj.filterCells(filter = "isSingleCell == True", inplace = True)

### Normalize channel intensities

In [None]:
obj.normChannel(channel = "beta3", method = "mode", nbins = 100, intensity_type = "avg")
obj.normChannel(channel = "rfp", method = "mode", nbins = 100, intensity_type = "avg")
obj.normChannel(channel = "ngn", method = "mode", nbins = 100, intensity_type = "avg")

### Filter cells

In [None]:
obj.count(["laminB1_group","gfap_group"])

In [None]:
%matplotlib tk
chosen_cells = obj.chooseCells(x = "rfp_group", y = "beta3_group")

In [None]:
%matplotlib inline

In [None]:
obj.showCells(cells = chosen_cells, n=10, ch2show = {'red': "rfp", 'green': "beta3"}, 
              order_by = "avg_intensity_rfp", ascending = True, 
             filter = None, show_nucleus = False)

In [None]:
obj.filterCells(cells = chosen_cells)

In [None]:
obj.count(["laminB1_group","gfap_group"])

## Dimensional reduction and clustering

### Prepare matrix

Below are the nuclear features used for dimensional reduction

In [None]:
obj.showADataVars()


In [None]:
exclude_feat = ['total_intensity_laminB1', 'gfap_x_actin', 'gfap_x_laminB1', 'actin_x_laminB1', 'gfap_x_actin_x_laminB1',
               'total_intensity_gfap', 'total_intensity_actin','total_intensity_dapi', 'total_intensity_core_dapi', 
                'total_intensity_internal_ring_dapi', 'total_intensity_external_ring_dapi']
obj.excludeVars(vars = exclude_feat)
obj.showADataVars()

In [None]:
## This graphic should be before

In [None]:
obj.plotVarDist(vars = "all", data_type="scaled")

In [None]:
# optional rescaling
obj.normAData(method = "maxabsscaler")

### Cluster and dim reduction

In [None]:
obj.findNeighbours(method = "umap")
obj.findClusters(method = "leiden", res=0.6)
obj.runDimReduc(method = "umap")

In [None]:
# Plot UMAP showing features
obj.plotDim(hue = "leiden", method="umap")

In [None]:
obj.plotDim(hue = "avg_intensity_rfp", method="umap")

In [None]:
obj.plotData(x="leiden", y = "avg_intensity_laminB1", plot_type = "violin")

In [None]:
obj.showCells(RGB_contrasts=[4,3,4], n=5, ch2show={'red':'laminB1', 'green':'gfap'}, filter = "leiden == '0'")

In [None]:
# Plot UMAP showing features
obj.plotDim(hue = "leiden", method="umap")

In [None]:
obj.plotDim(hue = "avg_intensity_rfp", method="umap")

In [None]:
obj.plotData(x="leiden", y = "avg_intensity_rfp", plot_type = "violin")

In [None]:
obj.showCell(RGB_contrasts=[4,3,4], n=5, ch2show={'red':'rfp', 'green':'beta3'}, filter = "leiden == '0'")

#### DIFFMAP

In [None]:
obj.findNeighbours(method = "umap")
obj.findClusters(method = "leiden")
obj.runDimReduc(method = "diffmap")

In [None]:
obj.plotDim(hue = "leiden", method="diffmap")

In [None]:
obj.plotDim(hue = "avg_intensity_gfap", method="diffmap")

#### Pseudotime
Choose a root cell for diffusion pseudotime:

In [None]:
%matplotlib tk
root_cells = obj.chooseCells(reduction = "diffmap")

In [None]:
%matplotlib inline

In [None]:
obj.runPT(root_cells = root_cells)

In [None]:
obj.plotDim(hue = "dpt_pseudotime", method="diffmap")

#### Stacked violin plot

In [None]:
fig, ax = plt.subplots(figsize = (5, 7))
sc.pl.stacked_violin(adata, data_cols, groupby = 'experiment', swap_axes = True, ax = ax, dendrogram = True)
fig.tight_layout()
plt.show()

#### Pseudotime - heatmap

In [None]:
# Enter order of clusters in pseudotime
pseudotime_path = [3,4,7]

In [None]:
# Heatmap - pseudotime
sc.pl.paga_path(
    adata, 
    pseudotime_path, 
    data_cols,
    show_node_names = True,
    n_avg = 50,
    annotations = ['dpt_pseudotime'],
    show_colorbar = True,
    color_map = 'coolwarm',
    groups_key = 'leiden',
    color_maps_annotations = {'dpt_pseudotime': 'viridis'},
    title = 'Path',
    return_data = False,
    normalize_to_zero_one = True,
    show = True
)

### Save Object