# Analyzing Segmentador data
Gabriel Emilio Herrera Oropeza <br>
13/06/2022

## *INSTRUCTIONS TO DEVELOPERS*

**Make sure to fetch and pull the most updated code and run `pip install .` before using the jupyter notebooks. Codes are constantly being improved and the workflow below work best with the newest version.**

This document describes the workflow to analyze data from segmented images that were previously processed by `insert tool name`. We will show how to import, visualise, filter and cluster the data using robust, simple-to-use functions. Many of these functions take up positional arguments that can be modified. To display the usage of these functions, run `help(name_of_function)`.

We begin by importing the `tool name` module.

In [None]:
from ngtools.analyzer import Analyzor

## Analyzor object

We provide an Analyzor object class that facilitates storage of nuclei segmented data and its downstream processing. To construct this object, run the following: 

In [None]:
path_to_experiments = "/media/cdn-bc/RAID/Projects/FH021_Marcelo_seganalysis/outputs/NGNdays3"
nga = Analyzor(path_to_experiments)

## Centering DAPI

In [None]:
nga.ctrDAPI()

### Identify Single Cells
Identify single cells based on DNA marker content.

In [None]:
nga.findSingleCells()
nga.showData()

## Displaying image of cells

The Analyzor object holds the path to the image of each cells and has the ability to display this image.
Simply run the `showCell` function:

In [None]:
nga.showCell()

Running the above function without any input parameters will invoke an interactive prompt. Alternatiely, you may provide the number of cells and channels to display as such:

In [None]:
nga.showCell(n=5, ch2show = {'red': "RFP", 'green': "Beta3"})

To maximise the use of RGB channels, `showCell` will show the DAPI/nucleus as a separate layer by default. This can be switched off using the `show_nucleus` flag:

In [None]:
nga.showCell(n=5, ch2show = {'red': "RFP", 'green': "Beta3"}, show_nucleus = False)

**TO DO**: Order_by feature

## Plotting data

In [None]:
nga.plotData(x = "nuclear_area", y = "avg_intensity_dapi", hue = "laminB1_group")

In [None]:
nga.plotData(x = "nuclear_area", y = "avg_intensity_dapi", x_trans = "log", y_trans = "log", hue = "experiment")

In [None]:
nga.plotData(x="experiment", y = "nuclear_area", plot_type = "violin")

Check selection of single cells:

In [None]:
nga.plotData("iNs", "isSingleCell", 
                hue = "isSingleCell", alpha = 0.5, 
                y_trans = "log")

In [None]:
# Keep only single cells
nga.filterCells(expr = "isSingleCell == True")

In [None]:
nga.dim()

### Intensity Normalisation
Statistic-based normalisation of intensity data. **Options are: mode, mean, and median.** *nbins* is used only when method is *mode*. DAPI channel is not normalised.

In [None]:
nga.normIntensity(method = "mode", nbins = 100)

Observe data before normalisation for a channel. The red line represents the statistical method value used for normalisation.

In [None]:
nga.plotData("experiment", "avg_intensity_rfp", plot_type = "violin", data_type="norm", hue="iNs")

## Data Exploration

### Linear relationships

In [None]:
nga.plotData("avg_intensity_core_dapi", "nuclear_area", plot_type = "line", hue="experiment", data_type="norm")

### Dimension Reduction

In [None]:
nga.colnames()

In [None]:
nga.buildAData(excluded_features=['angle','iNs','total_intensity_core_dapi', 'total_intensity_internal_ring_dapi',
                                 'total_intensity_external_ring_dapi', 'total_intensity_dapi', 'total_intensity_rfp',
                                 'total_intensity_laminB1', 'total_intensity_beta3', 'beta3_x_rfp', 'beta3_x_laminB1',
                                 'rfp_x_laminB1', 'beta3_x_rfp_x_laminB1'])
nga.normAData()

In [None]:
nga.showADataVars()
nga.showADataObs()

#### UMAP

In [None]:
nga.findNeighbours(method = "umap")
nga.findClusters(method = "leiden", res=0.6)
nga.runDimReduc(method = "umap")

In [None]:
# Plot UMAP showing features
nga.plotDim(hue = "iNs", method="umap")
nga.plotDim(hue = "leiden", method="umap")

#### DIFFMAP

In [None]:
nga.findNeighbours(method = "gauss")
nga.findClusters(method = "leiden")
nga.runDimReduc(method = "diffmap")

In [None]:
nga.plotDim(hue = "leiden", method="diffmap")
nga.plotDim(hue = "iNs", method="diffmap")

#### Pseudotime
Choose a root cell for diffusion pseudotime:

In [None]:
nga.runPT(root = 3)

In [None]:
nga.plotDim(hue = "dpt_pseudotime", method="diffmap")

#### Stacked violin plot

In [None]:
fig, ax = plt.subplots(figsize = (5, 7))
sc.pl.stacked_violin(adata, data_cols, groupby = 'experiment', swap_axes = True, ax = ax, dendrogram = True)
fig.tight_layout()
plt.show()

#### Pseudotime - heatmap

In [None]:
# Enter order of clusters in pseudotime
pseudotime_path = [3,4,7]

In [None]:
# Heatmap - pseudotime
sc.pl.paga_path(
    adata, 
    pseudotime_path, 
    data_cols,
    show_node_names = True,
    n_avg = 50,
    annotations = ['dpt_pseudotime'],
    show_colorbar = True,
    color_map = 'coolwarm',
    groups_key = 'leiden',
    color_maps_annotations = {'dpt_pseudotime': 'viridis'},
    title = 'Path',
    return_data = False,
    normalize_to_zero_one = True,
    show = True
)

### Save Object

In [None]:
adata.write("/save/path/filename.hdf5")