# Marker Gene Detection
In this notebook we will use a pre-clustered Giotto object (using Leiden clustering). We'll go over the following methods for detecting gene markers:
- Gini-index method
- using Scran
- using Mast

Each method can either identify marker genes between 2 selected (groups of) clusters or for each individual cluster.

We'll directly import the Leiden clustered data below. 

In [1]:
source("scripts//clustered_obj.R")

Skipping install of 'Giotto' from a github remote, the SHA1 (1b60529f) has not changed since last install.
  Use `force = TRUE` to force installation




 giotto environment found at 
 /Users/Natalie_1/Library/r-miniconda/envs/giotto_env/bin/pythonw 
Giotto environment is already installed, set force_environment = TRUE to reinstall 
Consider to install these (optional) packages to run all possible Giotto commands:  MAST tiff biomaRt trendsceek multinet RTriangle FactoMiner
 Giotto does not automatically install all these packages as they are not absolutely required and this reduces the number of dependencies
 no external python path was provided, but a giotto python environment was found and will be used 

 first scale genes and then cells 
return_plot = TRUE and return_gobject = TRUE 

          plot will not be returned to object, but can still be saved with save_plot = TRUE or manually 
hvg  was found in the gene metadata information and will be used to select highly variable genes 


“You're computing too large a percentage of total singular values, use a standard svd instead.”


### 1. Gini-index method
We can use the gini index method between [2 groups](https://rubd.github.io/Giotto_site/reference/findGiniMarkers.html) of clusters:

In [11]:
# between 2 groups
gini_markers = findGiniMarkers(gobject = my_giotto_object,
                               cluster_column = 'leiden_clus',
                               group_1 = 1,
                               group_2 = 2)

Or between one group and all the [others](https://rubd.github.io/Giotto_site/reference/findGiniMarkers_one_vs_all.html):

In [12]:
# for each cluster
gini_markers = findGiniMarkers_one_vs_all(gobject = my_giotto_object,
                                          cluster_column = 'leiden_clus')


 start with cluster  1 

 start with cluster  2 

 start with cluster  3 

 start with cluster  4 


### 2. Scran

Please note that you'll need to [install scran](https://bioconductor.org/packages/release/bioc/html/scran.html) if working locally. It has already been pre-loaded in the Binder environment for this notebook. 

Now we'll check out scran's implementation of finding markers [between 2 groups](https://rubd.github.io/Giotto_site/reference/findScranMarkers.html):

In [13]:
# between 2 groups
scran_markers = findScranMarkers(gobject = my_giotto_object,
                                 cluster_column = 'leiden_clus',
                                 group_1 = 1,
                                 group_2 = 2)

Or between one group and all the [others](https://rubd.github.io/Giotto_site/reference/findScranMarkers_one_vs_all.html):

In [14]:
# for each cluster
scran_markers = findScranMarkers_one_vs_all(gobject = my_giotto_object,
                                            cluster_column = 'leiden_clus')

using 'Scran' to detect marker genes. If used in published research, please cite:
  Lun ATL, McCarthy DJ, Marioni JC (2016).
  'A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor.'
  F1000Res., 5, 2122. doi: 10.12688/f1000research.9501.2. 




 start with cluster  1 

 start with cluster  2 

 start with cluster  3 

 start with cluster  4 


### 3. MAST
You'll also need to [install MAST](https://bioconductor.org/packages/3.12/bioc/html/MAST.html) if working locally. It's been pre-loaded in the Binder environment for this notebook.

We can look at MAST's implementation between [2 cluster groups](https://rubd.github.io/Giotto_site/reference/findMastMarkers.html):

In [15]:
# between 2 groups
mast_markers = findMastMarkers(gobject = my_giotto_object,
                                cluster_column = 'leiden_clus',
                                group_1 = 1,
                                group_2 = 2)

Assuming data assay in position 1, with name et is log-transformed.


 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  11% with 0 failures

 Completed [====>---------------------------------------]  12% with 0 failures

 Completed [=====>--------------------------------


































                                                                              


Done!

Combining coefficients and standard errors

Calculating log-fold changes

Calculating likelihood ratio tests

Refitting on reduced model...


 Completed [=>------------------------------------------]   4% with 0 failures

 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>






































































                                                                              


Done!



Or between [all groups](https://rubd.github.io/Giotto_site/reference/findMastMarkers_one_vs_all.html):

In [16]:
# for each cluster
mast_markers = findMastMarkers_one_vs_all(gobject = my_giotto_object,
                                          cluster_column = 'leiden_clus')

using 'MAST' to detect marker genes. If used in published research, please cite:
  McDavid A, Finak G, Yajima M (2020).
  MAST: Model-based Analysis of Single Cell Transcriptomics. R package version 1.14.0,
  https://github.com/RGLab/MAST/.




 start with cluster  1 


Assuming data assay in position 1, with name et is log-transformed.


 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  11% with 0 failures

 Completed [====>---------------------------------------]  12% with 0 failures

 Completed [=====>--------------------------------


































                                                                              


Done!

Combining coefficients and standard errors

Calculating log-fold changes

Calculating likelihood ratio tests

Refitting on reduced model...


 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  11% with 0 failures

 Completed [====>




































































                                                                              


Done!




 start with cluster  2 


Assuming data assay in position 1, with name et is log-transformed.


 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  11% with 0 failures

 Completed [====>---------------------------------------]  12% with 0 failures

 Completed [=====>--------------------------------


































                                                                              


Done!

Combining coefficients and standard errors

Calculating log-fold changes

Calculating likelihood ratio tests

Refitting on reduced model...


 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>





































































                                                                              


Done!




 start with cluster  3 


Assuming data assay in position 1, with name et is log-transformed.


 Completed [=>------------------------------------------]   4% with 0 failures

 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  11% with 0 failures

 Completed [====>---------------------------------



































                                                                              


Done!

Combining coefficients and standard errors

Calculating log-fold changes

Calculating likelihood ratio tests

Refitting on reduced model...


 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====






































































                                                                              


Done!




 start with cluster  4 


Assuming data assay in position 1, with name et is log-transformed.


 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  10% with 0 failures

 Completed [====>---------------------------------------]  11% with 0 failures

 Completed [====>---------------------------------------]  12% with 0 failures

 Completed [=====>--------------------------------


































                                                                              


Done!

Combining coefficients and standard errors

Calculating log-fold changes

Calculating likelihood ratio tests

Refitting on reduced model...


 Completed [=>------------------------------------------]   4% with 0 failures

 Completed [=>------------------------------------------]   5% with 0 failures

 Completed [=>------------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   6% with 0 failures

 Completed [==>-----------------------------------------]   7% with 0 failures

 Completed [==>-----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   8% with 0 failures

 Completed [===>----------------------------------------]   9% with 0 failures

 Completed [===>----------------------------------------]  10% with 0 failures

 Completed [====>






































































                                                                              


Done!



### 4. All markers
We can also use a wrapper to use all three of the above methods for [2 groups](https://rubd.github.io/Giotto_site/reference/findMarkers.html):

In [17]:
mast_markers = findMarkers(gobject = my_giotto_object, 
                          cluster_column = 'leiden_clus',
                          group_1 = 1,
                          group_2 = 2)

Or for [all groups](https://rubd.github.io/Giotto_site/reference/findMarkers_one_vs_all.html):

In [18]:
# for each cluster
mast_markers = findMarkers_one_vs_all(gobject = my_giotto_object,
                                          cluster_column = 'leiden_clus')

using 'Scran' to detect marker genes. If used in published research, please cite:
  Lun ATL, McCarthy DJ, Marioni JC (2016).
  'A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor.'
  F1000Res., 5, 2122. doi: 10.12688/f1000research.9501.2. 




 start with cluster  1 

 start with cluster  2 

 start with cluster  3 

 start with cluster  4 
