# CRI iAtlas notebooks
## Exploring the Pseudobulk single-cell RNAseq data available in iAtlas.

Repo: https://github.com/CRI-iAtlas/iatlas-notebooks/ 

Notebook: query_iatlas_single_cell_datasets.ipynb 

Date: September 13, 2024 

Author: Carolina Heimann

---

notebook repo: https://github.com/CRI-iAtlas/iatlas-notebooks

landing page: https://www.cri-iatlas.org/

portal: https://isb-cgc.shinyapps.io/iatlas/

email: support@cri-iatlas.org

---

## Getting started

In [None]:
# We have a few libraries to install.
try({
    packages = c("magrittr", "dplyr", "tidyr", "dplyr", "tidyr", "ggplot2", "iatlasGraphQLClient")

    sapply(packages, function(x) {
      if (!require(x,character.only = TRUE))
        install.packages(x)
        suppressPackageStartupMessages(library(x,character.only = TRUE))
    })},
    silent=TRUE 
)

# Exploring the single-cell datasets and features


The iAtlas single-cell RNAseq data is stored in a database that can be queried with functions from the `iatlasGraphQLClient` package. 
We have clinical data, pseudobulk expression, and immune features.

As a first step, let's take a look at the available datasets and features.

## Datasets available

In [1]:
#single cell datasets that we have in the iAtlas database
sc_datasets <- iatlasGraphQLClient::query_datasets(types = "scrna")
sc_datasets

display,name,type
<chr>,<chr>,<chr>
Bi 2021 - ccRCC - PD-1,Bi_2021,scrna
"Krishna 2021 - ccRCC, PD-1",Krishna_2021,scrna
Li 2022 - ccRCC,Li_2022,scrna
MSK - SCLC,MSK,scrna
"Shiao 2024 - BRCA, PD-1",Shiao_2024,scrna
Vanderbilt - colon polyps,Vanderbilt,scrna


## Immune Features

In [3]:
#immune features of all samples in the sc datasets
features_df <- iatlasGraphQLClient::query_features(cohorts = 'Bi_2021')
head(features_df)

name,display,class,order,unit,method_tag
<chr>,<chr>,<chr>,<int>,<chr>,<chr>
age_at_diagnosis,Age At Diagnosis,Clinical,,Year,
Module3_IFN_score,IFN-gamma Response,Core Expression Signature,1.0,Score,ExpSig
LIexpression_score,Lymphocyte Infiltration,Core Expression Signature,4.0,Score,ExpSig
CSF1_response,Macrophage Regulation,Core Expression Signature,3.0,Score,ExpSig
Module11_Prolif_score,Proliferation,Auxiliary Expression Signature,1.0,Score,ExpSig
TGFB_score_21050467,TGF-beta Response,Core Expression Signature,2.0,Score,ExpSig


## Clinical Annotation

In [4]:
#clinical annotation that is available for the sc datasets
clinical_options <- iatlasGraphQLClient::query_tags(datasets = sc_datasets$name)
head(clinical_options)

tag_name,tag_long_display,tag_short_display,tag_characteristics,tag_color,tag_order,tag_type
<chr>,<chr>,<chr>,<chr>,<lgl>,<int>,<chr>
Biopsy_Site,Biopsy Site,Biopsy Site,Site where sample was collected from.,,18,parent_group
Cancer_Tissue,Cancer Tissue,Cancer Tissue,Original tumor tissue.,,14,parent_group
Clinical_Benefit,Clinical Benefit,Clinical Benefit,Patients have clinical benefit when mRECIST response is different than Progressive Disease.,,4,parent_group
Clinical_Stage,Clinical Stage,Clinical Stage,Clinical stage of cancer.,,17,parent_group
FFPE,FFPE Samples,FFPE Samples,Indicates whether the sample is FFPE or not.,,20,parent_group
ICI_Pathway,ICI Pathway,ICI Pathway,Pathway that is being targeted by the ICI treatment.,,6,parent_group


## Gene Expression

In [15]:
#genes that we have expression data for all samples in the ici datasets (we will query expression values in the next section)
genes_df <- iatlasGraphQLClient::query_genes(cohorts = 'MSK')
head(genes_df)

ERROR: Error: Gateway Timeout (HTTP 504)
