# Inference of multimodal GRNs

To infer gene regulatory networks (GRNs) from transcriptomics and chromatin accessibility data at the single-cell resolution we will use the python framework CellOracle ([docs](https://morris-lab.github.io/CellOracle.documentation), [paper](https://doi.org/10.1038/s41586-022-05688-9)).

It starts by generating a "scaffold" of regulatory link from the scATAC-seq data alone and then it contextualizes them to each cell type cluster using scRNA-seq data:

<img src="../src/celloracle.png" width="600" height="400" />


## Scaffold of regulatory links
In this notebook, the scaffold GRN has been already generated for the sake of time. However, since it is one of the most crucial steps we will revise how is it done:

1. Inference of co-accessibility with Cicero ([docs](https://morris-lab.github.io/CellOracle.documentation/notebooks/01_ATAC-seq_data_processing/option1_scATAC-seq_data_analysis_with_cicero/01_atacdata_analysis_with_cicero_and_monocle3.html))
2. Gene Transcription Starting Site (TSS) annotation ([docs](https://morris-lab.github.io/CellOracle.documentation/notebooks/01_ATAC-seq_data_processing/option1_scATAC-seq_data_analysis_with_cicero/02_preprocess_peak_data.html))
3. Transcription factor (TF) motif scanning ([docs](https://morris-lab.github.io/CellOracle.documentation/notebooks/02_motif_scan/02_atac_peaks_to_TFinfo_with_celloracle_20200801.html))

Explore these vignettes to better grasp how the pipeline works. Can you answer these questions?

- Co-accessibility: In `run_cicero` there is the parameter `window` set to 500k bp by default. How do you think it can affect the results?
- TSS annotation: Do you think TSS annotation is robust? Where can it fail?
- Motif scanning: What happens if you change the scanning algorithm? And the motif database?

## Contextualization with transcriptomics

From the scaffold regulatory links, build GRNs for each of the cell type clusters in your selected trajectory following this [vignette](https://morris-lab.github.io/CellOracle.documentation/notebooks/04_Network_analysis/Network_analysis_with_Paul_etal_2015_data.html). Can you answer these questions?

- How many TFs do your GRNs have? And TF-Gene edges?
- How similar or different are your GRNs? What metrics can you use?
- Which TFs seem to play a big role in your trajectory? Do they make biological sense?

To load the inferred scaffold regulatory links and your trajectory run this:

In [None]:
import celloracle as co
import scanpy as sc
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline


# Load base GRN
base_GRN = pd.read_csv('base_GRN_dataframe.csv', index_col=0)

# Read trajectory AnnData
adata = sc.read_h5ad('name_of_your_trajectory.h5ad')