# Overview
This notebook provides instructions on how to use CEFCON for the following tasks: 
- constructing cell-lineage-specific GRNs,
- identifying driver regulators,
- finding regulon-like gene modules,
- visualizing and analyzing the results.

We use the scRNA-seq data from Nestorowa et al. (2016, Blood), which is about mouse hematopoietic stem and progenitor cell differentiation.

### Contents
- [1. Load the preprocessed data](#section1)
- [2. Construct cell-lineage-specific GRNs](#section2)
- [3. Identify driver regulators for each developmental lineage](#section3)
- [4. Find regulon-like gene modules (RGMS)](#section4)
- [5. Visualization and analyses of results](#section5)

In [2]:
import cefcon as cf
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")

In [3]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
plt.rcParams["savefig.dpi"] = 300
plt.rcParams["figure.figsize"] = [6, 4.5]

<a id = "section1"> </a>
## 1. Load the preprocessed data

Here, we use the mouse hematopoiesis data provided by [Nestorowa et al. (2016, Blood)](https://doi.org/10.1182/blood-2016-05-716480). Please check this [notebook]() to learn about scRNA-seq data preprocessing.

In [4]:
# Load the preprocessed data
adata = cf.datasets.mouse_hsc_nestorowa16()
adata

Load the prior gene interaction network

In [None]:
prior_network = cf.datasets.load_human_prior_interaction_network(dataset='nichenet')
# Convert the gene symbols of the prior gene interaction network to the mouse gene symbols
prior_network = cf.datasets.convert_human_to_mouse_network(prior_network)

Alternatively, you can directly specify the file path of the input prior interaction network and import the specified file using the `cf.data_preparation` function.

In [5]:
prior_network = './Reference_Networks/combined_network_Mouse.txt'

<a id = "section2"> </a>
## 2. Construct cell-lineage-specific GRNs


In [6]:
data = cf.data_preparation(adata, prior_network)

Assign a CUDA device:

In [7]:
CUDA = '0'

Lineage-by-lineage computation:

In [8]:
%%time
cefcon_results_dict = {}
for li, data_li in data.items():
    # We suggest setting up multiple repeats to minimize the randomness of the computation.
    cefcon_GRN_model = cf.NetModel(epochs=350, repeats=3, cuda=CUDA, seed=-1)
    cefcon_GRN_model.run(data_li)

    cefcon_results = cefcon_GRN_model.get_cefcon_results(edge_threshold_avgDegree=8)
    cefcon_results_dict[li] = cefcon_results

<a id = "section4"> </a>
## 4. Idenytify driver regulators for each lineage


In [9]:
%%time
for li, result_li in cefcon_results_dict.items():
    print(f'Lineage - {li}:')
    result_li.gene_influence_score()
    result_li.driver_regulators()

<a id = "section5"> </a>
## 5. Identify regulon-like gene modules (RGMs)


In [10]:
%%time
for li, result_li in cefcon_results_dict.items():
    print(f'Lineage - {li}:')
    result_li.RGM_activity()

<a id = "section6"> </a>
## 6. Visualization and analyses of results

In [11]:
# Check the names of lineages
print(list(data.keys()))

Here, we choose one lineage for further analysis.

In [12]:
lineage = 'E_pseudotime'
result = cefcon_results_dict[lineage]

\> Plot degree distribution of the inferred lineage-specific GRN and show some key topological metrics

In [13]:
result.plot_network_degree_distribution()

\> Plot gene embedding clusters

In [14]:
result.plot_gene_embedding_with_clustering(n_neighbors=30, resolution=1)

\> Plot influence scores of driver regulators

In [15]:
result.plot_influence_score()

\> Plot Venn diagram of MDS driver genes, MFVS driver genes and top-ranked genes

In [16]:
result.plot_driver_genes_Venn()

\> Plot heat map of the activity matrix of RGMs

In [17]:
result.plot_RGM_activity_heatmap(cell_label=None, type='out')

\> Plot network with user specified genes

In [18]:
genes = result.driver_regulator.index[0:30]

network = result.plot_network(genes)

\> Plot controllability metrics of all the lineages

In [19]:
cf.utils.plot_controllability_metrics(cefcon_results_dict, return_value=False)
