# How Many Receptors Are in Each NeuroMMSig Signature?

**Author:** [Charles Tapley Hoyt](https://github.com/cthoyt)

**Estimated Run Time:** 10 seconds

This notebook uses PyBEL and Bio2BEL HGNC to assess how many HGNC terms belonging to Gene Families containing the word "receptor" are present in each signature of the NeuroMMSig in the context of Alzheimer's disease.

## Import

In [21]:
import sys
import os
import time

import bio2bel_hgnc
import pandas as pd
import pybel
import pybel_tools

from bio2bel_hgnc.models import GeneFamily
from pybel_tools import selection, summary, utils

## Environment

In [3]:
print(sys.version)

3.6.3 (default, Oct  9 2017, 09:47:56) 
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.37)]


In [2]:
time.asctime()

'Tue Mar  6 14:14:55 2018'

## Dependencies

In [4]:
pybel.utils.get_version()

'0.11.2-dev'

In [5]:
pybel_tools.utils.get_version()

'0.5.2-dev'

# Identifying Receptor-Encoding Genes

This section requires that the `bio2bel_hgnc` package is installed and populated.

In [6]:
hgnc_manager = bio2bel_hgnc.Manager()
hgnc_manager

<Manager connection=mysql+mysqldb://root@localhost/pybel?charset=utf8>

In [7]:
hgnc_manager.summarize()

{'families': 1092, 'genes': 42472, 'uniprots': 20037}

In [None]:
receptor_families = hgnc_manager.session.query(GeneFamily).filter(GeneFamily.family_name.contains('receptor')).all()

How many receptor families are there?

In [8]:
len(receptor_families)

136

In [None]:
receptor_genes = {gene for family in receptor_families for gene in family.hgncs}

How many unique genes belong to one (or more) of these families?

In [9]:
len(receptor_genes)

2139

# Load NeuroMMSig Signatures

This section requires the `BMS_BASE` environment variable to be set.

In [10]:
neurommsig_ad_path = os.path.join(os.environ['BMS_BASE'], 'aetionomy', 'alzheimers', 'alzheimers.gpickle')

assert os.path.exists(neurommsig_ad_path)

In [11]:
neurommsig_ad = pybel.from_pickle(neurommsig_ad_path)

In [None]:
signatures = pybel_tools.selection.get_subgraphs_by_annotation(neurommsig_ad, 'Subgraph')

How many signatures does NeuroMMSig contain in the context of Alzheimer's disease?

In [12]:
len(signatures)

128

# Summarize

In [13]:
signature_genes = {
    signature_name: pybel.struct.summary.get_names_by_namespace(signature, 'HGNC')
    for signature_name, signature in signatures.items()
}

In [None]:
df = pd.DataFrame([
    (signature_name, len(genes)) 
    for signature_name, genes in signature_genes.items()
], columns=('Name', 'Receptors'))

The top 15 highest receptor density graphs are shown

In [20]:
df.sort_values('Receptors', ascending=False).head(15)

Unnamed: 0,Name,Receptors
6,Undefined,895
4,Amyloidogenic subgraph,243
45,Regulation of actin cytoskeleton subgraph,160
0,Non-amyloidogenic subgraph,124
68,Interleukin signaling subgraph,112
2,Inflammatory response subgraph,109
1,Gamma secretase subgraph,96
29,Tumor necrosis factor subgraph,89
70,Tau protein subgraph,83
14,miRNA subgraph,81
