# Abstract

**Author:** [Charles Tapley Hoyt](https://github.com/cthoyt)

This notebook outlines a simple way to assess the citations that contribute to a subgraph.

In [1]:
import os
from collections import defaultdict, Counter
import itertools as itt

import pybel
from pybel.constants import *

import pybel_tools as pbt

In [2]:
pybel.__version__

'0.4.1'

In [3]:
pbt.__version__

'0.1.3-dev'

In [4]:
bms_base = os.environ['BMS_BASE']

## Loading

The graph is loaded from a precompiled gpickle.

In [5]:
pickle_path = os.path.join(bms_base, 'aetionomy', 'alzheimers.gpickle')

In [6]:
graph = pybel.from_pickle(pickle_path)

pbt.summary.print_summary(graph)

Name: 
Number of nodes: 11420
Number of edges: 64013
Network density: 0.0004908784925238285
Number weakly connected components: 65
Average in-degree: 5.605341506129597
Average out-degree: 5.605341506129597


## Filtering

The graph is filtered by to a specific subgraph - the Apoptosis signaling subgraph.

In [7]:
target_subgraph = 'Apoptosis signaling subgraph'

In [8]:
subgraph = pbt.selection.get_subgraph_by_annotation(graph, target_subgraph)

pbt.summary.print_summary(subgraph)

Name: Alzheimer's Disease Model - Subgraph - Apoptosis signaling subgraph
Number of nodes: 130
Number of edges: 211
Network density: 0.012581991651759094
Number weakly connected components: 10
Average in-degree: 1.623076923076923
Average out-degree: 1.623076923076923


## Analysis

The unique citations for every pair of nodes is calculated. This helps to remove the bias from edges that have many notations and have a cartesian explosion. This process can be repeated with [pbt.summary.count_pmids](http://pybel-tools.readthedocs.io/en/latest/summary.html#pybel_tools.summary.count_pmids).

In [9]:
citations = defaultdict(set)

for u, v, d in subgraph.edges_iter(data=True):
    c = d[CITATION]
    citations[u, v].add((c[CITATION_TYPE], c[CITATION_REFERENCE], c[CITATION_NAME]))
    
counter = Counter(itt.chain.from_iterable(citations.values()))

for (_, pmid, name), v in counter.most_common():
    print('{}\t{}\t{}' .format(int(pmid.strip()), v, name))

19499146	27	Acta Biochim Biophys Sin (Shanghai). 2009 Jun;41(6):437-45.
22496686	11	J Toxicol. 2012;2012:187297. Epub 2012 Feb 8.
16153637	9	Eur J Pharmacol. 2005 Sep 27;520(1-3):1-11
17869087	7	J Nutr Biochem. 2008 Jul;19(7):459-66. Epub 2007 Sep 14
22122372	7	J Neurochem. 2012 Jan;120 Suppl 1:9-21. doi: 10.1111/j.1471-4159.2011.07519.x. Epub 2011 Nov 28.
19918364	6	PLoS One. 2009 Nov 12;4(11):e7820
11592846	6	Neurobiol Dis. 2001 Oct;8(5):764-73
12548636	6	Proteomics. 2003 Jan;3(1):73-7.
14744432	5	Cell. 2004 Jan 23;116(2):205-19.
18997293	4	J Alzheimers Dis. 2008 Nov;15(3):397-407
22236693	4	J Negat Results Biomed. 2012 Jan 12;11:5. doi: 10.1186/1477-5751-11-5.
24821282	4	J Neurochem. 2014 May 12. doi: 10.1111/jnc.12761
17316167	4	Curr Alzheimer Res. 2007 Feb;4(1):67-72
19734902	4	Nat Genet. 2009 Oct;41(10):1088-93. doi: 10.1038/ng.440. Epub 2009 Sep 6.
20847424	4	J Alzheimers Dis. 2010;22(3):741-63
22235318	4	PLoS One. 2012;7(1):e29641
15671026	3	J Biol Chem2005
22523685	3	J Aging R

# Conclusions

The top 5 density papers that contributed to the Apoptosis signaling subgraph were:

1. Acta Biochim Biophys Sin (Shanghai). 2009 Jun;41(6):437-45. ([pmid:19499146](www.ncbi.nlm.nih.gov/pubmed/19499146)) with (27) 
2.  J Toxicol. 2012;2012:187297. Epub 2012 Feb 8. ([pmid:22496686](www.ncbi.nlm.nih.gov/pubmed/22496686)) with (11)
3. Eur J Pharmacol. 2005 Sep 27;520(1-3):1-11 ([pmid:16153637](www.ncbi.nlm.nih.gov/pubmed/16153637)) ( 9) 
4. J Nutr Biochem. 2008 Jul;19(7):459-66. Epub 2007 Sep 14 ([pmid:17869087](www.ncbi.nlm.nih.gov/pubmed/17869087)) with (7) 
5. J Neurochem. 2012 Jan;120 Suppl 1:9-21. doi: 10.1111/j.1471-4159.2011.07519.x. Epub 2011 Nov 28. ([pmid:22122372](www.ncbi.nlm.nih.gov/pubmed/22122372)) with (7) 