# Space Mice Genes
## Heat Propagation and Clustering Package


----------------------

Author: Mikayla Webster (13webstermj@gmail.com)

Date: 2nd May, 2018

----------------------

<a id='toc'></a>
## Table of Contents
1. [Background](#background)
2. [Import packages](#import)
3. [Define Analysis Preferences](#pref)
3. [Load Networks](#load)
4. [Run Heat Propagation](#heat)
5. [Clustering](#cluster)

## Background
<a id='background'></a>

## Import packages
<a id='import'></a>

In [14]:
import sys
code_path = '../../network_bio_toolkit'
sys.path.append(code_path)

import Heat
reload(Heat)

import pandas as pd

## Define Analysis Preferences
<a id='pref'></a>

In [15]:
symbol = 'symbol'
entrez = 'entrez'

human = 'human'
mouse = 'mouse'

heat = Heat.Heat(gene_type = symbol, species = mouse)

## Load Networks
<a id='load'></a>

1. Load DEG file 
2. Load STRING background network

In [16]:
# load DEG file
DEG_filename = "../../DEG_databases/DE_CoeffspaceFlight - groundControl_glds48_20180312.csv"  
heat.create_DEG_list(DEG_filename, p_value_filter = 0.05, sep = ',')

print "Number of DEG's: " + str(len(heat.DEG_list))

Number of DEG's: 181


In [5]:
# load background network from BIOGRID ndex2 network 
heat.load_ndex_from_server(UUID = '36f7d8fd-23dc-11e8-b939-0ac135e8bacf', relabel_node_field = 'name')

print "\nNumber of interactions: " + str(len(list(heat.G_DEG.edges())))


Number of interactions: 111


## Run Heat Propagation
<a id='heat'></a>

In [18]:
Wprime = heat.normalized_adj_matrix() # optional. Only if you want to inspet Wprime

In [21]:
heat.draw_heat_prop(Wprime = Wprime, # you don't have to pass this argument. Will calculate automatically
                  num_nodes = 500,
#                  random_walk = False,
                  edge_width = 2,
                  edge_smooth_enabled = True,
                  edge_smooth_type = 'bezier',
                  node_size_multiplier = 5,
                  hover = False,
                  hover_connected_edges = False,
                  largest_connected_component = False,
                  physics_enabled = True,
                  node_font_size = 20,
                  graph_id = 1,
                  node_shadow_x = 6)

## Clustering 
<a id='cluster'></a>

Parameter information:
- **G_DEG**: background network filtered by DEG list, output of load_STRING_to_digraph
- **DG_universe**: full background network, output of create_graph.load_STRING_to_digraph 
- **seed_nodes**: list of DEG's, output of create_graph.create_DEG_list
- **Wprime**: will calculate automatically of not specified, output of visualizations.normalized_adj_matrix
- **num_top_genes**: number of genes to display in the output graph
- **cluster_size_cut_off**: color clusters below this threshhold grey
- **remove_stray_nodes**: remove custers below the cluster size cut off
- **r**: increases spacing between clusters. recommended number between 0.5 and 4.0
- **x_offset**: modify if some clusters are overlapping. Extra helpful when x_offset != y_offset
- **y_offset**: modify if some clusters are overlapping. Extra helpful when x_offset != y_offset
- **node_spacing**: recommended number between 500 and 2000
- **node_size_multiplier**: as you scale node_spacing, scale this number. Recommended number between 5 and 25
- **physics_enabled**: Nodes will bounce aroound when you click and drag them. Only set to True when number of nodes is 200 or less
- **node_font_size**: as you scale node_spacing, scale this number. Recommended number between 20 and 50
- **graph_id**: Allows rendering of multiple graphs in one notebook. Just make sure each graph has a unique id. 

In [22]:
heat.draw_clustering(rad_positions = False,
                Wprime = Wprime,
                k = None,
                largest_connected_component = True,
                num_top_genes = 500,
                cluster_size_cut_off = 5,
                remove_stray_nodes = True,
                node_spacing = 700,
                node_size_multiplier = 5,
                physics_enabled = False,
                node_font_size = 16,
                graph_id = 2
               )

In [23]:
heat.draw_clustering(Wprime = Wprime,
                num_top_genes = 500,
                cluster_size_cut_off = 5,
#                remove_stray_nodes = True,
                r = 1.2,
#                x_offset = 2,
#                y_offset = 2,
                node_spacing = 700,
                node_size_multiplier = 12,
                physics_enabled = False,
                node_font_size = 45,
                graph_id = 3,
                node_shadow_x = 6
               )