# To-do
- Implement DER
- Implement ODER, C-ODER
- Visualization..?
- Calculate component sizes of a graph with Tarjan algorithm
- Plot of largest SCC

Go though C++ tutorials linked to in Glotzdocs.

In [1]:
# Import needed packages
import time
import numpy as np
import pandas as pd
import scipy
import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
# not yet installed
import networkx as nx
import percolate

In [3]:
# For cleanliness, ignore warnings after first appearance. 
import warnings
warnings.filterwarnings('ignore')
# warnings.filterwarnings(action='once')

In [22]:
from sem_graphs import gnp_random_graph

# 00 - Percolation implementations (simple)

### Network science background
- Erdos-Renyi graphs (random, non-directed graphs): [link here](https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model)
- Directed Erdos-Renyi graphs, DER (directed edges)
- Paper covers two models:
  - Ordered, directed Erdos-Renyis (ODER)
  - Competitive ODER
  
I'm currently using Networkx to play around with these. In the generator source code [here](https://github.com/networkx/networkx/blob/1174e443263f8a60dc82083ad1c563a4c25e5582/networkx/generators/random_graphs.py), `erdos_renyi_graph` is just an alias for `gnp_random_graph`.

## Playing around with networkx capabilities

So, looks like this is not so straightforward.

Networkx works from different graph classes, e.g. [here](https://github.com/networkx/networkx/blob/a660d5728b3b3463b08c03ab8138f62468487c71/networkx/classes/digraph.py).

Class documentation is here: https://docs.python.org/3/tutorial/classes.html

Read through this: http://www.souravsengupta.com/cds2015/python/LPTHW.pdf

Unit tests: http://docs.python-guide.org/en/latest/writing/tests/

In [23]:
# ER example here (done with networkx)
# Based on this, will take about 500 seconds to make a graph with 10e6 nodes
# Can definitely parallelize this with map/pool!
# Reference: http://chriskiehl.com/article/parallelism-in-one-line/ < this is python 2.7
# Good reference on python 3: https://www.ploggingdev.com/2017/01/multiprocessing-and-multithreading-in-python-3/

n = [5000]
timed = []
for N in n:
    start = time.time()
    gnp_random_graph(int(N),0.5)
    timed.append(time.time()-start)
    start = time.time()
#     er = nx.erdos_renyi_graph(int(N),.5)
    er = nx.erdos_renyi_graph(int(N),.5)
    timed.append(time.time()-start)
    
print(timed)
    
# plt.scatter(n,timed)
# plt.subplot(121)
# nx.draw(er, with_labels=True, font_weight='bold')
# plt.subplot(122)
# nx.draw_shell(er, with_labels=True, font_weight='bold')

TypeError: add_edge() argument after * must be an iterable, not int

In [None]:
# DER example here
er = nx.erdos_renyi_graph(10,.5,directed=True)
plt.subplot(121)
nx.draw(er, with_labels=True, font_weight='bold')
plt.subplot(122)
nx.draw_shell(er, with_labels=True, font_weight='bold')

### Model 1: Ordered, Directed Erdos-Renyi (ODER)
- Generalization of the directed ER model to ordered graph
- Form two large components, which explosively merge (discontinuous jump in the size of the largest strongly connected component)

Ordered, directed graphs can be made using the OrderedDiGraph class, source [here](https://github.com/networkx/networkx/blob/386b71a7af6c4898331f62987d8ced3f5621b680/networkx/classes/ordered.py).

From the python `collections` documentation [here](https://docs.python.org/3/library/collections.html#collections.OrderedDict): "an OrderedDict is a dict that remembers the order that keys were first inserted. If a new entry overwrites an existing entry, the original insertion position is left unchanged. Deleting an entry and reinserting it will move it to the end."

In [None]:
# ODER example here

### Model 2: Competitive ODER
- Adds competition: preference for connecting nodes of similar rank
- See similar discontinuous jump in cluster size, but "more explosive"
- Get an effective phase separation of the two large components: one containing the lower-ranked users, one containing the higher-ranked users
- TAKEAWAY: Some bias towards grouping similar-ranked nodes leads to formation of two distinct groups of nodes (classes) with little flow of information between the classes

In [None]:
# CODER example here

# 01 - Clustering implementations

### Classical non-directed NZ clustering algorithm

In [None]:
# Newmann-Ziff implementation on ER

### Clustering algorithm implemented in paper
Implementation of Tarjan algorithm. Python source [here](https://github.com/bwesterb/py-tarjan).

Gives $O(E\log{E})$ clustering performance when using the pseudocode in section 5.

In [None]:
# Implementation of clustering algorithm from the paper

# 02 - Thermodynamics

In [None]:
# Plot SCC ("strongly connected component") versus edge density, per paper

In [None]:
# Find critical exponents-- they actually don't have them

# 03 - Something else interesting?
- Can we say something else interesting from this work that the authors might not have thought about?
- What could have made this paper more interesting?
- How innovative actually is this?