Chapter 2 Exercises

# Erdos-Renyi Graphs

Code examples from [Think Complexity, 2nd edition](https://thinkcomplex.com).

Copyright 2016 Allen Downey, [MIT License](http://opensource.org/licenses/MIT)

In [1]:
%matplotlib inline

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
import seaborn as sns
import random

from utils import decorate, savefig

# I set the random seed so the notebook 
# produces the same results every time.
np.random.seed(17)

# TODO: remove this when NetworkX is fixed
from warnings import simplefilter
import matplotlib.cbook
simplefilter("ignore", matplotlib.cbook.mplDeprecation)

## Exercises

In [2]:
def all_pairs(nodes):
    for i, u in enumerate(nodes):
        for j, v in enumerate(nodes):
            if i < j:
                yield u, v

In [3]:
def make_complete_graph(n):
    G = nx.Graph()
    nodes = range(n)
    G.add_nodes_from(nodes)
    G.add_edges_from(all_pairs(nodes))
    return G

In [4]:
#From the original notebook
def reachable_nodes(G, start):
    seen = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node not in seen:
            seen.add(node)
            stack.extend(G.neighbors(node))
    return seen

**Exercise 2.3:** In my implementation of `reachable_nodes`, you might be bothered by the apparent inefficiency of adding *all* neighbors to the stack without checking whether they are already in `seen`.  Write a version of this function that checks the neighbors before adding them to the stack.  Does this "optimization" change the order of growth?  Does it make the function faster?

In [5]:
def reachable_nodes_precheck(G, start):
    seen = set()
    stack = [start]
    while stack:
        node = stack.pop()
        if node not in seen:
            seen.add(node)
            #We edit this part here to add a check what is added
            #to the stack. We need to remove nodes already seen
            #from the neighbors before adding them to stack.
            unseen_neighbors = set(G.neighbors(node)).difference(seen) #set difference between neighbors of 'node' and seen nodes
            
            stack.extend(unseen_neighbors)
    
    return []

In [6]:
complete = make_complete_graph(1000)

In [7]:
%timeit len(reachable_nodes(complete, 0))

171 ms ± 28 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [8]:
%timeit len(reachable_nodes_precheck(complete, 0))

134 ms ± 26.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


We still expect the order of growth to be the same even with the addition of checking the stack. The version with the precheck can be faster because it avoids readding some nodes that are already seen to the stack.