<h1> Maximum Excluded Mass Burning

The Maximum Excluded Mass Burning (MEMB) algorithm [1] calculates the box-covering for a given network as follows:

1. Initially, mark all nodes as uncovered and non-centres.
2. For each non-centre node calculate the excluded mass. The excluded mass is defined as the number of uncovered nodes within a radius $r_B$ of the node.
3. Let the node with the maximum excluded mass be $p$, let $p$ be a centre and let all nodes within a radius $r_B$ from $p$ be covered.
4. Repeat steps 2 and 3 until all nodes in the network are covered. 

The code in this notebook calculates the box-covering according to the MEMB algorithm as well as an amended accelerated method. It also finds the renormalised graphs under $\ell_B$ box renormalisation and can identify if a network has fractal properties.

<h2> 1. Introduction 

<h2> Module Imports <a class="anchor" id="module-imports"></a>

In [None]:
import networkx as nx
import matplotlib.pyplot as plt
import random
import time
from operator import itemgetter
from scipy.io import mmread
import numpy as np
import statistics
import random
import numpy as np
import scipy.stats
import math

<H1> Workspace <a class="anchor" id="workspace"></a>

In [None]:
eduG = read_mtx_graph_format("web-edu.mtx")
notredameG = nx.read_gml("web-NotreDame.gml")

<h2> Results <a class="anchor" id="results"></a>

<h3>$(2,3)$-flower

3 iterations

In [None]:
# (3,2)-flower 3 Iterations
lB = [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31]
NB = [470, 95, 45, 20, 20, 10, 10, 6, 5, 5, 5, 4, 3, 2, 2, 2]

In [None]:
plot_lB_NB(lB, NB)
plot_loglog_lB_NB(lB, NB)

<h3> $(1,3)$-flower

4 iterations

In [None]:
# (3, 1)-flower 4 Iterations
lB = np.array([1, 3, 5, 7, 9, 11, 13, 15])
NB = np.array([684, 172, 44, 12, 4, 2, 1, 1])

In [None]:
plot_lB_NB(lB, NB)
plot_loglog_lB_NB(lB, NB)

<h3> Web Networks

web-edu.mtx

In [None]:
# Web Edu Network
lB = [1, 3, 5, 7, 9, 11, 13]
NB = [3031, 249, 12, 6, 3, 2, 1]

In [None]:
best_exp_fit, best_exp_score = find_best_fit_iteratively(lB, NB, find_best_exp_fit, linspace_N=100)
print(best_exp_fit)

In [None]:
best_frac_fit, best_frac_score = find_best_fit_iteratively(lB, NB, find_best_fractal_fit, linspace_N=100)
print(best_frac_fit)

In [None]:
find_best_fractal_fit(lB, NB, A_min=0, A_max=5000, c_min=0, c_max=10)

In [None]:
plot_best_fit_comparison(lB, NB, best_exp_fit[0], best_exp_fit[1], best_frac_fit[0], best_frac_fit[1])

web-NotreDame.gml

In [None]:
G=nx.read_gml("32flower5iter.gml")

In [None]:
print(len(G.nodes()))

In [None]:
G = nx.read_gml("32flower3iter.gml")
print(len(G.nodes()))

In [None]:
current_graph = G.copy()
lB = 3
count = 1
while len(list(current_graph.nodes())) > 1:
    new_graph = main(current_graph, lB, count)
    current_graph = new_graph.copy()
    count += 1

In [None]:
box_mass = [14, 12, 12, 14, 14, 6, 8, 7, 6, 6, 8, 7, 5, 7, 6, 8, 7, 8, 8, 9, 4, 4, 4, 5, 4, 4, 4, 4, 5, 5, 3, 4, 4, 4, 3, 4, 4, 3, 4, 4, 4, 4, 3, 3, 5, 5, 5, 4, 4, 5, 4, 4, 4, 4, 5, 5, 4, 3, 3, 3, 4, 3, 4, 3, 4, 4, 5, 4, 5, 3, 3, 3, 3, 5, 4, 4, 3, 4, 3, 4, 4, 5, 3, 4, 3, 5, 4, 4, 5, 5, 4, 4, 5, 3, 4]
box_avg = sum(box_mass)/len(box_mass)
print(box_avg)

In [None]:
box_mass = [7, 6, 7, 6, 7, 4, 4, 4, 4, 5, 5, 3, 4, 3, 4, 4, 5, 4, 4, 5]
box_avg = sum(box_mass)/len(box_mass)
print(box_avg)

<h1> Functions <a class="anchor" id="functions"></a>

<h2> Read Graphs <a class="anchor" id="read"></a>

In [None]:
def read_mtx_graph_format(filepath):
    """
    Reads graphs stored in the .mtx file format. Use with, for example, graphs from www.networkrepository.com.
    Note: Some files may need to be edited to make sure that scipy.io can read them. Files should have a header starting with %%MatrixMarket and a single line denoted the number of values in each column. 
    
    Args:
        filepath (str): Filepath to .mtx file
        
    Returns:
        G (networkx.Graph): Network read from file. 
    """
    # Read the file using the scipy.io file reader. 
    mmf = mmread(filepath)
    # Generate a graph from this file. 
    G = nx.from_scipy_sparse_array(mmf)
    # Return the graph.
    return G

<h2> Maximal Excluded Mass Burning <a class="anchor" id="MEMB"></a>

In [None]:
def MEMB(G, lB, deterministic=True):
    """
    Implements the Maximal Excluded Mass Burning algorithm as according to [1].
    Note: Only works for odd numbers!
    
    Args: 
        G (networkx.Graph): The network the algorithm is to be applied to. 
        lB (int): The diameter of the boxes used to cover the network.
        deterministic (Bool) (opt): If False, choose fom nodes with equal excluded mass randomly. If True, choose the first lexicographically.
    
    Returns: 
        centres (list): A list of nodes assigned to be centres under the MEMB algorithm. 
    """
    # Start with all nodes being uncovered and non-centres. 
    uncovered = list(G.nodes())
    non_centres = list(G.nodes())
    
    # Initialise empty lists for the covered and centre nodes. 
    covered = []
    centres = []
    
    # Each box can have diameter of up to lB, so the maximum radius is rB = (lB-1)/2.
    rB = (lB-1)/2
    
    # Initialise an empty dictionary to store nodes and a list of nodes in the graphs centred on these nodes with a radius rB.
    # Doing this stops us from having to generate the same subgraphs multiple times, which is expensive. 
    eg_dict = {}
    
    # For each node find the graph centres on that node with radius rB. 
    for node in G.nodes():
        H = nx.ego_graph(G, node, radius=rB)
        eg_dict[node] = list(H.nodes()) # Add the list of nodes in that graph to the dictionary.

    # Iterate while there are still nodes uncovered in the graph.
    while len(uncovered) > 0:

        # Start with a maximum excluded mass of zero, and no node p [1].
        p = None
        maximum_excluded_mass = 0
        
        # For the non-deterministic method, keep a list of nodes with equal maximum excluded mass 
        possible_p = []

        # For each node that isn't a centre, find the excluded mass.
        for node in non_centres:
            # The excluded mass is the number of uncovered nodes in within a radius of rB.
            excluded_mass = len(list(set(eg_dict[node])-set(covered)))
            # If the excluded mass of this node is greater than the current excluded mass, choose this node.
            if excluded_mass > maximum_excluded_mass:
                p = node # Update p.
                maximum_excluded_mass = excluded_mass # Update maximum excluded mass.
                possible_p = [node] # Update list of possible nodes for non-deterministic method.
            # If the excluded mass of this node is equal to the current maximum excluded mass, then add this node to the list of possible p.
            elif excluded_mass == maximum_excluded_mass:
                possible_p.append(node)
        
        # If the non-deterministic method is chosen, then randomly choose a node from the list of possible p.
        if not deterministic:
            p = random.choice(possible_p)
                
        # Add the chosen p to the list of centres. 
        centres.append(p)
        # Remove the chosen p from the list of non-centres.
        non_centres.remove(p)

        # Find the graph centred on the node p with radius rB.
        H = eg_dict[p] 
        # Iterate through the nodes in this subgraph.
        for node in H:
            covered.append(node) # Cover the nodes in the subgraph.
            # Remove these nodes from the list of uncovered nodes.
            if node in uncovered:
                uncovered.remove(node)        
    
    # Once all the nodes are covered, return the list of centres. 
    return centres
    
        

In [None]:
def degree_based_MEMB(G, lB, deterministic=True, N=10):
    """
    Implements the Maximal Excluded Mass Burning algorithm as according to [1], adjusted to prioritise nodes with high degree. 
    This reduces the running time of the traditional MEMB without losing accuracy.
    Note: Only works for odd numbers!
    
    Args: 
        G (networkx.Graph): The network the algorithm is to be applied to. 
        lB (int): The diameter of the boxes used to cover the network.
        deterministic (Bool) (opt): If False, choose fom nodes with equal excluded mass randomly. If True, choose the first lexicographically.
        N (int): Takes the top N-th nodes by degree to find the centred subgraph of. 
    
    Returns: 
        centres (list): A list of nodes assigned to be centres under the MEMB algorithm. 
    """
    
    # If the diameter lB is less than or equal to 2, then the maximum radius is 0 and so every node is in its own box.
    if lB == 1 or lB == 2:
        return list(G.nodes())
    
    # Start with all nodes being uncovered and non-centres. 
    uncovered = list(G.nodes())
    
    # Initialise empty lists for the covered and centre nodes. 
    covered = []
    centres = []
    
    # Each box can have diameter of up to lB, so the maximum radius is rB = (lB-1)/2.
    rB = (lB-1)/2

    # Find the N nodes with the greatest degree centrality. 
    dc = nx.degree_centrality(G)
    top_N_dc = dict(sorted(dc.items(), key=itemgetter(1), reverse=True)[:N])

    # Initialise an empty dictionary to store nodes and a list of nodes in the graphs centred on these nodes with a radius rB.
    # Doing this stops us from having to generate the same subgraphs multiple times, which is expensive. 
    eg_dict = {}
    for node in top_N_dc:
        H = nx.ego_graph(G, node, radius=rB)
        eg_dict[node] = list(H.nodes())
    
    # On the initial iteration, set maiden to True.
    maiden = True
    
    # This variable checks if the algorithm does not find a solution, and then looks at the next N nodes. 
    failed = False
    
    # Iterate while there are still nodes uncovered in the graph.
    while len(uncovered) > 0:
        
        # For all iterations except the first, calculate the new dictionary of nodes in each subgraph of radius rB.
        if not maiden:
            eg_dict = calc_next_iter(G, uncovered, N, dc, eg_dict, rB, centres, p, failed)
        maiden = False

        # Start with a maximum excluded mass of zero, and no node p [1].
        p = None
        maximum_excluded_mass = 0
        
        # For the non-deterministic method, keep a list of nodes with equal maximum excluded mass 
        possible_p = []
        
        # For each of the top N nodes by degree, find the excluded mass. 
        for node in eg_dict:
            # The excluded mass is the number of uncovered nodes in within a radius of rB. 
            excluded_mass = len(list(set(eg_dict[node])-set(covered)))
            # If the excluded mass of this node is greater than the current excluded mass, choose this node.
            if excluded_mass > maximum_excluded_mass:
                p = node # Update p.
                maximum_excluded_mass = excluded_mass # Update maximum excluded mass. 
                possible_p = [node] # Update list of possible nodes for non-deterministic method.
            # If the excluded mass of this node is equal to the current maximum excluded mass, then add this node to the list of possible p.
            elif excluded_mass == maximum_excluded_mass:
                possible_p.append(node)
            
        # If the non-deterministic method is chosen, then randomly choose a node from the list of possible p.
        if not deterministic:
            p = random.choice(possible_p)
        
        # Check if the method fails to find a node p. 
        # The method only fails if every single node has zero uncovered nodes within a radius rB. 
        if p == None:
            # If it does fail, remove all the nodes that have already been tried from the list of top N nodes. 
            failed=True # Set the failed variable.
            for tried_node in eg_dict:
                dc.pop(tried_node)
        else:    
            # Otherwise, add the new p to the list of centre nodes. 
            centres.append(p)
            # Update the list so that the newly covered nodes are in the list of covered nodes and not in the list of uncovered nodes. 
            for node in eg_dict[p]:
                if node in uncovered:
                    uncovered.remove(node)
                    covered.append(node)
            failed=False # Reset the failed variable.

    # Once all the nodes are covered return the list of centres found in the graph.
    return centres

In [None]:
def calc_next_iter(G, uncovered,  N, dc, eg_dict, rB, centres, p, failed):
    """
    Finds the top N nodes to check in the next iteration of the algorithm.
    
    Args:
        G (networkx.Graph): The network being analysed. 
        uncovered (list): A list of nodes in the network which are uncovered at this stage.
        N (int): The number of nodes to check in the next iteration of the algorithm.
        dc (dict): Dictionary containing the degree centrality of each node which is yet to be checked as a centre.
        eg_dict (dict): Dictionary with keys as nodes and values as the list of nodes within a distance of rB from that node.
        rB (int): The radius rB to be checked. 
        centres (list): List of nodes identified as centres. 
        p (str): Name of the node chosen as the most recent p [1].
        failed (Bool): True if the previous iteration of the algorithm failed, and False otherwise. 
    
    Returns:
        eg_dict (dict): Returns an updated version of eg_dict with the nodes to be checked for the next iteration. 
    """
    
    # If the algorithm failed in the previous iteration, then remove the most recent node p from the dictionary. 
    # This node was chosen as a centre in the last stage, and so it does not need to be considered again.
    if not failed:
        dc.pop(p)
        
    # Choose the top N nodes by degree centrality. 
    top_N_dc = dict(sorted(dc.items(), key=itemgetter(1), reverse=True)[:N])
    
    # Initialise an empty updated version of eg_dict.
    new_eg_dict = {}

    # For each of the top N nodes, assign the list of nodes in the subgraph of radius rB centred around the node to the new dictionary.
    for node in top_N_dc:
        # If the subgraph has already been found then reference the old dictionary to prevent recalculating the subgraph.
        if node in eg_dict:
            new_eg_dict[node] = eg_dict[node]
        # If not, then find the graph and add it to the new dictionary.
        else:
            H = nx.ego_graph(G, node, radius=rB)
            new_eg_dict[node] = list(H.nodes())
    
    # Once this is done for all the relevant nodes, return the new updated dictionary.
    return new_eg_dict

In [None]:
def original_method(G, lB):
    N = len(G.nodes())
    start = time.time()
    new_centres = MEMB(G,lB)
    print(new_centres)
    end = time.time()
    print(len(new_centres))
    print(end-start)
    #[2939, 2940, 2943]

In [None]:
def degree_method(G, lB):
    N = len(G.nodes())
    start = time.time()
    db_centres = degree_based_MEMB(G,lB,egoN=500)
    print(db_centres)
    end = time.time()
    print(len(db_centres))
    print(end-start)
    #[2939, 2940, 2943]

<h2> Find Best Fit

In [None]:
def sum_of_squares_deviation(y, est_y):
    sum_of_squares = 0
    for (yi, est_yi) in zip(y, est_y):
        sum_of_squares += (est_yi - yi) ** 2
    return sum_of_squares

In [None]:
def find_best_exp_fit(x, y, A_min=0, A_max=100, c_min=0, c_max=2, linspace_N=100):
    """
    Finds the best exponential fit according to the sum of squares deviation of the form y = Ae^{cx}
    """
    best_fit = (None, None)
    best_score = None
    for A in np.linspace(A_min, A_max, linspace_N):
        for c in np.linspace(c_min, c_max, linspace_N):
            est_y = [A * math.e ** (-c*i) for i in lB]
            score = sum_of_squares_deviation(y, est_y)
            if best_score == None:
                best_score = score
                best_fit = (A, c)
            elif score < best_score:
                best_score = score
                best_fit = (A, c)
    return best_fit, best_score

In [None]:
def find_best_fractal_fit(x, y, A_min=0, A_max=100, c_min = 0, c_max = 2, linspace_N=100):
    """
    Finds the best exponential fit according to the sum of squares deviation of the form y = Ax^{-c}
    """
    best_fit = (None, None)
    best_score = None
    for A in np.linspace(A_min, A_max, 100):
        for c in np.linspace(c_min, c_max, 100):
            est_y = [A * i ** (-c) for i in lB]
            score = sum_of_squares_deviation(y, est_y)
            if best_score == None:
                best_score = score
                best_fit = (A, c)
            elif score < best_score:
                best_score = score
                best_fit = (A, c)
    return best_fit, best_score

In [None]:
def find_best_fit_iteratively(x, y, method, linspace_N=100):
    
    A_min = 0
    A_max = 10000
    c_min = 0
    c_max = 12.5
    
    A_diff = A_max - A_min
    c_diff = c_max - c_min
    
    for i in range(3):
        best_fit, best_score = method(x, y, A_min=A_min, A_max=A_max, c_min=c_min, c_max=c_max, linspace_N=linspace_N)
        
        A_diff = A_diff / 10
        c_diff = c_diff / 5
        
        best_A_approximation = best_fit[0]
        best_c_approximation = best_fit[1]
        
        A_min = max(0, best_A_approximation - (A_diff/2))
        A_max = best_A_approximation + (A_diff/2)
        c_min = max(0, best_c_approximation - (c_diff/2))
        c_max = best_c_approximation + (c_diff/2)

    return best_fit, best_score

<h2> Plotting

In [None]:
def plot_best_fit_comparison(lB, NB, exp_A, exp_c, frac_A, frac_c):
    lB = np.array(lB)
    NB = np.array(NB)
    
    est_NB_exp = [exp_A * math.e ** (-exp_c*i) for i in lB]
    est_NB_frac = [frac_A * i ** (-frac_c) for i in lB]
    
    fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(10, 3))
    axes[0].plot(lB, NB, color='#000066', label="Empirical Data")
    axes[0].plot(lB, est_NB_exp, color='#C00000', label="Best Fit")
    axes[1].plot(lB, NB, color='#000066')
    axes[1].plot(lB, est_NB_frac, color='#C00000')

    fig.suptitle('Non-Fractal Network Model', fontsize=16)

    axes[0].set_xlabel('$\ell_B$')
    axes[0].set_title('Exponential Relation')
    axes[0].set_ylabel('$N_B$')
    axes[0].text(12, 600, r"$SSR \approx 6.367$")

    axes[1].set_xlabel('$\ell_B$')
    axes[1].set_title('Power-Law Relation')
    axes[1].set_ylabel('$N_B$')
    axes[1].text(12, 600, r"$SSR \approx 3566$")

    fig.legend(loc="upper left")
    fig.tight_layout()

In [None]:
def plot_lB_NB(lB, NB):
    plt.plot(lB, NB, color='#C00000')
    plt.xlabel('$\ell_B$')
    plt.ylabel('$N_B$')
    plt.title('The optimal number of boxes $N_B$ against the diameter $\ell_B$.')
    plt.show()

In [None]:
def plot_loglog_lB_NB(lB, NB):
    plt.loglog(lB, NB, color='#C00000')
    plt.xlabel('$\ell_B$')
    plt.ylabel('$N_B$')
    plt.title('The optimal number of boxes $N_B$ against the diameter $\ell_B$.')
    plt.show()

<h2> Finding Boxes

In [None]:
def find_central_distance(G, centres):
    central_distance = {}
    for v in list(G.nodes()):
        shortest_path_len = None
        closest_neighbour = None
        for u in centres:
            path_len = nx.shortest_path_length(G, v, u)
            if shortest_path_len == None or shortest_path_len > path_len:
                shortest_path_len = path_len
                closest_neighbour = u
        central_distance[v] = shortest_path_len
    return central_distance

In [None]:
def assign_nodes_to_boxes(G, centres, central_distance):
    # TO DO REDO VARIABLE NAMING HERE
    boxes = {}
    nodes = list(G.nodes())
    non_centres = list(set(nodes) - set(centres))
    
    sorted_non_centres = []
    sorted_dict = dict(sorted(central_distance.items(), key=itemgetter(1)))
    for key in sorted_dict:
        if not key in centres:
            sorted_non_centres.append(key)

    id = 0
    for node in centres:
        boxes[node] = id
        id += 1
    for node in sorted_non_centres:
        possible_boxes = []
        for neighbour in G.neighbors(node):
            if central_distance[node] > central_distance[neighbour]:
                possible_boxes.append(boxes[neighbour])
        boxes[node] = random.choice(possible_boxes)
    return boxes

In [None]:
def find_boxes(nodes_to_boxes, centres):
    """
    Finds a list of nodes assigned to each box in a network.
    
    Args:
        nodes_to_boxes (dict): A dictionary with nodes as keys and their corresponding boxes as values. 
        centres (list): A list of the nodes found as centres under the MEMB algorithm. 
    
    Returns:
        boxes (dict): A dictionary with boxes as keys and a list of nodes in that box as the value. 
    """
    # Initialise an empty dictionary to store the boxes.
    boxes = {}
    # The box IDs are 0, ..., k-1 where k is the number of centres. 
    for i in range(len(centres)): # Iterate over the box IDs.
        # Initialise an empty list of nodes.
        nodes = []
        # Check if each node belongs in the current box.
        for node in nodes_to_boxes:
            # If it does, add it to the list of nodes.
            if nodes_to_boxes[node] == i:
                nodes.append(node)
        # Assign the list of nodes to the box. 
        boxes[i] = nodes
    # Return the dictionary of boxes to nodes. 
    return boxes

<h1> Renormalised Graph


In [None]:
def renormalise_graph(G, boxes, nodes_to_boxes):
    renormalisedG = nx.Graph()
    for box in boxes:
        renormalisedG.add_node(box)
    for edge in G.edges():
        source = edge[0]
        target = edge[1]
        renormalised_source = nodes_to_boxes[source]
        renormalised_target = nodes_to_boxes[target]
        renormalisedG.add_edge(renormalised_source, renormalised_target)
    renormalisedG.remove_edges_from(nx.selfloop_edges(renormalisedG))
    nx.draw_kamada_kawai(renormalisedG, node_color = list(renormalisedG.nodes()))
    return renormalisedG

In [None]:
def main(G, lB, count=1):

    centres = degree_based_MEMB(G, lB, deterministic=True, N=10)
    central_distance = find_central_distance(G, centres)
    nodes_to_boxes = assign_nodes_to_boxes(G, centres, central_distance)
    colourmap = []
    for node in G.nodes():
        colourmap.append(nodes_to_boxes[node])
    plt.figure(1)
    nx.draw_kamada_kawai(G, node_color = colourmap)
    boxes = find_boxes(nodes_to_boxes, centres)
    
    print("Iteration {0}".format(count))
    box_sizes = []
    for box in boxes:
        box_sizes.append(len(boxes[box]))
    print(len(G.nodes()), box_sizes)
    
    boxes_file_path = "32flower5iter_boxes_" + str(count)
    renormalised_file_path = "32flower5iter_renormalised_" + str(count)
    
    export_to_gephi(G, nodes_to_boxes, boxes_file_path)
    plt.figure(2)
    renormalisedG = renormalise_graph(G, boxes, nodes_to_boxes)
    gephi_dict = {}
    for node in renormalisedG:
        gephi_dict[node] = node
    
    export_to_gephi(renormalisedG, gephi_dict, renormalised_file_path)
    plt.show()
    return renormalisedG

In [None]:
#H = nx.watts_strogatz_graph(100, 4,0.7)
eduG = read_mtx_graph_format("web-edu.mtx")
current_graph = eduG.copy()
lB = 3
count = 1
while len(list(current_graph.nodes())) > 1:
    new_graph = main(current_graph, lB, count)
    current_graph = new_graph.copy()
    count += 1

In [None]:
H = nx.watts_strogatz_graph(25, 4,0.7)
centres = degree_based_MEMB(H, 3, deterministic=True, N=10)
central_distance = find_central_distance(H, centres)
nodes_to_boxes = assign_nodes_to_boxes(H, centres, central_distance)
colourmap = []
for node in H.nodes():
    colourmap.append(nodes_to_boxes[node])
nx.draw_kamada_kawai(H, with_labels=True, node_color = colourmap)
boxes = find_boxes(nodes_to_boxes, centres)
print(boxes)

In [None]:
renormalise_graph(H, boxes, nodes_to_boxes)

In [None]:
def export_to_gephi(G, nodes_to_boxes, file_path):
    H = G.copy()
    nx.set_node_attributes(H, nodes_to_boxes, 'boxes')
    nx.write_graphml(H, file_path)

<h1> References

[1] C. Song, L. K. Gallos, S. Havlin, and H. A. Makse, “How to calculate the fractal dimension of a complex
network: The box covering algorithm,” Journal of Statistical Mechanics, 2007.

<h1> Depreciated Code

In [None]:
def calculate_excluded_mass(G, covered, node, rB):
    H = nx.ego_graph(G, node, radius=rB)
    uncovered_size = len(list(set(H.nodes())-set(covered)))
    #print("MEMB of {0} is {1}".format(node, uncovered_size))
    return uncovered_size

In [None]:
def subgraph_degree_centrality(G, uncovered):
    N = len(uncovered)
    dc = {}
    for node in G:
        neighbourhood = set(list((nx.neighbors(G, node))))
        uncovered_neighbourhood = neighbourhood.intersection(set(uncovered))
        dc[node] = len(uncovered_neighbourhood)/N
    return dc

def subgraph_closeness_centrality(G, uncovered, top_N_dc):
    N = len(uncovered)
    cs_dict = {}
    for node in top_N_dc:
        csN = N
        sum_d = 0
        for uncovered_node in uncovered:
            if node == uncovered_node:
                csN -= 1
            else:
                d = nx.shortest_path_length(G, source=node, target=uncovered_node)
                sum_d += d
        cs_dict[node] = csN/sum_d
    #print(cs_dict)
    return cs_dict