## Girvan-Newman Algorithm

We saw this algorithm in the lecture.  It can be used to identify communities of vertices within a graph.


In [None]:
fbgraph={'A':['B','E','F'],'B':['A','E','F','I'],'C':['G','H'],'D':['G','H','E'],'E':['D','F','A','B','I'],'F':['A','B','E','I'],'G':['C','D','H'],'H':['C','D','G','K'],'I':['E','B','F','J'],'J':['I','K','L'],'K':['H','J','L'],'L':['J','K']}

First we modify the BFS algorithm so that we compute alternative shortest paths.  So, the predecessor of a vertex becomes a list.  Whilst we are at it, lets compute the number of shortest paths to a vertex as we go along and store a top-down version of the breadth-first search tree (it will be useful to be able to identify the edges before and after a vertex in the tree).

In [None]:
import numpy as np

def BFS_GN(graphdict,source):
    
    #initialisation
    colour={}
    distance={}
    predecessor={}  #this will store for each vertex a list of predecessors on shortest paths from the source
    topdown={} #this will store for each vertex a list of successors on shortest paths from the source
    sp={}  #this will store the number of shortest paths from the source to each vertex
    for v in graphdict.keys():
        colour[v]='w'
        distance[v]=np.inf
        predecessor[v]=[]
        sp[v]=0
        
    colour[source]='g'
    distance[source]=0
    queue=[source]
    sp[source]=1
    
    #iteration
    while queue != []:
        u=queue[0]
        topdown[u]=[]
        for vertex in graphdict[u]:
            if colour[vertex]=='g' and distance[vertex]==distance[u]+1: #alternative shortest path
                topdown[u].append(vertex)
                predecessor[vertex].append(u)
                sp[vertex]+=sp[u]
            if colour[vertex]=='w':
                colour[vertex]='g'
                distance[vertex]=distance[u]+1
                predecessor[vertex]=[u]
                sp[vertex]=sp[u]
                topdown[u].append(vertex)
                queue.append(vertex)
        
        colour[u]='b'
        
        if len(queue)>1:
            queue=queue[1:]
        else:
            queue=[]
            
    return (distance,predecessor,topdown,sp)
        
BFS_GN(fbgraph,'C')

In order to be able to compute the vertex and edge credits, we need to be able to progress back up the tree, processing all nodes at a certain distance from the source.  To help us do this, it is useful to have a reverse-index of the distance dictionary computed by the BFS

In [None]:

def revvalues(adict):
    revdict={}
    maxsofar=0
    for (key,value) in adict.items():
        if value > maxsofar:
            maxsofar=value
        if value in revdict:
            revdict[value].append(key)
        else:
            revdict[value]=[key]
            
    return revdict,maxsofar

#### test this with some different dictionary inputs



### Exercise 3a
Write a function which takes a graph and a given source vertex, calls the modified BFS algorithm, calls revvalues on the distance dictionary and then computes the credit calculations for each vertex and edge.  

Hint:
I would use a while loop which stops when the current distance of vertices from the source is 0.  Then for each vertex at the current distance, compute its vertex credit and the edge credits of edges to each predecessor in the tree.  Then decrease the current distance by 1.  I would store vertex credits in a dictionary and edge credits in a dictionary of dictionaries.

### Exercise 3b

Write a function which takes a graph and runs the Girvan-Newman betweenness calculations with all vertices as source.  Compute and return the total betweenness score for each edge.