<a href="https://colab.research.google.com/github/thefr33radical/codeblue/blob/master/algo_ds/graphs_trees.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 # **Graphs & Trees Workbook**

---




> **References**
  * [Networkx](https://networkx.github.io/documentation/stable/reference/algorithms/)
  * [GeeksforGeeks](https://www.geeksforgeeks.org/graph-data-structure-and-algorithms/)

In [0]:
# Find the maximum width of the tree
# https://www.geeksforgeeks.org/height-generic-tree-parent-array/

def find_max_width(graph):
  pass
parent = [-1, 0, 0, 0, 3, 1, 1, 2]
tree={}
# construct tree
for p in range(len(parent)+1):
  tree[p]=[]

for p in range(len(parent)):
  if parent[p]==-1:
    continue
  tree[parent[p]].append(p)
    
print(tree)

{0: [1, 2, 3], 1: [5, 6], 2: [7], 3: [4], 4: [], 5: [], 6: [], 7: [], 8: []}


#SHORTEST PATH ALGORITHMS

*Best Algorithms to use to find shortest paths*
* Unweighted -> **BFS**                                           T(C) = O(E+V)
* Weighted without negative edges -> **Dijstra's**                T(C)=O(ElogV)
* Weighted with negative edges -> **Bellman**                    T(C)=O(V-1 X E)


In [0]:
'''
Bellman-Ford Algorithm

* Used to find the distance between source node and all other other nodes.
* Shortest path cannot be found if there is a -ve CYCLE.
* Dynamic Programming concept
* T(C) = O(V X E)
* Disadvantage : Time complexity is higher than Dijkstras ALgo T(C)= VLogV
* Advantage : Can be used for graph with negative edge. 

Algorithm :
1. Create Edge list(u,v,wt)
2. Create dist[] & parent[]
3. For V-1 times :
      for u,v in edglist:
        Relax(u,v) 
4. for u,v in edgelist:
        Relax_final(u,v)  

Relax(u,v):
  if dist[v] > dist[u]+wt(u,v):
    dist[v] = dist[u]+wt(u,v)
    parent[v]=u

Relax_final(u,v):
  if dist[v] > dist[u]+wt(u,v)
    =>negative CYCLE

* Why we loop V-1 times ? Any node can be traversed in max V-1 time(tree property)
* Why we do Vth relaxation ? To find negative CYCLE. If the dist[v] decreses on this iteration there is a -edge, shortest path cannot be found.

References:
* https://en.wikipedia.org/wiki/Bellman%E2%80%93Ford_algorithm
* https://www.youtube.com/watch?v=-mOEd_3gTK0
* https://www.geeksforgeeks.org/bellman-ford-algorithm-simple-implementation/
* https://www.geeksforgeeks.org/bellman-ford-algorithm-dp-23/

'''

G={'A':[('B',-1),('C',4)],'B':[('C',3),('D',2),('E',2)],'C':[],'D':[('B',1),('C',5)],'E':[('D',-3)]}

def bellman_ford(G,src):
  
  # Create edges list and create distance array, parent array for each node
  edges=[]
  dist={}
  parent={}

  for vertex in G:
    dist[vertex]=999
    parent[vertex]=0

    for e in G[vertex]:
      edges.append((vertex,e[0],e[1]))

  dist[src]=0

  # Repeat V-1 times, since by tree property it takes max V-1 times to reach any node 
  for vertex in range(len(G)-1):
    for ed in edges:
      # Relax(u,v)
      u=ed[0]
      v=ed[1]
      wt=ed[2]

      # if dist[v] > dist[u]+ edge(u,v) assign to dist[v]
      if dist[v] > dist[u] + ed[2]:
        dist[v]=dist[u]+ed[2]
        parent[v]=u

  # Relax_final to check for 
  for ed in edges:
    u=ed[0]
    v=ed[1]
    wt=ed[2]

    if dist[v] > dist[u]+wt:
      print("Negative Cycle Found")
      break

  print('parent graph :',parent,'\nDistance graph:',dist)

bellman_ford(G,'A')

In [0]:
'''
Dijkstras Algorithm

* Greedy algorithm

Algorithm:

1. Initialize vist & dist array to inf
2. initialize binary heap(v,dist) to inf
3. atart with visit[src]=T, dist[src]=0, insert binary heap(src,0)
4. while binary heap is not empty:
      v =extract_min_dist(binary_heap)
      add all adjacent vertices to heap with corresponding dist
      if dist[v] < dist[u]+adj_dist
          update dist[v]
          update parent[v]

'''
from collections import defaultdict 
import heapq

class Graph():
  def __init__(self, V): 
        self.size = V 
        self.graph = {}
        self.edges=[]  
  
  def addEdge(self, src, dest, wt):
    try:
      self.graph[src].append((wt,dest))
    except:
      self.graph[src]=[]
      self.graph[src].append((wt,dest))
    try:
      self.graph[dest].append((wt,src))
    except:
      self.graph[dest]=[]
      self.graph[dest].append((wt,src))
        
def dijkstra(G,src):
  print(G.graph)
  parent=[0]*G.size
  
  dist=[999]*G.size
  dist[src]=0
  q=[]

  for v in G.graph.keys():
    if v==src:
      heapq.heappush(q,(0,src))
    else:
      heapq.heappush(q,(999,v))
 
  heapq.heapify(q)
  inheap=[True]*G.size

  while len(q) > 0:

    # Extract min_dist node and remove from min heap
    min_vertex=heapq.heappop(q)
    u=min_vertex[1]
    inheap[u]=False
    #print(u)

    # Find all adj nodes 
    for adj in G.graph[u]:
      wt=adj[0]
      node=adj[1]
      print(u,node,wt)

      # Update adj  new distance if present in heap & its dist is > than new dist
      if inheap[node]==True and dist[node] > dist[u]+wt :
        dist[node] = dist[u]+wt
        parent[node]=u  
        index=0
        temp=adj

        # Update adj node in minheap
        for i in range(len(q)):
          if q[i][1]==node:
            index=i
            break
        del(q[index])
        print(q)
        q.append((dist[node],temp[1]))

    heapq.heapify(q)
        
  print(dist)



graph = Graph(9) 

graph.addEdge(0, 1, 4) 
graph.addEdge(0, 7, 8) 
graph.addEdge(1, 2, 8) 
graph.addEdge(1, 7, 11) 
graph.addEdge(2, 3, 7) 
graph.addEdge(2, 8, 2) 
graph.addEdge(2, 5, 4) 
graph.addEdge(3, 4, 9) 
graph.addEdge(3, 5, 14) 
graph.addEdge(4, 5, 10) 
graph.addEdge(5, 6, 2) 
graph.addEdge(6, 7, 1) 
graph.addEdge(6, 8, 6) 
graph.addEdge(7, 8, 7) 
#graph.dijkstra(0) 

dijkstra(graph,0)

{0: [(4, 1), (8, 7)], 1: [(4, 0), (8, 2), (11, 7)], 7: [(8, 0), (11, 1), (1, 6), (7, 8)], 2: [(8, 1), (7, 3), (2, 8), (4, 5)], 3: [(7, 2), (9, 4), (14, 5)], 8: [(2, 2), (6, 6), (7, 7)], 5: [(4, 2), (14, 3), (10, 4), (2, 6)], 4: [(9, 3), (10, 5)], 6: [(2, 5), (1, 7), (6, 8)]}
0 1 4
[(999, 2), (999, 5), (999, 4), (999, 3), (999, 8), (999, 7), (999, 6)]
0 7 8
[(999, 2), (999, 5), (999, 4), (999, 3), (999, 8), (999, 6), (4, 1)]
1 0 4
1 2 8
[(8, 7), (999, 3), (999, 5), (999, 8), (999, 6), (999, 4)]
1 7 11
7 0 8
7 1 11
7 6 1
[(12, 2), (999, 3), (999, 4), (999, 8), (999, 5)]
7 8 7
[(12, 2), (999, 3), (999, 4), (999, 5), (9, 6)]
6 5 2
[(12, 2), (999, 3), (15, 8), (999, 4)]
6 7 1
6 8 6
5 2 4
5 3 14
[(12, 2), (15, 8), (999, 4)]
5 4 10
[(12, 2), (15, 8), (25, 3)]
5 6 2
2 1 8
2 3 7
[(15, 8), (21, 4)]
2 8 2
[(21, 4), (19, 3)]
2 5 4
8 2 2
8 6 6
8 7 7
3 2 7
3 4 9
3 5 14
4 3 9
4 5 10
[0, 4, 12, 19, 21, 11, 9, 8, 14]


In [0]:
'''
Prims Algorithm
'''

In [0]:
''' 
Kruskal's Algorithm
'''

# **Traversals**

* BFS
* DFS
* Beam Search
* A*
* Informed Search/Best First Search

In [0]:
'''


'''


# **Distance Measures**

In [0]:
'''


barycenter(G[, weight, attr, sp]) Calculate barycenter of a connected graph, optionally with edge weights.
center(G[, e, usebounds]) Returns the center of the graph G.
Diameter : the diameter of a graph is the maximum distance between pair of vertices in a graph. Also called Eccentricity.
Center of a graph : The node whose diameter/eccentricity is minimum.
extrema_bounding(G[, compute]) Compute requested extreme distance metric of undirected graph G
periphery(G[, e, usebounds]) Returns the periphery of the graph G.
radius(G[, e, usebounds]) Returns the radius of the graph G.
resistance_distance(G, nodeA, nodeB[, …]) Returns the resistance distance between node A and node B on graph G.
Network Denisty :The number of edges a graph has. Complete graph has ND=1, Empty graph has ND=0.

References:
* https://www.geeksforgeeks.org/graph-measurements-length-distance-diameter-eccentricity-radius-center/
* https://www.tutorialspoint.com/graph_theory/graph_theory_basic_properties.htm
'''

# **Similarity Measures**

In [0]:
'''
graph_edit_distance(G1, G2[, node_match, …])

Returns GED (graph edit distance) between graphs G1 and G2.

optimal_edit_paths(G1, G2[, node_match, …])

Returns all minimum-cost edit paths transforming G1 to G2.

optimize_graph_edit_distance(G1, G2[, …])

Returns consecutive approximations of GED (graph edit distance) between graphs G1 and G2.

optimize_edit_paths(G1, G2[, node_match, …])

GED (graph edit distance) calculation: advanced interface.

simrank_similarity(G[, source, target, …])

Returns the SimRank similarity of nodes in the graph G.

simrank_similarity_numpy(G[, source, …])

Calculate SimRank of nodes in G using matrices with numpy.'''

# Centrality

#### References
 * [Betweeness](https://www.analyticsvidhya.com/blog/2018/04/introduction-to-graph-theory-network-analysis-python-codes/)
 * [Closeness](https://en.wikipedia.org/wiki/Closeness_centrality)

In [0]:
'''
 Closeness Centrality

* Closeness centrality measures the inverse distance between the node and all other nodes. 
It is defined as the reciprocal (sum/avg of distance between the node and all other nodes)
For disconnected graphs : sum of (reciprocal of distances between the node and all other nodes)
* Its the average of the shortest path from the node to all other nodes.

# 

'''



In [0]:
'''
Betweeness Centrality:
* The number of times a node occurs between shortest path of all other nodes.

References:
* https://www.analyticsvidhya.com/blog/2018/04/introduction-to-graph-theory-network-analysis-python-codes/


'''

In [0]:
'''
Stress Centrality:
* The average number of shortest paths passing through the node.

References:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0200091
'''

# Connectivity

In [0]:
'''
Kosraju Algorithm :
Strongly connected graph: (Directed)
* this applies to directed graph. 
References :
* https://www.geeksforgeeks.org/connectivity-in-a-directed-graph/
'''

#Coloring

# **Link analysis**

# Clustering

In [0]:
# DAG

# **PageRank**