# Report 3

## Basic Graph Algorithms

#### Marcin Kapiszewski 156048
#### Adam Tomys 156057

Group 2

### Compared diraph representations:

1. Incidence matrix:
    - Matrix consists of vertices (rows), arcs (columns), and values(cells)
    - Values:
        - 1 means the head of the arc
        - 0 means that the arc is not connected to that node
        - -1 means the tail of the arc
2. Adjacency matrix:
    - Matrix consists of starting vertices (rows), ending vertices (columns), and values(cells)
    - Values:
        - 1 means that the given arc from tail to head exists
        - 0 means that the given arc from tail to head does not exist
        - -1 means that there is an arc from ending vertice to starting vertice (this value is not required)
3. Arc list:
    - The list contains entires containing arcs
    - arc is represented as a tuple containing the tail and the head
4. Adjacency list:
    - It consists of one node for each vertice
    - Each node contains all of the successors of the vertice
    - Its possible to store all precessors too but as an additional list
5. Forward star (in our case uses built in hashtable):
    - It is the same as the previos one, but instead of storing successors in a list it uses an AVL tree (in our case) or hashtable

### Implementation Difficulties:
NO difficulties

### DAG Generation:
TODO you better understood it and its not written anyway

In [1]:
%load_ext autoreload
%autoreload 2

### Memory Comparision:
V - number of vertices, A - number of arcs
1. Incidence matrix:
    - V * A
    - stores bools or small signed integers (if -1 is used to represent the tail)
2. Adjacency matrix:
    - V * V
    - stores bools or small signed integers (if -1 is used to represent "reversed" arc)
3. Arc list:
    - 2 * A
    - stores ids of heads and tails, typically represented by integers
4. Adjacency list:
    - V + E
    - stores ids of heads and tails, typically represented by integers
5. Forward star (in our case uses built in hashtable):
    - V + E
    - stores ids of heads and tails, typically represented by integers

### Compared topological sorts:

1. Inspired by Kahn algorithm:
    - First we count for each node the number of predecessors
    - We initalize an output list, a number of visited nodes as 0, and a queue containing all vertices that do not have any predecessors
    - While our queue is not empty:
        - Increment visited nodes by one
        - Remove a vertice from queue and add it to the output
        - For each successor of dequeued node:
            - Decrease their number of predecessors by one
            - If the number is now 0 add it to the queue
    - If the number of visisted nodes is diffrent than the number of all nodes it is not possible to sort this graph and an error or message should be returned
    - Otherwise return out output list
2. Making use of DFS (graph coloring):
    - Our code used recursion for this
    - First we initialize a stack as our output and a set for visited nodes
    - Now we iterate over every node:
        - If it was not visited already we call our helper function dfs
    - At the end we return out stack as a list (stack was used for fast pushing new values on the left)

    - Our helper dfs function:
        - It takes the graph, the current node, the set of visited nodes, and the output stack as arguments
        - It adds the node to the visited set
        - For each successors of the node if that successor was not visited it calls dfs again
        - At the end the current node is pushed to the left of the stack

In [2]:
from collections import defaultdict

from generate_DAG import generateDAG

from kahn import kahn_sort
from dfs_sort import dfs_sort

from graph_representations.adjacencyList import AdjacencyList
from graph_representations.adjacencyMatrix import AdjacencyMatrix
from graph_representations.arcList import ArcList
from graph_representations.forwardStar import ForwardStar
from graph_representations.incidenceMatrix import IncidenceMatrix
from graph_representations.digraphRepresentation import Digraph

from evaluate_program import measureTime

import plotly.graph_objects as go
from plotly.subplots import make_subplots

In [3]:
sortAlgorithms = [kahn_sort, dfs_sort]
graphRepresentations = [IncidenceMatrix, AdjacencyMatrix, ArcList, AdjacencyList, ForwardStar]
nodesNumbers = [100, 200, 300, 400, 500, 750, 1000, 1250, 1500, 2000, 2500, 3000, 4000, 5000]

In [4]:
def inicializaceGraph(graph: Digraph, DAG):
    len_dag = len(DAG)
    for node in range(len_dag):
        graph.addNode(node)
    for startNode in range(len_dag):
        for endNode in range(len_dag):
            if DAG[startNode][endNode] == 1:
                graph.addEdge(startNode, endNode)

In [5]:
def evaluateSortAlgorithm(sortAlgorithm, saturation):
    graphRepresentation: Digraph
    times = {} # times.nodesNumber.graphRepresentation
    for nodesNumber in nodesNumbers:
        print(f"{nodesNumber}:")
        times[nodesNumber] = {}
        DAG = generateDAG(nodesNumber, saturation)
        for graphRepresentation in graphRepresentations:
            if (graphRepresentation.__name__ == "IncidenceMatrix" and nodesNumber >= 1500) or (graphRepresentation.__name__ == "ArcList" and nodesNumber >= 2000):
                continue
            graph = graphRepresentation()
            inicializaceGraph(graph, DAG)
            times[nodesNumber][graphRepresentation.__name__] = measureTime(sortAlgorithm, [graph])
            print(f"\t{graphRepresentation.__name__} --> {times[nodesNumber][graphRepresentation.__name__]}")
    return times

In [None]:
import json
for saturation in (1.0, 0.75, 0.5, 0.25, 0.1, 0.01):
    times = {}
    for sortAlgorithm in sortAlgorithms:
        times[sortAlgorithm.__name__] = evaluateSortAlgorithm(sortAlgorithm, saturation)
    with open(f"times_{saturation*100}.json", "w") as f:
        json.dump(times, f)

In [8]:
def compareSortTimesForGraphSizes(times, title):
    titles = [f"nodes: {nodesNumber}  edges:{((nodesNumber-1)*nodesNumber)//2}" for nodesNumber in  nodesNumbers]
    fig = make_subplots(rows=((len(times) - 1)//2 + 1), cols=2,subplot_titles=titles)

    i=0
    for nodesNumber in nodesNumbers:
        d = times[nodesNumber]
        fig.add_trace(go.Bar(x=list(d.keys()), y=list(d.values()), text=[round(x,3) for x in d.values()],
                            textposition="auto", name=""),
                        row=i//2+1, col=i%2+1)
        fig.update_yaxes(title_text="time [s]", row=i//2+1, col=i%2+1)
        i+=1

    fig.update_layout(height=1600, width=1200,
                    title_text=title)

    fig.show()

In [9]:
def compareSortTimesForGraphRepresentation(times, title):
    fig = make_subplots(rows=3, cols=2, subplot_titles=[representation.__name__ for representation in graphRepresentations])
    colors = {"kahn_sort": "blue", "dfs_sort": "red"}
    i=0
    for graphRepresentation in graphRepresentations:
        d = defaultdict(lambda : {})
        for sort, sort_times in times.items():
            for nodesNumber, time in sort_times.items():
                # edgesNumber = ((nodesNumber-1)*nodesNumber)/2
                if graphRepresentation.__name__ in time.keys():
                    d[nodesNumber] = time[graphRepresentation.__name__]
            fig.add_trace(go.Scatter(x=list(d.keys()), y=list(d.values()), text=[round(x,3) for x in d.values()], 
                                     name=sort, legendgroup=sort, showlegend=(i==0), marker=dict(color=colors[sort])),
                            row=i//2+1, col=i%2+1)
        fig.update_yaxes(title_text="time [s]", row=i//2+1, col=i%2+1)
        i+=1

    fig.update_layout(height=800, width=1200,
                    title_text=title)

    fig.show()

In [10]:
compareSortTimesForGraphRepresentation(times, "sort comparision")

In [11]:
compareSortTimesForGraphSizes(times["kahn_sort"], "kahn_sort")

In [12]:
compareSortTimesForGraphSizes(times["dfs_sort"], "dfs_sort")

### Conclusions:

TODO first rest of report needed
    
