# Graphs
#### Adjacency list
Given an adjacency list, return:
- the number of nodes in the graph
- the number of edges in the graph
- the degree of a node
- the neighbours of a node.

In [2]:
adjacency_list = [
    [1],       # node 0
    [0,2,5,4], # node 1
    [1,4,5],   # node 2
    [],        # node 3
    [5,2,1],   # node 4
    [1,2,4],   # node 5
]

In [3]:
def num_nodes(al):
    return len(al)

def num_edges(al):
    e = 0
    for n in al:
        e += len(n)
    return e//2

def degree(node):
    return len(node)

def neighbours(node):
    for n in node:
        print(n)

In [4]:
print(f"Number of nodes in adjacency list: {num_nodes(adjacency_list)}")
print(f"Number of edges in adjacency list: {num_edges(adjacency_list)}")
print(f"Degree of node 4 from adjacency list: {degree(adjacency_list[4])}")
print(f"Neighbours of node 4 from adjacency list...")
neighbours(adjacency_list[4])

Number of nodes in adjacency list: 6
Number of edges in adjacency list: 7
Degree of node 4 from adjacency list: 3
Neighbours of node 4 from adjacency list...
5
2
1


#### Transforming edge lists into adjacency lists
The above adjacency list could have been given as an edge list. Edge lists are less useful as it takes $O(E)$ time to find a node's neighbours from an edge list, compared with $O(degree(node))$ for an adjacency list. As such, it is generally a good idea to transform an edge list into an adjacency list.

Note that if searching for neighbours from an adjacency list becomes a bottleneck, we can transform the neighbour list into a set. Checking will then only require $O(1)$ time.

In [6]:
edges = [[0,1],[1,2],[1,4], [1,5], [2,4], [2,5], [4,5]]
V = 6

In [7]:
def build_adjacency_list(V, edges):
    graph = [[] for _ in range(V)]
    for n1, n2 in edges:
        graph[n1].append(n2)
        graph[n2].append(n1)
    return graph  

In [8]:
build_adjacency_list(6, edges)

[[1], [0, 2, 4, 5], [1, 4, 5], [], [1, 2, 5], [1, 2, 4]]

#### Adjacency list validation
Given an adjacency list, write a function that returns whether the graph is a valid undirected graph. This requires:
- every node is between 0 and V-1
- there a no self-loops or nodes connected to themselves
- there are no parallel edges connecting the same two nodes
- adjacency for two nodes is recorded for each of them.

In [10]:
def validate(graph):
    V = len(graph)
    # Iterate through each node and each node's adjacencies to determine whether every value is between 0 and V-1
    # Return False if any of a node's adjacencies == itself
    # Return False if there are any node has repeat adjacencies
    # For each adjacency, check that the counterpart node lists the counterpart adjacency
    for i, node in enumerate(graph):
        seen = set()
        for a in node:
            if a < 0 or a >= V or a == i or a in seen or i not in graph[a]:
                return False
            else:
                seen.add(a)
    return True

In [11]:
validate(adjacency_list)

True

In [12]:
# The above validation function cycles through each of a node's adjacencies when it checks for matching adjacencies (requirement 4).
# This is somewhat inefficient and could be improved by keeping a record of each adjacency as it's seen and then subsequently matching these.
# This requires more space, but is much more efficient in terms of time.
def validate(graph):
    V=len(graph)
    edges = set()
    for i, node in enumerate(graph):
        seen = set()
        for a in node:
            if a < 0 or a >= V or a == i or a in seen:
                return False
            seen.add(a)
            edge = (min(i, a), max(i, a))
            if edge in edges:
                edges.remove(edge)
            else:
                edges.add(edge)
    return len(edges) == 0

In [13]:
validate(adjacency_list)

True

### Adjacency matrices
An adjacency matrix uses booleans to represent connected nodes, ie. $(c,v) = True $ where node $c$ is connected to node $v$. This can be useful in the context of particularly dense graphs where nodes are connected to most other nodes. It requires $O(V^2)$ space, but only requires $O(1)$ time to check if two nodes are neighbours.

### Big O analysis
- Space usage: $O(V+E)$
- Count nodes: $O(1)$
- Count edges: $O(V)$
- Node degree: $O(1)$
- Iterate through a node's neighbours: $O(degree(node))$
- Find if two nodes are adjacent: $O(degree(node1))$ (can be optimized to $O(1)$)
- Initialize from edge list and node count: $O(V+E)$
- Validate adjacency list (no self-loops, no parallel edges, no missing directions): $O(V+E)$

The maximum number of edges in a directed graph is $V*(V-1)$. For an undirected graph, it's half that: $V*(V-1)/2$

#### Complexity simplifications
| Graph type       | Range of edge count       | Example of simplification    |
|------------------|---------------------------|------------------------------|
| All undirected graphs | $0 \le E \le V*(V-1)/2$ | $O(V+E)$ (cannot be simplified) |
| Complete graphs  | $E = V*(V-1)/2$           | $O(V+E) = O(V+V^2) = O(V^2)$ |
| Connected graphs | $V-1 \le E \le V*(V-1)/2$ | $O(V+E) = O(E)$              |
| Acyclic graphs (aka forests) | $0 \le E \le V-1$ | $O(V+E) = O(V)$          |
| Trees (connected and acyclic) | $E = V-1$    | $O(V+E) = O(V)$              |
| Graphs with max degree k (k is const) | $0 \le E \le V*K/2$ | $O(V+E) = O(V+K*V) = O(V)$ |

#### Graph path
Given an adjacency list for an undirected graph, return a simple path (no repetitions) between two nodes - or an empty array if there is no path.

In [17]:
def path(graph, node1, node2):
    seen = set()
    def visit(node, path):
        print(f'running visit on node {node} with path {path}')
        if node in seen:
            print(f'node {node} already seen. Will return empty array.')
            return []
        path.append(node)
        print(f'Node {node} added to path. Path is now {path}')
        if node2 in graph[node]:
            print(f"Node {node2} is in node {node}'s neighbour list. Bingo!")
            path.append(node2)
            print(f'Node {node2} added to path. Path {path} will now be returned')
            return path
        seen.add(node)
        print(f"Node {node} now added to seen list")
        for n in graph[node]:
            print(f"Will now visit node {n}")
            visit(n, path)
    path = visit(node1, [])
    print(f"Recursive function has finished. Path = {path}")
    return path


In [18]:
# Solution creates a tree of predecessors
def path(graph, node1, node2):
    predecessors = {node2: None}

    def visit(node):
        for nbr in graph[node]:
            if nbr not in predecessors:
                predecessors[nbr] = node
                visit(nbr)
    visit(node2)
    if node1 not in predecessors:
        return []
    print(f"Predecessors: {predecessors.items()}")
    path = [node1]
    while path[len(path)-1] != node2: # while final item in path is not yet node2...
        path.append(predecessors[path[-1]])
    return path

In [19]:
#print(f"Path from node 0 to 4 on graph defined above: {path(adjacency_list, 0,4)}")
#print(f"Path from node 0 to 3 on graph defined above: {path(adjacency_list, 0,3)}")

In [20]:
print(f"Path from node 0 to 4 on graph defined above: {path(adjacency_list, 0,4)}")
print(f"Path from node 0 to 3 on graph defined above: {path(adjacency_list, 0,3)}")

Predecessors: dict_items([(4, None), (5, 4), (1, 5), (0, 1), (2, 1)])
Path from node 0 to 4 on graph defined above: [0, 1, 5, 4]
Path from node 0 to 3 on graph defined above: []


#### Tree check
Check whether an adjacency list for an undirected graph represents a tree. A tree is acyclic and connected.

In [22]:
# We can determine if a graph is connected by tracking visits to all neighbouring nodes in a set. 
# The set should be the same size as the node count.
# We determine whether a graph is cyclic by visiting each nodes neighbours recursively.
# At each visit there are three possible outcomes:
# - the node visited has not yet been seen, in which case we add it to the set of seen nodes and visit its neighbours
# - the node has been seen because is was the directly preceding node, in which case we ignore it
# - the node has been seen but is not the directly preceding node. This indicates that a graph is cyclic.

def tree_check(graph):
    predecessors = {0:None}
    found_cycle = False
    
    def visit(node):
        nonlocal found_cycle
        if found_cycle:
            return # This exits the recursive function call
        for nbr in graph[node]:
            if nbr not in predecessors:
                predecessors[nbr] = node
                visit(nbr)
            elif predecessors[node] != nbr: # we expect the predecessor to be in the neighbour list
                found_cycle = True                   

    visit(0)
    connected = len(graph) == len(predecessors)
    return  connected and not found_cycle

In [23]:
adjacency_list_connected_acyclic = [
    [2],       # node 0
    [2,5], # node 1
    [0,1,3,4],   # node 2
    [2],        # node 3
    [2],   # node 4
    [1],   # node 5
]

adjacency_list_forest_acyclic = [
    [2],       # node 0
    [5], # node 1
    [0,3],   # node 2
    [2],        # node 3
    [],   # node 4
    [1],   # node 5
]

adjacency_list_connected_cyclic = [
    [1],       # node 0
    [0,2,5], # node 1
    [1,3,4],   # node 2
    [2],        # node 3
    [2,5],   # node 4
    [1,4],   # node 5
]

In [24]:
print(f"Checking if connected, acyclic graph is tree: {tree_check(adjacency_list_connected_acyclic)}")
print(f"Checking if forest graph is tree: {tree_check(adjacency_list_forest_acyclic)}")
print(f"Checking if connected, cyclic graph is tree: {tree_check(adjacency_list_connected_cyclic)}")

Checking if connected, acyclic graph is tree: True
Checking if forest graph is tree: False
Checking if connected, cyclic graph is tree: False


#### Spanning tree
Given an undirected, connected graph, return a set of edges forming a spanning tree, ie a tree that spans every node with no cycles.

In [26]:
def spanning_tree(graph):
    predecessors = dict()#{0: None}

    def visit(node):
        for nbr in graph[node]:
            if nbr not in predecessors and nbr != 0:
                predecessors[nbr] = node
                visit(nbr)

    visit(0)
    return [[parent, child] for child, parent in predecessors.items()]

In [27]:
spanning_tree(adjacency_list_connected_cyclic)

[[0, 1], [1, 2], [2, 3], [2, 4], [4, 5]]

In [28]:
spanning_tree(adjacency_list_connected_acyclic)

[[0, 2], [2, 1], [1, 5], [2, 3], [2, 4]]

#### Reachability queries
Given an adjacency list for an undirected graph and a list of queries, each represented by a pair of node indices, return a boolean list indicating whether the node pair for each query is connected.

In [30]:
# To answer this, I create a map of connected components, represented using a dictionary that lists the component associated with each node.
# This avoids repeat traversals through the graph; once a node has been visited and can be viewed from the dictionary with $O(1)$ time.
# This may be an inefficient approach if the query list is short and the graph is large and highly disconnected. 

def reachability(graph, queries):
    seen = dict()
    component = 1

    def visit(node):
        for nbr in graph[node]:
            if nbr not in seen:
                seen[nbr] = seen[node]
                visit(nbr)
    
    for node in range(len(graph)):
        if node not in seen:
            seen[node] = component
            component += 1
            visit(node)
    
    
    return [seen[q[0]] == seen[q[1]] for q in queries]

In [31]:
reachability(build_adjacency_list(6, edges), [[0,4],[0,3]])

[True, False]

#### Strongly connected graph
Return whether a directed graph is strongly connected, ie all nodes can reach all other nodes, bilaterally.

In [33]:
# Conditions for strong connection:
# 1. We must be able to get from the root node to all other nodes following the direction of their connections.
# 2. We must be able to get from every other node back to the root (following the direction of their connections)

def strongly_connected(graph):
    reverse_graph = [[] for _ in graph]
    for node in range(len(graph)):
        for nbr in graph[node]:
            reverse_graph[nbr].append(node)
    seen = set()
    seen_reverse = set()
    def visit(node, graph, seen):
        for nbr in graph[node]:
            if nbr not in seen:
                seen.add(nbr)
                visit(nbr, graph, seen)
    visit(0,graph,seen)
    visit(0,reverse_graph,seen_reverse)
    return len(seen) == len(seen_reverse) == len(graph)
    

In [34]:
adjacency_list_strongly_connected = [
    [1,3],       # node 0
    [2], # node 1
    [0],   # node 2
    [2],        # node 3
]

adjacency_list_weakly_connected = [
    [1,2,3],       # node 0
    [2], # node 1
    [],   # node 2
    [2],        # node 3
]

adjacency_list_disconnected = [
    [1],       # node 0
    [0], # node 1
    [3],   # node 2
    [2],        # node 3
]

In [35]:
print(f"Checking if 'strongly connected' graph is strongly connected: {strongly_connected(adjacency_list_strongly_connected)}")
print(f"Checking if 'weakly connected' graph is strongly connected: {strongly_connected(adjacency_list_weakly_connected)}")
print(f"Checking if 'disconnected' graph is strongly connected: {strongly_connected(adjacency_list_disconnected)}")

Checking if 'strongly connected' graph is strongly connected: True
Checking if 'weakly connected' graph is strongly connected: False
Checking if 'disconnected' graph is strongly connected: False
