# 图

> 由顶点和连接这些顶点的边组成

## 关键概念

> 无向图：边没有方向，或者说是双向的
>
> 有向图：边有方向
>
> 度：无向图中，顶点的边的数量
>
> 出度：有向图中，从该顶点出发的边的数量
>
> 入度：有向图中，到该顶点的边的数量

## 在内存中的实现

> 邻接矩阵：在一个有v个顶点的图中，用 v*v 的二维矩阵A表示，A[i][j] = 1 表示顶点i和j之间有边
>> 优点：基于矩阵操作简单
>>
>> 缺点：对于边比较少的稀疏矩阵空间浪费大
>
> 邻接表：在一个有v个顶点的图中，用长度为v的数组A表示，A[i]是一个列表，表示和顶点i相邻的顶点
>> 优点：占用空间小
>>
>> 缺点：操作相对复杂，需要用支持快速查找的动态数据结构，比如跳表、红黑树，替代列表

## 搜索算法

> BFS(Breadth First Search)：“地毯式”搜索，每次遍历一层顶点
>
> DFS(Deepth First Search)：“走迷宫”搜索，沿着一条路走到头，没路时沿路返回

### 复杂度分析

> time complexity
>> BFS：遍历每个节点，O(V)
>> DFS：遍历每个节点，O(V)

> space complexity
>> BFS, DFS：需要 adj、visited，O(V)

In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [None]:
class Graph:
    def __init__(self, v, directed=False):
        self.v = v
        self.directed = directed
        self.adj = [[] for _ in range(self.v)]
    
    def add_edge(self, s, t, directed=None):
        self.adj[s].append(t)
        # directed 优先级比 self.directed 高
        if directed is not None:
            if directed == False:
                self.adj[t].append(s)
        elif not self.directed:
            self.adj[t].append(s)

g = Graph(8)
g.add_edge(0, 1)
g.add_edge(0, 3)
g.add_edge(1, 2)
g.add_edge(1, 4)
g.add_edge(2, 5)
g.add_edge(3, 4)
g.add_edge(4, 5)
g.add_edge(4, 6)
g.add_edge(5, 7)
g.add_edge(6, 7)

In [27]:
from collections import deque

# BFS
def bfs(g, s, t):
    if s == t:
        return [t]
    queue = deque()
    queue.append(s)
    visited = [False] * g.v
    prev = [-1] * g.v
    while queue:
        w = queue.popleft()
        visited[w] = True
        for v in g.adj[w]:
            if visited[v]:
                continue
            prev[v] = w
            if v == t:
                return path(prev, s, t)
            queue.append(v)
    return []

def path(prev, s, t):
    p = []
    while prev[t] != -1 and s != t:
        p.append(t)
        t = prev[t]
    p.append(s)
    return p[::-1]

bfs(g, 0, 6)

[0, 3, 4, 6]

In [28]:
# DFS

def dfs(g, s, t):
    visited = [False] * g.v
    prev = [-1] * g.v
    found = False

    def recurdfs(g, s, t):
        nonlocal found
        visited[s] = True
        if s == t:
            found = True
            return
        for v in g.adj[s]:
            if found:
                return
            if visited[v]:
                continue
            prev[v] = s
            recurdfs(g, v, t)
    
    recurdfs(g, s, t)
    return path(prev, s, t)

dfs(g, 0, 6)


[0, 1, 2, 5, 4, 6]

## 拓扑排序（topological sort）

> 从局部顺序中找出全局顺序
>
> 常见算法有：Kahn, DFS

In [37]:
# a -> b 表示a先于b
topo1 = Graph(4, True)
topo1.add_edge(2, 0)
topo1.add_edge(2, 3)
topo1.add_edge(3, 0)
topo1.add_edge(0, 1)
topo1.add_edge(3, 1)

# a -> b 表示a依赖b
topo2 = Graph(4, True)
topo2.add_edge(0, 2)
topo2.add_edge(0, 3)
topo2.add_edge(1, 0)
topo2.add_edge(1, 3)
topo2.add_edge(3, 2)

circled_topo = Graph(3, True)
circled_topo.add_edge(0, 1)
circled_topo.add_edge(1, 2)
circled_topo.add_edge(2, 1)

from collections import defaultdict, deque

# kahn
def topo1_sort_by_kahn(graph):
    indegree = [0] * graph.v
    for i in range(graph.v):
        for j in graph.adj[i]:
            indegree[j] += 1
    res = []
    queue = deque([v for v, e in enumerate(indegree) if e == 0])
    while queue:
        v = queue.popleft()
        res.append(v)
        for i in graph.adj[v]:
            indegree[i] -= 1
            if indegree[i] == 0:
                queue.append(i)
    
    return res

print(f'a->b表示a先于b, kahn: {topo1_sort_by_kahn(topo1)}')

# def topo2_sort_by_kahn(graph):
#     inverse_adj = [[] for _ in range(graph.v)]
#     for i in range(graph.v):
#         for j in graph.adj[i]:
#             inverse_adj[j].append(i)
#     g = Graph(graph.v)
#     g.adj = inverse_adj
#     return topo1_sort_by_kahn(g)

# topo2_sort_by_kahn(topo2) 
# topo2_sort_by_kahn(circled_topo)

# dfs
# def topo1_sort_by_dfs(graph):
#     # inverse_adj = defaultdict(list)
#     # for i in range(graph.v):
#     #     for j in graph.adj[i]:
#     #         inverse_adj[j].append(i)
    
#     res = []
#     visited = set()
#     def dfs(v):
#         if v in visited:
#             return
#         for i in graph.adj[v]:
#             dfs(i)
#         visited.add(v)
#         res.append(v)

#     for i in range(graph.v):
#         dfs(i)
#     return res[::-1]

# topo1_sort_by_dfs(topo1)

def topp2_sort_by_dfs(graph):
    res = []
    visited = set()
    nopre = [False] * graph.v
    def dfs(v):
        if nopre[v]:
            return True
        
        if v in visited:
            return False

        visited.add(v)
        for i in graph.adj[v]:
            if not dfs(i):
                return False
        nopre[v] = True
        res.append(v)
        return True

    for i in range(graph.v):
        dfs(i)
    
    return res

topp2_sort_by_dfs(topo2)
print(f'a->b stand for a dependent b, dfs: {topp2_sort_by_dfs(circled_topo)}')

a->b表示a先于b, kahn: [2, 3, 0, 1]


[2, 3, 0, 1]

[]

[2, 3, 0, 1]

[2, 3, 0, 1]

a->b stand for a dependent b, dfs: []


## 最短线路

### Dijkstra 算法

> 核心思路
> 1. 初始化每个顶点（除起点外）到起点的距离为无穷大，起点到起点的距离为0
> 2. 把起点加入优先（有序）队列
> 3. 开始循环从队列里每次取距离最小的点出来
>> 1. 如果点是终点，结束
>> 2. 如果相邻点的距离小于已有的距离，更新距离，放入队列
>
> Dijkstra 算法也是一种DP算法，每次尝试处理最优子结构，但并不丢弃其他选择，这是和贪心算法一条路走到黑最明显的区别
>
> 时间复杂度：O(Elogv)

In [17]:
import heapq       

class WeightGraph(Graph):
    def __init__(self, v, directed=False):
        super().__init__(v, directed)
    
    def add_edge(self, s, t, w, directed=None):
        return super().add_edge(s, (t, w), directed)
    
    def dijkstra(self, s, t):
        distance = [float('inf')] * self.v
        distance[s] = 0
        pre = [None] * self.v
        queue = [(0, s)]
        while queue:
            dist, v = heapq.heappop(queue)
            if v == t:
                return self._path(s, t, pre)
            if dist > distance[v]:
                continue
            for n, w in self.adj[v]:
                if dist + w < distance[n]:
                    pre[n] = v
                    distance[n] = dist + w
                    heapq.heappush(queue, (distance[n], n))
        return []


    def _path(self, s, t, pre):
        if s == t:
            return [s]
        return self._path(s, pre[t], pre) + [t]


weight_graph = WeightGraph(6, directed=True)
weight_graph.add_edge(0, 1, 10)
weight_graph.add_edge(0, 4, 15)
weight_graph.add_edge(1, 2, 15)
weight_graph.add_edge(1, 3, 2)
weight_graph.add_edge(2, 5, 5)
weight_graph.add_edge(3, 2, 1)
weight_graph.add_edge(3, 5, 12)
weight_graph.add_edge(4, 5, 10)

weight_graph.dijkstra(0, 5)

[0, 1, 3, 2, 5]

### A* 算法

> 相比于Dijkstra用dist来计算顶点到起点的距离g(i)， A* 算法额外使用了启发函数来估算顶点和终点的距离h(i)，每次从队列中取f(i) = g(i) + h(i) 最小的顶点

In [None]:
def _heurisitic(self, s, t):
    pass

def a_start(self, s, t):
    dist = [(float('inf'), float('inf'))] * self.v # (f, g)
    dist[s] = (self._heuristic(s, t), 0)
    queue = [(*dist[s], s)]
    while queue:
        f, g, v = heapq.heappop(queue)
        if v == t:
            return g
        if f > dist[s][0]:
            continue
        for n, w in self.adj[v]:
            if g + w < dist[n][1]:
                dist[n] = (self._heuristic(n, t), g+w)
                heapq.heappush(queue, (*dist[n], n))
    
    return -1

setattr(WeightGraph, '_heuristic', _heurisitic)
setattr(WeightGraph, 'a_start', a_start)


### Bellman-Ford 算法

> 核心思想
>> 在有V个顶点的图中，两个顶点的之间的最短路径最多经过V-1个边
>
> 应用场景
> 1. 稀疏图
> 2. 负权边
> 3. 检测负权环
> 
> 时间复杂度: O(VE)

In [None]:
def bellford(self, s, t):
    dist = [float('inf')] * self.v
    dist[s] = 0
    pre = [None] * self.v
    for _ in range(self.v - 1):
        for i in range(self.v):
            for v, w in self.adj[i]:
                if dist[i] != float('inf') and dist[i] + w < dist[v]:
                    dist[v] = dist[i] + w
                    pre[v] = i

    # 检测负权环
    for i in range(self.v):
        for v, w in self.adj[i]:
            if dist[i] != float('inf') and dist[i] + w < dist[v]:
                return []
    
    return self._path(s, t, pre)

setattr(WeightGraph, 'bellford', bellford)



### Floyd-Warshall 算法

> 基于DP的思想，用一个二维数组维护两个顶点之间最短路径，状态转移方程dist[i][j] = min (dist[i][j], dist[i][k] + dist[k][j])
>
> 时间复杂度: O(V3)

In [None]:
def floyd(self):
    dist = [[float('inf')] for _ in self.v]
    for i in range(self.v):
        dist[i][i] = 0
        for j, w in self.adj[i]:
            dist[i][j] = w
    
    for k in range(self.v):
        for i in range(self.v):
            for j in range(self.v):
                dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
    
    return dist

setattr(WeightGraph, 'floyd', floyd)

### Dijkstra Vs. Bellman-Ford Vs. Floyd-Warshall

| **算法**          | **类型**              | **时间复杂度**       | **空间复杂度**      | **适用图**         | **是否支持负权边** | **适用场景** |
|------------------|---------------------|---------------------|---------------------|------------------|-----------------|-------------|
| **Dijkstra**     | 单源最短路径         | \(O(ElogV)\) (使用堆优化) | \(O(V)\)       | **有向 / 无向** | **不支持**       | **边权非负，稀疏图（E ≪ V²）** |
| **Bellman-Ford** | 单源最短路径         | \(O(VE)\)           | \(O(V)\)           | **有向 / 无向** | **支持**        | **支持负权边，可检测负环** |
| **Floyd-Warshall** | 多源最短路径        | \(O(V^3)\)          | \(O(V^2)\)         | **有向 / 无向** | **支持**        | **小规模图（V ≈ 1000 以下）** |