1. Actor(玩家):

> 为了玩转这个游戏得到尽量高的reward, 你需要实现一个函数: 输入state, 输出action, 可以用神经网络来近似这个函数
> 
> 剩下的任务就是如何训练神经网络, 让它的表现更好(得更高的reward)

2. Critic(评委): 

> 为了训练actor, 你需要知道actor的表现到底怎么样, 根据表现来决定对神经网络参数的调整, 这就要用到强化学习中的“Q-value”
> 
> 但Q-value也是一个未知的函数, 所以也可以用神经网络来近似


Actor-Critic的训练

- Actor看到游戏目前的state, 做出一个action

- Critic根据state和action两者, 对actor刚才的表现打一个分数

- Actor依据critic(评委)的打分, 调整自己的策略(actor神经网络参数), 争取下次做得更好

- Critic根据系统给出的reward(相当于ground truth)和其他评委的打分(critic target)来调整自己的打分策略(critic神经网络参数)

一开始actor随机表演, critic随机打分

但是由于reward的存在, critic评分越来越准, actor表现越来越好

感觉有种类似 `GAN` 的感觉, 两个网络在相互的碰撞, boom!

# GCN practice code

- import basic library

In [23]:
import torch

torch.version

<module 'torch.version' from 'e:\\conda3\\envs\\test\\lib\\site-packages\\torch\\version.py'>

In [24]:
import torch
from torch_geometric.data import Data
from torch_geometric.utils import from_networkx

import networkx as nx
import numpy as np
from random import randint, expovariate
import matplotlib.pyplot as plt

- Generate the network

In [25]:
S_CPU_MAX = []
S_BW_MAX = []

# 随机生成一个图(20节点, 100链路)
net = nx.gnm_random_graph(n=20, m=100)

# 设置所有节点的CPU数据, 并同时统计最大值最小值
min_cpu_capacity = 1.0e10
max_cpu_capacity = 0.0
for node_id in net.nodes:
    net.nodes[node_id]['CPU'] = randint(50, 100)
    net.nodes[node_id]['LOCATION'] = randint(0, 2)
    
    if net.nodes[node_id]['CPU'] < min_cpu_capacity:
        min_cpu_capacity = net.nodes[node_id]['CPU']
    if net.nodes[node_id]['CPU'] > max_cpu_capacity:
        max_cpu_capacity = net.nodes[node_id]['CPU']

# 设置链路的带宽数据, 并同时统计最大带宽最小带宽
min_bandwidth_capacity = 1.0e10
max_bandwidth_capacity = 0.0
for edge_id in net.edges:
    net.edges[edge_id]['bandwidth'] = randint(50, 100)
    
    if net.edges[edge_id]['bandwidth'] < min_bandwidth_capacity:
        min_bandwidth_capacity = net.edges[edge_id]['bandwidth']
    if net.edges[edge_id]['bandwidth'] > max_bandwidth_capacity:
        max_bandwidth_capacity = net.edges[edge_id]['bandwidth']

# data=True: 返回的是 NodeDataView 对象, 该对象不仅包含每个顶点的 ID 属性, 还包括顶点的其他属性
for s_node_id, s_node_data in net.nodes(data=True):
    S_CPU_MAX.append(s_node_data['CPU'])

# 统计每个底层节点周围链路带宽和
for s_node_id in range(len(net.nodes)):
    total_node_bandwidth = 0.0
    for link_id in net[s_node_id]:
        total_node_bandwidth += net[s_node_id][link_id]['bandwidth']
    S_BW_MAX.append(total_node_bandwidth)

In [26]:
 # S_CPU_Free
s_CPU_remaining = []
s_bandwidth_remaining = []

# 1 表示目前哪些节点被占用, 0 相反
current_embedding = [0] * len(net.nodes)

# 节点剩余资源
for s_node_id, s_node_data in net.nodes(data=True):
    s_CPU_remaining.append(s_node_data['CPU'])
    
# 节点周围剩余带宽资源
for s_node_id in range(len(net.nodes)):
    total_node_bandwidth = 0.0
    for link_id in net[s_node_id]:
        total_node_bandwidth += net[s_node_id][link_id]['bandwidth']
    s_bandwidth_remaining.append(total_node_bandwidth)

In [27]:
# 底层网络特征矩阵
substrate_features = []
substrate_features.append(S_CPU_MAX)
substrate_features.append(S_BW_MAX)

substrate_features.append(s_CPU_remaining)
substrate_features.append(s_bandwidth_remaining)

substrate_features.append(current_embedding)

print(substrate_features)

[[52, 95, 100, 63, 96, 63, 50, 60, 60, 58, 97, 73, 58, 63, 80, 54, 51, 80, 52, 83], [727.0, 636.0, 927.0, 421.0, 932.0, 724.0, 698.0, 960.0, 482.0, 992.0, 885.0, 663.0, 824.0, 845.0, 709.0, 515.0, 552.0, 788.0, 931.0, 685.0], [52, 95, 100, 63, 96, 63, 50, 60, 60, 58, 97, 73, 58, 63, 80, 54, 51, 80, 52, 83], [727.0, 636.0, 927.0, 421.0, 932.0, 724.0, 698.0, 960.0, 482.0, 992.0, 885.0, 663.0, 824.0, 845.0, 709.0, 515.0, 552.0, 788.0, 931.0, 685.0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]


In [28]:
substrate_features = torch.tensor(substrate_features)
print(substrate_features)
print(substrate_features.shape)

# transpose: 转置
substrate_features = torch.transpose(substrate_features, 0, 1)
print(substrate_features)
print(substrate_features.shape)

tensor([[ 52.,  95., 100.,  63.,  96.,  63.,  50.,  60.,  60.,  58.,  97.,  73.,
          58.,  63.,  80.,  54.,  51.,  80.,  52.,  83.],
        [727., 636., 927., 421., 932., 724., 698., 960., 482., 992., 885., 663.,
         824., 845., 709., 515., 552., 788., 931., 685.],
        [ 52.,  95., 100.,  63.,  96.,  63.,  50.,  60.,  60.,  58.,  97.,  73.,
          58.,  63.,  80.,  54.,  51.,  80.,  52.,  83.],
        [727., 636., 927., 421., 932., 724., 698., 960., 482., 992., 885., 663.,
         824., 845., 709., 515., 552., 788., 931., 685.],
        [  0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
           0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.]])
torch.Size([5, 20])
tensor([[ 52., 727.,  52., 727.,   0.],
        [ 95., 636.,  95., 636.,   0.],
        [100., 927., 100., 927.,   0.],
        [ 63., 421.,  63., 421.,   0.],
        [ 96., 932.,  96., 932.,   0.],
        [ 63., 724.,  63., 724.,   0.],
        [ 50., 698.,  50., 698.,   0.],
    

In [8]:
# substrate_features = torch.reshape(substrate_features, (-1,))
# print(substrate_features.shape)
# print(substrate_features)

torch.Size([100])
tensor([  56.,  586.,   56.,  586.,    0.,   73.,  621.,   73.,  621.,    0.,
         100.,  823.,  100.,  823.,    0.,   92.,  413.,   92.,  413.,    0.,
          88.,  892.,   88.,  892.,    0.,   97., 1170.,   97., 1170.,    0.,
          58.,  960.,   58.,  960.,    0.,   56.,  816.,   56.,  816.,    0.,
          87.,  704.,   87.,  704.,    0.,   53.,  780.,   53.,  780.,    0.,
          83.,  638.,   83.,  638.,    0.,   81.,  744.,   81.,  744.,    0.,
          85.,  546.,   85.,  546.,    0.,   77.,  823.,   77.,  823.,    0.,
          61.,  872.,   61.,  872.,    0.,   86.,  582.,   86.,  582.,    0.,
          96.,  857.,   96.,  857.,    0.,   97.,  612.,   97.,  612.,    0.,
          85.,  851.,   85.,  851.,    0.,   84., 1030.,   84., 1030.,    0.])


In [11]:
# vnr_cpu = torch.tensor([10])
# vnr_bw = torch.tensor([30])
# pending = torch.tensor([2])
# substrate_features = torch.cat((substrate_features, vnr_cpu, vnr_bw, pending), 0)

# substrate_features
# substrate_features.shape

torch.Size([103])

In [29]:
print(substrate_features)

tensor([[ 52., 727.,  52., 727.,   0.],
        [ 95., 636.,  95., 636.,   0.],
        [100., 927., 100., 927.,   0.],
        [ 63., 421.,  63., 421.,   0.],
        [ 96., 932.,  96., 932.,   0.],
        [ 63., 724.,  63., 724.,   0.],
        [ 50., 698.,  50., 698.,   0.],
        [ 60., 960.,  60., 960.,   0.],
        [ 60., 482.,  60., 482.,   0.],
        [ 58., 992.,  58., 992.,   0.],
        [ 97., 885.,  97., 885.,   0.],
        [ 73., 663.,  73., 663.,   0.],
        [ 58., 824.,  58., 824.,   0.],
        [ 63., 845.,  63., 845.,   0.],
        [ 80., 709.,  80., 709.,   0.],
        [ 54., 515.,  54., 515.,   0.],
        [ 51., 552.,  51., 552.,   0.],
        [ 80., 788.,  80., 788.,   0.],
        [ 52., 931.,  52., 931.,   0.],
        [ 83., 685.,  83., 685.,   0.]])


- Using 'from_networkx'
    - transfer the torch_geometric

In [30]:
data = from_networkx(net)

In [31]:
print(data)

Data(CPU=[20], LOCATION=[20], bandwidth=[200], edge_index=[2, 200])


### Graph Convolution Network
- Generate the GCN class

In [32]:
from torch.nn import Linear
from torch_geometric.nn import GCNConv


class GCN(torch.nn.Module):
    def __init__(self):
        super(GCN, self).__init__()
        torch.manual_seed(12345)
        # in_channels: 节点特征数   out_channels: 输出的节点分类数
        self.conv1 = GCNConv(in_channels=5, out_channels=4)
        self.conv2 = GCNConv(in_channels=4, out_channels=4)
        self.conv3 = GCNConv(in_channels=4, out_channels=1)
        self.classifier = Linear(1, 20)

    def forward(self, x, edge_index):
        h = self.conv1(x, edge_index)
        h = h.tanh()
        h = self.conv2(h, edge_index)
        h = h.tanh()
        h = self.conv3(h, edge_index)
        h = h.tanh()  # Final GNN embedding space.
        
        # Apply a final (linear) classifier.
        out = self.classifier(h)

        return out, h

model = GCN()
print(model)

GCN(
  (conv1): GCNConv(5, 4)
  (conv2): GCNConv(4, 4)
  (conv3): GCNConv(4, 1)
  (classifier): Linear(in_features=1, out_features=20, bias=True)
)


In [33]:
model = GCN()

print(substrate_features.shape, data.edge_index.shape)
print(substrate_features)

out, embedding = model(substrate_features, data.edge_index)
# out, embedding = model(data.x, data.edge_index)
print(f'Embedding shape: {list(embedding.shape)}')

torch.Size([20, 5]) torch.Size([2, 200])
tensor([[ 52., 727.,  52., 727.,   0.],
        [ 95., 636.,  95., 636.,   0.],
        [100., 927., 100., 927.,   0.],
        [ 63., 421.,  63., 421.,   0.],
        [ 96., 932.,  96., 932.,   0.],
        [ 63., 724.,  63., 724.,   0.],
        [ 50., 698.,  50., 698.,   0.],
        [ 60., 960.,  60., 960.,   0.],
        [ 60., 482.,  60., 482.,   0.],
        [ 58., 992.,  58., 992.,   0.],
        [ 97., 885.,  97., 885.,   0.],
        [ 73., 663.,  73., 663.,   0.],
        [ 58., 824.,  58., 824.,   0.],
        [ 63., 845.,  63., 845.,   0.],
        [ 80., 709.,  80., 709.,   0.],
        [ 54., 515.,  54., 515.,   0.],
        [ 51., 552.,  51., 552.,   0.],
        [ 80., 788.,  80., 788.,   0.],
        [ 52., 931.,  52., 931.,   0.],
        [ 83., 685.,  83., 685.,   0.]])
Embedding shape: [20, 1]


In [34]:
print(embedding)
print(embedding.shape)

tensor([[0.9209],
        [0.8982],
        [0.9205],
        [0.8444],
        [0.9381],
        [0.8986],
        [0.9124],
        [0.9377],
        [0.8682],
        [0.9381],
        [0.9202],
        [0.8987],
        [0.9236],
        [0.9322],
        [0.9102],
        [0.8654],
        [0.8890],
        [0.9067],
        [0.9201],
        [0.8939]], grad_fn=<TanhBackward0>)
torch.Size([20, 1])


In [35]:
print(out.shape)
print(out)

torch.Size([20, 20])
tensor([[ 0.6984, -0.1664,  1.0992,  0.2334,  0.0841, -0.7660, -0.6426,  0.2313,
          0.8649,  0.2041,  0.9959,  0.2864,  0.2800,  0.1531, -0.8035,  1.3454,
         -0.8052,  1.2415,  0.3402, -1.1153],
        [ 0.6996, -0.1617,  1.0916,  0.2430,  0.0748, -0.7520, -0.6358,  0.2139,
          0.8627,  0.2221,  0.9823,  0.3015,  0.2814,  0.1532, -0.7935,  1.3352,
         -0.8030,  1.2291,  0.3286, -1.0975],
        [ 0.6984, -0.1663,  1.0991,  0.2335,  0.0840, -0.7658, -0.6425,  0.2311,
          0.8649,  0.2044,  0.9957,  0.2866,  0.2800,  0.1531, -0.8034,  1.3452,
         -0.8052,  1.2413,  0.3400, -1.1150],
        [ 0.7024, -0.1505,  1.0735,  0.2657,  0.0527, -0.7188, -0.6196,  0.1728,
          0.8574,  0.2646,  0.9500,  0.3372,  0.2849,  0.1533, -0.7698,  1.3111,
         -0.7976,  1.1997,  0.3013, -1.0555],
        [ 0.6975, -0.1700,  1.1050,  0.2261,  0.0912, -0.7766, -0.6478,  0.2445,
          0.8666,  0.1905,  1.0062,  0.2750,  0.2789,  0.1530, -0.