1. Actor(玩家):

> 为了玩转这个游戏得到尽量高的reward, 你需要实现一个函数: 输入state, 输出action, 可以用神经网络来近似这个函数
> 
> 剩下的任务就是如何训练神经网络, 让它的表现更好(得更高的reward)

2. Critic(评委): 

> 为了训练actor, 你需要知道actor的表现到底怎么样, 根据表现来决定对神经网络参数的调整, 这就要用到强化学习中的“Q-value”
> 
> 但Q-value也是一个未知的函数, 所以也可以用神经网络来近似


Actor-Critic的训练

- Actor看到游戏目前的state, 做出一个action

- Critic根据state和action两者, 对actor刚才的表现打一个分数

- Actor依据critic(评委)的打分, 调整自己的策略(actor神经网络参数), 争取下次做得更好

- Critic根据系统给出的reward(相当于ground truth)和其他评委的打分(critic target)来调整自己的打分策略(critic神经网络参数)

一开始actor随机表演, critic随机打分

但是由于reward的存在, critic评分越来越准, actor表现越来越好

感觉有种类似 `GAN` 的感觉, 两个网络在相互的碰撞, boom!

# GCN practice code

- import basic library

In [1]:
import torch

torch.version

  from .autonotebook import tqdm as notebook_tqdm


<module 'torch.version' from 'e:\\conda3\\envs\\test\\lib\\site-packages\\torch\\version.py'>

In [2]:
import torch
from torch_geometric.data import Data
from torch_geometric.utils import from_networkx

import networkx as nx
import numpy as np
from random import randint, expovariate
import matplotlib.pyplot as plt

- Generate the network

In [3]:
S_CPU_MAX = []
S_BW_MAX = []

# 随机生成一个图(20节点, 100链路)
net = nx.gnm_random_graph(n=20, m=100)

# 设置所有节点的CPU数据, 并同时统计最大值最小值
min_cpu_capacity = 1.0e10
max_cpu_capacity = 0.0
for node_id in net.nodes:
    net.nodes[node_id]['CPU'] = randint(50, 100)
    net.nodes[node_id]['LOCATION'] = randint(0, 2)
    if net.nodes[node_id]['CPU'] < min_cpu_capacity:
        min_cpu_capacity = net.nodes[node_id]['CPU']
    if net.nodes[node_id]['CPU'] > max_cpu_capacity:
        max_cpu_capacity = net.nodes[node_id]['CPU']

# 设置链路的带宽数据, 并同时统计最大带宽最小带宽
min_bandwidth_capacity = 1.0e10
max_bandwidth_capacity = 0.0
for edge_id in net.edges:
    net.edges[edge_id]['bandwidth'] = randint(50, 100)
    if net.edges[edge_id]['bandwidth'] < min_bandwidth_capacity:
        min_bandwidth_capacity = net.edges[edge_id]['bandwidth']
    if net.edges[edge_id]['bandwidth'] > max_bandwidth_capacity:
        max_bandwidth_capacity = net.edges[edge_id]['bandwidth']

# data=True: 返回的是 NodeDataView 对象, 该对象不仅包含每个顶点的 ID 属性, 还包括顶点的其他属性
for s_node_id, s_node_data in net.nodes(data=True):
    S_CPU_MAX.append(s_node_data['CPU'])

# 统计每个底层节点周围链路带宽和
for s_node_id in range(len(net.nodes)):
    total_node_bandwidth = 0.0
    for link_id in net[s_node_id]:
        total_node_bandwidth += net[s_node_id][link_id]['bandwidth']
    S_BW_MAX.append(total_node_bandwidth)

In [4]:
 # S_CPU_Free
s_CPU_remaining = []
s_bandwidth_remaining = []

# 1 表示目前哪些节点被占用, 0 相反
current_embedding = [0] * len(net.nodes)

# 节点剩余资源
for s_node_id, s_node_data in net.nodes(data=True):
    s_CPU_remaining.append(s_node_data['CPU'])
    
# 节点周围剩余带宽资源
for s_node_id in range(len(net.nodes)):
    total_node_bandwidth = 0.0
    for link_id in net[s_node_id]:
        total_node_bandwidth += net[s_node_id][link_id]['bandwidth']
    s_bandwidth_remaining.append(total_node_bandwidth)

In [5]:
# 底层网络特征矩阵
substrate_features = []
substrate_features.append(S_CPU_MAX)
substrate_features.append(S_BW_MAX)
substrate_features.append(s_CPU_remaining)
substrate_features.append(s_bandwidth_remaining)
substrate_features.append(current_embedding)

print(substrate_features)

[[95, 82, 77, 51, 78, 62, 87, 92, 66, 100, 61, 68, 52, 76, 50, 68, 57, 55, 85, 66], [856.0, 1243.0, 733.0, 587.0, 899.0, 705.0, 786.0, 556.0, 628.0, 763.0, 782.0, 763.0, 699.0, 957.0, 715.0, 960.0, 871.0, 737.0, 447.0, 1199.0], [95, 82, 77, 51, 78, 62, 87, 92, 66, 100, 61, 68, 52, 76, 50, 68, 57, 55, 85, 66], [856.0, 1243.0, 733.0, 587.0, 899.0, 705.0, 786.0, 556.0, 628.0, 763.0, 782.0, 763.0, 699.0, 957.0, 715.0, 960.0, 871.0, 737.0, 447.0, 1199.0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]


In [6]:
substrate_features = torch.tensor(substrate_features)
print(substrate_features)
print(substrate_features.shape)

# transpose: 转置
substrate_features = torch.transpose(substrate_features, 0, 1)
print(substrate_features)
print(substrate_features.shape)

tensor([[  95.,   82.,   77.,   51.,   78.,   62.,   87.,   92.,   66.,  100.,
           61.,   68.,   52.,   76.,   50.,   68.,   57.,   55.,   85.,   66.],
        [ 856., 1243.,  733.,  587.,  899.,  705.,  786.,  556.,  628.,  763.,
          782.,  763.,  699.,  957.,  715.,  960.,  871.,  737.,  447., 1199.],
        [  95.,   82.,   77.,   51.,   78.,   62.,   87.,   92.,   66.,  100.,
           61.,   68.,   52.,   76.,   50.,   68.,   57.,   55.,   85.,   66.],
        [ 856., 1243.,  733.,  587.,  899.,  705.,  786.,  556.,  628.,  763.,
          782.,  763.,  699.,  957.,  715.,  960.,  871.,  737.,  447., 1199.],
        [   0.,    0.,    0.,    0.,    0.,    0.,    0.,    0.,    0.,    0.,
            0.,    0.,    0.,    0.,    0.,    0.,    0.,    0.,    0.,    0.]])
torch.Size([5, 20])
tensor([[  95.,  856.,   95.,  856.,    0.],
        [  82., 1243.,   82., 1243.,    0.],
        [  77.,  733.,   77.,  733.,    0.],
        [  51.,  587.,   51.,  587.,    0.],
    

In [7]:
# substrate_features = torch.reshape(substrate_features, (-1,))
# print(substrate_features.shape)
# print(substrate_features)

In [8]:
# vnr_cpu = torch.tensor([10])
# vnr_bw = torch.tensor([30])
# pending = torch.tensor([2])
# substrate_features = torch.cat((substrate_features, vnr_cpu, vnr_bw, pending), 0)

# substrate_features
# substrate_features.shape

In [9]:
print(substrate_features)

tensor([[  95.,  856.,   95.,  856.,    0.],
        [  82., 1243.,   82., 1243.,    0.],
        [  77.,  733.,   77.,  733.,    0.],
        [  51.,  587.,   51.,  587.,    0.],
        [  78.,  899.,   78.,  899.,    0.],
        [  62.,  705.,   62.,  705.,    0.],
        [  87.,  786.,   87.,  786.,    0.],
        [  92.,  556.,   92.,  556.,    0.],
        [  66.,  628.,   66.,  628.,    0.],
        [ 100.,  763.,  100.,  763.,    0.],
        [  61.,  782.,   61.,  782.,    0.],
        [  68.,  763.,   68.,  763.,    0.],
        [  52.,  699.,   52.,  699.,    0.],
        [  76.,  957.,   76.,  957.,    0.],
        [  50.,  715.,   50.,  715.,    0.],
        [  68.,  960.,   68.,  960.,    0.],
        [  57.,  871.,   57.,  871.,    0.],
        [  55.,  737.,   55.,  737.,    0.],
        [  85.,  447.,   85.,  447.,    0.],
        [  66., 1199.,   66., 1199.,    0.]])


- Using 'from_networkx'
    - transfer the torch_geometric

In [10]:
data = from_networkx(net)

In [11]:
print(data)

Data(edge_index=[2, 200], CPU=[20], LOCATION=[20], bandwidth=[200], num_nodes=20)


### Graph Convolution Network
- Generate the GCN class

In [12]:
from torch.nn import Linear
from torch_geometric.nn import GCNConv


class GCN(torch.nn.Module):
    def __init__(self):
        super(GCN, self).__init__()
        torch.manual_seed(12345)
        # in_channels: 节点特征数   out_channels: 输出的节点分类数
        self.conv1 = GCNConv(in_channels=5, out_channels=4)
        self.conv2 = GCNConv(in_channels=4, out_channels=4)
        self.conv3 = GCNConv(in_channels=4, out_channels=1)
        self.classifier = Linear(1, 20)

    def forward(self, x, edge_index):
        h = self.conv1(x, edge_index)
        h = h.tanh()
        h = self.conv2(h, edge_index)
        h = h.tanh()
        h = self.conv3(h, edge_index)
        h = h.tanh()  # Final GNN embedding space.
        
        # Apply a final (linear) classifier.
        out = self.classifier(h)

        return out, h

model = GCN()
print(model)

GCN(
  (conv1): GCNConv(5, 4)
  (conv2): GCNConv(4, 4)
  (conv3): GCNConv(4, 1)
  (classifier): Linear(in_features=1, out_features=20, bias=True)
)


In [13]:
model = GCN()

print(substrate_features.shape, data.edge_index.shape)
print(substrate_features)

out, embedding = model(substrate_features, data.edge_index)
# out, embedding = model(data.x, data.edge_index)
print(f'Embedding shape: {list(embedding.shape)}')

torch.Size([20, 5]) torch.Size([2, 200])
tensor([[  95.,  856.,   95.,  856.,    0.],
        [  82., 1243.,   82., 1243.,    0.],
        [  77.,  733.,   77.,  733.,    0.],
        [  51.,  587.,   51.,  587.,    0.],
        [  78.,  899.,   78.,  899.,    0.],
        [  62.,  705.,   62.,  705.,    0.],
        [  87.,  786.,   87.,  786.,    0.],
        [  92.,  556.,   92.,  556.,    0.],
        [  66.,  628.,   66.,  628.,    0.],
        [ 100.,  763.,  100.,  763.,    0.],
        [  61.,  782.,   61.,  782.,    0.],
        [  68.,  763.,   68.,  763.,    0.],
        [  52.,  699.,   52.,  699.,    0.],
        [  76.,  957.,   76.,  957.,    0.],
        [  50.,  715.,   50.,  715.,    0.],
        [  68.,  960.,   68.,  960.,    0.],
        [  57.,  871.,   57.,  871.,    0.],
        [  55.,  737.,   55.,  737.,    0.],
        [  85.,  447.,   85.,  447.,    0.],
        [  66., 1199.,   66., 1199.,    0.]])
Embedding shape: [20, 1]


In [14]:
print(embedding)
print(embedding.shape)

tensor([[0.7550],
        [0.8366],
        [0.7595],
        [0.6825],
        [0.7760],
        [0.7318],
        [0.7655],
        [0.6840],
        [0.7156],
        [0.7574],
        [0.7429],
        [0.7295],
        [0.7424],
        [0.7824],
        [0.7400],
        [0.8235],
        [0.7747],
        [0.7684],
        [0.6608],
        [0.8380]], grad_fn=<TanhBackward0>)
torch.Size([20, 1])


In [15]:
print(out.shape)
print(out)

torch.Size([20, 20])
tensor([[-0.9013,  0.0658,  0.2151, -0.8476,  0.4945, -1.5472,  0.3259, -0.5638,
         -1.6169, -0.0826, -0.5106, -0.9745,  0.1948,  0.6400,  0.3509, -0.2696,
         -0.3130, -0.4192, -0.3950,  0.6328],
        [-0.9017,  0.1159,  0.1428, -0.9109,  0.4796, -1.6282,  0.3370, -0.6216,
         -1.6902, -0.1157, -0.5544, -0.9749,  0.2495,  0.6201,  0.3672, -0.2327,
         -0.2910, -0.4151, -0.3817,  0.6808],
        [-0.9013,  0.0685,  0.2111, -0.8511,  0.4937, -1.5517,  0.3265, -0.5670,
         -1.6210, -0.0844, -0.5130, -0.9745,  0.1978,  0.6389,  0.3518, -0.2676,
         -0.3118, -0.4189, -0.3943,  0.6354],
        [-0.9009,  0.0212,  0.2794, -0.7914,  0.5078, -1.4754,  0.3161, -0.5125,
         -1.5518, -0.0531, -0.4718, -0.9741,  0.1462,  0.6577,  0.3365, -0.3024,
         -0.3325, -0.4227, -0.4069,  0.5901],
        [-0.9014,  0.0787,  0.1965, -0.8639,  0.4907, -1.5681,  0.3288, -0.5787,
         -1.6358, -0.0911, -0.5219, -0.9746,  0.2089,  0.6349,  0.