## 图自动编码器GAE & 变分图自动编码器VGAE
### GAE的核心原理  
GAE是一种基于图卷积网络的无监督学习框架，通过编码-解码结构学习节点的低维嵌入表示，并重构图结构信息。  
#### 编码器
* 使用2层GCN将节点特征和图结构映射为低维嵌入：
$$
Z = GCN(X, A) = \sigma(\tilde{A} * ReLU(\tilde{A}XW_{0})W_{1})
$$
其中：
1. $\tilde{A}=D^{-\frac{1}{2}}AD^{-\frac{1}{2}}$为对称归一化邻接矩阵
2. X是节点特征矩阵，A是邻接矩阵
3. $W_{0}, W_{1}$为可学习参数
* 输出为确定性节点嵌入Z，每个节点对应一个固定向量
#### 解码器
* 通过节点嵌入的内积重构邻接矩阵  
$$
\tilde{A} = \sigma(ZZ^T)
$$
1. $\sigma$为sigmoid函数，$\tilde{A_{ij}}$表示节点i和节点j之间边存在的概率  
#### 损失函数  
* 最小化原始邻接矩阵A和重构矩阵$\tilde{A}$的交叉熵：  
$$
L{GAE} = - \sum_{i, j}[A_{ij}log{\tilde{A_{ij}}}+(1-A_{ij})log(1-\tilde{A_{ij}})]
$$

In [9]:
import torch
from torch_geometric.datasets import Planetoid
import torch_geometric.transforms as T
from torch_geometric.nn import GCNConv
from torch_geometric.utils import train_test_split_edges
from torch_geometric.nn import GAE

In [10]:
# 编码器
class GCNEncoder(torch.nn.Module):
    def __init__(self, in_channels, out_channels):
        super(GCNEncoder, self).__init__()
        self.conv1 = GCNConv(in_channels, 2*out_channels, cached=True)
        self.conv2 = GCNConv(2*out_channels, out_channels, cached=True)
    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        return self.conv2(x, edge_index)

In [11]:
out_channels = 2 # 输出：真假
num_features = 100 # 节点特征
epochs = 100

model = GAE(GCNEncoder(num_features, out_channels))

# move to GPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

# 初始化 optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

In [5]:
import os.path as osp
import torch
import torch.nn.functional as F
from sklearn.cluster import KMeans
from sklearn.metrics.cluster import (v_measure_score, homogeneity_score, completeness_score)
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import torch_geometric.transforms as T
from torch_geometric.datasets import Planetoid
from torch_geometric.nn import GCNConv
from torch_geometric.nn.models.autoencoder import ARGVA
from torch_geometric.utils import train_test_split_edges

In [None]:
class VEncoder(torch.nn.Module):
    def __init__(self, in_channels, out_channels):
        super(VEncoder, self).__init__()
        # 第一层：输入特征向量 -> 2倍潜在空间维度（为均值和方差分值提供共享特征）
        self.conv1 = GCNConv(in_channels, 2*out_channels, cached=True)
        # 均值分支：将中间特征映射到空间均值向量
        self.conv_mu = GCNConv(2*out_channels, out_channels, cached=True)
        # 方差分支：将中间特征映射到潜在对数标准差向量
        self.conv_logstd = GCNConv(2*out_channels, out_channels, cached=True)

    def forward(self, x, edge_index):
        x = F.relu(self.conv1(x, edge_index))
        return self.conv_mu(x, edge_index), self.conv_logstd(x, edge_index)

In [None]:
# 解码器
class Discriminator(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super(Discriminator, self).__init__()
        # 第1层全连接：输入维度 -> 隐藏层维度
        self.lin1 = torch.nn.Linear(in_channels, hidden_channels)
        # 第2层全连接：隐藏层 -> 隐藏层
        self.lin2 = torch.nn.Linear(hidden_channels, hidden_channels)
        # 输出层： 隐藏层 -> 判别结果维度（如真假分类）
        self.lin3 = torch.nn.Linear(hidden_channels, out_channels)
    
    def forward(self, x):
        x = F.relu(self.lin1(x))
        x = F.relu(self.lin2(x))
        x = self.lin3(x)
        return x


#### 初始化模型

In [16]:
num_features = 1000
latent_size = 32

# 编码器
encoder = VEncoder(num_features, out_channels=latent_size)
# 解码器
discriminator = Discriminator(in_channels=latent_size, hidden_channels=64, out_channels=1)

# 模型
model = ARGVA(encoder, discriminator)

# 定义优化器
discriminator_optimizer = torch.optim.Adam(discriminator.parameters(), lr=0.001)
encoder_optimizer = torch.optim.Adam(encoder.parameters(), lr=0.005)

### VGAE的核心原理  
VGAE在GAE基础上引入了变分推断，将节点嵌入视作概率分布（高斯分布），通过采样生成嵌入以捕捉数据不确定性  
#### 编码器
* 使用两个共享第一层权重的GCN分别输出均值和方差：
$$
u = GCN_{u}(X, A),\ log_{\sigma} = GCN_{\sigma}(X, A)
$$
* 节点嵌入$z_i$从后验分布采样：  
$$
q(z_{i}|X, A) = N(u_{i}, diag(\sigma^{2}_{i}))
$$
1. 通过重参数化技巧实现梯度传播：$z_{i} = u_{i} + \sigma_{i}\odot\epsilon $,$\epsilon ~ N(0, I)$
#### 解码器  
* 与GAE相同，通过内积计算边概率：  
$$
p(A_{ij}=1|z_{i}, z_{j}) = \sigma(z^{T}_{i}, z_{j})
$$
1. 重构邻接矩阵时，直接使用采样后的嵌入Z
#### 损失函数
* 最大化证据下届，包含重构损失函数和KL散度：  
$$
L_{VGAE} = E_{q(Z|X,A)}[logp(A|Z)]-KL[q(Z|X,A)||p(Z)]
$$
1. $p(Z)=\sum_{i}N(z_{i}|0,I)$为先验分布（标准高斯）
2. KL散度约束后验分布接近先验，避免过拟合

In [12]:
from torch_geometric.nn import VGAE

In [14]:
class VariationalGCNEncoder(torch.nn.Module):
    def __init__(self, in_channels, out_channels):
        super(VariationalGCNEncoder, self).__init__()
        self.conv1 = GCNConv(in_channels, 2*out_channels, cached=True)
        self.conv_mu = GCNConv(2*out_channels, out_channels, cached=True)
        self.conv_logstd = GCNConv(2*out_channels, out_channels, cached=True)
    
    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        return self.conv_mu(x, edge_index), self.conv_logstd(x, edge_index)

In [15]:
out_channels = 2
num_features = 100
epochs = 300

model = VGAE(VariationalGCNEncoder(num_features, out_channels))
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)