# KGCN 논문 리뷰 & 코드 작성
**KGCN: Simplifying and Powering Graph Convolution Network for Recommendation**  
*Hongwei Wang et al. (2019)*  
🔗 [논문 링크](https://arxiv.org/abs/1904.12575)

## 3.1 Problem Formulation

*목표 : 사용자 u가 아이템 v에 관심 있을지를 예측하는 함수*
$$\hat{y}_{uv} = F(u,v | Θ, Y, G)$$
- Y : 사용자-아이템 상호작용 행렬 (예: 클릭, 평가)
- G : 지식 그래프 (KG), 삼중항(triple : head, relation, tail)의 집합

## 3.2 KGCN Layer (모델 구성 요소)

*1. 관계 중요도 계산 (사용자 u, 관계 r)*
$$\pi_{ur} = g(u,r)$$

*2. 이웃 노드 집계 (attention 가중치 포함)*
$$v_u^{N(v)} = \sigma_{e\in{N(v)}}\tilde{\pi}_{ur}⋅e$$
$$\tilde{\pi}_{ur}=\frac{exp(\pi_{ur})}{\sum\nolimits_{e'}exp(\pi_{ur'})}$$

*3. Aggregation 방식 3가지*
- Sum : $ReLU(W(v+v_u^{N(v)})+b)$
- Concat : $ReLU(W[v;v_u^{N(v)}]+b)$
- Neighbor : $ReLU(Wv_u^{N(v)}+b)$

## 3.3 Learning Algorithm (학습 알고리즘)

*반복 구조*
KGCN은 여러 계층(hop)으로 구성되어 '0-hop → 1-hop → ...'와 같은 형식으로 이웃 정보를 반복적으로 전파 및 집계
$$\hat{y}_{uv}=f(u, v_u^{(h)})$$

*학습 손실 함수*
- Cross Entropy + Negative Sampling + L2 정규화 포함
$$L = \sum_{u}\begin{bmatrix}\sum_{v:y_{uv}=1}J(y_{uv}, \hat{y}_{uv})-\sum_{i=1}^{T_u}\mathbb{E}_{vi~P(v)}J(0,\hat{y}_(uvi))\end{bmatrix}+\lambda||F||_2^2$$

In [11]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from tqdm import tqdm
import random
import pandas as pd
import pickle

In [12]:
device = (torch.device("mps") if torch.backends.mps.is_available() else torch.device("cpu"))
print("Using device:", device)

Using device: mps


## Book - Crossing 모델 설정 (Hyperparameter)

|항목|설정|
|:---:|:---:|
임베딩 차원 | 64
학습률 | 0.0002
Optimizer | Adam
정규화 계수 (L2) | 2e-5
Negative Sampling | 1:1 비율로 샘플링
배치 사이즈 | 256
학습 Epoch | 최대 1000 (일반적으로 200~400)
레이어 수 (Receptive Field Depth, H) | 1
이웃 샘플링 수 (K) | 8
초기화 방식 | Xavier Uniform (명시는 없지만 일반적인 초기화 방식으로 추정됨)

In [13]:
config = {
    'device': 'mps',
    'dataset': 'book',
    'embedding_dim': 64,
    'n_layers': 1,
    'lr': 0.0002,
    'batch_size': 256,
    'l2': 2e-5,
    'n_epoch': 200,
    'n_neighbor': 8,
    'H': 1,
    'aggregator': 'sum'
}

In [14]:
interactions = pd.read_csv("data/processed/interactions.csv")

# positive / negative 각각 샘플링
pos_sample = interactions[interactions['label'] == 1].sample(5000, random_state=42)
neg_sample = interactions[interactions['label'] == 0].sample(5000, random_state=42)

# 합치기
small_data = pd.concat([pos_sample, neg_sample]).reset_index(drop=True)

print(small_data['label'].value_counts())  # 5000 / 5000

with open("data/processed/kg_triples.pkl", "rb") as f:
    kg_triples = pickle.load(f)

with open("data/processed/user2id.pkl", "rb") as f:
    user2id = pickle.load(f)
with open("data/processed/item2id.pkl", "rb") as f:
    item2id = pickle.load(f)
with open("data/processed/entity2id.pkl", "rb") as f:
    entity2id = pickle.load(f)
with open("data/processed/relation2id.pkl", "rb") as f:
    relation2id = pickle.load(f)

print(f"Interactions: {interactions.shape}")
print(f"KG triples: {len(kg_triples)}")

label
1    5000
0    5000
Name: count, dtype: int64
Interactions: (20000263, 3)
KG triples: 80141


In [15]:
from collections import defaultdict

def build_kg_dict(kg_triples):
    kg_dict = defaultdict(list)
    for h, r, t in kg_triples:
        kg_dict[h].append((r, t))
    return kg_dict

kg_dict = build_kg_dict(kg_triples)

In [16]:
class KGCN(nn.Module):
    def __init__(self, num_users, num_entities, num_relations, embed_dim, kg_dict, n_neighbors=8, n_hops=1):
        super(KGCN, self).__init__()
        self.user_embed = nn.Embedding(num_users, embed_dim)
        self.entity_embed = nn.Embedding(num_entities, embed_dim)
        self.relation_embed = nn.Embedding(num_relations, embed_dim)

        self.kg = kg_dict
        self.n_neighbors = n_neighbors
        self.n_hops = n_hops

    def sample_neighbors(self, entities):
        batch_neighbors = []
        batch_relations = []
        for e in entities:
            neighbors = self.kg.get(e.item(), [])
            
            # 이 부분이 없으면 문제가 발생할 수 있음
            if len(neighbors) == 0:
                # self-loop처럼 자기 자신을 neighbor로 추가
                neighbors = [(0, e.item())]

            # 샘플링 (길이 부족하면 마지막 이웃 반복해서 채우기)
            sampled = neighbors[:self.n_neighbors]
            if len(sampled) < self.n_neighbors:
                sampled += [sampled[-1]] * (self.n_neighbors - len(sampled))

            batch_neighbors.append([t for r, t in sampled])
            batch_relations.append([r for r, t in sampled])

        return torch.LongTensor(batch_neighbors), torch.LongTensor(batch_relations)


    def aggregate(self, user_emb, entities):
        e_embed = self.entity_embed(entities)
        for _ in range(self.n_hops):
            n_entities, n_relations = self.sample_neighbors(entities)
            n_emb = self.entity_embed(n_entities.to(entities.device))
            r_emb = self.relation_embed(n_relations.to(entities.device))
            scores = torch.sum(user_emb.unsqueeze(1) * r_emb, dim=-1).unsqueeze(-1)
            attn = F.softmax(scores, dim=1)
            agg = torch.sum(attn * n_emb, dim=1)
            e_embed = F.relu(e_embed + agg)
        return e_embed

    def forward(self, users, items):
        u = self.user_embed(users)
        v = self.aggregate(u, items)
        scores = torch.sum(u * v, dim=-1)
        return torch.sigmoid(scores)

In [17]:

class KGDataset(Dataset):
    def __init__(self, df):
        self.users = df['user'].values
        self.items = df['item'].values
        self.labels = df['label'].values

    def __len__(self):
        return len(self.users)

    def __getitem__(self, idx):
        return self.users[idx], self.items[idx], self.labels[idx]

train_data = KGDataset(small_data)
train_loader = DataLoader(train_data, batch_size=256, shuffle=True)

In [18]:
model = KGCN(
    num_users=len(user2id),
    num_entities=max(max(item2id.values()), max(entity2id.values())) + 1,
    num_relations=len(relation2id),
    embed_dim=64,
    kg_dict=kg_dict
).to(device)

optimizer = torch.optim.Adam(model.parameters(), lr=0.0002, weight_decay=2e-5)
loss_fn = nn.BCELoss()

for epoch in range(2):  # 2 epoch만 확인
    model.train()
    total_loss = 0
    for users, items, labels in tqdm(train_loader, desc=f"Epoch {epoch+1}"):
        users = users.to(device)
        items = items.to(device)
        labels = labels.float().to(device)

        preds = model(users, items)
        loss = loss_fn(preds, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    print(f"Epoch {epoch+1} | Loss: {total_loss:.4f}")

Epoch 1: 100%|██████████| 40/40 [00:01<00:00, 22.53it/s]


Epoch 1 | Loss: 139.4906


Epoch 2: 100%|██████████| 40/40 [00:01<00:00, 25.68it/s]

Epoch 2 | Loss: 137.8773



