# DGL 101: Use DGL to implement a simple node classification with Karat Club data

Almost every computer 101 class starts with a "Hello World" example. Like MNIST for deep learning, in graph study domain we have the Zachary's Karate Club problem. The karate club is a social network that includes 34 members and documents pairwise links between members who interact outside the club. The club later divides into two communities led by the instructor and the club president. The network is visualized as follows with the color indicating the community.
<img src='./images/karat_club.png' align='center' width="400px" height="300px" />
The club is used as a typical node classification task, which purely leverage graph structure information. In this tutorial, we will use Graph Convolutional Network, a basic Graph Neural Network, to do node classification.

You will learn:
- How to define a graph, adding nodes and edges;
- How to setup features and labels for nodes;
- How to define a GCN model using DGL's building modules;
- How to train the GCN model, and
- How to check the results

Notice: this tutorial is using PyTorch as backend. You can find MXNet version GCN in <a href='https://github.com/dmlc/dgl/blob/master/examples/mxnet/gcn/gcn.py'>here</a> and TensorFlow version in <a href='https://github.com/dmlc/dgl/blob/master/examples/tensorflow/gcn/gcn.py'>here</a>. And more examples could be found in our <a href="https://github.com/dmlc/dgl/examples">github link</a>.
<!--
### 使用DGL的实现简单的节点分类 
-->

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch import optim
import dgl
from dgl.nn.pytorch import GraphConv
import dgl.function as fn
import networkx as nx
import pandas as pd
import numpy as np

print(dgl.__version__)

0.4.3


## DGL 101: Karate Club Classification

Here we use PyTorch to implement a node classification algorithm. Basically, below codes include 5 steps.

In [2]:
# five steps of training 
# ----------- 1. create graph and features ------ #
# first, create the graph
g = dgl.DGLGraph()
g.add_nodes(34)

# second, add edges
edge_list = [(1, 0), (2, 0), (2, 1), (3, 0), (3, 1), (3, 2),
        (4, 0), (5, 0), (6, 0), (6, 4), (6, 5), (7, 0), (7, 1),
        (7, 2), (7, 3), (8, 0), (8, 2), (9, 2), (10, 0), (10, 4),
        (10, 5), (11, 0), (12, 0), (12, 3), (13, 0), (13, 1), (13, 2),
        (13, 3), (16, 5), (16, 6), (17, 0), (17, 1), (19, 0), (19, 1),
        (21, 0), (21, 1), (25, 23), (25, 24), (27, 2), (27, 23),
        (27, 24), (28, 2), (29, 23), (29, 26), (30, 1), (30, 8),
        (31, 0), (31, 24), (31, 25), (31, 28), (32, 2), (32, 8),
        (32, 14), (32, 15), (32, 18), (32, 20), (32, 22), (32, 23),
        (32, 29), (32, 30), (32, 31), (33, 8), (33, 9), (33, 13),
        (33, 14), (33, 15), (33, 18), (33, 19), (33, 20), (33, 22),
        (33, 23), (33, 26), (33, 27), (33, 28), (33, 29), (33, 30),
        (33, 31), (33, 32)]
src, dst = tuple(zip(*edge_list))
g.add_edges(src, dst)
g.add_edges(dst, src)

In [3]:
# third add some features to nodes
g.ndata['feats'] = torch.eye(34)
labeled_nodes = torch.tensor([0, 33])
labeled_labels = torch.tensor([0, 1])
# fourth create initial inputs
inputs = torch.eye(34)

In [4]:
# ----------- 2. create model -------------- #
# build a two layer GCN
class GCN(nn.Module):
    def __init__(self, in_feats, h_feats, num_classes):
        super(GCN, self).__init__()
#         if activation == None:
#             activation = F.relu
        self.gcn_layer1 = GraphConv(in_feats, h_feats)
        self.gcn_layer2 = GraphConv(h_feats, num_classes)
    
    def forward(self, g, in_feat):
        h = self.gcn_layer1(g, in_feat)
        h = F.relu(h)
        h = self.gcn_layer2(g, h)
        return h
    
# create a GCN with given dimensions 
net = GCN(34, 16, 2)

In [5]:
# ----------- 3. set up loss and optimizer -------------- #
# in this case, loss will in training loop
optimizer = optim.Adam(net.parameters(), lr=0.01)

# ----------- 4. traing -------------------------------- #
for e in range(30):
    logits = net(g, inputs)
    
    # compute loss
    logp = F.log_softmax(logits, 1)
    loss = F.nll_loss(logp[labeled_nodes], labeled_labels)
    
    # backpropagation
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    print('In epoch {}, loss: {}'.format(e, loss))

In epoch 0, loss: 0.6628820300102234
In epoch 1, loss: 0.6366796493530273
In epoch 2, loss: 0.6087723970413208
In epoch 3, loss: 0.5795037746429443
In epoch 4, loss: 0.5500038862228394
In epoch 5, loss: 0.5205397605895996
In epoch 6, loss: 0.49064093828201294
In epoch 7, loss: 0.4591338634490967
In epoch 8, loss: 0.4272027313709259
In epoch 9, loss: 0.3962421715259552
In epoch 10, loss: 0.3670524060726166
In epoch 11, loss: 0.3390410840511322
In epoch 12, loss: 0.3119989335536957
In epoch 13, loss: 0.28574976325035095
In epoch 14, loss: 0.26048415899276733
In epoch 15, loss: 0.23621729016304016
In epoch 16, loss: 0.21323105692863464
In epoch 17, loss: 0.19137248396873474
In epoch 18, loss: 0.17083191871643066
In epoch 19, loss: 0.15199896693229675
In epoch 20, loss: 0.1346510946750641
In epoch 21, loss: 0.11884148418903351
In epoch 22, loss: 0.10438705235719681
In epoch 23, loss: 0.09142228215932846
In epoch 24, loss: 0.07982584834098816
In epoch 25, loss: 0.0695841982960701
In epoch 2

In [6]:
# ----------- 5. check results ------------------------ #
pred = torch.argmax(logits, axis=1)
print(pred.numpy())

[0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 1 0 1 1 1 1 1]


### visualize the results (This is NOT above training)
We store nodes' classification results in each epoch, and then visualize them in a gif image. So we can see how the results changes during training. You will have a take-home exercise to figure out how to do this.
<img src='./images/classification.gif' align='center' width="600px" height="300px" />

### Save graph into a txt file and Read it back

In [7]:
# save data with networkx help
nx_g = g.to_networkx()
nx.write_edgelist(nx_g, 'karat_club.txt', delimiter=',', data=False)

# read edge list data into DGL graph data
edgelist = pd.read_csv('karat_club.txt')

In [8]:
edgelist.tail(2)

Unnamed: 0,0,1
153,33,31
154,33,32


In [9]:
# create edge list tuples and give it to a dgl graph
src = edgelist['0'].values
dst = edgelist['1'].values

edges = list(zip(src, dst))
print(edges)

graph = dgl.DGLGraph(edges)

[(0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 10), (0, 11), (0, 12), (0, 13), (0, 17), (0, 19), (0, 21), (0, 31), (1, 0), (1, 2), (1, 3), (1, 7), (1, 13), (1, 17), (1, 19), (1, 21), (1, 30), (2, 0), (2, 1), (2, 3), (2, 7), (2, 8), (2, 9), (2, 13), (2, 27), (2, 28), (2, 32), (3, 0), (3, 1), (3, 2), (3, 7), (3, 12), (3, 13), (4, 0), (4, 6), (4, 10), (5, 0), (5, 6), (5, 10), (5, 16), (6, 0), (6, 4), (6, 5), (6, 16), (7, 0), (7, 1), (7, 2), (7, 3), (8, 0), (8, 2), (8, 30), (8, 32), (8, 33), (9, 2), (9, 33), (10, 0), (10, 4), (10, 5), (11, 0), (12, 0), (12, 3), (13, 0), (13, 1), (13, 2), (13, 3), (13, 33), (14, 32), (14, 33), (15, 32), (15, 33), (16, 5), (16, 6), (17, 0), (17, 1), (18, 32), (18, 33), (19, 0), (19, 1), (19, 33), (20, 32), (20, 33), (21, 0), (21, 1), (22, 32), (22, 33), (23, 25), (23, 27), (23, 29), (23, 32), (23, 33), (24, 25), (24, 27), (24, 31), (25, 23), (25, 24), (25, 31), (26, 29), (26, 33), (27, 2), (27, 23), (27, 24), (27, 33), (28, 2), (28, 31), (28, 3

### Take home exercise

Print out each club member’s feature during training

## Basic operations on DGL graph

1. Generate graphs in different ways and save
2. Explore graph information and structures
3. Assign features to nodes/edges
4. Message passing function and Reduce(Aggregation) function

Data is based on the sample graph in a real paper. This is a bipartite graph. It has 2 types of nodes: User and Item, one type of edge: comment_on. For DGL, we need one more type of edge, commented_by.
Data sample will be:
<img src='./images/XY-Test-Data.png' width=40%>
- comment_on: [(0,0),(1,0),(2,0),(1,1),(1,2),(3,1),(4,1),(4,2)]
- commented_by: [(0,0),(0,1),(0,2),(1,1),(2,1),(1,3),(1,4),(2,4)]

### 1. Generate a heterogenous DGL graph

In [10]:
# Use two bipartitie graph to build this heterograph
co_g = dgl.bipartite([(0,0),(1,0),(2,0),(1,1),(1,2),(3,1),(4,1),(4,2)], 'user', 'comment_on', 'item')
cb_g = dgl.bipartite([(0,0),(0,1),(0,2),(1,1),(2,1),(1,3),(1,4),(2,4)], 'item', 'commented_by', 'user')

graph = dgl.hetero_from_relations([co_g, cb_g])

### 2. Before go further, let's check the basic data structure of DGL graph

In [11]:
# These two are list type features
print(graph.ntypes)
print(graph.etypes)

['item', 'user']
['comment_on', 'commented_by']


In [12]:
# This is a tuple list
print(graph.canonical_etypes)

[('user', 'comment_on', 'item'), ('item', 'commented_by', 'user')]


In [13]:
# Now check the id list in each type
print(graph.get_ntype_id('user'))
print(graph.get_ntype_id('item'))

1
0


In [14]:
# Get nodes and edges information
print(graph.number_of_nodes('user'))
print(graph.number_of_nodes('item'))
print(graph.number_of_edges('comment_on'))
print(graph.number_of_edges('commented_by'))

5
3
8
8


In [15]:
# Get nodes and edges idx
graph.all_edges(form='all', order='srcdst', etype='comment_on')

(tensor([0, 1, 1, 1, 2, 3, 4, 4]),
 tensor([0, 0, 1, 2, 0, 1, 1, 2]),
 tensor([0, 1, 3, 4, 2, 5, 6, 7]))

In [16]:
graph.all_edges(form='all', order='srcdst', etype='commented_by')

(tensor([0, 0, 0, 1, 1, 1, 2, 2]),
 tensor([0, 1, 2, 1, 3, 4, 1, 4]),
 tensor([0, 1, 2, 3, 5, 6, 4, 7]))

In [17]:
# get a slice of the graph
graph['comment_on']

Graph(num_nodes={'user': 5, 'item': 3},
      num_edges={('user', 'comment_on', 'item'): 8},
      metagraph=[('user', 'item')])

In [18]:
graph['commented_by']

Graph(num_nodes={'item': 3, 'user': 5},
      num_edges={('item', 'commented_by', 'user'): 8},
      metagraph=[('item', 'user')])

### 3. Set features of nodes and edges

Need to notice that nodes and edges are VIEW implemented in C++ for speed, so cannot directly see them. However, each type of nodes has a 'data' variable,which is a dict to hold features.

In [19]:
# Nodes and Edges are VIEW
print(graph.nodes)
print(graph.edges)

<dgl.view.HeteroNodeView object at 0x134a55f98>
<dgl.view.HeteroEdgeView object at 0x134d44438>


In [20]:
# The basic feature of nodes is an empty dict in the 'data' variable.
print(graph.nodes['user'].data)
print(graph.nodes['item'].data)

{}
{}


In [21]:
print(graph.edges['comment_on'].data)
print(graph.edges['commented_by'].data)

{}
{}


#### We have 5 user nodes, 4 item nodes, and 8 edges in two directions
<p>
<font color='red'>Notice: when set features, must use the same number nodes/edges. Otherwise, you will get an error, like "DGLError: Expect number of features to match number of nodes (len(u)). Got 2 and 5 instead."</font>

In [22]:
graph.nodes['user'].data['nu_f_1'] = torch.ones(5,2)
graph.edges['comment_on'].data['eco_f_1'] = torch.ones(8,2)

In [23]:
graph.nodes['user'].data['nu_f_1']

tensor([[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]])

In [24]:
graph.edges['comment_on'].data['eco_f_1']

tensor([[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]])

### 4. Message-passing and reduction

#### Here we use designed features to demo how to use DGL build_in functions

- User node: 2-d, values are aligned with idx, e.g. node-0 is [0,0], node-1 is [1,1], and so on so forth.
- Item node: 2-d, values are aligned with idx but with negative, e.g. node-0 is [0,0], node-1 is [-1,-1], and so on so forth.
- Edge: 2-d, for both comment_on and commented_by, values are aligned with idx but 0.10 times smaller, e.g, edge-0 is [0,0], edge-1 is [0.1,0.1], and so on so forth

In [25]:
# let's delete previous dummy data, only can run once
graph.nodes['user'].data.pop('nu_f_1')
graph.edges['comment_on'].data.pop('eco_f_1')

tensor([[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]])

In [26]:
user_feats = np.ones([5,2]) * np.arange(5).reshape(5,1)
graph.nodes['user'].data['u'] = torch.from_numpy(user_feats).float()
graph.nodes['user'].data

{'u': tensor([[0., 0.],
        [1., 1.],
        [2., 2.],
        [3., 3.],
        [4., 4.]])}

In [27]:
item_feats = np.ones([3,2]) * np.arange(3).reshape(3,1) * -1
graph.nodes['item'].data['i'] = torch.from_numpy(item_feats).float()
graph.nodes['item'].data

{'i': tensor([[-0., -0.],
        [-1., -1.],
        [-2., -2.]])}

In [28]:
edge_feats = np.ones([8,2]) * np.arange(8).reshape(8,1) * 0.1
graph.edges['comment_on'].data['e'] = torch.from_numpy(edge_feats).float()
graph.edges['commented_by'].data['e'] = torch.from_numpy(edge_feats).float()
graph.edges['comment_on'].data

{'e': tensor([[0.0000, 0.0000],
        [0.1000, 0.1000],
        [0.2000, 0.2000],
        [0.3000, 0.3000],
        [0.4000, 0.4000],
        [0.5000, 0.5000],
        [0.6000, 0.6000],
        [0.7000, 0.7000]])}

### 4.1 Let's mimic passing feature from users to items, and aggregate them with mean values

<img src='./images/XY-Test-Data.png' width=30%>

In our example, this means:
- average U0, U1, and U2 features, and add to I0 feature; 
- average U1, U3, and U4 features, and add to I1; 
- average U1 and U4 features, and add to I2.

DGL has a build-in function "up_date_all" to do message-passing and aggregation in one step.

In [32]:
# Step 1: copy users' feature to items, average them, and then store at a dictionary named by "u_avg"
graph.update_all(fn.copy_u('u', 'm'), fn.mean('m', 'u_avg'), etype='comment_on')
graph.nodes['item'].data

{'i': tensor([[1.0000, 1.0000],
        [1.6667, 1.6667],
        [0.5000, 0.5000]]), 'u_avg': tensor([[1.0000, 1.0000],
        [2.6667, 2.6667],
        [2.5000, 2.5000]])}

In [33]:
# Step 2: add the aggregate features to each item node
graph.nodes['item'].data['i'] = graph.nodes['item'].data['i'] + graph.nodes['item'].data['u_avg']
graph.nodes['item'].data

{'i': tensor([[2.0000, 2.0000],
        [4.3333, 4.3333],
        [3.0000, 3.0000]]), 'u_avg': tensor([[1.0000, 1.0000],
        [2.6667, 2.6667],
        [2.5000, 2.5000]])}

### 4.2 How about include edges' feature?
Above we only pass Users' features to Item nodes. But how about include edges feature with the Users, and pass to the Item nodes.

This task requires three steps:
1. Pass Users' features to "comment_on" type of edges, and add these features with edges' features;
2. Pass edges' aggregated features to Item nodes, and then agverage them;
3. Add the averaged feature to Item own features

In [34]:
# Step 1: add User nodes to "comment_on" edges
graph.apply_edges(fn.u_add_e('u', 'e', 'u_add_e'), etype='comment_on')
graph.edges['comment_on'].data

{'e': tensor([[0.0000, 0.0000],
        [0.1000, 0.1000],
        [0.2000, 0.2000],
        [0.3000, 0.3000],
        [0.4000, 0.4000],
        [0.5000, 0.5000],
        [0.6000, 0.6000],
        [0.7000, 0.7000]]), 'u_add_e': tensor([[0.0000, 0.0000],
        [1.1000, 1.1000],
        [2.2000, 2.2000],
        [1.3000, 1.3000],
        [1.4000, 1.4000],
        [3.5000, 3.5000],
        [4.6000, 4.6000],
        [4.7000, 4.7000]])}

In [35]:
# Step 2: pass "comment_on" edge features to Item nodes and average them
graph.update_all(fn.copy_e('u_add_e', 'm'), fn.mean('m', 'u_avg'), etype='comment_on')
graph.nodes['item'].data

{'i': tensor([[2.0000, 2.0000],
        [4.3333, 4.3333],
        [3.0000, 3.0000]]), 'u_avg': tensor([[1.1000, 1.1000],
        [3.1333, 3.1333],
        [3.0500, 3.0500]])}

In [36]:
# Step 3: add the aggregate features to each item node
graph.nodes['item'].data['i'] = graph.nodes['item'].data['i'] + graph.nodes['item'].data['u_avg']
graph.nodes['item'].data

{'i': tensor([[3.1000, 3.1000],
        [7.4667, 7.4667],
        [6.0500, 6.0500]]), 'u_avg': tensor([[1.1000, 1.1000],
        [3.1333, 3.1333],
        [3.0500, 3.0500]])}

### Take_home Exercise: Reverse message passing

Please do the reverse message passing. That is, pass the Item nodes' features to User nodes in the same ways demoed above:
1. Purely pass Items' features to User nodes;
2. Add edges' features with Items, and then pass to User nodes.