# Sparse vs Dense Graph Representations



This code snippet demonstrates the implementation of the GraphSAGE model with both dense and sparse data representations for node classification on the 'ogbn-products' dataset from the OGB (Open Graph Benchmark). (Spoiler: the dense representation will overwhelm the GPU, unless you have at least 50GB of usable GPU memory. The sparse representation takes up about half the GPU RAM.)

Two distinct GraphSAGE classes are defined for handling sparse and dense data. The code leverages pynvml to monitor GPU memory usage during model training and testing.

The user has the flexibility to choose between the sparse or dense representation by toggling the use_sparse variable. Correspondingly, different versions of the train and test functions are defined and selected to handle the selected data representation. The functions record and print out the loss, correctness, and GPU memory usage during execution.

The model, once trained, is tested, and the predictions, true labels, and memory usage are printed out. Finally, the pynvml library is shut down to free up resources.

## Overview of Graph Representations in Pytorch Geometric
###Sparse Graphs
In the context of graph neural networks, sparse graphs are typically represented using adjacency matrices. An adjacency matrix is a square matrix used to represent a finite graph. The elements of the matrix indicate whether pairs of vertices are adjacent or not in the graph. For large graphs, the adjacency matrix can be highly sparse, meaning that most of the elements are zero. In practice, we only store the non-zero elements to save memory and computational resources. This is particularly crucial for large-scale graphs like social networks, citation networks, etc., where the number of nodes can be in the millions or more.

###Dense Graphs
On the other hand, dense graphs or graph data represented in a dense format often use an edge list for representation. An edge list consists of pairs of nodes that have a direct edge between them. It’s a simple and explicit way to represent a graph but can be inefficient for large, sparse graphs, as it doesn't exploit the sparsity of the data.

###Differences in GraphSAGE Classes
#####Input Data Format:

GraphSAGE_sparse: Takes adjacency matrices as input. The forward method accepts adj_t (a sparse tensor representation of the adjacency matrix).

GraphSAGE_dense: Takes edge lists as input. The forward method accepts edge_index that provides the indices of the source and target nodes for each edge.
#####Memory Efficiency:

GraphSAGE_sparse: More memory-efficient for large, sparse graphs as it only stores non-zero elements of the adjacency matrix.

GraphSAGE_dense: Can be less efficient in terms of memory for large, sparse graphs but is suitable for smaller or denser graphs.
#####Computational Efficiency:

GraphSAGE_sparse: Can be computationally efficient for certain operations due to the sparse nature of the data.

GraphSAGE_dense: May involve more computations as it deals with the entire edge list or adjacency matrix.
Code Adaptability:

The two classes ensure that the GraphSAGE model can be easily adapted to different data formats and storage requirements, offering flexibility to the user.

In [None]:
#Uninstall the current CUDA version
!apt-get --purge remove cuda nvidia* libnvidia-*
!dpkg -l | grep cuda- | awk '{print $2}' | xargs -n1 dpkg --purge
!apt-get remove cuda-*
!apt autoremove
!apt-get update

In [2]:

#Download CUDA 11.6
!wget  --no-clobber https://developer.download.nvidia.com/compute/cuda/11.6.0/local_installers/cuda-repo-ubuntu1804-11-6-local_11.6.0-510.39.01-1_amd64.deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
#install CUDA kit dpkg
!dpkg -i cuda-repo-ubuntu1804-11-6-local_11.6.0-510.39.01-1_amd64.deb
!sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda-repo-ubuntu1804-11-6-local/7fa2af80.pub
!apt-get update
!apt-get install cuda-11-6

--2023-10-18 22:28:42--  https://developer.download.nvidia.com/compute/cuda/11.6.0/local_installers/cuda-repo-ubuntu1804-11-6-local_11.6.0-510.39.01-1_amd64.deb
Resolving developer.download.nvidia.com (developer.download.nvidia.com)... 152.199.20.126
Connecting to developer.download.nvidia.com (developer.download.nvidia.com)|152.199.20.126|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2681108236 (2.5G) [application/x-deb]
Saving to: ‘cuda-repo-ubuntu1804-11-6-local_11.6.0-510.39.01-1_amd64.deb’


2023-10-18 22:28:51 (270 MB/s) - ‘cuda-repo-ubuntu1804-11-6-local_11.6.0-510.39.01-1_amd64.deb’ saved [2681108236/2681108236]

--2023-10-18 22:28:51--  https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
Reusing existing connection to developer.download.nvidia.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 2942 (2.9K) [application/x-deb]
Saving to: ‘cuda-repo-ubuntu1804_10.0.130-1_amd6

In [3]:
!pip uninstall torch -y

Found existing installation: torch 2.0.1+cu118
Uninstalling torch-2.0.1+cu118:
  Successfully uninstalled torch-2.0.1+cu118


In [4]:
!pip install torch==1.13.0 -f https://data.pyg.org/whl/torch-1.13.0+cu116.html

Looking in links: https://data.pyg.org/whl/torch-1.13.0+cu116.html
Collecting torch==1.13.0
  Downloading torch-1.13.0-cp310-cp310-manylinux1_x86_64.whl (890.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m890.1/890.1 MB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-runtime-cu11==11.7.99 (from torch==1.13.0)
  Downloading nvidia_cuda_runtime_cu11-11.7.99-py3-none-manylinux1_x86_64.whl (849 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m849.3/849.3 kB[0m [31m72.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cudnn-cu11==8.5.0.96 (from torch==1.13.0)
  Downloading nvidia_cudnn_cu11-8.5.0.96-2-py3-none-manylinux1_x86_64.whl (557.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m557.1/557.1 MB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cublas-cu11==11.10.3.66 (from torch==1.13.0)
  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)
[2K     

In [None]:
!pip install psutil

In [6]:
# Find the CUDA version PyTorch was installed with
!python -c "import torch; print(torch.version.cuda)"

11.7


In [7]:
# PyTorch version
!python -c "import torch; print(torch.__version__)"

1.13.0+cu117


In [8]:

!pip install pyg-lib torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-1.13.0+cu117.html
!pip install torch-geometric

Looking in links: https://data.pyg.org/whl/torch-1.13.0+cu117.html
Collecting pyg-lib
  Downloading https://data.pyg.org/whl/torch-1.13.0%2Bcu117/pyg_lib-0.3.0%2Bpt113cu117-cp310-cp310-linux_x86_64.whl (2.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.4/2.4 MB[0m [31m50.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting torch-scatter
  Downloading torch_scatter-2.1.2.tar.gz (108 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m108.0/108.0 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting torch-sparse
  Downloading torch_sparse-0.6.18.tar.gz (209 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m210.0/210.0 kB[0m [31m9.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: torch-scatter, torch-sparse
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython 

313pm

In [18]:
!pip install ogb
!pip install pynvml

Collecting pynvml
  Downloading pynvml-11.5.0-py3-none-any.whl (53 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.1/53.1 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pynvml
Successfully installed pynvml-11.5.0


In [11]:
import torch
import torch.nn.functional as F
import torch_geometric.transforms as T
from torch_geometric.nn import SAGEConv
from torch_geometric.transforms import ToSparseTensor
from ogb.nodeproppred import PygNodePropPredDataset
import numpy as np
import pynvml

We use pynvml to get the GPU memory usage. This function can be placed anywhere in the code. In the example below, it is placed after the epoch. When I trouble shoot memory issues, one simple way to profile the code is to place this function in serveral spots in the code, as one would with debugging print statements.


In [2]:
# Initialize NVML
pynvml.nvmlInit()

# Function to get the current GPU memory usage using pynvml
def get_gpu_memory_usage():
    handle = pynvml.nvmlDeviceGetHandleByIndex(0)  # 0 is GPU index
    info = pynvml.nvmlDeviceGetMemoryInfo(handle)
    return info.used / 1024 / 1024  # convert to MB

Definition of the GraphSAGE model for sparse data. These layers use adjacency matrices.

In [3]:
class GraphSAGE_sparse(torch.nn.Module):
    def __init__(self, in_dim, hidden_dim, out_dim, dropout=0.2):
        super().__init__()
        self.dropout = dropout
        self.conv1 = SAGEConv(in_dim, hidden_dim)
        self.conv2 = SAGEConv(hidden_dim, hidden_dim)
        self.conv3 = SAGEConv(hidden_dim, out_dim)
    def forward(self, x, adj_t):
        x = self.conv1(x, adj_t)
        x = F.elu(x)
        x = F.dropout(x, p=self.dropout)
        x = self.conv2(x, adj_t)
        x = F.elu(x)
        x = F.dropout(x, p=self.dropout)
        x = self.conv3(x, adj_t)
        return torch.log_softmax(x, dim=-1)

Definition of the GraphSAGE model for dense data. These layers use edge indices.

In [4]:
class GraphSAGE_dense(torch.nn.Module):
    def __init__(self, in_dim, hidden_dim, out_dim, dropout=0.2):
        super().__init__()
        self.dropout = dropout
        self.conv1 = SAGEConv(in_dim, hidden_dim)
        self.conv2 = SAGEConv(hidden_dim, hidden_dim)
        self.conv3 = SAGEConv(hidden_dim, out_dim)
    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index)
        x = F.elu(x)
        x = F.dropout(x, p=self.dropout)
        x = self.conv2(x, edge_index)
        x = F.elu(x)
        x = F.dropout(x, p=self.dropout)
        x = self.conv3(x, edge_index)
        return torch.log_softmax(x, dim=-1)

Train and test functions for the sparse representation. In this simple example, there is only one epoch.



In [5]:
def train_sparse(model, data, optimizer):
    model.train()
    data = data.to('cuda')

    optimizer.zero_grad()
    out = model(data.x, data.adj_t)  # Use adj_t for sparse tensor representation
    loss = F.cross_entropy(out, data.y.squeeze(1).long())
    loss.backward()
    optimizer.step()

    memory_usage = get_gpu_memory_usage()  # Getting memory usage with pynvml

    correct = (out.argmax(dim=1) == data.y.squeeze(1)).sum().item()
    return loss.item(), correct, memory_usage

@torch.no_grad()
def test_sparse(model, data):
    model.eval()
    data = data.to('cuda')

    out = model(data.x, data.adj_t)  # Use adj_t for sparse tensor representation
    pred = out.argmax(dim=1)

    memory_usage = get_gpu_memory_usage()  # Getting memory usage with pynvml

    return pred.cpu(), data.y.cpu(), memory_usage

Training/Test functions for dense. Again, we opt for one epoch in this example.

In [6]:
def train_dense(model, data, optimizer):
    model.train()
    data = data.to('cuda')

    optimizer.zero_grad()
    out = model(data.x, data.edge_index)
    loss = F.cross_entropy(out, data.y.squeeze(1).long())
    loss.backward()
    optimizer.step()

    memory_usage = get_gpu_memory_usage()  # Getting memory usage with pynvml

    correct = (out.argmax(dim=1) == data.y.squeeze(1)).sum().item()
    return loss.item(), correct, memory_usage

@torch.no_grad()
def test_dense(model, data):
    model.eval()
    data = data.to('cuda')

    out = model(data.x, data.edge_index)
    pred = out.argmax(dim=1)

    memory_usage = get_gpu_memory_usage()  # Getting memory usage with pynvml

    return pred.cpu(), data.y.cpu(), memory_usage

Choose whether to use sparse or dense data

In [12]:
# Choose either sparse or dense
use_sparse = True  # Set to True for sparse, False for dense

if use_sparse:
    # Load the dataset
    dataset = PygNodePropPredDataset(name='ogbn-products', transform=T.ToSparseTensor())
    data = dataset[0]
    model = GraphSAGE_sparse(dataset.num_features, 128, dataset.num_classes)
    train = train_sparse
    test = test_sparse
else:
    # Load the dataset
    dataset = PygNodePropPredDataset(name='ogbn-products')
    data = dataset[0]
    model = GraphSAGE_dense(dataset.num_features, 128, dataset.num_classes)
    train = train_dense
    test = test_dense

  adj = torch.sparse_csr_tensor(


In our scenario, opting for a sparse representation should work fine. Opting for a dense representation of the data will lead to an `OutOfMemoryError`.

This error occurs when the GPU runs out of memory during the training process, which is attributed to the large memory requirement of handling dense data representations. The attempted allocation of 46.09 GiB of memory surpasses the total GPU capacity, causing the error.

One effective strategy to mitigate such memory issues is implementing mini-batching. Mini-batching breaks down the dataset into smaller, manageable batches, reducing the amount of memory needed at any given time. In the context of the Section 8.7 notebook, mini-batching is successfully applied to handle the same dense representation without overwhelming the memory, thus preventing the `OutOfMemoryError`.

In [13]:
model = model.to('cuda')
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Training and testing the model
loss, correct, train_memory = train(model, data, optimizer)
print(f"Loss: {loss}, Correct: {correct}, Memory Used: {train_memory} MB")

pred, true_labels, test_memory = test(model, data)
print(f"Predictions: {pred}, True Labels: {true_labels}, Memory Used: {test_memory} MB")

# Shutdown NVML
pynvml.nvmlShutdown()

Loss: 3.849055767059326, Correct: 47747, Memory Used: 24667.4375 MB
Predictions: tensor([10,  4,  4,  ...,  8,  4,  4]), True Labels: tensor([[0],
        [1],
        [2],
        ...,
        [8],
        [2],
        [4]]), Memory Used: 24667.4375 MB
