First, let's import necessary libraries and prepare a toy dataset. In a real scenario, you would fetch content feature vectors for users and items from your MongoDB. Here, we'll simulate this with small numpy arrays. Each user and item is represented by a feature vector (e.g., encoding topics of interest for users and content topics for courses). We also define a set of user-item interactions (edges in the graph). This toy data will allow us to demonstrate the model pipeline:

In [None]:
!pip install torch torchvision matplotlib networkx

In [None]:
import numpy as np
import networkx as nx
import torch
from torch import nn
import torch.optim as optim
import matplotlib.pyplot as plt

In [None]:
# Use GPU if available, otherwise fallback to CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

# Sample data parameters
num_users = 5
num_items = 10
feature_dim = 3   # Dimension of content feature vectors for users/items (example)

# Example content-based feature vectors for each user (e.g., preference scores for 3 topics)
user_content = np.array([
    [1.0, 0.2, 0.1],   # User 0: strong in topic0
    [0.1, 0.9, 0.2],   # User 1: strong in topic1
    [0.7, 0.7, 0.1],   # User 2: interest in topic0 & topic1
    [0.0, 0.1, 0.9],   # User 3: strong in topic2
    [0.2, 0.8, 0.8]    # User 4: interest in topic1 & topic2
])

# Example content-based feature vectors for each item (e.g., content distribution over 3 topics)
item_content = np.array([
    [1.0, 0.0, 0.0],    # Item 0: pure topic0 content
    [0.8, 0.2, 0.0],    # Item 1: mostly topic0
    [0.0, 1.0, 0.0],    # Item 2: pure topic1 content
    [0.1, 0.9, 0.0],    # Item 3: mostly topic1
    [0.0, 0.0, 1.0],    # Item 4: pure topic2 content
    [0.0, 0.4, 0.8],    # Item 5: mostly topic2
    [0.6, 0.6, 0.0],    # Item 6: mixed topic0 & topic1
    [0.0, 0.7, 0.7],    # Item 7: mixed topic1 & topic2
    [0.5, 0.0, 0.5],    # Item 8: mixed topic0 & topic2
    [0.33, 0.33, 0.33]  # Item 9: balanced across all topics
])

# Define user-item interactions (edges in the bipartite graph)
# Each tuple (u, i) means user u has interacted with item i (e.g., viewed or liked the course)
interactions = [
    (0, 0), (0, 1),        # User 0 interacted with Item 0 and 1
    (1, 2), (1, 3),        # User 1 with Item 2 and 3
    (2, 0), (2, 2), (2, 6),# User 2 with Item 0, 2, 6
    (3, 4), (3, 5),        # User 3 with Item 4, 5
    (4, 3), (4, 5), (4, 7) # User 4 with Item 3, 5, 7
]

# Build neighbor lists (adjacency) for each user and item, for graph propagation
user_neighbors = {u: [] for u in range(num_users)}
item_neighbors = {i: [] for i in range(num_items)}
for (u, i) in interactions:
    user_neighbors[u].append(i)
    item_neighbors[i].append(u)

# Degree (number of neighbors) for normalization in GCN propagation
deg_user = {u: len(user_neighbors[u]) for u in user_neighbors}
deg_item = {i: len(item_neighbors[i]) for i in item_neighbors}

# Move data to PyTorch (the embedding initialization will handle conversion)


Next, let's visualize the user-item interaction graph to understand the structure. We create a bipartite graph where one set of nodes are users (U0–U4) and the other set are items (I0–I9), with edges showing interactions:

In [None]:
# Visualize the bipartite user-item graph
G = nx.Graph()
# Add user nodes and item nodes
user_nodes = [f"U{u}" for u in range(num_users)]
item_nodes = [f"I{i}" for i in range(num_items)]
G.add_nodes_from(user_nodes, bipartite=0)
G.add_nodes_from(item_nodes, bipartite=1)
# Add edges between users and items
for (u, i) in interactions:
    G.add_edge(f"U{u}", f"I{i}")

# Compute positions for a bipartite layout (users on left, items on right)
pos = {}
# Users at x = 0
for idx, u in enumerate(user_nodes):
    pos[u] = (-1.0, idx)  # spread out vertically
# Items at x = 1
for idx, i in enumerate(item_nodes):
    pos[i] = (1.0, idx)

# Draw the graph
plt.figure(figsize=(6, 4))
nx.draw_networkx_nodes(G, pos, nodelist=user_nodes, node_color='lightblue', node_shape='s', label='Users')
nx.draw_networkx_nodes(G, pos, nodelist=item_nodes, node_color='lightgreen', node_shape='o', label='Items')
nx.draw_networkx_edges(G, pos, edge_color='gray')
nx.draw_networkx_labels(G, pos, font_size=8)
plt.axis('off')
plt.title("User-Item Interaction Graph")
plt.legend()
plt.show()


User-item bipartite graph visualization. In this graph, squares represent users U0–U4 and circles represent items I0–I9. An edge between a user and an item means that user interacted with that item (e.g., a user viewed or liked a course). Such a graph is the basis for graph-based collaborative filtering models like LightGCN.
In the code above: We used NetworkX to draw the graph. The users are laid out on the left (x = -1) and items on the right (x = 1) for clarity. This structure shows how information will flow in the GNN: user nodes aggregate messages from their item neighbors, and item nodes aggregate messages from their user neighbors.

## LightGCN Model Definition
Now, we'll define the LightGCN model. LightGCN is a simplified Graph Convolutional Network for recommendations that only uses neighborhood aggregation and omits feature transformations and nonlinear activations
medium.com
medium.com
. It learns embeddings for users and items by propagating information on the user-item interaction graph. Importantly, we will integrate content-based features by initializing the node embeddings with the given feature vectors (instead of random initializations). Key points of LightGCN:
Each user and item has an embedding vector that is updated through multiple propagation layers.
In each layer, a node's embedding is updated as the average of its neighbors' embeddings from the previous layer (with a normalization factor)
medium.com
.
The final embedding for a node is the sum (or average) of its embedding from each layer (including the initial layer)
medium.com
.
The model predicts the affinity of a user for an item via the inner product of their final embeddings
medium.com
.
Let's implement this step by step in PyTorch:

In [None]:
class LightGCN(nn.Module):
    def __init__(self, num_users, num_items, embed_dim, num_layers, user_features=None, item_features=None):
        """
        LightGCN model initialization.
        - num_users, num_items: number of user and item nodes.
        - embed_dim: embedding vector dimension.
        - num_layers: number of propagation layers in GCN.
        - user_features, item_features: optional initial features for users and items (numpy arrays).
        """
        super(LightGCN, self).__init__()
        self.num_users = num_users
        self.num_items = num_items
        self.embed_dim = embed_dim
        self.num_layers = num_layers

        # Define trainable embeddings for users and items.
        # If content features are provided and match embed_dim, use them as initial embeddings.
        if user_features is not None and user_features.shape[1] == embed_dim:
            self.user_embedding = nn.Embedding(num_users, embed_dim)
            # Initialize user embedding weights from provided features
            self.user_embedding.weight.data = torch.tensor(user_features, dtype=torch.float32)
        else:
            # If no features provided (or dimensionality mismatch), initialize randomly
            self.user_embedding = nn.Embedding(num_users, embed_dim)
            nn.init.xavier_uniform_(self.user_embedding.weight)  # Xavier initialization for embeddings

        if item_features is not None and item_features.shape[1] == embed_dim:
            self.item_embedding = nn.Embedding(num_items, embed_dim)
            self.item_embedding.weight.data = torch.tensor(item_features, dtype=torch.float32)
        else:
            self.item_embedding = nn.Embedding(num_items, embed_dim)
            nn.init.xavier_uniform_(self.item_embedding.weight)

    def propagate(self, user_neighbors, item_neighbors, deg_user, deg_item):
        """
        Perform K-layer propagation to compute final embeddings for all users and items.
        - user_neighbors, item_neighbors: dictionaries of adjacency lists.
        - deg_user, deg_item: dictionaries of node degrees for normalization.
        Returns:
        - user_emb_final, item_emb_final: final embeddings after K propagation layers.
        """
        # Initial embeddings (layer 0)
        user_emb = self.user_embedding.weight    # shape: [num_users, embed_dim]
        item_emb = self.item_embedding.weight    # shape: [num_items, embed_dim]

        # Store embeddings at each layer (including initial)
        user_emb_layers = [user_emb]
        item_emb_layers = [item_emb]

        # Neighborhood aggregation for each layer
        for layer in range(1, self.num_layers + 1):
            # New containers for this layer's embeddings
            new_user_emb = torch.zeros_like(user_emb)
            new_item_emb = torch.zeros_like(item_emb)
            prev_user_emb = user_emb_layers[-1]  # user embeddings from previous layer
            prev_item_emb = item_emb_layers[-1]  # item embeddings from previous layer

            # Aggregate neighbor embeddings for each user
            for u in range(self.num_users):
                if len(user_neighbors[u]) == 0:
                    # If a user has no neighbors (no interactions), skip (remains zero)
                    continue
                # Sum the embeddings of all neighbor items (from prev layer)
                # using normalized weight 1/sqrt(deg(u) * deg(item))
                total = torch.zeros(self.embed_dim, device=user_emb.device)
                for i in user_neighbors[u]:
                    norm_factor = (deg_user[u] * deg_item[i]) ** 0.5  # sqrt(|N(u)| * |N(i)|)
                    total += prev_item_emb[i] / norm_factor
                new_user_emb[u] = total

            # Aggregate neighbor embeddings for each item
            for i in range(self.num_items):
                if len(item_neighbors[i]) == 0:
                    continue
                total = torch.zeros(self.embed_dim, device=user_emb.device)
                for u in item_neighbors[i]:
                    norm_factor = (deg_user[u] * deg_item[i]) ** 0.5
                    total += prev_user_emb[u] / norm_factor
                new_item_emb[i] = total

            # Append this layer's embeddings to the list
            user_emb_layers.append(new_user_emb)
            item_emb_layers.append(new_item_emb)

        # Combine embeddings from all layers (including initial layer 0).
        # LightGCN typically sums the embeddings from each layer:contentReference[oaicite:6]{index=6}.
        user_emb_final = torch.zeros_like(user_emb)
        item_emb_final = torch.zeros_like(item_emb)
        for emb in user_emb_layers:
            user_emb_final += emb
        for emb in item_emb_layers:
            item_emb_final += emb
        # (Optionally, could divide by (num_layers+1) to take an average; here sum is fine or average equivalently)

        return user_emb_final, item_emb_final

    def forward(self, user_indices, item_indices, user_neighbors, item_neighbors, deg_user, deg_item):
        """
        Forward pass to predict scores for given user-item pairs.
        - user_indices, item_indices: tensors of indices for users and items to score.
        Returns:
        - scores: tensor of predicted scores for each (user, item) pair.
        """
        # Get final embeddings after propagation
        user_emb_final, item_emb_final = self.propagate(user_neighbors, item_neighbors, deg_user, deg_item)
        # Select the embeddings for the specified user and item indices
        u_emb = user_emb_final[user_indices]    # shape: [batch_size, embed_dim]
        i_emb = item_emb_final[item_indices]    # shape: [batch_size, embed_dim]
        # Compute dot-product similarity for each pair (this is the predicted score)
        scores = torch.sum(u_emb * i_emb, dim=1)  # inner product along embedding dimension
        return scores


In [None]:
# Instantiate the LightGCN model
embed_dim = feature_dim  # we'll use the same dimension as our feature vectors for embedding
num_layers = 2           # number of propagation layers in GCN
model = LightGCN(num_users, num_items, embed_dim, num_layers, user_features=user_content, item_features=item_content)
model = model.to(device)


## Training the LightGCN Model
We will train the model using Bayesian Personalized Ranking (BPR) loss, which is a common objective for implicit feedback recommenders like ours. BPR loss encourages the model to score observed (user, item) pairs higher than unobserved pairs. In practice, for each known interaction (positive example), we sample a random item that the user has not interacted with (negative example), and train the model to rank the positive item higher than the negative. We also include L2 regularization on the embedding parameters to prevent overfitting.

Let's set up the training loop:

In [None]:
# Training setup
optimizer = optim.Adam(model.parameters(), lr=0.01)  # Adam optimizer for gradient descent
num_epochs = 50
reg_lambda = 1e-4  # regularization strength

# Prepare for training
model.train()  # set model to training mode
epoch_losses = []  # to record the training loss each epoch

# Training loop
for epoch in range(1, num_epochs+1):
    total_loss = 0.0
    count = 0
    # Shuffle interactions for each epoch (stochastic order)
    np.random.shuffle(interactions)
    for (u, i) in interactions:
        # Skip if a user has no negative candidate (not likely in real data unless user interacted with all items)
        if len(user_neighbors[u]) == num_items:
            continue
        # Sample a negative item (j) that user u has NOT interacted with
        j = np.random.randint(0, num_items)
        while j in user_neighbors[u]:
            j = np.random.randint(0, num_items)
        # Create tensors for the user and item indices
        # We'll score (u,i) as positive and (u,j) as negative in one forward pass by duplicating u
        user_idx = torch.tensor([u, u], dtype=torch.long, device=device)
        item_idx = torch.tensor([i, j], dtype=torch.long, device=device)

        # Forward pass: get scores for (u,i) and (u,j)
        optimizer.zero_grad()              # reset gradients from previous step
        scores = model(user_idx, item_idx, user_neighbors, item_neighbors, deg_user, deg_item)
        # scores is a tensor of length 2: [score(u,i), score(u,j)]
        score_pos = scores[0]
        score_neg = scores[1]

        # Compute BPR loss: -log(sigmoid(score_pos - score_neg))
        diff = score_pos - score_neg
        loss = -torch.log(torch.sigmoid(diff) + 1e-8)  # add a small constant to avoid log(0)
        loss = loss + 0.0  # convert from tensor to scalar (for autograd, it's fine as is)

        # Add L2 regularization on user and item embeddings (to discourage large weights)
        reg_loss = reg_lambda * (model.user_embedding.weight.norm(2).pow(2) +
                                 model.item_embedding.weight.norm(2).pow(2))
        total_batch_loss = loss + reg_loss

        # Backpropagation
        total_batch_loss.backward()  # compute gradients
        optimizer.step()             # update model parameters

        total_loss += loss.item()    # accumulate the BPR loss (for monitoring)
        count += 1

    # Record average loss for this epoch
    avg_loss = total_loss / (count if count > 0 else 1)
    epoch_losses.append(avg_loss)
    # Print progress occasionally
    if epoch % 10 == 0 or epoch == 1:
        print(f"Epoch {epoch}/{num_epochs}, average BPR loss = {avg_loss:.4f}")


In [None]:
# Plot the training loss curve
plt.figure(figsize=(6,4))
plt.plot(range(1, len(epoch_losses)+1), epoch_losses, marker='o')
plt.title("Training Loss Curve (BPR loss)")
plt.xlabel("Epoch")
plt.ylabel("Average BPR Loss")
plt.grid(True)
plt.show()


In [None]:
# Switch to evaluation mode for inference
model.eval()

# Get final embeddings for all users and items after training
user_emb_final, item_emb_final = model.propagate(user_neighbors, item_neighbors, deg_user, deg_item)
user_emb_final = user_emb_final.cpu().detach().numpy()
item_emb_final = item_emb_final.cpu().detach().numpy()

# Use PCA to reduce embedding dimension to 2 for visualization
from sklearn.decomposition import PCA
all_emb = np.vstack([user_emb_final, item_emb_final])  # combine user and item embeddings
if all_emb.shape[1] > 2:
    # Fit PCA on all embeddings and transform
    emb_2d = PCA(n_components=2).fit_transform(all_emb)
else:
    # If embeddings are already 2D or 1D, we can pad or take them as is
    if all_emb.shape[1] == 1:
        emb_2d = np.hstack([all_emb, all_emb])  # duplicate the single dimension
    else:
        emb_2d = all_emb

# Split back into user and item embeddings
user_emb_2d = emb_2d[:num_users]
item_emb_2d = emb_2d[num_users:]

# Plot the 2D embeddings
plt.figure(figsize=(6,6))
plt.scatter(user_emb_2d[:,0], user_emb_2d[:,1], c='blue', label='Users')
plt.scatter(item_emb_2d[:,0], item_emb_2d[:,1], c='green', label='Items')
# Annotate points with their IDs (U0, I0, etc.)
for u_idx in range(num_users):
    plt.annotate(f"U{u_idx}", (user_emb_2d[u_idx,0], user_emb_2d[u_idx,1]), color='blue')
for i_idx in range(num_items):
    plt.annotate(f"I{i_idx}", (item_emb_2d[i_idx,0], item_emb_2d[i_idx,1]), color='green')
plt.title("User and Item Embeddings in 2D Space")
plt.legend()
plt.show()


In [None]:
# Function to get top-N recommendations for a given user
def recommend_for_user(user_id, N=3):
    model.eval()
    # Get all item scores for this user
    user_indices = torch.tensor([user_id] * num_items, device=device)
    item_indices = torch.tensor(list(range(num_items)), device=device)
    with torch.no_grad():  # no gradient needed for inference
        scores = model(user_indices, item_indices, user_neighbors, item_neighbors, deg_user, deg_item)
    scores = scores.cpu().numpy()
    # Get top N item indices with highest scores that the user hasn't interacted with
    ranked_items = np.argsort(-scores)
    recommended = [item for item in ranked_items if item not in user_neighbors[user_id]][:N]

    return recommended, scores

# Example: recommendations for User 2
user_id = 2
topN = num_items
rec_items, all_item_scores = recommend_for_user(user_id, N=topN)
print(f"Top {topN} recommended items for User {user_id}: {rec_items[:3]}")
recommended_item_scores = [all_item_scores[item_id] for item_id in rec_items]

# Convert scores to probabilities using softmax
# softmax(x_i) = exp(x_i) / sum(exp(x_j) for j in all_items)
# We apply softmax only to the recommended items' scores to get their relative probabilities
# among themselves. If you want the probability relative to *all* items, let me know.
exp_scores = np.exp(recommended_item_scores - np.max(recommended_item_scores)) # Subtract max for numerical stability
probabilities = exp_scores / np.sum(exp_scores)

# Convert probabilities to percentages
percentages = probabilities * 100

print("Recommended Items with Probability Strength:")
for i, item_id in enumerate(rec_items):
    print(f"Item {item_id}: {percentages[i]:.2f}%")

