# Project Overview
This project investigates matrix completion through both baseline methods and a novel feature-enhanced approach.  
- I reproduced and tuned **GCMC** and **Funk-SVD** as baselines, extending them with visualization and feature-aware adaptations.  
- I proposed and implemented the **Feature Node GCMC** model, which introduces feature nodes into a tripartite graph and redesigns message passing, making side information an integral part of the propagation process.  
- These contributions highlight not only the replication of existing methods but also my own design and improvement of graph-based matrix completion models.  

---

# Repository Structure

## GCMC and SVD
This folder documents my baseline exploration and extensions.

- **GCMC**
  - Reproduced the original implementation and performed hyperparameter tuning.  
  - Developed a workflow to map prediction results back to user–item indices.  
  - Built a visualization-based recommendation system based on the recovered indices.  

- **SVD**
  - Implemented Funk-SVD for the matrix completion task.  
  - Extended Funk-SVD by integrating user and item side features into the factorization process.  
  - Evaluated the effect of feature-aware factorization on matrix completion performance.  

## Feature Node GCMC
This folder contains my main contribution.

- Redesigned the data modeling schema by introducing **feature nodes**, constructing a tripartite user–item–feature graph.  
- Modified the message passing mechanism in graph convolution so that features directly participate in propagation.  
- Designed and implemented the **Feature Node GCMC** model as the core improvement of this project.  



---

# Usage
The following commands show how to run each model.  
You can adjust the arguments (e.g., number of epochs, hidden dimensions, latent factors) according to your experimental needs.  
Each folder also provides supporting scripts for visualization and evaluation of the recommendation results.



In [2]:
# Mount Google Drive to access files in Colab
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# change the working directory
import os
os.chdir('/content/drive/MyDrive/GCMC and SVD')
print(os.getcwd())  #

/content/drive/MyDrive/master project


In [None]:
#GCMC without feature
!python train.py -d ml_100k --accum stack -do 0.7 -nleft -nb 2 -e 1000 --testing  > ml_1m_testing.txt  > ml_100k_testing.txt  2>&1

In [None]:
#GCMC with feature
!python train.py -d ml_100k --accum stack -do 0.7 -nleft -nb 2 -e 1000 --features --feat_hidden 10 --testing

In [None]:
# change the working directory
import os
os.chdir('/content/drive/MyDrive/GCMC and SVD/SVD')
print(os.getcwd())  #

/content/drive/MyDrive/master project/SVD


In [None]:
#funk SVD without feature
!python svd_ml100k.py \
  --data_dir "/content/drive/MyDrive/master project/data/ml_100k" \
  --split 1 \
  --factors 64 \
  --lr 0.01 \
  --reg 0.02 \
  --epochs 20 \
  --seed 42

In [None]:
#funk SVD with fearture
!python svd_ml100k_feat.py --data_dir "/content/drive/MyDrive/master project/data/ml_100k" --split 1 \
  --factors 64 --lr 0.01 --reg 0.02 --epochs 20 --seed 42 \
  --use_features 1 --feat_reg 0.01

In [None]:
# change the working directory
import os
os.chdir('/content/drive/MyDrive/Feature Node GCMC')
print(os.getcwd())  #

In [None]:
#Feature Node GCMC
!python train-6.py -d ml_100k -f --feature_nodes -ac stack -do 0.7 -nb 2 -e 1000 -t