# Text-GCN Training Pipeline

This notebook walks through the complete Text-GCN training pipeline for text classification.

## Overview
Text-GCN builds a heterogeneous graph containing:
- **Document nodes**: Each text document is a node
- **Word nodes**: Each word in the vocabulary is a node
- **Edges**: Connect documents to words (TF-IDF) and words to words (PMI)

Then applies Graph Convolutional Networks for classification.

## Pipeline Steps
1. Configuration & Setup
2. Data Loading & Graph Construction
3. Model Creation
4. Training Loop
5. Evaluation
6. Results Analysis

## Memory Management
⚠️ This notebook includes explicit memory cleanup (`del` and `gc.collect()`) to handle large datasets efficiently.

In [None]:
from google.colab import drive

drive.mount("/content/drive")

PATH = '/content/drive/MyDrive/OMSCS/CS7643/final_project/TextGCN'
%cd '/content/drive/MyDrive/OMSCS/CS7643/final_project/TextGCN'

---
## 1. Configuration & Setup

In [None]:
!pip install torch-scatter torch-sparse torch-cluster torch-geometric -f https://data.pyg.org/whl/torch-2.8.0+cpu.html

In [None]:
!pip install -r ./requirements.txt

In [None]:
!python -u main.py --num_epochs 20