Skip to content

Cynwell/Text-Level-GNN

Repository files navigation

Text-Level-GNN

An implementation to the paper: Text Level Graph Neural Network for Text Classification (https://arxiv.org/pdf/1910.02356.pdf)

Features:

  • Dynamic edge weights instead of static edge weights
  • All documents are from a big graph instead of every documents having its own structure
  • Public edge sharing (achieved by computing edge statistics during dataset construction and masking during training, a novel mechanism roughly described by the paper yet without much further information)
  • Flexible argument controls and early stopping features
  • Detailed explanations about intermediate operations
  • The number of parameters in this model is close to the amount of parameters mentioned in the paper

File structure:

+---embeddings\
|             +---glove.6B.50d.txt
|             +---glove.6B.100d.txt
|             +---glove.6B.200d.txt
|             +---glove.6B.300d.txt
+---train.py
+---r52-test-all-terms.txt
+---r52-train-all-terms.txt
+---r8-test-all-terms.txt
+---r8-train-all-terms.txt

Since the original link DOES NOT work anymore, I hereby provide the original link and the corresponding dataset file in this repository for anyone who is also looking for the r8 and r52 dataset.

https://www.cs.umb.edu/~smimarog/textmining/datasets/r8-train-all-terms.txt => r8-train-all-terms.txt https://www.cs.umb.edu/~smimarog/textmining/datasets/r8-test-all-terms.txt => r8-test-all-terms.txt https://www.cs.umb.edu/~smimarog/textmining/datasets/r52-train-all-terms.txt => r52-train-all-terms.txt https://www.cs.umb.edu/~smimarog/textmining/datasets/r52-test-all-terms.txt => r52-test-all-terms.txt

Environment:

  • Python 3.7.4
  • PyTorch 1.5.1 + CUDA 10.1
  • Pandas 1.0.5
  • Numpy 1.19.0

Successful run on RTX 2070, RTX 2080 Ti and RTX 3090. However, the memory consumption is quite large that it requires smaller batch size / shorter MAX_LENGTH / smaller embedding_size on RTX 2070.

Usage:

  • Linux:
    • OMP_NUM_THREADS=1 python train.py --cuda=0 --embedding_size=300 --p=3 --min_freq=2 --max_length=70 --dropout=0 --epoch=300
  • Windows:
    • python train.py --cuda=0 --embedding_size=300 --p=3 --min_freq=2 --max_length=70 --dropout=0 --epoch=300

Result:

I only tested the model on r8 dataset and is unable to achieve the figure as described in the paper despite having tried some hyperparameter tunings. The closest run that I could get is:

Train Accuracy Validation Accuracy Test Accuracy
99.91% 95.7% 96.2%

with embedding_size=300, p=3 and 70<=max_length<=150 and dropout=0. As the experiment settings described in the paper is not clearly stated, I assumed they used a learning rate decay mechanism too. I also added a warming up mechanism to pretrain the model. But actually the model converged quite fast and does not even need to use warming up technique.