# Notebook to train models and eval dataset.

This notebook provides instruction to train models and evaluate dataset

## Download and Install requirements
Make sure to download the repo via `git clone https://github.com/fanminshi/acts-lstm`
and `cd acts-lstm`. This notebook **assume** that you are in the ` acts-lstm` folder.

In [1]:

# install project dependency
! pip install -e .   
! pip install -r requirements.txt

Obtaining file:///home/fanmin/atcs-lstm
  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting pytorch-lightning (from atcs-lstm==0.0.0)
  Using cached pytorch_lightning-2.2.2-py3-none-any.whl.metadata (21 kB)
Using cached pytorch_lightning-2.2.2-py3-none-any.whl (801 kB)
Installing collected packages: pytorch-lightning, atcs-lstm
  Running setup.py develop for atcs-lstm
Successfully installed atcs-lstm-0.0.0 pytorch-lightning-2.2.2


## Train Models

In [2]:
# This saves model check point and logging under ./models folder inside root dir atcs-lstm. 
# Train avg embedding model
! python project/train.py --encoder=avg --save_dir='./models'

Global seed set to 1234
unique token len 37211
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 4070 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name     | Type               | Params
------------------------------------------------
0 | encoder  | BaselineEncoder    | 11.2 M
1 | mlp      | Sequential         | 879 K 
2 | accuracy | MulticlassAccuracy | 0     
------------------------------------------------
879 K     Trainable params
11.2 M    Non-tra

In [3]:
# Train lstm model
! python project/train.py --encoder=lstm --save_dir='./models'
# ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
# ┃   Runningstage.testing    ┃                           ┃
# ┃          metric           ┃       DataLoader 0        ┃
# ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
# │       test_accuracy       │    0.8037459254264832     │
# │         test_loss         │    0.5357282757759094     │
# └───────────────────────────┴───────────────────────────┘

Global seed set to 1234
unique token len 37211
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 4070 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Missing logger folder: ./models/lstm
  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name     | Type               | Params
------------------------------------------------
0 | encoder  | LSTMEncoder        | 30.4 M
1 | mlp      | Sequential         | 4.5 M 
2 | accuracy | MulticlassAccuracy | 0     
------------------------------------------------
23.7 M 

In [6]:
# Train bi-lstm model
! python project/train.py --encoder=bi-lstm --save_dir='./models'



Global seed set to 1234
unique token len 37211
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 4070 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
  rank_zero_warn(f"Checkpoint directory {dirpath} exists and is not empty.")
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name     | Type               | Params
------------------------------------------------
0 | encoder  | LSTMEncoder        | 49.7 M
1 | mlp      | Sequential         | 8.7 M 
2 | accuracy | MulticlassAccuracy | 0     
------------------------------------------------
47.2 M    Trainable params
11.2 M    Non-tra

In [5]:
# Train bi-lstm with max-pooling model
! python project/train.py --encoder=bi-lstm-max-pool --save_dir='./models'

Global seed set to 1234
Using the latest cached version of the dataset since stanfordnlp/snli couldn't be found on the Hugging Face Hub
Found the latest cached dataset configuration 'plain_text' at /home/fanmin/.cache/huggingface/datasets/stanfordnlp___snli/plain_text/0.0.0/cdb5c3d5eed6ead6e5a341c8e56e669bb666725b (last modified on Wed Apr 17 17:24:15 2024).
unique token len 37211
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
You are using a CUDA device ('NVIDIA GeForce RTX 4070 Ti') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
Missing logger folder: ./models/bi-lstm-max-pool
  rank_zero_warn(f"Checkpoint directory {dirpath

## SentEval
Make sure to run `atcs-lstm/SentEval/data/downstream/get_transfer_data.bash` in `atcs-lstm/SentEval/data/downstream/` first before running the following.

In [None]:
!python project/train.py --senteval --checkpt=/home/fanmin/atcs-lstm/models/avg-epoch=19-val_loss=0.66-val_accuracy=0.72.ckpt
{'MR': {'devacc': 76.84, 'acc': 76.89, 'ndev': 10662, 'ntest': 10662}, 'CR': {'devacc': 46.54, 'acc': 36.24, 'ndev': 3775, 'ntest': 3775}, 'MPQA': {'devacc': 78.06, 'acc': 68.77, 'ndev': 10606, 'ntest': 10606}, 'SUBJ': {'devacc': 90.73, 'acc': 91.31, 'ndev': 10000, 'ntest': 10000}, 'SST2': {'devacc': 79.24, 'acc': 80.29, 'ndev': 872, 'ntest': 1821}, 'TREC': {'devacc': 73.24, 'acc': 83.0, 'ndev': 5452, 'ntest': 500}, 'MRPC': {'devacc': 73.58, 'acc': 72.87, 'f1': 81.38, 'ndev': 4076, 'ntest': 1725}, 'SICKEntailment': {'devacc': 81.0, 'acc': 78.49, 'ndev': 500, 'ntest': 4927}}