SPC

This repository contains the official code for AAAI 2024 Oral paper Structured Probabilistic Coding.

Highlights

Structured Probabilistic Coding (SPC) is a supervised representation learning technology serving as an encoder-only probabilistic coding framework with structured regularization from the target space.

By learning compact and informative representations from input related to the target task, SPC enhances the generalization ability of pre-trained language models for better language understanding.

Experimental results on 12 natural language understanding tasks demonstrate that SPC effectively improves the performance of PLMs for classification and regression.

News

[TODO]: Release the model chechpoints of SPC.
[Mar 2024]: Added support for the multi-task version of SPC.
[Feb 2024]: Code is available on GitHub.
[Dec 2023]: Paper is available on arXiv.
[Dec 2023]: Paper is accepted by AAAI 2024 (Oral).

Quick Start

Clone the repository

git clone https://github.com/zerohd4869/SPC.git
cd ./SPC

Download the data and pre-trained model parameters

Download 12 datasets mentioned in the paper from here, and extract the files into the /SPC/data/ directory. This repo already contains 7 of these datasets by default, so this step is optional.

Download the roberta-base model parameters from here and place them in the /SPC/ptms/roberta-base/ directory. SPC is a backbone-free representation learning method. When using it, you could choose an appropriate backbone model and initialized parameter checkpoints for your task or dataset.

Install dependencies

# env: Python 3.7.16, Tesla A100 80GB
pip install -r spc_requirements.txt

Run examples

For classification:

# EmojiEval dataset
nohup bash script/run_train_emojieval.sh >  spc_roberta_emojieval.out &

# EmotionEval dataset
nohup bash script/run_train_emotioneval.sh >  spc_roberta_emotioneval.out &

# HatEval dataset
nohup bash script/run_train_hateval.sh >  spc_roberta_hateval.out &

# IronyEval dataset
nohup bash script/run_train_ironyeval.sh >  spc_roberta_ironyeval.out &

# OffensEval dataset
nohup bash script/run_train_offenseval.sh >  spc_roberta_offenseval.out &

# SentiEval dataset
nohup bash script/run_train_sentieval.sh >  spc_roberta_sentieval.out &

# StanceEval dataset
nohup bash script/run_train_stanceeval.sh >  spc_roberta_stanceeval.out &

# ISEAR dataset
nohup bash script/run_train_isear.sh >  spc_roberta_isear.out &

# MELD dataset
nohup bash script/run_train_meld.sh >  spc_roberta_meld.out &

# GoEmotions dataset
nohup bash script/run_train_goemotions.sh >  spc_roberta_goemotions.out &

For regression:

# STS-B dataset
nohup bash script/run_train_sbsb.sh >  spc_roberta_stsb.out &

# CLAIRE dataset
nohup bash script/run_train_claire.sh >  spc_roberta_claire.out &

Additional Recipes

Apply for a new task/dataset

Data preparation and loading script. Download the new dataset (take NewDataset as an example) and place the unzip files in the /SPC/data/ directory. Add the label information of this dataset to the dictionary file SPC/data/task2label.json. Then, refer to the template /SPC/datasets/new_dataset_script.py to write the corresponding reading script for the dataset and place the file in the /SPC/datasets/ directory. Also, add the dataset and task information to the file SPC/task.py at the corresponding location.
Refer to the Quick Start section above to write the corresponding sh script and run it.

During the training process for SPC, the primary hyperparameters for adjustment along with their suggested ranges are as follows:

var_weight (beta): [0.01, 0.1, 1, 10]
clu_weight (gamma): [0.01, 0.1, 1, 10]

weight_decay: [0, 0.001]
dropout: [0, 0.2]
normalize_flag: False, True

Other hyperparameters can be adjusted based on experimental conditions and specific task requirements, such as epochs, patience, warmup_ratio, bs, max_length, etc.

Apply all tasks in a multi-task paradigm

# 6 tasks/datasets in TweetEval
nohup bash script/run_train_mtl_tweeteval.sh >  spc_roberta_mtl_tweeteval.out &

Citation

If you are interested in this work and want to use the code in this repo, please star this repo and cite it as:

@inproceedings{hu2024structured,
  title={Structured Probabilistic Coding},
  author={Dou Hu and Lingwei Wei and Yaxin Liu and Wei Zhou and Songlin Hu},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
datasets		datasets
script		script
LICENSE		LICENSE
README.md		README.md
main.py		main.py
model.py		model.py
spc_requirements.txt		spc_requirements.txt
task.py		task.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

datasets

datasets

script

script

LICENSE

LICENSE

README.md

README.md

main.py

main.py

model.py

model.py

spc_requirements.txt

spc_requirements.txt

task.py

task.py

utils.py

utils.py

Repository files navigation

SPC

Highlights

News

Quick Start

Additional Recipes

Citation

About

Releases

Packages

Contributors 2

Languages

License

zerohd4869/SPC

Folders and files

Latest commit

History

Repository files navigation

SPC

Highlights

News

Quick Start

Additional Recipes

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Languages