GitHub - amzn/sto-transformer

Source code of paper "Transformer Uncertainty Estimation with Hierarchical Stochastic Attention" (AAAI 2022)

1. Introduction

This repository releases the source code of stochastic models proposed in paper "Transformer Uncertainty Estimation with Hierarchical Stochastic Attention", which is accepted by AAAI conference in 2022. We implemented stochastic transformer models for the following 2 NLP tasks:

Sentiment Analysis (code/IMDB);
Linguistic Acceptability (code/CoLA);

2. Environment

We implemented the model based on pytorch 1.8.1 and python 3.7.6. And config experimental enviroment by the following steps.

conda create -n pytorch_latest_p37 python=3.7 anaconda  # creat the virtual environment
source activate pytorch_latest_p37                      # activate the environment
sh setup.sh                                             # install all dependent packages

3. Datasets

We have experiment on two datasets:

4. Quickstart Training

4.1 Sentiment Analysis

Preprocessing

python code/IMDB/Run.py --mode=pre --model_name=IMDB --model_type=tf-sto --exp_name=default --job_id=123456 --debug=0

Train & test N_RUN times with uncertainty

python code/IMDB/Run.py --mode=uncertain-train-test --model_name=IMDB --model_type=tf-sto --exp_name=single_t1  --debug=0

More details can be found at code/IMDB/README.md.

4.2 Linguistic Acceptability

Downloaded the CoLA dataset from the repository (https://github.com/pranavajitnair/CoLA)
Train and validate the model run:

python train.py --model_type sto_transformer  --inference True  --sto_transformer True --model_name dual --dual True

More details can be found at code/CoLA/README.md.

Reference

Emails:
- Jiahuan Pei, jpei@amazon.com
- Cheng Wang, cwngam@amazon.com
- György Szarvas, szarvasg@amazon.com
Paper
- Direct link
- Citation with bibtex

@inproceedings{pei2022transformer,
    title={Transformer uncertainty estimation with hierarchical stochastic attention},
    author={Pei, Jiahuan and Wang, Cheng and Szarvas, Gy{\"o}rgy},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={36},
    number={10},
    pages={11147--11155},
    year={2022}
}

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
code		code
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

.gitignore

.gitignore

CODE_OF_CONDUCT.md

CODE_OF_CONDUCT.md

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

NOTICE

NOTICE

README.md

README.md

Repository files navigation

Source code of paper "Transformer Uncertainty Estimation with Hierarchical Stochastic Attention" (AAAI 2022)

1. Introduction

2. Environment

3. Datasets

4. Quickstart Training

4.1 Sentiment Analysis

4.2 Linguistic Acceptability

Reference

Security

License

About

Releases

Packages

Languages

License

amzn/sto-transformer

Folders and files

Latest commit

History

Repository files navigation

Source code of paper "Transformer Uncertainty Estimation with Hierarchical Stochastic Attention" (AAAI 2022)

1. Introduction

2. Environment

3. Datasets

4. Quickstart Training

4.1 Sentiment Analysis

4.2 Linguistic Acceptability

Reference

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages