Skip to content
/ SGPA Public

Example code of Sparse Gaussian Process Attention (ICLR 2023)

Notifications You must be signed in to change notification settings

chenw20/SGPA

Repository files navigation

Sparse Gaussian Process Attention

This is an example code for the paper titled Calibrating Transformers via Sparse Gaussian Processes (ICLR 2023)

This code implememts SGPA on CIFAR10 and IMDB datasets.

To use this code: simply run train_cifar.py or train_imdb.py

The IMDB dataset can be downloaded here

Dependencies:

  • Python - 3.8
  • Pytorch - 1.10.2
  • numpy - 1.22.4
  • einops - 0.4.1
  • pandas - 1.4.3
  • transformers - 4.18.0

ECE/MCE reported in the paper are computed according to this script. Note according to this script, ECE/MCE are computed based on the differences between predicted probabilities and the labels for all classes (not just the max-prob class).

Citing the paper (bib)

@inproceedings{chen2023calibrating,
  title = {Calibrating Transformers via Sparse Gaussian Processes},
  author = {Chen, Wenlong and Li, Yingzhen},
  booktitle = {International Conference on Learning Representations},
  year = {2023}
}

About

Example code of Sparse Gaussian Process Attention (ICLR 2023)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages