How are Prompts Different in Terms of Sensitivity?

This repository includes the code for the paper How are Prompts Different in Terms of Sensitivity?. You will find codes here to estimate sensitivity, calculate gradient-based saliency scores, and utilize sensitivity-aware decoding.

Abstract: In-context learning (ICL) has become one of the most popular learning paradigms. While there is a growing body of literature focusing on prompt engineering, there is a lack of systematic analysis comparing the effects of prompt techniques across different models and tasks. To address this, we present a comprehensive prompt analysis based on the sensitivity of a function. Our analysis reveals that sensitivity is an unsupervised proxy for model performance, as it exhibits a strong negative correlation with accuracy. We use gradient-based saliency scores to empirically demonstrate how different prompts affect the relevance of input tokens to the output, resulting in different levels of sensitivity. Furthermore, we introduce sensitivity-aware decoding which incorporates sensitivity estimation as a penalty term in the standard greedy decoding. We show that this approach is particularly helpful when information in the input is scarce. Our work provides a fresh perspective on the analysis of prompts, and contributes to a better understanding of the mechanism of ICL.

Contact person: Sheng Lu

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

Getting Started

Prepare the environment:

pip install -r requirements.txt

Inference

See inference.ipynb for the inference code.

Saliency scores

See saliency.ipynb for the calculation of gradient-based saliency scores (Simonyan et al., 2013; Li et al., 2016; Yin and Neubig, 2022).

Sensitivity-aware decoding

See sensitivity_aware_decoding.ipynb for the implementation of sensitivity-aware decoding.

Evaluation scores

See evaluation_scores.csv and evaluation_scores - greedy_decoding.csv for the full evaluation scores.

Citation

Please use the following citation:

@article{lu2023prompts,
  title={How are Prompts Different in Terms of Sensitivity?},
  author={Lu, Sheng and Schuff, Hendrik and Gurevych, Iryna},
  journal={arXiv preprint arXiv:2311.07230},
  year={2023}
}

Disclaimer

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
datasets		datasets
figures		figures
prompts		prompts
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
evaluation_scores - greedy_decoding.csv		evaluation_scores - greedy_decoding.csv
evaluation_scores.csv		evaluation_scores.csv
inference.ipynb		inference.ipynb
lm_saliency.py		lm_saliency.py
requirements.txt		requirements.txt
saliency.ipynb		saliency.ipynb
sensitivity_aware_decoding.ipynb		sensitivity_aware_decoding.ipynb
utils.py		utils.py

License

UKPLab/naacl2024-prompt-sensitivity

Folders and files

Latest commit

History

Repository files navigation

How are Prompts Different in Terms of Sensitivity?

Getting Started

Inference

Saliency scores

Sensitivity-aware decoding

Evaluation scores

Citation

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Languages