Skip to content

Repository for ICLR 2024 Paper: Supervised Knowledge Makes Large Language Models Better In-context Learners. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks.

YangLinyi/Supervised-Knowledge-Makes-Large-Language-Models-Better-In-context-Learners

Repository files navigation

Supervised Knowledge Makes Large Language Models Better In-context Learners


Linyi Yang*1,2  Shuibai Zhang*1  Zhuohao Yu*3
Guangsheng Bao2  Yidong Wang1,3  Jindong Wang4  Ruochen Xu4  Wei Ye1
Xing Xie4  Weizhu Chen4  Yue Zhang†1,2

*: Co-first Authors †: Corresponding Authors


1 School of Engineering, Westlake University, 2 Westlake Institute for Advanced Study,
3 Peking University, 4 Microsoft

Overview

This is the official repository for Supervised Knowledge Makes Large Language Models Better In-context Learners.

Paper: Supervised Knowledge Makes Large Language Models Better In-context Learners

Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The recent progress in large-scale generative models has further expanded their use in real-world language applications. However, the critical challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. While previous in-context learning research has focused on enhancing models to adhere to users' specific instructions and quality expectations, and to avoid undesired outputs, little to no work has explored the use of task-Specific fine-tuned Language Models (SLMs) to improve LLMs' in-context learning during the inference stage. Our primary contribution is the establishment of a simple yet effective framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks. Using our proposed plug-in method, enhanced versions of Llama 2 and ChatGPT surpass their original versions regarding generalizability and factuality. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks. Our empirical analysis sheds light on the advantages of incorporating discriminative models into LLMs and highlights the potential of our methodology in fostering more reliable LLMs.

img

This repository contains:

  • The codes for implementing SuperContext
  • The prompts used for each task
  • The model weights of Pre-trained Language Models
  • The codes and configs for fine-tuning models.

News

  • [2024/01/23] SuperContext was accepted to ICLR 2024!
  • [2023/12/27] Our paper has been collected by HuggingFace's daily selection on Twitter.

Conrtibuting

We welcome contributions to SuperContext. If you'd like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch with your changes.
  3. Submit a pull request with a clear description of your changes.

Citation

@inproceedings{yang2024supervised,
  title={Supervised Knowledge Makes Large Language Models Better In-context Learners},
  author={Linyi Yang, Shuibai Zhang, Zhuohao Yu, Guangsheng Bao, Yidong Wang, Jindong Wang, Ruochen Xu, Wei Ye, Xing Xie, Weizhu Chen, Yue Zhang},
  booktitle={The Eighteenth International Conference on Learning Representations (ICLR 2024)},
  year={2024}
}

License

For model weights of LLaMA-based models, we follow the LLaMA license. See MODEL_LICENSE.

The rest of this repo is under Apache License 2.0. See LICENSE.

About

Repository for ICLR 2024 Paper: Supervised Knowledge Makes Large Language Models Better In-context Learners. We offer a comprehensive suite of resources, including 16 curated datasets, prompts, model checkpoints, and LLM outputs across 9 distinct tasks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published