KATE GPT3

This repository contains source code to our paper:

What Makes Good In-context Examples for GPT3?

1. Introduction

Fine-tuning GPT3 requires hundreds of GPUs to load the 175B model, which is prohibitively expensive and time-consuming for ordinary research labs. Moreover, storing large fine-tuned model checkpoints require huge storage space.

To tackle these challenges, we propose KATE (Knn-Augmented in-conText Example selection), a non-parametric selection approach that retrieves in-context examples according to their semantic similarity to the test samples.

On several natural language understanding and generation tasks, the proposed method improves GPT-3’s performance, over the random sampling baseline, by a significant margin.

2. Inference Demonstration

To reproduce our TriviaQA result in the paper, please perform the following steps:

a. Download and unzip demo_GPT3_KATE_TriviaQA_data.zip.

b. Move the unzipped files "trivia_qa_train_78785_dev_full_dev.tsv" and "trivia_qa_train_78785_dev_full_train.tsv" under the code directory "./inference/dataset/".

c. Move the unzipped files "trivia_qa_train_78785_dev_full_random_0.dat" and "trivia_qa_train_78785_dev_full_roberta-large-nli-stsb-mean-tokens_cosine_mean.dat" under the code directory "./inference/kNN_pretraining/".

d. To do inference on the TriviaQA dataset, please run the following commands:

export GPT3_KEY=*** # replace *** with your GPT3 API key
cd inference
chmod 755 run.sh
./run.sh

3. kNN Retrieval Demonstration

As shown in the above inference demonstration, the key step to do inference is to have the retrieved examples ready. We have computed and stored the indices of the training examples in a ".dat" file.

To reproduce the "trivia_qa_train_78785_dev_full_roberta-large-nli-stsb-mean-tokens_cosine_mean.dat" file above, please run the following commands:

cd inference
chmod 755 trivia_qa.sh
./trivia_qa.sh

In order to run the retrieval code, you need to install Sentence-BERT in your environment in addition to all the packages listed in requirements.txt.

3.1 Which Pre-trained Sentence Encoder?

In the paper, we study three sentence encoders for retrieval:

a. For the kNN_{roberta} results, you can import the pre-trained model RoBERTA-large from HuggingFace directly.

b. For the kNN_{nli} results, you can use the SentTransformer model called "roberta-large-nli".

c. For the kNN_{nli+stsb} results, you can use the SentTransformer model called "roberta-large-stsb".

4. Reproducibility

Section 2 and Section 3 jointly give a demo on how to produce the results on the TriviaQA dataset shown in the paper.

Below, we provide the links to all preprocessed datasets and saved kNN similarity indices, from which you can reproduce all the results shown in the paper.

4.1 Preprocessed Datasets

All preprocessed datasets mentioned in the paper can be found and downloaded via this link.

4.2 Saved kNN Indices based on Pre-trained Encoders

You can compute and save the kNN similary indices by following the procedure as demonstrated in Section 3.

Once you go through the procedure in Section 3 and get a ".dat" file, you can compare and verify your kNN similariy indices with our already computed and saved ones via this link.

If you are not able to produce the kNN ".dat" files or the results shown in the paper, please don't hesitate to reach out to me via email or open an issue on this GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
inference		inference
retrieval		retrieval
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inference

inference

retrieval

retrieval

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

KATE GPT3

1. Introduction

2. Inference Demonstration

3. kNN Retrieval Demonstration

3.1 Which Pre-trained Sentence Encoder?

4. Reproducibility

4.1 Preprocessed Datasets

4.2 Saved kNN Indices based on Pre-trained Encoders

About

Releases

Packages

Languages

License

jiachangliu/KATEGPT3

Folders and files

Latest commit

History

Repository files navigation

KATE GPT3

1. Introduction

2. Inference Demonstration

3. kNN Retrieval Demonstration

3.1 Which Pre-trained Sentence Encoder?

4. Reproducibility

4.1 Preprocessed Datasets

4.2 Saved kNN Indices based on Pre-trained Encoders

About

Resources

License

Stars

Watchers

Forks

Languages