Code for NAACL 2024 paper: MDR: Model-Specific Demonstration Retrieval at Inference Time for In-Context Learning.
cd MDR
bash install.sh
Follow the instructions in UPRISE to download pre-trained retriever and pre-constructed demonstration pool.
After downloading, encode the demonstration pool with the demonstration encoder:
bash ./scripts/gen_demonstration_embeds.sh
Download demonstration_pool_GPTNeo.json to ./demonstration_pools
. Then run the provided shell to evaluate MDR on different tasks with GPTNeo-2.7B and get to know the demonstration retrieval process:
bash ./scripts/run_GPTNeo_2.7B.sh
You can change the variable DEMONSTRATION_POOL
to path_to_demonstration_pool
(downloaded from UPRISE) to see how MDR calculate eigenvalue and loss for each sample in test dataset given specific inference model.
Customize your scripts to support different tasks and models based on the parameters:
LLM
: you can specify the LLM name here (in huggingface format);DEMONSTRATION_POOL
: since the calculation of eigenvalue and loss has a one-to-one correspondence with the model, you should create different demonstration pool files for different models (just copy the downloaded demonstration pool file and rename it);TASKS
: MDR support 20+ datasets, you can specify the task name to evaluate according to the task definition in./DPR/dpr/utils/tasks.py
;
This repository is built using the UPRISE codebase.