D2Controller

Code and data for "Dynamic Demonstrations Controller for In-Context Learning".

Pilot Experiments

Increasing the number of demonstrations does not necessarily improve the model performance
The optimal k-shot setting differs depending on specific datasets and models

Method

Based on the above analyses, we propose a method named D2Controller:

D2Controller not only boosts ICL performance but also saves time and space during inference of the LLMs
We apply D2Controller to eight different sizes of LLMs and realize an average of 5.4% relative improvement across ten datasets. We also extend D2Controller to previous ICL models and yield competitive results

Environment Setup

Create a new virtual environment with Python==3.9.16

conda create --name d2controller python=3.9.16

Install requirement.txt

pip install -r requirement.txt

LLMs

Please download LLMs (except GPT-3) from the HuggingFace. The LLMs we use in our paper include gpt2-medium, gpt2-large, gpt2-xl, Cerebras-GPT-2.7B, Cerebras-GPT-6.7B, opt-13b and opt-30b. Put the model files under llm/ directory. For example


llm/
  |--gpt2-medium/
      |--config.json
      |--merges.txt
      |--pytorch_model.bin
      |--tokenizer.json
      |--vocab.json
  |--gpt2-large/
      |--...
      ...
  ...

Running Code

Preprocess

run the bash script do_preprocess.sh to transform original dataset files

bash scripts/do_preprocess.sh

Select k-shot setting

To obtain selected $k$-shot settings, run the script run_selectk.sh

bash scripts/run_selectk.sh

For OPT-30B model, we use the script run_selectk_multi.sh to obtain results

bash scripts/run_selectk_multi.sh

Notice that you should allocate names and directories for datasets and models in the script.

Obtain ICL results

To obtain ICL results based on the select $k$-shot setting, run the script run_icl.sh

bash scripts/run_icl.sh

Find Oracle k-shot setting

To obtain Oracle $k$-shot setting on validation set, run the script run_oracle_k.sh

bash scripts/run_oracle_k.sh

GPT-3

We will release the code and scripts for GPT-3 later.

Results

Main Results

Model	Parameters	Setting	SST-2	SST-5	DBPedia	MR	CR	MPQA	Subj	AGNews	RTE	CB	AVG
GPT-2	0.3B	Default	58.1±13.1	24.1±7.4	60.6±7.2	54.2±10.6	50.6±0.4	59.6±15.8	53.4±5.3	48.7±8.5	51.3±1.7	48.6±6.4	50.9
GPT-2	0.3B	D2Controller	74.1±9.3	31.6±8.6	60.6±7.2	53.8±7.0	67.7±11.4	57.1±9.7	53.8±4.2	48.7±8.5	48.7±2.9	48.6±6.4	54.5
GPT-2	0.3B	Oracle	74.1±9.3	31.6±8.6	60.6±7.2	56.0±9.9	67.7±11.4	64.5±16.0	58.6±12.8	49.4±18.4	51.3±1.7	50.0±9.2	56.4
GPT-2	0.8B	Default	71.8±12.1	37.8±6.8	63.4±6.0	71.1±15.6	80.5±11.4	65.8±11.3	59.9±12.2	65.6±17.2	53.1±3.4	37.1±14.5	60.6
GPT-2	0.8B	D2Controller	65.9±15.2	37.5±5.1	63.4±6.0	71.1±15.6	80.5±11.4	70.5±5.2	69.4±12.4	65.6±17.2	53.1±3.4	47.5±3.2	62.4
GPT-2	0.8B	Oracle	71.8±12.1	39.6±5.1	63.4±6.0	71.1±15.6	80.5±11.4	74.5±8.8	69.4±12.4	65.6±17.2	53.8±4.4	49.3±3.7	63.9
GPT-2	1.5B	Default	70.3±6.6	35.4±8.4	82.0±2.0	52.0±3.8	52.0±3.2	66.7±8.2	57.3±10.5	78.2±6.7	53.1±1.7	52.9±6.3	60.0
GPT-2	1.5B	D2Controller	81.3±5.4	35.4±8.4	82.0±2.0	72.2±13.9	66.2±16.7	83.9±1.5	64.1±11.3	78.2±6.7	53.1±2.9	52.9±6.3	67.0
GPT-2	1.5B	Oracle	81.3±5.4	40.6±5.4	82.0±2.0	72.2±13.9	66.2±16.7	83.9±1.5	64.1±11.3	81.3±7.5	53.1±2.9	57.9±9.8	68.2
Cerebras-GPT	2.7B	Default	65.5±13.8	28.4±4.3	81.8±1.4	65.1±11.2	85.8±4.2	64.2±11.6	69.3±14.4	69.5±3.2	48.1±1.1	52.5±9.5	63.0
Cerebras-GPT	2.7B	D2Controller	77.3±7.7	34.3±4.8	81.8±1.4	76.0±7.7	87.4±1.5	81.6±2.1	74.2±7.6	77.3±4.1	48.0±1.1	54.6±2.7	69.3
Cerebras-GPT	2.7B	Oracle	80.7±9.1	34.3±4.8	81.8±1.4	76.0±7.7	87.4±1.5	82.9±3.0	74.2±7.6	77.3±4.1	49.6±2.3	55.7±5.0	70.0
Cerebras-GPT	6.7B	Default	83.4±8.5	38.3±1.8	87.0±2.4	88.0±1.1	89.0±3.1	75.2±10.3	72.0±14.5	79.2±2.4	52.3±2.3	52.5±8.0	71.7
Cerebras-GPT	6.7B	D2Controller	82.0±11.3	39.5±3.7	87.0±2.4	86.8±1.9	90.5±0.9	83.8±3.3	79.2±12.5	80.2±1.5	52.8±2.5	57.9±7.2	74.0
Cerebras-GPT	6.7B	Oracle	88.6±2.7	43.6±1.6	87.0±2.4	88.0±1.1	90.6±2.8	83.8±3.3	79.2±12.5	80.2±1.5	53.4±1.7	57.9±3.0	75.2
OPT	13B	Default	81.2±6.7	43.3±4.6	92.3±2.1	87.8±2.7	91.4±3.3	75.0±6.7	79.1±12.7	81.9±2.9	54.4±4.2	58.9±8.1	74.5
OPT	13B	D2Controller	90.2±5.8	43.3±4.6	92.3±2.1	87.8±2.7	91.3±2.1	72.0±9.4	91.6±2.0	82.6±1.5	55.8±3.1	58.9±8.1	76.6
OPT	13B	Oracle	90.9±3.7	48.0±2.8	92.3±2.1	91.8±0.6	93.3±1.2	78.6±7.3	91.6±2.0	82.6±1.5	55.8±3.1	73.2±12.4	79.8
OPT	30B	Default	92.3±1.3	40.9±1.8	91.7±3.7	91.8±2.1	87.3±3.3	78.8±6.2	76.1±4.9	78.7±3.6	63.0±3.1	60.0±8.2	76.1
OPT	30B	D2Controller	92.3±1.3	42.0±2.8	91.7±3.7	93.4±1.1	87.3±2.7	85.7±3.8	83.4±8.6	76.7±4.5	61.6±2.8	60.0±8.2	77.4
OPT	30B	Oracle	92.8±1.6	45.2±3.1	91.7±3.7	93.4±1.1	87.7±3.9	85.7±3.8	83.4±8.6	78.7±3.6	63.0±3.1	60.0±8.2	78.1
GPT-3	175B	Default	94.0±1.4	47.7±0.6	90.2±2.8	94.1±0.6	91.4±0.0	84.4±0.6	71.1±2.2	86.9±1.4	60.4±5.3	70.5±13.9	79.1
GPT-3	175B	D2Controller	94.0±1.4	48.4±0.6	90.2±2.8	95.5±0.8	93.0±2.3	84.4±0.6	87.3±4.7	86.9±1.4	66.6±3.0	73.2±2.5	82.0
GPT-3	175B	Oracle	94.1±0.0	48.4±0.6	90.2±2.8	95.5±0.3	93.6±2.8	86.5±2.5	87.3±4.7	86.9±1.4	69.7±1.4	73.2±2.5	82.6

D2Controller is beneficial to other ICL models

Model	GPT-2 0.3B	GPT-2 0.8B	GPT-2 1.5B	Cerebras-GPT 2.7B	Cerebras-GPT 6.7B
KATE	66.7	69.4	67.7	71.6	77.6
KATE+D2Controller	68.8	70.5	69.4	74.7	77.9
GlobalE	59.5	67.7	69.8	-	-
GlobalE+D2Controller	61.5	68.7	71.6	-	-
Contextual Calibration	59.5	64.2	63.9	67.2	72.5
Contextual Calibration+D2Controller	60.8	66.6	65.4	68.7	73.5
kNN Prompting	74.8	76.0	77.3	77.8	79.0
kNN Prompting+D2Controller	75.8	77.1	78.2	78.1	79.7

D2Controller can achieve competitive results on a small number of in-context example groups

IICScore is effective in selecting evaluation examples

Model	GPT-2 0.3B	GPT-2 0.8B	GPT-2 1.5B	Cerebras-GPT 2.7B	Cerebras-GPT 6.7B
Random	54.1	59.2	63.5	68.0	72.9
D2Controller-ED	54.4	59.2	64.0	67.1	72.6
D2Controller-Cos	54.9	59.3	62.2	68.3	72.4
D2Controller	54.5	62.4	66.9	69.3	74.0

Dynamically selecting k performs better than using the maximum k

Model	GPT-2 0.3B	GPT-2 0.8B	GPT-2 1.5B	Cerebras-GPT 2.7B	Cerebras-GPT 6.7B
kmax-shot setting	54.1	58.7	66.0	65.4	73.0
D2Controller	54.5	62.4	66.9	69.3	74.0

Citation

If you find this helpful, please cite our paper.

@misc{zhao2023dynamic,
      title={Dynamic Demonstrations Controller for In-Context Learning}, 
      author={Fei Zhao and Taotian Pang and Zhen Wu and Zheng Ma and Shujian Huang and Xinyu Dai},
      year={2023},
      eprint={2310.00385},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
data		data
figs		figs
scripts		scripts
utils		utils
README.md		README.md
icl.py		icl.py
requirements.txt		requirements.txt
select_k.py		select_k.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

figs

figs

scripts

scripts

utils

utils

README.md

README.md

icl.py

icl.py

requirements.txt

requirements.txt

select_k.py

select_k.py

Repository files navigation

D2Controller

Pilot Experiments

Method

Environment Setup

LLMs

Running Code

Preprocess

Select k-shot setting

Obtain ICL results

Find Oracle k-shot setting

GPT-3

Results

Citation

About

Releases

Packages

Contributors 2

Languages

TJTP/D2Controller

Folders and files

Latest commit

History

Repository files navigation

D2Controller

Pilot Experiments

Method

Environment Setup

LLMs

Running Code

Preprocess

Select k-shot setting

Obtain ICL results

Find Oracle k-shot setting

GPT-3

Results

Citation

About

Resources

Stars

Watchers

Forks

Languages