This example will walk you throught the basic usage of PromptBench. We hope that you can get familiar with the APIs and use it in your own projects later.

First, there is a unified import of `import promptbench as pb` that easily imports the package.

In [1]:
!pip install promptbench

Collecting promptbench
  Obtaining dependency information for promptbench from https://files.pythonhosted.org/packages/b7/d1/4fa888eb75c86e722aeddae53b574005649bb6f279f77ac0c5db83773732/promptbench-0.0.3-py3-none-any.whl.metadata
  Downloading promptbench-0.0.3-py3-none-any.whl.metadata (15 kB)
Collecting autocorrect==2.6.1 (from promptbench)
  Downloading autocorrect-2.6.1.tar.gz (622 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m622.8/622.8 kB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m00:01[0m
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hCollecting accelerate==0.25.0 (from promptbench)
  Obtaining dependency information for accelerate==0.25.0 from https://files.pythonhosted.org/packages/f7/fc/c55e5a2da345c9a24aa2e1e0f60eb2ca290b6a41be82da03a6d4baec4f99/accelerate-0.25.0-py3-none-any.whl.metadata
  Downloading accelerate-0.25.0-py3-none-any.whl.metadata (18 kB)
Collecting datasets>=2.15.0 (from promptbench)
  Obtaining dependency informat

In [2]:
import promptbench as pb


## Load dataset

First, PromptBench supports easy load of datasets.

In [3]:
# print all supported datasets in promptbench
print('All supported datasets: ')
print(pb.SUPPORTED_DATASETS)

# load a dataset, sst2, for instance.
# if the dataset is not available locally, it will be downloaded automatically.
dataset_name = "gsm8k"
dataset = pb.DatasetLoader.load_dataset(dataset_name)

# print the first 5 examples
dataset[:5]

All supported datasets: 
['sst2', 'cola', 'qqp', 'mnli', 'mnli_matched', 'mnli_mismatched', 'qnli', 'wnli', 'rte', 'mrpc', 'mmlu', 'squad_v2', 'un_multi', 'iwslt2017', 'math', 'bool_logic', 'valid_parentheses', 'gsm8k', 'csqa', 'bigbench_date', 'bigbench_object_tracking', 'last_letter_concat', 'numersense', 'qasc', 'bbh', 'drop', 'arc-easy', 'arc-challenge']


[{'content': "Janet’s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
  'label': '18'},
 {'content': 'A robe takes 2 bolts of blue fiber and half that much white fiber.  How many bolts in total does it take?',
  'label': '3'},
 {'content': 'Josh decides to try flipping a house.  He buys a house for $80,000 and then puts in $50,000 in repairs.  This increased the value of the house by 150%.  How much profit did he make?',
  'label': '70000'},
 {'content': 'James decides to run 3 sprints 3 times a week.  He runs 60 meters each sprint.  How many total meters does he run a week?',
  'label': '540'},
 {'content': "Every day, Wendi feeds each of her chickens three cups of mixed chicken feed, containing seeds, mealworms and vegetables to help keep them healthy.  She giv

## Load models

Then, you can easily load LLM models via promptbench.

In [5]:
# print all supported models in promptbench
print('All supported models: ')
print(pb.SUPPORTED_MODELS)

# load a model, gpt-3.5-turbo, for instance.
# If model is openai/palm, need to provide openai_key/palm_key
# If model is llama, vicuna, need to provide model dir
model = pb.LLMModel(model='gpt-3.5-turbo', 
                    api_key = 'sk-',
                    max_new_tokens=150)

All supported models: 
['google/flan-t5-large', 'llama2-7b', 'llama2-7b-chat', 'llama2-13b', 'llama2-13b-chat', 'llama2-70b', 'llama2-70b-chat', 'phi-1.5', 'phi-2', 'palm', 'gpt-3.5-turbo', 'gpt-4', 'gpt-4-1106-preview', 'gpt-3.5-turbo-1106', 'gpt-4-0125-preview', 'gpt-3.5-turbo-0125', 'vicuna-7b', 'vicuna-13b', 'vicuna-13b-v1.3', 'google/flan-ul2', 'gemini-pro', 'mistralai/Mistral-7B-v0.1', 'mistralai/Mistral-7B-Instruct-v0.1', 'mistralai/Mixtral-8x7B-v0.1', 'mistralai/Mixtral-8x7B-Instruct-v0.1', '01-ai/Yi-6B', '01-ai/Yi-34B', '01-ai/Yi-6B-Chat', '01-ai/Yi-34B-Chat', 'baichuan-inc/Baichuan2-7B-Base', 'baichuan-inc/Baichuan2-13B-Base', 'baichuan-inc/Baichuan2-7B-Chat', 'baichuan-inc/Baichuan2-13B-Chat']


You can use different methods to predict 

In [9]:
# load method
# print all methods and their supported datasets
print('All supported methods: ')
print(pb.SUPPORTED_METHODS)
print('Supported datasets for each method: ')
print(pb.METHOD_SUPPORT_DATASET)

# load a method, emotion_prompt, for instance.
method = pb.PEMethod(method='CoT', 
                        dataset=dataset_name,
                        verbose=True,  # if True, print the detailed prompt and response
                        )





All supported methods: 
['CoT', 'ZSCoT', 'least_to_most', 'generated_knowledge', 'expert_prompting', 'emotion_prompt', 'baseline']
Supported datasets for each method: 
{'CoT': ['gsm8k', 'csqa', 'bigbench_date', 'bigbench_object_tracking'], 'ZSCoT': ['gsm8k', 'csqa', 'bigbench_date', 'bigbench_object_tracking'], 'expert_prompting': ['gsm8k', 'csqa', 'bigbench_date', 'bigbench_object_tracking'], 'emotion_prompt': ['gsm8k', 'csqa', 'bigbench_date', 'bigbench_object_tracking'], 'least_to_most': ['gsm8k', 'last_letter_concat'], 'generated_knowledge': ['csqa', 'numersense', 'qasc'], 'baseline': ['gsm8k', 'csqa', 'bigbench_date', 'bigbench_object_tracking', 'last_letter_concat', 'numersense', 'qasc']}


<promptbench.prompts.prompt.Prompt at 0x34f8c3d60>

In [11]:
!pip install llmlingua


Collecting llmlingua
  Obtaining dependency information for llmlingua from https://files.pythonhosted.org/packages/6e/3e/221fe46a3338f2babdb2082ee42df88fcaa8ea0e639e832cbb1b93c5923a/llmlingua-0.2.2-py3-none-any.whl.metadata
  Downloading llmlingua-0.2.2-py3-none-any.whl.metadata (17 kB)
Downloading llmlingua-0.2.2-py3-none-any.whl (30 kB)
Installing collected packages: llmlingua
Successfully installed llmlingua-0.2.2

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [15]:
from llmlingua import PromptCompressor

llm_lingua = PromptCompressor(
    model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank",
    use_llmlingua2=True, # Whether to use llmlingua-2
    device_map="mps",
)

In [102]:
#비교해볼 데이터
data = dataset[1]

In [107]:
custom_prompt_template = """
Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there
will be 21 trees. How many trees did the grove workers plant today?
A: There are 15 trees originally. Then there were 21 trees after some more were planted. So there must have
been 21 - 15 = 6. The answer is 6.
Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?
A: There are originally 3 cars. 2 more cars arrive. 3 + 2 = 5. The answer is 5.
Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?
A: Originally, Leah had 32 chocolates. Her sister had 42. So in total they had 32 + 42 = 74. After eating 35, they
had 74 - 35 = 39. The answer is 39.
Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did
Jason give to Denny?
A: Jason started with 20 lollipops. Then he had 12 after giving some to Denny. So he gave Denny 20 - 12 = 8.
The answer is 8.
Q: Shawn has five toys. For Christmas, he got two toys each from his mom and dad. How many toys does he
have now?
A: Shawn started with 5 toys. If he got 2 toys each from his mom and dad, then that is 4 more toys. 5 + 4 = 9.
The answer is 9.
Q: There were nine computers in the server room. Five more computers were installed each day, from monday
to thursday. How many computers are now in the server room?
A: There were originally 9 computers. For each of 4 days, 5 more computers were added. So 5 * 4 = 20
computers were added. 9 + 20 is 29. The answer is 29.
Q: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How many golf
balls did he have at the end of wednesday?
A: Michael started with 58 golf balls. After losing 23 on tuesday, he had 58 - 23 = 35. After losing 2 more, he
had 35 - 2 = 33 golf balls. The answer is 33.
Q: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
A: Olivia had 23 dollars. 5 bagels for 3 dollars each will be 5 x 3 = 15 dollars. So she has 23 - 15 dollars left. 23
- 15 is 8. The answer is 8.

"""

compressed_custom_prompt = llm_lingua.compress_prompt(custom_prompt_template, target_token=300, force_tokens = ['\n', '?'])["compressed_prompt"]


prompts = pb.Prompt([compressed_custom_prompt + "{content}"])
input_text = pb.InputProcess.basic_format(prompts[0], data)
label = data['label']
raw_pred = model(input_text)
print(compressed_custom_prompt)
print(raw_pred)




huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)



 15 trees grove workers plant trees today After
 21 trees How many trees?
 15 trees originally 21 after more planted
 21 - 15 = 6. 6.
 3 cars parking lot 2 more arrive how many?
 originally 3 2 3 + 2 = 5. 5.
 Leah 32 chocolates sister 42 ate 35 how many pieces left?
 Leah 32 sister 42 32 + 42 = 74 After
 74 - 35 = 39 39.
 Jason 20 lollipops gave Denny Now 12 lollipops How many lollipops
 Denny?
 started 20 20 - 12 = 8.
 8.
 Shawn five toys two toys each from mom dad How many toys
 now?
 5 2 each 4 more toys 5 + 4 = 9.
 9.
 nine computers server room Five more installed each monday
 thursday How many computers now?
 originally 9 computers 5 added 5 * 4 = 20
 9 + 20 is 29. 29.
 Michael 58 golf balls lost 23 balls wednesday lost 2 more many
end wednesday?
 Michael started 58 balls 23 tuesday 58 - 23 35 losing 2
 35 - 2 33 balls answer 33.
 Olivia $23 bought five bagels $3 money left?
 23 dollars 5 bagels 3 dollars 3 15 dollars left
 - 15 8. 8.


To make a robe, it takes 2 bolts of blue f

In [104]:
results = method.test([data], 
                      model,
                      )

results

  0%|          | 0/1 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  0%|          | 0/1 [00:01<?, ?it/s]


Q: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are done, there
will be 21 trees. How many trees did the grove workers plant today?
A: There are 15 trees originally. Then there were 21 trees after some more were planted. So there must have
been 21 - 15 = 6. The answer is 6.
Q: If there are 3 cars in the parking lot and 2 more cars arrive, how many cars are in the parking lot?
A: There are originally 3 cars. 2 more cars arrive. 3 + 2 = 5. The answer is 5.
Q: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in total?
A: Originally, Leah had 32 chocolates. Her sister had 42. So in total they had 32 + 42 = 74. After eating 35, they
had 74 - 35 = 39. The answer is 39.
Q: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did
Jason give to Denny?
A: Jason started with 20 lollipops. Then he had 12 after giving some to Denny. So he gave Denny 20 - 




AttributeError: 'list' object has no attribute 'extract_answer'