llmpebase
is a unified platform to support experiments for prompt engineering in large language models (LLMs). This codebase is designed to be easy to use and to ensure a fair comparison. With the components provided by this codebase, one can easily implement a new prompt engineering algorithm and apply it to a task for evaluation.
The structure of llmpebase
is
.
├── configs # Configuration files to be used
├── examples # Implemented examples
├── llmpebase # The source code of `llmpebase`
└──── datasets # Datasets
└──── models # LLMs, prompting, and thought structures
└──── exactor # To extract the result from the output
└──── evaluator # To evaluate by comparing ground truth and the result
Mathematical problems:
- GSM8K
- SVAMP
- AQUA-RAT
- MATH
- TheoremQA
- Game of 24
Multi-task Reasoning:
- MMLU
- BBH
Commonsense reasoning:
- CSQA (CommonsenseQA)
- GPT
- Llama
- Llama2
- Claude
- Falcon
- Few-shot prompting
- Chain-of-Thought (CoT) prompting
- Zero-shot prompting
- BoT prompting
- TR prompting
- Chain
- Tree
- Graph
- LLM-based extractor
- Regex-based extractor
- Regex-based evaluator
Anyone can run examples/
of llmpebase
by executing the following three steps:
-
(Optional). To use ChatGPT API, one needs to have the OPENAI_API_KEY, OPENAI_ORGAN_KEY, and ANTHROPIC_KEY and set them in the file
.env
under the root directory. -
Download the code from GitHub. Install
llmpebase
by running$ pip install .
-
Run the example by running
$ python examples/ChainOfThought/ChainOfThought.py -c configs/GSM8K/Standard_chatgpt.yml -b LLMPEBASE $ python examples/ChainOfThought/ChainOfThought.py -c configs/GSM8K/CoT_chatgpt.yml -b LLMPEBASE $ python examples/ChainOfThought/ChainOfThought.py -c configs/GSM8K/ZeroCoT_chatgpt.yml -b LLMPEBASE
See documents under
examples/
for more details.
[1]. Chen, Sijia and Li, Baochun and Niu, Di, Boosting of thoughts: Trial-and-error problem solving with large language models, ICLR24. See examples/BoTReasoning
[2]. Chen, Sijia and Li, Baochun, Toward Adaptive Reasoning in Large Language Models with Thought Rollback, ICML24. See examples/ThoughtRollback