prompto
is a Python library which facilitates processing of experiments of Large Language Models (LLMs) stored as jsonl files. It automates asynchronous querying of LLM API endpoints and logs progress.
prompto
derives from the Italian word "pronto" which means "ready" (or "hello" when answering the phone). It could also mean "I prompt" in Italian (if "promptare" was a verb meaning "to prompt").
A pre-print for this work is available on arXiv. If you use this library, please see the citation below. For the experiments in the pre-print, see the system demonstration examples.
The benefit of asynchronous querying is that it allows for multiple requests to be sent to an API without having to wait for the LLM's response, which is particularly useful to fully utilise the rate limits of an API. This is especially useful when an experiment file contains a large number of prompts and/or has several models to query. Asynchronous programming is simply a way for programs to avoid getting stuck on long tasks (like waiting for an LLM response from an API) and instead keep running other things at the same time (to send other queries).
With prompto
, you are able to define your experiments of LLMs in a jsonl or csv file where each line/row contains the prompt and any parameters to be used for a query of a model from a specific API. The library will process the experiment file and query models and store results. You are also able to query multiple models from different APIs in a single experiment file and prompto
will take care of querying the models asynchronously and in parallel.
The library is designed to be extensible and can be used to query different models.
For more details on the library, see the documentation where you can find information on how to set up an experiment file, how to run experiments, how to configure environment variables, how to specify rate limits for APIs and to use parallel processing and much more.
See below for installation instructions and quickstarts for getting started with prompto
.
prompto
can also be used as an evaluation tool for LLMs. In particular, it has functionality to automatically conduct an LLM-as-judge evaluation on the outputs of models and/or apply a scorer
function (e.g. string matching, regex, or any custom function applied to some output) to outputs. For details on how to use prompto
for evaluation, see the evaluation docs.
The library supports querying several APIs and models. The following APIs are currently supported are:
- OpenAI (
"openai"
) - Azure OpenAI (
"azure-openai"
) - Anthropic (
"anthropic"
) - Gemini (
"gemini"
) - Vertex AI (
"vertexai"
) - Huggingface text-generation-inference (
"huggingface-tgi"
) - Ollama (
"ollama"
) - A simple Quart API for running models from
transformers
locally ("quart"
)
Our aim for prompto
is to support more APIs and models in the future and to make it easy to add new APIs and models to the library. We welcome contributions to add new APIs and models to the library. We have a contribution guide and a guide on how to add new APIs and models to the library in the docs.
To install the library, you can use pip
:
pip install prompto
Note: This only installs the base dependencies required for prompto
. There are also extra group dependencies depending on the models that you'd like to query. For example, if you'd like to query models from the OpenAI and Gemini API, you can install the extra dependencies by running:
pip install prompto"[openai,gemini]"
To install all the dependencies for all the models, you can run:
pip install prompto[all]
You might also want to set up a development environment for the library. To do this, please refer to the development environment setup guide in our contribution guide.
prompto
derives from the Italian word "pronto" which means "ready" and could also mean "I prompt" in Italian (if "promptare" was a verb meaning "to prompt").
The library has functionality to process experiments and to run a pipeline which continually looks for new experiment jsonl files in the input folder. Everything starts with defining a pipeline data folder which contains:
└── data
└── input: contains the jsonl files with the experiments
└── output: contains the results of the experiments runs.
When an experiment is ran, a folder is created within the output folder with the experiment name
as defined in the jsonl file but removing the `.jsonl` extension.
The results and logs for the experiment are stored there
└── media: contains the media files for the experiments.
These files must be within folders of the same experiment name
as defined in the jsonl file but removing the `.jsonl` extension
When using the library, you simply pass in the folder you would like to use as the pipeline data folder and the library will take care of the rest.
The main command line interface for running an experiment is the prompto_run_experiment
command (see the commands doc for more details). This command will process a single experiment file and query the model for each prompt in the file. The results will be stored in the output folder of the experiment. To see all arguments of this command, run:
prompto_run_experiment --help
See the examples for examples of how to use the library with different APIs/models. Each example contains an experiment file which contains prompts for the model(s) and a walkthrough on how to run the experiment.
The following is an example of an experiment file which contains two prompts for two different models from the OpenAI API:
{"id": 0, "api": "openai", "model_name": "gpt-4o", "prompt": "How does technology impact us?", "parameters": {"n": 1, "temperature": 1, "max_tokens": 100}}
{"id": 1, "api": "openai", "model_name": "gpt-3.5-turbo", "prompt": "How does technology impact us?", "parameters": {"n": 1, "temperature": 1, "max_tokens": 100}}
To run this example, first install the library and create the following folder structure in your working directory from where you'll run this example:
├── data
│ └── input
│ └── openai.jsonl
├── .env
where openai.jsonl
contains the above two prompts and the .env
file contains the following:
OPENAI_API_KEY=<YOUR-OPENAI-KEY>
You are then ready to run the experiment with the following command:
prompto_run_experiment --file data/input/openai.jsonl --max-queries 30
This will:
- Create subfolders in the
data
folder (in particular, it will createmedia
(data/media
) andoutput
(data/media
) folders) - Create a folder in the
output
folder with the name of the experiment (the file name without the.jsonl
extension * in this case,openai
) - Move the
openai.jsonl
file to theoutput/openai
folder (and add a timestamp of when the run of the experiment started) - Start running the experiment and sending requests to the OpenAI API asynchronously which we specified in this command to be 30 queries a minute (so requests are sent every 2 seconds) * the default is 10 queries per minute
- Results will be stored in a "completed" jsonl file in the output folder (which is also timestamped)
- Logs will be printed out to the console and also stored in a log file (which is also timestamped)
The resulting folder structure will look like this:
├── data
│ ├── input
│ ├── media
│ ├── output
│ │ └── openai
│ │ ├── DD-MM-YYYY-hh-mm-ss-completed-openai.jsonl
│ │ ├── DD-MM-YYYY-hh-mm-ss-input-openai.jsonl
│ │ └── DD-MM-YYYY-hh-mm-ss-log-openai.txt
├── .env
The completed experiment file will contain the responses from the OpenAI API for the specific model in each prompt in the input file in data/output/openai/DD-MM-YYYY-hh-mm-ss-completed-openai.jsonl
where DD-MM-YYYY-hh-mm-ss
is the timestamp of when the experiment file started to be processed.
For a more detailed walkthrough on using prompto
with the OpenAI API, see the openai
example.
{"id": 0, "api": "gemini", "model_name": "gemini-1.5-flash", "prompt": "How does technology impact us?", "safety_filter": "none", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}}
{"id": 1, "api": "gemini", "model_name": "gemini-1.0-pro", "prompt": "How does technology impact us?", "safety_filter": "few", "parameters": {"candidate_count": 1, "temperature": 1, "max_output_tokens": 100}}
To run this example, first install the library and create the following folder structure in your working directory from where you'll run this example:
├── data
│ └── input
│ └── gemini.jsonl
├── .env
where gemini.jsonl
contains the above two prompts and the .env
file contains the following:
GEMINI_API_KEY=<YOUR-GEMINI-KEY>
You are then ready to run the experiment with the following command:
prompto_run_experiment --file data/input/openai.jsonl --max-queries 30
As with the above example, the resulting folder structure will look like this:
├── data
│ ├── input
│ ├── media
│ ├── output
│ │ └── gemini
│ │ ├── DD-MM-YYYY-hh-mm-ss-completed-gemini.jsonl
│ │ ├── DD-MM-YYYY-hh-mm-ss-input-gemini.jsonl
│ │ └── DD-MM-YYYY-hh-mm-ss-log-gemini.txt
├── .env
The completed experiment file will contain the responses from the Gemini API for the specified model in each prompt in the input file in data/output/gemini/DD-MM-YYYY-hh-mm-ss-completed-gemini.jsonl
where DD-MM-YYYY-hh-mm-ss
is the timestamp of when the experiment file started to be processed.
For a more detailed walkthrough on using prompto
with the Gemini API, see the gemini
example.
The library has a few key classes:
Settings
: this defines the settings of theexperiment pipeline which stores the paths to the relevant data folders and the parameters for the pipeline.Experiment
: this defines all the variables related to a single experiment. An 'experiment' here is defined by a particular JSONL file which contains the data/prompts for each experiment. Each line in this file is a particular input to the LLM which we will obtain a response for. An experiment can be processed by calling theExperiment.process()
method which will query the model and store the results in the output folder.ExperimentPipeline
: this is the main class for running the full pipeline. The pipeline can be ran using theExperimentPipeline.run()
method which will continually check the input folder for new experiments to process.AsyncAPI
: this is the base class for querying all APIs. Each API/model should inherit from this class and implement thequery
method which will (asynchronously) query the model's API and return the response. When running an experiment, theExperiment
class will call this method for each experiment to send requests asynchronously.
When a new model is added, you must add it to the API
dictionary which is in the apis
module. This dictionary should map the model name to the class of the model. For details on how to add a new model, see the guide on adding new APIs and models.
@article{chan2024prompto,
title={Prompto: An open source library for asynchronous querying of LLM endpoints},
author={Chan, Ryan Sze-Yin and Nanni, Federico and Brown, Edwin and Chapman, Ed and Williams, Angus R and Bright, Jonathan and Gabasova, Evelina},
journal={arXiv preprint arXiv:2408.11847},
year={2024}
}