BatchWizard is a powerful CLI tool for managing OpenAI batch processing jobs with ease. It provides functionalities to upload files, create batch jobs, check their status, and download the results. The tool uses asynchronous processing to efficiently handle multiple jobs concurrently.
You can install BatchWizard using pipx
for an isolated environment or directly via pip
.
pipx install batchwizard
pip install batchwizard
Ensure you have pipx
or pip
installed on your system. For pipx
, you can follow the installation instructions here.
BatchWizard provides a command-line interface (CLI) for managing batch jobs. Here are some example commands:
To process input files or directories:
batchwizard process <input_paths>... [--output-directory OUTPUT_DIR] [--max-concurrent-jobs NUM] [--check-interval SECONDS]
You can provide multiple input paths, which can be individual JSONL files or directories containing JSONL files.
Let's say you have a file named batchinput.jsonl
with the following content:
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "system", "content": "You are a helpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "gpt-4o-mini", "messages": [{"role": "system", "content": "You are an unhelpful assistant."},{"role": "user", "content": "Hello world!"}],"max_tokens": 1000}}
To process this file using BatchWizard:
- First, ensure your OpenAI API key is set:
batchwizard configure --set-key YOUR_API_KEY
- Then, run the process command:
This command will:
batchwizard process /path/to/batchinput.jsonl --output-directory /path/to/output
- Upload the
batchinput.jsonl
file to OpenAI - Create a batch job
- Monitor the job status
- Download the results to the specified output directory when complete
- Upload the
You can also process multiple files or directories:
batchwizard process /path/to/file1.jsonl /path/to/directory_with_jsonl_files /path/to/file2.jsonl
To list recent batch jobs:
batchwizard list-jobs [--limit NUM] [--all]
To cancel a specific batch job:
batchwizard cancel <job_id>
To download results for a completed batch job:
batchwizard download <job_id> [--output-file FILE_PATH]
To set the OpenAI API key:
batchwizard configure --set-key YOUR_API_KEY
To show the current configuration:
batchwizard configure --show
To reset the configuration to default values:
batchwizard configure --reset
BatchWizard supports the following commands:
process
: Process batch jobs from input files or directories.configure
: Manage BatchWizard configuration.list-jobs
: List recent batch jobs.cancel
: Cancel a specific batch job.download
: Download results for a completed batch job.
For detailed information on each command, use the --help
option:
batchwizard <command> --help
- Flexible Input: Process individual JSONL files or entire directories containing JSONL files.
- Asynchronous Processing: Efficiently handle multiple batch jobs concurrently.
- Rich UI: Display progress and job status using a rich, interactive interface.
- Flexible Configuration: Easily manage API keys and other settings.
- Job Management: List, cancel, and download results for batch jobs.
- Error Handling: Robust error handling and informative error messages.
We welcome contributions to BatchWizard! To contribute, follow these steps:
- Fork the repository.
- Create a new branch:
git checkout -b feature/your-feature-name
. - Make your changes and commit them:
git commit -m 'Add some feature'
. - Push to the branch:
git push origin feature/your-feature-name
. - Open a pull request.
To run tests, use pytest
:
pytest --cov=batchwizard tests/
Ensure your code passes all tests and meets the coding standards before opening a pull request.
This project is licensed under the MIT License. See the LICENSE file for details.
For any questions or feedback, feel free to open an issue on the GitHub repository.