Lumigator is an open-source platform built by Mozilla.ai for guiding users through the process of selecting the right language model for their needs. Currently, we support evaluating summarization tasks using sequence-to-sequence models like BART and BERT and causal architectures like GPT and Mistral, but will be expanding to other machine learning tasks and use-cases.
See example notebook for a platform API walkthrough.
Note
Lumigator is in the early stages of development. It is missing important features and documentation. You should expect breaking changes in the core interfaces and configuration structures as development continues.
- Understanding Evaluation
- Installing Lumigator
- Building
- See below
- Using/Testing
- Building
- Using Lumigator:
- Platform Examples
- Lumigator API
- Offline Evaluations with lm-buddy
- Extending Lumigator:
Model Type | Model | via HuggingFace | via API |
---|---|---|---|
seq2seq | facebook/bart-large-cnn | X | |
causal | gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo-0125 | X | |
causal | open-mistral-7b | X |
Model Type | Model | via HuggingFace | via API |
---|---|---|---|
seq2seq | facebook/bart-large-cnn | X | |
seq2seq | longformer-qmsum-meeting-summarization | X | |
seq2seq | mrm8488/t5-base-finetuned-summarize-news | X | |
seq2seq | Falconsai/text_summarization | X | |
causal | gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo-0125 | X | |
causal | open-mistral-7b | X |
- ROUGE - (Recall-Oriented Understudy for Gisting Evaluation), which compares an automatically-generated summary to one generated by a machine learning model on a score of 0 to 1 in a range of metrics comparing statistical similarity of two texts.
- METEOR - Looks at the harmonic mean of precision and recall
- BERTScore - Generates embeddings of ground truth input and model output and compares their cosine similarity
Check this link for a list of pros and cons of each metric and an example of how they work
Lumigator is a Python-based FastAPI web app with REST API endpoints that allow for access to services for serving and evaluating large language models available as safetensor artifacts hosted on both HuggingFace and local stores, with our first primary focus being Huggingface access, and tracking the lifecycle of a model in the backend database (Postgres). It consists of:
- a FastAPI-based web app that includes huggingface's
evaluate
library for those metrics - a Ray cluster to run offline evaluation jobs using
evaluator
- the
evaluator
module runs inference accessing different kind of models, accessible locally or via APIs, and evaluation with huggingface'sevaluate
library or lm-evaluation-harness
- the
- Artifact management (S3 in the cloud, localstack locally )
- A database to track platform-level tasks and dataset metadata
You can build the local project docker-compose
on Mac or Linux, or into a distributed environment using Kubernetes Helm charts
- Docker
- On MAC, Docker Desktop >= 4.3, and docker-compose >= 1.28.
- On Linux, please also follow the post-installation steps.
- System Python (that is: no version manager, such as pyenv, should be active).
git clone git@github.com:mozilla-ai/lumigator.git
make start-lumigator
- The REST API should be available at http://localhost:8000. (If you need to change the port, you can do it in the
docker-compose
)
To run Lumigator with an external services you, fill in the required values in the docker-compose.external.yaml
file. Once that's done, you can start Lumigator with:
make start-lumigator-external-services
git clone git@github.com:mozilla-ai/lumigator.git
make local-up
. For more ondocker-compose
, see the local install documentation..- To shut down app,
make local-down
We use uv
to manage dependencies. Each project under .../lumigator/lumigator/python/mzai/
is an independent uv
Python project to isolate dependencies. Sub-projects are linked together using editable Python package installs.
For each project, here are some handy uv
commands to work with the repo
Change directory to the project you want to work on (ie. lumigator/lumigator/python/mzai/backend
)
Grab dependencies
uv sync
Run Tests
uv run pytest
Add Dependencies to a given project
uv add package
Make sure to commit the updated uv.lock file
Run Test Suite Across all projects
make test
Run the app locally via docker compose:
make local-up
make local-logs # gets the logs from docker compose
make local-down # shuts it down
The docker-compose
setup described in the corresponding README needs several environment variables to work appropriately.
If the S3 storage service is used, the endpoint, key and secret are needed. The LocalStack implementation used also requires an authentication token.
Environment variable name | Default value | Description |
---|---|---|
AWS_ENDPOINT_URL | "" | Endpoint URL for the S3 data storage service. |
AWS_ACCESS_KEY_ID | "" | Key for the S3 data storage service. |
AWS_SECRET_ACCESS_KEY | "" | Secret for the S3 data storage service. |
AWS_DEFAULT_REGION | "" | Default region for the S3 service. |
LOCALSTACK_AUTH_TOKEN | "" | Authentication token for the LocalStack service. |
S3_BUCKET | lumigator-storage | "" |
Models from Mistral or OpenAI can be used via API instead of instantiating them within Lumigator. In this case, the corresponding key is needed.
Environment variable name | Default value | Description |
---|---|---|
MISTRAL_API_KEY | "" | Key for Mistral API models. |
OPENAI_API_KEY | "" | Key for OpenAI API models. |
Lumigator uses a database to store its structured data. It needs a database user, a password and a default database.
Environment variable name | Default value | Description |
---|---|---|
POSTGRES_HOST | "" | Host where the postgres db is available. Currently pointing at services.postgres . |
POSTGRES_PORT | "" | Port where the postgres db is available (usually 5432). |
POSTGRES_USER | "" | Database user holding the lumigator structured data. Needs to match postgres.environment.POSTGRES_DB . |
POSTGRES_DB | "" | Database name holding the lumigator structured data. Needs to match postgres.environment.POSTGRES_DB . |
The Ray cluster used for computing allows several settings through the following variables.
Environment variable name | Default value | Description |
---|---|---|
RAY_DASHBOARD_PORT | "" | Port for accessing the Ray dashboards (usually 8265). |
RAY_WORKER_GPUS | "" | Number of GPUs available for worker nodes. |
RAY_WORKER_GPUS_FRACTION | "" | Fraction of available GPUs used by worker nodes. |