-
Notifications
You must be signed in to change notification settings - Fork 294
Configuration
PaperBanana uses a YAML configuration file with CLI flag overrides.
The default configuration lives at configs/config.yaml:
vlm:
provider: gemini
model: gemini-2.0-flash
image:
provider: google_imagen
model: gemini-3-pro-image-preview
pipeline:
num_retrieval_examples: 10
refinement_iterations: 3
output_resolution: "2k"
reference:
path: data/reference_sets
output:
dir: outputs
save_iterations: true
save_metadata: truepaperbanana generate \
--input method.txt \
--caption "Overview" \
--config my_config.yamlYou only need to include the settings you want to override. Unspecified settings fall back to defaults.
| Key | Description | Default |
|---|---|---|
vlm.provider |
VLM provider name | gemini |
vlm.model |
VLM model identifier | gemini-2.0-flash |
The VLM handles all text generation tasks: retrieval scoring, planning, styling, and critique.
| Key | Description | Default |
|---|---|---|
image.provider |
Image generation provider | google_imagen |
image.model |
Image generation model | gemini-3-pro-image-preview |
| Key | Description | Default |
|---|---|---|
pipeline.num_retrieval_examples |
Number of reference examples to retrieve | 10 |
pipeline.refinement_iterations |
Visualizer-Critic refinement rounds | 3 |
pipeline.output_resolution |
Target output resolution | "2k" |
num_retrieval_examples: Higher values give the Planner more context but increase prompt length and API cost. The maximum is 13 (the full reference set). Values between 5 and 10 work well for most inputs.
refinement_iterations: Each iteration runs one Visualizer + Critic cycle. 3 is the default from the paper. Setting to 1 produces faster but lower quality output. Beyond 3 shows diminishing returns.
| Key | Description | Default |
|---|---|---|
reference.path |
Path to reference dataset directory | data/reference_sets |
Point this at a custom reference set if you've curated your own examples.
| Key | Description | Default |
|---|---|---|
output.dir |
Base output directory | outputs |
output.save_iterations |
Save intermediate iteration images | true |
output.save_metadata |
Save run metadata JSON | true |
Setting save_iterations: false reduces disk usage if you only need the final output.
CLI flags override config file values. For example:
# Config file says 3 iterations, but this runs 1
paperbanana generate \
--input method.txt \
--caption "Overview" \
--config my_config.yaml \
--iterations 1| Variable | Description |
|---|---|
GOOGLE_API_KEY |
Gemini API key. Read from .env if present. |