v0.1.0-alpha — first public release

This is the first public release of Gemma Multimodal Fine-Tuner, and it's an alpha.

Why this exists

I wanted to fine-tune Gemma on audio + text on my Mac Studio, on data that didn't fit on my Mac — and discovered nothing did all three at once:

MLX-LM / Unsloth / axolotl either don't do audio, don't run on Apple Silicon, or assume your dataset fits on local disk.
Renting an H100 to LoRA a 2B model felt absurd. Copying a terabyte of GCS data to a laptop, more so.

So this toolkit does the thing I needed: text, image, and audio LoRA on Gemma 3n / Gemma 4, MPS-native, streaming from GCS / BigQuery. As far as I know it's the only Apple-Silicon-native path for Gemma audio fine-tuning.

What works today

✅ Text-only LoRA (instruction or completion on local CSV) — the most-tested path.
✅ Image + text LoRA (captioning / VQA on local CSV) — works, with offline + gated smoke tests.
✅ Audio + text LoRA — works on Apple Silicon.
✅ GCS / BigQuery streaming for datasets that don't fit locally.
✅ Interactive wizard for system check, LoRA config, model, and dataset selection.
✅ Export to merged HF / SafeTensors via gemma_tuner/scripts/export.py.

Why it's alpha

APIs and config schema will change. Profiles, config.ini keys, and CLI flags are not yet stable.
Tested primarily on my own hardware (Apple Silicon, unified memory). Other configurations are likely to surface rough edges.
Image + audio paths have lighter test coverage than text. The gated multimodal smoke workflow is manual, not on every PR.
The wizard's vision memory estimator is a heuristic and is documented as unvalidated.
Gemma 4 support requires a separate requirements-gemma4.txt stack — expect dependency friction.
Expect bugs in edge cases around long prompts, unusual CSV schemas, and very large streamed shards.

Install & try it

See the README for install, profile setup, and the wizard walkthrough.

Feedback

Issues and PRs welcome at github.com/mattmireles/gemma-tuner-multimodal. If you hit a crash, the bootstrap log + your profile is the most useful thing to attach. That said, this is a side quest for me, so hopefully this doesn't get too popular lol.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.1.0-alpha — first public release

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.1.0-alpha — first public release

Why this exists

What works today

Why it's alpha

Install & try it

Feedback

Uh oh!