Models

Key	Flavor	Params	Autoencoder	Hardware	Max Duration
`small`	ARC	433 M	SAME-S	CPU	120 s
`medium`	ARC	1.4 B	SAME-L	GPU (CUDA)	380 s
`small-rf` / `medium-rf`	RF	433 M / 1.4 B	SAME-S / SAME-L	CPU / GPU	120 / 380 s
`same-s` / `same-l`	Autoencoder	266 M / 1.7 B	n/a	CPU / GPU	n/a

ARC checkpoints are post-trained for 8-step inference at cfg_scale=1. RF checkpoints are rectified-flow bases for LoRA training at cfg_scale=7 and roughly 50 steps. ARC and RF checkpoints bundle the autoencoder, and standalone SAME checkpoints reuse the cached full checkpoint when one is available.

Loading

Nothing downloads at startup. Local-only mode is on by default: a model loads at the first CREATE that needs it, resolving local folders first, then the Hugging Face cache, with a one-time download only after explicit consent. The Settings, then Models panel shows every engine's readiness, registers any checkpoint already on disk through a native folder picker, and maps every model location with sizes and one-click open-in-Explorer.

Placement

For which model, which files, the exact folder tree with download links, and where the T5Gemma text encoder lives, see User Guide §21.2.

Engines

The magenta-rt2-nvidia sidecar adds Magenta RealTime 2 text-to-music, and Suno adds cloud generation. Both appear in the same Generate surface as the local engines. See User Guide §27 and §26.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models

Models

Loading

Placement

Engines

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

theDAW

Reference

Ecosystem

GANTASMO

Clone this wiki locally