v3.3.2 - New Inference Engine (pre-release)
Pre-releaseAmuse v3.3.2 - New Inference Engine (pre-release)
This is a development build introducing a complete re-architecture of the inference engine. We are transitioning away from ONNX Runtime as the primary backend to a more modular engine supporting native Safetensors and GGUF.
This build serves as the first technical preview on the roadmap toward Amuse 4.0.
Technical Pivot: From ONNX to Native Weights
The legacy ONNX implementation required significant conversion overhead and lacked immediate compatibility with the latest research. The new engine changes this:
- Format Support: Added native loaders for
.safetensorsand.gguf. - SOTA Integration: Provides the foundation to run FLUX.2, Z-Image, and LTX-2 without waiting for ONNX-specific optimizations or model conversions.
- Quantization: Automatic quantization to
bfloat16,float8orNF4data types, support for GGUF allows for advanced bit-depth control (4-bit, 5-bit, 8-bit, etc.), significantly improving VRAM management for high-parameter models on consumer hardware.
What to Expect in this Dev Build
- New Backend Implementation: This is a ground-up rebuild. Expect instability as we port features over.
- Breaking Changes: Model loading logic has been overhauled. Existing ONNX models from your v3.0 install need to be updated to their Safetensors equivalents.
- Performance: Initial focus is on extensibility and compatibility. Performance tuning for specific AMD/NPU/GPU targets is ongoing.
Installation
This dev build is NOT compatible with previous versions of Amuse or existing ONNX models, full uninstall of Amuse 3.0 is REQUIRED.
1. Installer Version
Best for a standard Windows setup with automatic shortcuts.
- New Install & Upgrade:
- Download and run
Amuse_vX.X.X.exe. - Follow the on-screen instructions.
- Download and run
2. Standalone Version
Best for custom drive locations.
New Install:
- Download and extract
Amuse_vX.X.X.zipto your preferred folder. - Run
Amuse.exe.
Note: A fast SSD with plenty of free space is highly recommended, as model downloads can be large.
Device Compatibility
This new engine leverages a python based compute backend. Please note the following hardware requirements for this dev release:
- NVIDIA: Supports all CUDA compatible devices (CUDA 12.8).
- AMD: Supports Radeon 7000 series GPUs only (ROCm 7.2.1).
- Intel: Not Supported.
- Legacy Hardware: If you are using an incompatible device or older AMD hardware, it is highly recommended to stay on v3.2.0 until broader support is ported to the new engine.
Developers Note
Only install this version if you are interested in testing development builds. As I am a solo developer with very limited testing resources and only a few devices, things may not work, and features may be missing. Each dev build may require re-installs and other manual processes. Please only install if this is okay; otherwise, wait until a stable release is ready.
Full Changelog: v3.2.0...v3.3.2