Skip to content

Liquid4All/onnx-export

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LiquidONNX

ONNX export and inference tools for LFM2 models.

1. Supported Models

Family Quant Formats
LFM2.5, LFM2 fp32, fp16, q4, q8
LFM2.5-VL, LFM2-VL fp32, fp16, q4, q8
LFM2-MoE fp32, fp16, q4, q4f16

2. Installation

git clone https://github.com/Liquid4All/onnx-export.git
cd onnx-export
uv sync

# For GPU inference support
uv sync --extra gpu

# For development (testing, benchmarking)
uv sync --extra dev

3. Export

3.1 LFM2 Text Models

# All precisions
uv run lfm2-export LiquidAI/LFM2.5-1.2B-Instruct --precision

3.2 LFM2-VL Vision-Language Models

# All precisions
uv run lfm2-vl-export LiquidAI/LFM2.5-VL-1.6B --precision

# Conv2d vision format (alternative to default tiled)
uv run lfm2-vl-export LiquidAI/LFM2.5-VL-1.6B --vision-format conv2d

3.3 LFM2-MoE Mixture of Experts

# All precisions
uv run lfm2-moe-export LiquidAI/LFM2-MoE-8B-A1B --precision

4. Inference

All inference commands provide interactive multi-turn chat with streaming output. They automatically detect CUDA availability and fall back to CPU if needed.

4.1 Text Generation

# Interactive chat (starts conversation loop)
uv run lfm2-infer --model LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx

# Single prompt (non-interactive)
uv run lfm2-infer --model LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx \
    --prompt "Explain quantum computing"

# Force CPU execution
uv run lfm2-infer --model LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx --cpu

4.2 Vision-Language

# Single image analysis
uv run lfm2-vl-infer --model LFM2.5-VL-1.6B-ONNX \
    --images photo.jpg \
    --prompt "What do you see in this image?"

# Multi-image comparison (up to 2 images)
uv run lfm2-vl-infer --model LFM2.5-VL-1.6B-ONNX \
    --images image1.jpg image2.jpg \
    --prompt "Compare these two images"

# Text-only (no images)
uv run lfm2-vl-infer --model LFM2.5-VL-1.6B-ONNX \
    --prompt "Hello, how are you?"

Note: VL inference requires the model directory path (not a single .onnx file) since it loads multiple components: embed_tokens.onnx, embed_images.onnx, and decoder.onnx.

4.3 MoE

# Interactive chat
uv run lfm2-moe-infer --model LFM2-MoE-8B-A1B-ONNX/onnx/model_q4.onnx

# Force CPU (when model does not fit VRAM)
uv run lfm2-moe-infer --model LFM2-MoE-8B-A1B-ONNX/onnx/model_q4.onnx --cpu

5. Testing

Tests verify ONNX exports against PyTorch reference models.

# Install dev dependencies
uv sync --extra dev

# LFM2 text model tests
uv run pytest tests/test_lfm2/test_decoder.py -v -k "q4"

# LFM2-VL vision-language tests
uv run pytest tests/test_lfm2_vl/test_decoder.py -v -k "450M"
uv run pytest tests/test_lfm2_vl/test_vision_encoder.py -v

# LFM2-MoE tests
uv run pytest tests/test_lfm2_moe/test_decoder.py -v

Benchmarking, compare the CPU

# Text model benchmark
uv run lfm2-bench --model LiquidAI/LFM2.5-1.2B-Instruct \
    --onnx LFM2.5-1.2B-Instruct-ONNX/onnx/model_q4.onnx

6. Pre-exported Models

6.1 LiquidAI

Text models:

Vision-Language:

6.2 onnx-community

Text models:

Specialized:

Vision-Language:

MoE:

Note: The onnx-community models are exported using Transformers.js tooling with a different export pipeline. This project aims to produce compatible graph structures and file naming conventions to ensure interoperability with Transformers.js and other ONNX consumers.

7. Acknowledgements

Special thanks to Joshua Lochner for his work on Transformers.js and the onnx-community models, which inspired and informed this project's ONNX export approach.

8. License

See LICENSE for details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages