-
Notifications
You must be signed in to change notification settings - Fork 0
Quick Start
Gaurav14cs17 edited this page Jun 21, 2026
·
1 revision
Get started with FlashOptim in minutes.
from flashoptim import FlashOptim, PTQuantizer
# Load your model
model = FlashOptim("pretrained/model.pth")
# Apply INT8 quantization
quantizer = PTQuantizer(dtype="int8", calibration_samples=500)
quantized = quantizer.quantize(model, calibration_data="data/calibration/")
# Export
quantized.export("optimized/model_int8.onnx")
print(f"Size reduction: {quantized.compression_ratio}x")from flashoptim import FlashOptim, UnstructuredPruner
model = FlashOptim("pretrained/model.pth")
pruner = UnstructuredPruner(sparsity=0.5, method="magnitude")
pruned = pruner.prune(model)
pruned.export("optimized/model_pruned.onnx")# Quantize
flashoptim quantize --config configs/flashoptim_quantize_int8.yaml
# Prune
flashoptim prune --config configs/flashoptim_prune_unstructured.yaml
# Benchmark the result
flashoptim benchmark --model optimized/model_int8.onnx --device cpufrom flashoptim import AutoOptimizer
optimizer = AutoOptimizer(
model_path="pretrained/model.pth",
target="edge",
constraints={"latency_ms": 10}
)
result = optimizer.run()
result.export("optimized/model_auto.onnx")- Quantization Guide — Deep dive into PTQ and QAT
- Pruning Guide — Structured and unstructured pruning
- Distillation Guide — Knowledge transfer techniques
- NAS Guide — Automated architecture search
FlashOptim — Model optimization toolkit | PyPI | MIT License