chisel is a model compression toolkit for PyTorch and Hugging Face Transformers that helps you sculpt leaner, faster models for efficient inference at the edge.
It exposes a set of composable compression passes, including quantization, pruning, and low-rank decomposition, that can be applied individually or orchestrated together into a compression pipeline via a simple YAML workflow. Under the hood, chisel builds on Microsoft's Olive for workflow orchestration. Each pass is designed to be modular, with first-class support for PyTorch and Transformers models.
chisel also provides built-in evaluators to measure the impact of each compression step on model size and accuracy, so you always know the cost of every cut.
Longer term, chisel aims to support export to edge runtimes such as ONNX and ExecuTorch, bringing the full compression-to-deployment pipeline under one roof.
chisel is not yet published on PyPI. Install the latest from source:
uv add git+https://github.com/icnatspell/chiselOr clone and uv sync (see Development below).
chisel is driven by Olive workflow configs. The chisel run command is a thin wrapper around olive run that ensures chisel's passes and evaluators are registered before Olive starts:
chisel run --config path/to/workflow.yamlA minimal structured-pruning config looks like:
input_model:
type: HfModel
model_path: microsoft/resnet-50
task: image-classification
passes:
pruning:
type: TorchPruningPass
config:
pruning_ratio: 0.10
importance: lamp
global_pruning: true
engine:
output_dir: outputs/prunedSee examples/ for complete end-to-end workflows including evaluation and knowledge-distillation fine-tuning.
| Example | Model | What it shows |
|---|---|---|
examples/hf/microsoft-resnet-50/ |
microsoft/resnet-50 |
Prune → Eval → Fine-tune (Plain, KD) |
examples/torch/torchvision-resnet-50/ |
torchvision.models.resnet50 |
Prune → Eval → Fine-tune (Plain, KD) |
git clone https://github.com/icnatspell/chisel.git && cd chisel && uv syncjust check # lint, format, type-check
just test # pytest with coverage
just build # build sdist + wheelRun just with no arguments to see all tasks.
just hooks # install hooks and run on all filesLint, type-check, and tests run on every push and pull request via GitHub Actions (.github/workflows/ci.yml). Coverage must stay at or above 80%.