Skip to content

[Help Wanted] Add ONNX Runtime inference path with PyTorch fallback #12

@Oshgig

Description

@Oshgig

Overview

scripts/export_model.py exports PyTorch models to ONNX, but the API still loads .pth checkpoints directly. ONNX Runtime would provide faster inference and lower memory usage.

Scope

  • Update _load_model() in src/climatevision/inference/pipeline.py to prefer .onnx files when present
  • Fall back to PyTorch .pth if ONNX file is missing
  • Verify ONNX input/output tensor signatures match per analysis type:
    • UNet: (N, n_channels, 256, 256)(N, n_classes, 256, 256)
    • Siamese: (N, C, 256, 256) × 2 → (N, 2, 256, 256)
  • Add latency benchmark logging (compare PyTorch vs ONNX inference time)
  • Update scripts/export_model.py to export all analysis-specific models

Acceptance Criteria

  • When .onnx checkpoint exists, API loads it via ONNX Runtime
  • Latency difference is logged per request
  • Fallback to PyTorch works seamlessly when ONNX export is absent
  • All 3 analysis types supported

Resources

  • src/climatevision/inference/pipeline.py_load_model() function
  • scripts/export_model.py — existing ONNX export script
  • references/model-architectures.md — tensor shape specs

Difficulty: Intermediate
Labels: help wanted, backend, mlops, performance

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions