feat: safetensors runtime loading + HF-aware resolver by ziv-lazarov-nagish · Pull Request #28 · sign-language-processing/segmentation

ziv-lazarov-nagish · 2026-04-20T11:06:46Z

Summary

First of a 5-PR stack (layers 1–4 split PR #22 into reviewable layers; layer 5 adds --dry-run on top). This PR adds the runtime-only foundation: bin.py can now load models from safetensors directories and falls back to .ckpt for backward compatibility. No HF or publish features are advertised here — the huggingface_hub import is behind a guarded try/except ImportError, and the [hf] / [publish] optional-dependency groups land in the later stack layers.

bin.py: new resolve_model_path() with priority MODEL_PATH env > HF_MODEL_REPO env > baked-in sign_language_segmentation/dist/2026/. load_model() prefers model.safetensors + config.json and falls back to .ckpt.
pyproject.toml: add [build-system], add safetensors to base deps, swap .ckpt → .safetensors in package-data, drop slim_checkpoint script (still runnable via python -m).
.gitignore: ignore *.safetensors artifacts.

Review-comment fix shipped in this layer

High memory usage #3 @lru_cache + HF_MODEL_REVISION staleness — added revision: str = "" kwarg to load_model(...) so it's part of the cache key. segment_pose reads os.environ.get("HF_MODEL_REVISION", "") and forwards it. A mid-process env change now invalidates the cache entry instead of silently returning a stale model. Non-HF paths default to revision="" and behave exactly as before.

Out of scope (reverted)

datasets/dgs/dataset.py formatting churn — not in this PR
datasets/annotation_platform/sync.py --model_path default change — not in this PR

Stack

5-PR stack — layers 1–4 split #22, layer 5 adds --dry-run:

This PR → nagish
feat/publish-utils-and-card → this PR (PR feat: publish utils — conversion, versioning, model card #29)
feat/publish-hf-ops-and-eval → feat: publish utils — conversion, versioning, model card #29 (PR feat: publish HF ops + evaluation helpers #31)
feat/publish-cli → feat: publish HF ops + evaluation helpers #31 (PR feat: publish CLI orchestrator #30)
feat/publish-dry-run → feat: publish CLI orchestrator #30 (PR feat: --dry-run flag on publish CLI #32)

Stacks up toward replacing #22. #22 stays open until explicit close.

Test plan

uv run --extra dev pytest sign_language_segmentation/tests — 61 passed
Local round-trip: MODEL_PATH pointing at a directory with model.safetensors + config.json loads successfully (verified against sign_language_segmentation/dist/2026/)
load_model(dir, device, revision="x") vs load_model(dir, device, revision="y") returns distinct instances; same revision returns the cached instance (cache key includes revision as intended)

- `bin.py`: new `resolve_model_path()` picks MODEL_PATH > HF_MODEL_REPO > baked-in `dist/2026/`. `load_model()` prefers `model.safetensors` + `config.json` and falls back to `.ckpt` for backward compatibility. - `load_model(..., revision="")` makes HF_MODEL_REVISION part of the lru_cache key so mid-process env changes invalidate the cached instance instead of silently returning a stale model. - `pyproject.toml`: add `[build-system]`, add `safetensors` to base deps, swap `.ckpt` for `.safetensors` in package-data, drop `slim_checkpoint` script (still runnable via `python -m`). - `.gitignore`: ignore `*.safetensors` artifacts. The `huggingface_hub` import stays behind a guarded `try/except ImportError`; no new runtime dep is advertised in this PR.

AmitMY · 2026-04-22T16:05:37Z

generally good! but, this PR, if merged to main, would break it. can you please manually convert the current ckpt to safetensors, so we can still have the model here (more important for main)

ziv-lazarov-nagish · 2026-04-23T07:18:57Z

generally good! but, this PR, if merged to main, would break it. can you please manually convert the current ckpt to safetensors, so we can still have the model here (more important for main)

.ckpt files are still supported (see here), do you still want me to convert to safetensors?

AmitMY · 2026-04-23T14:56:01Z


 [tool.setuptools.package-data]
-sign_language_segmentation = ["**/*.json", "**/*.ckpt", "**/*.yaml"]
+sign_language_segmentation = ["**/*.json", "**/*.safetensors", "**/*.yaml"]


convert to safetensors, otherwise, ckpt file will not be installed on pip install

package-data now selects **/*.safetensors, so the bundled best.ckpt would not have been installed on pip install. convert it in-place (bf16 state dict + hparams json) so the baked-in default keeps working. safetensors/ckpt loads produce identical outputs (max abs diff = 0).

This was referenced Apr 20, 2026

feat: publish utils — conversion, versioning, model card #29

Merged

feat: publish CLI orchestrator #30

Merged

feat: publish HF ops + evaluation helpers #31

Merged

feat: --dry-run flag on publish CLI #32

Merged

ziv-lazarov-nagish marked this pull request as ready for review April 20, 2026 12:21

ziv-lazarov-nagish requested a review from AmitMY April 20, 2026 12:23

AmitMY reviewed Apr 23, 2026

View reviewed changes

AmitMY requested changes Apr 23, 2026

View reviewed changes

ziv-lazarov-nagish merged commit 4b3d2ea into nagish Apr 26, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: safetensors runtime loading + HF-aware resolver#28

feat: safetensors runtime loading + HF-aware resolver#28
ziv-lazarov-nagish merged 2 commits intonagishfrom
feat/safetensors-runtime

ziv-lazarov-nagish commented Apr 20, 2026 •

edited

Loading

Uh oh!

AmitMY commented Apr 22, 2026

Uh oh!

ziv-lazarov-nagish commented Apr 23, 2026

Uh oh!

AmitMY Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ziv-lazarov-nagish commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Review-comment fix shipped in this layer

Out of scope (reverted)

Stack

Test plan

Uh oh!

AmitMY commented Apr 22, 2026

Uh oh!

ziv-lazarov-nagish commented Apr 23, 2026

Uh oh!

AmitMY Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ziv-lazarov-nagish commented Apr 20, 2026 •

edited

Loading