feat: exact weight-byte accounting from safetensors metadata

Part of #52

## Goal
Compute byte-accurate model weight size before load by reading the safetensors header, with the existing analytical estimate (`ModelProfile::total_param_bytes`) as a fallback.

## Why
The most accurate weight number is what is actually on disk. `model.safetensors.index.json` carries `metadata.total_size` (exact sum of all tensor bytes), and a single-file `model.safetensors` carries a header with each tensor's dtype + shape + byte offsets. mlxcel already opens the index (`parse_shard_index` in `src/lib/mlxcel-core/src/weights.rs`) but extracts only shard filenames and **discards `total_size`**. The analytical estimate (`build_profile_from_json`) is only "few % accurate" and currently lives only in the distributed path.

## Scope / implementation
- Add a function (e.g. `weights::weight_footprint_bytes(model_dir) -> Option<u64>`) that:
  - prefers `metadata.total_size` from `model.safetensors.index.json` when present;
  - else, for a single `model.safetensors`, reads the 8-byte header length + JSON header and sums per-tensor byte sizes (dtype itemsize × shape product) **without loading tensor data**;
  - returns `None` when neither is available (caller falls back to analytical).
- Reuse / extend the existing index parser in `weights.rs` rather than adding a second JSON reader.
- Provide a single accessor in the estimation module so callers don't each pick between exact/analytical.

## Integration (required for completion — not a standalone helper)
- Wire the exact footprint into the shared estimator used by `--recommend-quant` (`src/execution/quant_advisor.rs`) and the new estimator (sub-issue D), so the analytical path is only a fallback.
- At minimum, `--recommend-quant`'s "Model size" line must reflect exact bytes when a safetensors header is available.

## Acceptance criteria
- Exact byte count returned for both sharded (`index.json` / `total_size`) and single-file (`model.safetensors` header) layouts; `None` triggers the analytical fallback path.
- Unit tests with fixture headers / index JSON (sharded + single-file + missing).
- One real-model smoke check: exact footprint is within a few % of `ModelProfile::total_param_bytes` and ≤ the on-disk weight file size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: exact weight-byte accounting from safetensors metadata #53

Goal

Why

Scope / implementation

Integration (required for completion — not a standalone helper)

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: exact weight-byte accounting from safetensors metadata #53

Description

Goal

Why

Scope / implementation

Integration (required for completion — not a standalone helper)

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions