feat: wrap MLX runtime memory APIs (active/peak/limit) in FFI

Part of #52

## Goal
Expose MLX's runtime memory APIs through the C++ FFI so mlxcel can measure actual resident/peak memory and set allocator limits.

## Why
There is currently **no** binding for `get_active_memory` / `get_peak_memory` / `get_cache_memory` / `set_memory_limit` / `set_cache_limit` / `reset_peak_memory` (confirmed absent from `src/lib/mlx-cpp` and the FFI). Without it the estimator cannot be validated against truth, peak cannot be observed, and the allocator cannot be capped to fail fast instead of thrashing / OOM.

## Scope / implementation
- Add C++ wrappers in the mlx-cpp FFI bridge and re-export typed wrappers from `mlxcel-core` (e.g. a new `memory` module, or extend `hardware.rs`):
  - `active_memory() -> u64`, `peak_memory() -> u64`, `cache_memory() -> u64`
  - `set_memory_limit(bytes)`, `set_cache_limit(bytes)`, `reset_peak_memory()`, optionally `clear_cache()`.
- Match the existing FFI conventions in `src/lib/mlxcel-core/src/ffi*` and the `mlx-cpp` crate (keep the raw `ffi` module private; re-export typed wrappers).

## Integration (required for completion — not standalone)
- Call `active_memory()` immediately after a successful model load and log "resident after load: X" (gated behind the estimator/inspect path or a debug log).
- Provide a `set_memory_limit` hook the preflight (sub-issue D) can use to fail fast.
- At minimum, the post-load resident measurement is logged in a real `mlxcel generate` run.

## Acceptance criteria
- Functions are callable from Rust and return plausible non-zero values on Apple Silicon after allocating an array (FFI smoke test).
- `reset_peak_memory()` followed by a known allocation → `peak_memory()` reflects it.
- Wired into the load path so a real `mlxcel generate` logs resident-after-load.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: wrap MLX runtime memory APIs (active/peak/limit) in FFI #55

Goal

Why

Scope / implementation

Integration (required for completion — not standalone)

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: wrap MLX runtime memory APIs (active/peak/limit) in FFI #55

Description

Goal

Why

Scope / implementation

Integration (required for completion — not standalone)

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions