Skip to content

Headless/no-GPU environments crash on MLX import (NSRangeException) instead of failing gracefully #3148

@jrp2014

Description

@jrp2014

Summary

In environments where no Metal device is visible (headless/sandboxed/virtualized/macOS automation sessions),
MLX initialization aborts the process with an uncaught Objective-C exception instead of returning
a recoverable Python error.

This makes downstream tooling fail hard, including quality/lint/test pipelines that do not need GPU execution.

Reproduction

  1. Run in a session with no visible Metal device.
  2. Execute:
python -c "import mlx.core as mx; print(mx.default_device())"

(Equivalent failure also occurs during import mlx via dependency probes.)

Actual behavior

Process exits with signal/abort (-6) and an uncaught exception similar to:

NSRangeException: -[__NSArray0 objectAtIndex:]: index 0 beyond bounds for empty array

Expected behavior

  • No hard abort.
  • Either:
    1. raise a clear Python exception (e.g., RuntimeError: No Metal device available), or
    2. allow a documented CPU/no-op mode for import-time checks.
  • Error path should be machine-detectable so CI tools can handle it gracefully.

Why this matters

Hard aborts break non-inference workflows (lint/type/test/packaging checks) when MLX is installed
but GPU is unavailable. A recoverable error would allow callers to skip MLX-dependent runtime tests
without crashing the whole process.

Suggested fix

  • Guard the zero-device path before indexing into device arrays. load_device() in mlx/backend/metal/device.cpp is the primary fix site (empty-device guard + graceful error path).
  • Convert this failure path to a typed Python exception instead of process termination.
  • Optionally provide an env flag to skip Metal probing at import time in CI/headless contexts.

The main code path is:

Related existing report: Issue #2691.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions