Skip to content

[Bug] Misleading "Unable to read the model" error when intermediate IR is truncated during INT4 export #1707

@t8

Description

@t8

Environment

Verified on the latest tooling supported by optimum-intel main:

  • optimum-intel: 1.27.0.dev0+190d59f (main HEAD, also released 1.27.0)
  • transformers: 5.0.0 (latest within the <5.1 constraint set by optimum-intel main)
  • openvino: 2026.1.0
  • nncf: 3.1.0
  • torch: 2.11.0
  • Python: 3.12.3, Linux x86_64

Description

When optimum-cli export openvino --weight-format int4 ... runs against a
large model and the temp directory backing the FP16 intermediate IR fills up,
the FP16 write is silently truncated by ENOSPC and the subsequent NNCF
int4 reload of the IR fails. The error never mentions disk space.

The exact OpenVINO error depends on how much of the XML/BIN pair survived
the truncation. Repro'd on the latest stack against OVBaseModel.load_model:

Truncation Error
XML truncated to 0 bytes Unable to read the model: "..." Please check that model format: ".xml" is supported and the model is correct. Available frontends: onnx jax tflite paddle ir tf pytorch
XML truncated mid-tag (e.g. 50%) Check 'res.status == pugi::status_ok' failed at src/frontends/ir/src/input_model.cpp:222: Error parsing element attribute at offset N
XML almost intact (e.g. 99%) Check 'res.status == pugi::status_ok' failed at src/frontends/ir/src/input_model.cpp:222: Start-end tags mismatch at offset N
BIN truncated (any non-zero loss) Check 'm_weights->size() >= offset + size' failed at src/core/xml_util/src/xml_deserialize_util.cpp:907: Incorrect weights in bin file!

In each case, the user has no signal that disk is the cause; they have to
inspect the temp directory by hand to discover that the XML/BIN pair was
truncated.

Steps to reproduce

The realistic path is to run the actual export on a host with insufficient
free space at TMPDIR:

TMPDIR=/tmp \
optimum-cli export openvino \
    --model meta-llama/Llama-3.1-70B-Instruct \
    --weight-format int4 \
    --task text-generation-with-past \
    /path/to/output

A self-contained reproducer that does not require a 70 GB model is
attached at
upstream/04_optimum_intel_disk_full_error/reproducer.py
in our project repo: it exports a tiny model, truncates the resulting
model.xml (or .bin) to several levels, and calls
OVBaseModel.load_model to surface each error path.

Expected behavior

Either (a) the export aborts earlier with a clear "not enough free space at
TMPDIR" message, or (b) the read failure includes the file sizes and free
disk space at the path so the user can diagnose immediately.

Where this happens in the source

optimum/commands/export/openvino.py:455-462 creates a TemporaryDirectory()
that honours TMPDIR and falls back to /tmp. There is no free-space
precheck.

The FP16 IR is then written by main_export
optimum/exporters/openvino/convert.py (_save_model
openvino.save_model). _main_quantize re-reads it via
model_cls.from_pretrained(output, ...), which lands in
OVBaseModel.load_model at
optimum/intel/openvino/modeling_base.py:369-376
(line numbers against upstream main at HEAD 190d59f):

if isinstance(file_name, str):
    file_name = Path(file_name)
model = (
    core.read_model(file_name.resolve(), file_name.with_suffix(".bin").resolve())
    if not file_name.suffix == ".onnx"
    else convert_model(file_name)
)

There is no integrity check and no exception handling around core.read_model,
so any failure (truncation, permission, anything else) re-emits the verbatim
OpenVINO error.

Proposed Solution

A minimal patch in OVBaseModel.load_model that catches the read failure and
adds disk-space + file-size context to the error:

import shutil
...
if isinstance(file_name, str):
    file_name = Path(file_name)
if file_name.suffix == ".onnx":
    model = convert_model(file_name)
else:
    xml_path = file_name.resolve()
    bin_path = file_name.with_suffix(".bin").resolve()
    try:
        model = core.read_model(xml_path, bin_path)
    except Exception as e:
        xml_size = xml_path.stat().st_size if xml_path.is_file() else -1
        bin_size = bin_path.stat().st_size if bin_path.is_file() else -1
        try:
            free_space = shutil.disk_usage(xml_path.parent).free
        except OSError:
            free_space = -1
        raise RuntimeError(
            f"OpenVINO failed to read {xml_path} "
            f"(xml size={xml_size} bytes, bin size={bin_size} bytes, "
            f"free space at parent dir={free_space} bytes). "
            f"If file sizes look truncated, the intermediate IR write may have "
            f"run out of disk space; check that TMPDIR (or the export output "
            f"directory) has sufficient free space. "
            f"Original error: {e}"
        ) from e

Happy to open a PR. The matching branch with this patch is at
t8/optimum-intel:improve-corrupt-ir-error-message
(commit 9f40fff,
+24 −5 in one file). Branch is rebased on main HEAD 190d59f.

A follow-up improvement (separate PR if the first is welcome) would be to
add a free-space precheck in optimum/commands/export/openvino.py shortly
after TemporaryDirectory() is created: log
shutil.disk_usage(temporary_directory.name).free and warn when it is less
than ~2× the model's HF-config-implied FP16 size.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions