Address external path references in loaded models#28709
Conversation
There was a problem hiding this comment.
Pull request overview
Hardens external data loading against malicious models by (1) adding a backstop path-validation call in GetExtDataFromTensorProto/LoadExtDataToTensorFromTensorProto so direct callers outside Graph::Resolve are also protected, and (2) explicitly rejecting ORT in-memory address markers (*/_ORT_MEM_ADDR_/*, */_ORT_NATIVE_ENDIAN_MEM_ADDR_/*) when they appear in dense initializers, sparse initializers, or sparse Constant-node attributes loaded from a protobuf. The in-memory markers are ORT-internal sentinels that cause reinterpret_cast<void*>(offset)-based dereferences; allowing them from an .onnx protobuf is a memory-safety hazard. ORT-format flatbuffer loading and the ModelEditor API use different Graph constructors and are unaffected.
Changes:
- Introduce
ValidateExternalFilePathForTensorand call it fromGetExtDataFromTensorProto/LoadExtDataToTensorFromTensorProto; rename/refactor the sparse-only validator toValidateSparseSubTensorExternalDataPath. - In
Graph::Graph(GraphProto ctor),ORT_ENFORCEagainst in-memory address markers on dense initializers, sparse initializers (values + indices), and Constant-node sparse-tensor attributes. - Add direct loader tests for absolute and directory-escaping paths, plus model-load tests that reject in-memory markers on dense and sparse initializers; update one optimizer test to expect
OnnxRuntimeExceptioninstead offilesystem_error.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/core/framework/tensorprotoutils.cc | Adds ValidateExternalFilePathForTensor backstop and renames the sparse helper to ValidateSparseSubTensorExternalDataPath. |
| onnxruntime/core/graph/graph.cc | Rejects in-memory address markers on dense initializers, sparse initializers, and sparse Constant attributes in the GraphProto ctor. |
| onnxruntime/test/framework/tensorutils_test.cc | Adds direct GetExtDataFromTensorProto tests for absolute and escaping external paths. |
| onnxruntime/test/ir/graph_test.cc | Adds model-load tests verifying in-memory markers are rejected on dense and sparse (values/indices) initializers. |
| onnxruntime/test/optimizer/initializer_test.cc | Updates expected exception type for an invalid external directory to OnnxRuntimeException. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tianleiwu
left a comment
There was a problem hiding this comment.
Security Hardening Review
Strong defense-in-depth PR. The layered approach — Graph ctor as the gatekeeper for untrusted protobufs, with validation helpers enforcing path constraints at data-loading callsites — is well-designed.
Positives:
- Adding path validation inside
GetExtDataFromTensorProtoandLoadExtDataToTensorFromTensorProtoensures callers outsideGraph::Resolve(training checkpoints, custom-op initializers) are protected. - Rejecting in-memory markers at the protobuf entry point (Graph ctor) is the correct architectural choice — prevents attacker-controlled pointers from ever reaching data-loading code.
- Renaming to
ValidateSparseSubTensorExternalDataPathmakes scope and intent clearer. - The trust-boundary comments (ORT-format flatbuffer = trusted, protobuf = untrusted) are valuable for future maintainers.
- Test coverage is thorough: sparse values/indices markers, dense initializer markers, absolute paths, and escaping paths.
One minor suggestion inline.
This pull request strengthens security checks around loading external tensor data in ONNX Runtime, particularly to prevent malicious models from referencing unsafe file paths or in-memory address markers that could lead to arbitrary file access or unsafe memory dereferencing. The changes introduce stricter validation for external data paths and add explicit rejections for ORT in-memory address markers found in model protobufs, along with new and improved regression tests to verify this behavior.
Security hardening for external data loading:
ValidateExternalFilePathForTensorto enforce that external data paths are validated for all code paths loading external data (including those outsideGraph::Resolve), rejecting absolute or directory-escaping paths and passing through only trusted in-memory markers. This is now called inGetExtDataFromTensorProtoandLoadExtDataToTensorFromTensorPrototo ensure defense-in-depth. [1] [2]ValidateSparseSubTensorExternalDataPath, clarifying the handling of in-memory markers and ensuring only legitimate file paths are accepted.SparseTensorProtoToDenseTensorPrototo use the new sparse sub-tensor validation for both values and indices.Model loading and graph construction protections:
Graph::Graph, added explicit rejection of ORT in-memory address markers in sparse tensor attributes and initializers when loading from a protobuf, preventing attackers from crafting models that could cause unsafe memory access during sparse-to-dense conversion or initializer resolution. [1] [2] [3]Expanded and improved testing: