Skip to content

Add a new operator attribute type ORT_OP_ATTR_BYTES to the ORT C API #25300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

wcy123
Copy link
Contributor

@wcy123 wcy123 commented Jul 7, 2025

Description

Add a new operator attribute type ORT_OP_ATTR_BYTES to the ONNX
Runtime C API.

Motivation and Context

PR #24887 allows plugin-EPs to interface with ORT using a binary stable interface.

It is an important feature for a plugin EP to generate an EP context model as specified by
https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html

The EP needs to read and write the ep_cache_context attribute of type ORT_OP_ATTR_STRING as specified by

.Attr(
"ep_cache_context",
"payload of the execution provider context if embed_mode=1, or path to the context file if embed_mode=0.",
AttributeProto::STRING,
OPTIONAL_VALUE)

The current implemention of ReadOpAttr regards an attribute of type ORT_OP_ATTR_STRING as a null terminated string.

case OrtOpAttrType::ORT_OP_ATTR_STRING: {
const auto& s = attr->s();
if (len < s.size() + 1) {
ret = OrtApis::CreateStatus(OrtErrorCode::ORT_INVALID_ARGUMENT,
"Size of data not large enough to hold the string.");
} else {
char* output_c = reinterpret_cast<char*>(data);
for (char c : s) {
*output_c++ = c;
}
*output_c = '\0';

It is very common that ep_cache_context is a sequence of bytes, as specificed by ONNX
https://github.com/ankane/onnxruntime-1/blob/95843a5dbc3100062be88bcb0d06fd36877f3f77/onnxruntime/core/protobuf/onnx-ml.proto#L140

This commit adds a new operator attribute type ORT_OP_ATTR_BYTES to the ONNX Runtime C API, which allows plugin EPs to read and write ep_cache_context as a sequence of bytes.

@wcy123 wcy123 changed the title Add a new operator attribute type ORT_OP_ATTR_BYTES to the ONNX Add a new operator attribute type ORT_OP_ATTR_BYTES to the ORT C API Jul 7, 2025
@HectorSVC HectorSVC requested review from Copilot and skottmckay July 7, 2025 16:08
@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@HectorSVC HectorSVC requested a review from adrianlizarraga July 7, 2025 16:10
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds support for a new ORT_OP_ATTR_BYTES attribute type in the ONNX Runtime C API, enabling plugin execution providers to read and write raw byte sequences for operator attributes.

  • Introduces ORT_OP_ATTR_BYTES to the OrtOpAttrType enum.
  • Updates CreateOpAttr (in standalone_op_invoker.cc) to serialize raw bytes into the AttributeProto.
  • Extends ReadOpAttr (in custom_ops.cc) to deserialize byte attributes back into a caller‐provided buffer.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
include/onnxruntime/core/session/onnxruntime_c_api.h Added new enum value ORT_OP_ATTR_BYTES.
onnxruntime/core/session/standalone_op_invoker.cc Handled ORT_OP_ATTR_BYTES in attribute creation.
onnxruntime/core/session/custom_ops.cc Implemented reading logic for ORT_OP_ATTR_BYTES.
Comments suppressed due to low confidence (2)

include/onnxruntime/core/session/onnxruntime_c_api.h:275

  • Add a brief comment above this enum entry to explain that ORT_OP_ATTR_BYTES represents raw binary data rather than a null‐terminated string.
  ORT_OP_ATTR_BYTES,

onnxruntime/core/session/standalone_op_invoker.cc:351

  • Add or extend unit tests to cover the new ORT_OP_ATTR_BYTES branch in both writer (CreateOpAttr) and reader (ReadOpAttr) to validate correct handling of raw byte sequences.
    case OrtOpAttrType::ORT_OP_ATTR_BYTES:

@adrianlizarraga
Copy link
Contributor

Hi, I don't think ORT_OP_ATTR_BYTES corresponds to an ONNX AttributeType: https://github.com/onnx/onnx/blob/09dcfe787e85070f9057d566c0cba4646e435cc0/onnx/onnx.proto#L132

Is there are reason why we can't require the EPContext ep_cache_context to always be null-terminated as specified by the ONNX spec for string attributes?

@wcy123
Copy link
Contributor Author

wcy123 commented Jul 8, 2025

Hi, I don't think ORT_OP_ATTR_BYTES corresponds to an ONNX AttributeType: https://github.com/onnx/onnx/blob/09dcfe787e85070f9057d566c0cba4646e435cc0/onnx/onnx.proto#L132

Is there are reason why we can't require the EPContext ep_cache_context to always be null-terminated as specified by the ONNX spec for string attributes?

Thank you for the clarification. I noticed that the type of the s field in onnx.proto is bytes, but the comment describes it as a UTF-8 string. This aligns with the ONNX IR documentation, which also states that it should be a UTF-8 string.

However, in practice, the ep_cache_context attribute is used to store artifacts from AI graph compilers, which are typically in binary formats such as tar or ELF. This is the case for the VitisAI EP, and I believe other EPs likely follow a similar pattern—please correct me if I’m mistaken.

To work around this mismatch, we could base64-encode the binary data to ensure it conforms to UTF-8, though this would increase the size of the serialized model.

If ep_cache_context must strictly be a UTF-8 string, then models compiled with the previous VitisAI EP (which stored raw binary) would no longer be compatible and would require updates.

Would it be possible to clarify whether s is intended to support arbitrary binary data, or if we should enforce UTF-8 encoding going forward?

if we look at the sample model in EP-Context-Design.html as below, .

it seems like the ep_cache_context is a sequence of bytes, not a utf-8 string.

wcy123 and others added 3 commits July 7, 2025 19:23
Runtime C API.

PR microsoft#24887 allows plugin-EPs to interface with ORT using a binary stable
interface.

It is an important feature for a plugin EP to generate an EP context
model as specified by
https://onnxruntime.ai/docs/execution-providers/EP-Context-Design.html

The EP needs to read and write the `ep_cache_context` attribute of type
`ORT_OP_ATTR_STRING` as specified by
https://github.com/microsoft/onnxruntime/blob/5fdd4e4f2a2b6705a9a49a378a3b3496805067ee/onnxruntime/core/graph/contrib_ops/contrib_defs.cc#L3301-L3305

The current implemention of `ReadOpAttr` regards an attribute of type
`ORT_OP_ATTR_STRING` as a null terminated string.
https://github.com/microsoft/onnxruntime/blob/5fdd4e4f2a2b6705a9a49a378a3b3496805067ee/onnxruntime/core/session/custom_ops.cc#L437

It is very common that `ep_cache_context` is a sequence of bytes, as
specificed by ONNX
https://github.com/ankane/onnxruntime-1/blob/95843a5dbc3100062be88bcb0d06fd36877f3f77/onnxruntime/core/protobuf/onnx-ml.proto#L148

This commit adds a new operator attribute type `ORT_OP_ATTR_BYTES` to
the ONNX Runtime C API, which allows plugin EPs to read and write
`ep_cache_context` as a sequence of bytes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@wcy123 wcy123 force-pushed the add-new-op-attr-type-ORT_OP_ATTR_BYTES branch from 53f9099 to ae7c541 Compare July 8, 2025 02:23
@skottmckay
Copy link
Contributor

Does it have to be an attribute vs. say a uint8_t tensor input with a constant initializer providing the data?

@wcy123
Copy link
Contributor Author

wcy123 commented Jul 8, 2025

Does it have to be an attribute vs. say a uint8_t tensor input with a constant initializer providing the data?

According to

.Attr(
"ep_cache_context",
"payload of the execution provider context if embed_mode=1, or path to the context file if embed_mode=0.",
AttributeProto::STRING,
OPTIONAL_VALUE)

I believe it should be a string, even the proto type is bytes, not string. The inputs are the inputs of the fused node.

RETURN_IF_ERROR(model_editor_api.CreateNode("EPContext", "com.microsoft", fused_node_name,
input_names.data(), input_names.size(),
output_names.data(), output_names.size(),
attributes.data(), attributes.size(),
&ep_context_nodes[i]));
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants