Skip to content

[IR] Implement __buffer__ for tensors #2241

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 22 commits into
base: main
Choose a base branch
from
Draft

Conversation

justinchuby
Copy link
Collaborator

@justinchuby justinchuby commented Apr 29, 2025

Users can now call memoryview on tensors and avoid a copy when saving to external tensors. The TorchTensor class from pytorch gets this method automatically because it inherits from the TensorBase class.

Update: I was able to save some memory but the overload time it takes to save seems to be longer. Need more investigations.

Before | After

image

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new tofile method to enable writing tensor content as bytes to a file-like object. It updates both the protocol definition in _protocols.py and provides concrete implementations in _core.py for tensors with different data layouts.

  • Added the tofile method definition in the tensor protocol.
  • Implemented two tofile methods in _core.py: one for invoking tobytes() for specially handled dtypes and another for writing raw byte content.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
onnxscript/ir/_protocols.py Added tofile method signature to the tensor protocol.
onnxscript/ir/_core.py Implemented tofile methods to handle tensor serialization, including handling of non-native dtypes and raw data writing.

Copy link

codecov bot commented Apr 29, 2025

❌ 8 Tests Failed:

Tests completed Failed Passed Skipped
5859 8 5851 3238
View the top 3 failed test(s) by shortest run time
onnxscript.ir._core_test.TensorTest::test_init_with_non_native_numpy_dtype_8_float4e2m1
Stack Traces | 0.002s run time
..../test/lib/python3.12.../site-packages/parameterized/parameterized.py:620: in standalone_func
    return func(*(a + p.args), **p.kwargs, **kw)
onnxscript/ir/_core_test.py:65: in test_init_with_non_native_numpy_dtype
    np.testing.assert_array_equal(tensor, array.view(dtype.numpy()))
E   AssertionError: 
E   Arrays are not equal
E   
E   (shapes (1,), (2,) mismatch)
E    ACTUAL: array([49], dtype=uint8)
E    DESIRED: array([0.5, 1.5], dtype='float4_e2m1fn')
onnxscript.ir._io_test.IOFunctionsTest::test_save_with_external_data_does_not_modify_model
Stack Traces | 0.002s run time
onnxscript/ir/_io_test.py:80: in test_save_with_external_data_does_not_modify_model
    _io.save(model, path, external_data=external_data_file, size_threshold_bytes=0)
onnxscript/ir/_io.py:84: in save
    model = _external_data.unload_from_model(
onnxscript/ir/external_data.py:378: in unload_from_model
    external_tensors = convert_tensors_to_external(
onnxscript/ir/external_data.py:289: in convert_tensors_to_external
    _write_external_data(sorted_tensors, external_data_infos, path)
onnxscript/ir/external_data.py:180: in _write_external_data
    with memoryview(tensor) as view:
E   TypeError: memoryview: a bytes-like object is required, not 'Tensor'
onnxscript.ir._io_test.IOFunctionsTest::test_save_with_external_data_invalidates_obsolete_external_tensors
Stack Traces | 0.002s run time
onnxscript/ir/_io_test.py:118: in test_save_with_external_data_invalidates_obsolete_external_tensors
    _io.save(model, path, external_data=external_data_file, size_threshold_bytes=0)
onnxscript/ir/_io.py:84: in save
    model = _external_data.unload_from_model(
onnxscript/ir/external_data.py:378: in unload_from_model
    external_tensors = convert_tensors_to_external(
onnxscript/ir/external_data.py:289: in convert_tensors_to_external
    _write_external_data(sorted_tensors, external_data_infos, path)
onnxscript/ir/external_data.py:180: in _write_external_data
    with memoryview(tensor) as view:
E   TypeError: memoryview: a bytes-like object is required, not 'Tensor'

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@justinchuby justinchuby marked this pull request as draft April 29, 2025 04:25
@justinchuby
Copy link
Collaborator Author

WIP:

before

Profile at /home/justinchu/dev/onnxruntime-genai/src/python/py/models/builder.py:3659

24.217 <module>  builder.py:1
└─ 24.217 create_model  builder.py:3440
   ├─ 20.500 LlamaModel.save_model  builder.py:517
   │  ├─ 19.334 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 19.313 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:332
   │  │     └─ 19.312 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:229
   │  │        └─ 19.257 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           ├─ 12.405 LazyTensor.tobytes  ../../onnxscript/onnxscript/ir/_core.py:974
   │  │           │  ├─ 11.173 TorchTensor.tobytes  ../../onnxscript/onnxscript/ir/tensor_adapters.py:101
   │  │           │  │  ├─ 6.319 Tensor.contiguous  <built-in>
   │  │           │  │  └─ 4.854 [self]  ../../onnxscript/onnxscript/ir/tensor_adapters.py
   │  │           │  ├─ 0.949 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:934
   │  │           │  │  └─ 0.949 tensor_func  builder.py:555
   │  │           │  │     └─ 0.770 <lambda>  builder.py:811
   │  │           │  │        └─ 0.770 Tensor.to  <built-in>
   │  │           │  └─ 0.283 [self]  ../../onnxscript/onnxscript/ir/_core.py
   │  │           ├─ 6.542 BufferedWriter.write  <built-in>
   │  │           └─ 0.259 [self]  ../../onnxscript/onnxscript/ir/external_data.py
   │  └─ 0.943 remove  <built-in>
   ├─ 2.672 LlamaModel.make_model  builder.py:2277
   │  └─ 2.487 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   │        [83 frames hidden]  transformers, importlib, torch, netwo...
   └─ 0.811 LlamaModel.save_processing  builder.py:452
      └─ 0.571 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
            [4 frames hidden]  transformers, <built-in>

@justinchuby
Copy link
Collaborator Author

For some reason ndarray to file is really slow

90.469 <module>  builder.py:1
└─ 90.469 create_model  builder.py:3440
   ├─ 86.475 LlamaModel.save_model  builder.py:517
   │  ├─ 85.124 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 85.102 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:331
   │  │     └─ 85.101 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:228
   │  │        └─ 85.100 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           └─ 84.740 LazyTensor.tofile  ../../onnxscript/onnxscript/ir/_core.py:126
   │  │              ├─ 83.748 ndarray.tofile  <built-in>
   │  │              └─ 0.992 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1000
   │  │                 └─ 0.987 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:964
   │  │                    └─ 0.987 tensor_func  builder.py:555
   │  └─ 1.124 remove  <built-in>
   └─ 2.852 LlamaModel.make_model  builder.py:2277
      └─ 2.674 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
            [13 frames hidden]  transformers, importlib, torch

@justinchuby
Copy link
Collaborator Author

With memoryview

40.849 <module>  builder.py:1
└─ 40.849 create_model  builder.py:3440
   ├─ 37.499 LlamaModel.save_model  builder.py:517
   │  └─ 37.234 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │     └─ 37.213 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:331
   │        └─ 37.212 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:228
   │           └─ 37.208 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │              ├─ 30.889 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:114
   │              │  ├─ 29.473 ascontiguousarray  <built-in>
   │              │  └─ 1.085 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1003
   │              │     └─ 1.083 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:967
   │              │        └─ 1.083 tensor_func  builder.py:555
   │              │           └─ 0.898 <lambda>  builder.py:811
   │              │              └─ 0.895 Tensor.to  <built-in>
   │              └─ 6.211 BufferedWriter.write  <built-in>
   ├─ 2.289 LlamaModel.make_model  builder.py:2277
   │  └─ 2.067 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   │        [26 frames hidden]  transformers, importlib, torch
   └─ 0.810 LlamaModel.save_processing  builder.py:452
      └─ 0.602 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
            [4 frames hidden]  transformers, <built-in>

@justinchuby
Copy link
Collaborator Author

Calling contiguous on pytorch first:

14.954 <module>  builder.py:1
└─ 14.954 create_model  builder.py:3440
   ├─ 11.296 LlamaModel.save_model  builder.py:517
   │  ├─ 10.178 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 10.150 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:331
   │  │     └─ 10.149 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:228
   │  │        └─ 10.148 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           ├─ 6.712 BufferedWriter.write  <built-in>
   │  │           ├─ 3.135 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:114
   │  │           │  └─ 3.120 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1003
   │  │           │     └─ 3.116 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:967
   │  │           │        └─ 3.116 tensor_func  builder.py:555
   │  │           │           ├─ 2.923 <lambda>  builder.py:811
   │  │           │           │  ├─ 2.227 Tensor.contiguous  <built-in>
   │  │           │           │  └─ 0.657 Tensor.to  <built-in>
   │  │           │           └─ 0.179 <lambda>  builder.py:1068
   │  │           │              └─ 0.179 Parameter.to  <built-in>
   │  │           └─ 0.301 [self]  ../../onnxscript/onnxscript/ir/external_data.py
   │  ├─ 0.909 remove  <built-in>
   │  └─ 0.204 collect  <built-in>
   ├─ 2.429 LlamaModel.make_model  builder.py:2277
   │  ├─ 2.237 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   │  │     [83 frames hidden]  transformers, importlib, torch, trito...
   │  └─ 0.186 LlamaModel.make_layer  builder.py:2264
   │     └─ 0.168 LlamaModel.make_attention  builder.py:1632
   ├─ 0.938 LlamaModel.save_processing  builder.py:452
   │  ├─ 0.683 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
   │  │     [4 frames hidden]  transformers, <built-in>
   │  └─ 0.255 PreTrainedTokenizerFast.save_pretrained  transformers/tokenization_utils_base.py:2356
   │        [2 frames hidden]  transformers, <built-in>
   └─ 0.201 AutoConfig.from_pretrained  transformers/models/auto/configuration_auto.py:1013
         [18 frames hidden]  transformers, huggingface_hub, reques...

@justinchuby justinchuby changed the title [IR] Implement tofile for tensors [IR] Implement __buffer__ for tensors Apr 29, 2025
@justinchuby justinchuby marked this pull request as ready for review April 29, 2025 04:55
@justinchuby justinchuby added the module: IR Intermediate representation label Apr 29, 2025
@justinchuby justinchuby requested a review from Copilot April 29, 2025 05:02
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the buffer protocol for tensor classes to allow zero‐copy access via memoryview, which helps improve performance when saving external tensor data. Key changes include:

  • Updating tensor_adapters.py to ensure tensors are contiguous via calling .contiguous() before conversion.
  • Replacing tobytes() with memoryview() in external_data.py for more efficient data access.
  • Adding buffer implementations in both TensorBase and ExternalTensor classes in _protocols.py and _core.py.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
onnxscript/ir/tensor_adapters.py Enforces contiguous tensor format before numpy conversion to accelerate common usage.
onnxscript/ir/external_data.py Uses memoryview(tensor) for writing external data and adjusts the release order accordingly.
onnxscript/ir/_protocols.py & _core.py Introduces and implements the buffer protocol in tensor base classes for zero‐copy support.
Comments suppressed due to low confidence (3)

onnxscript/ir/tensor_adapters.py:84

  • [nitpick] Add tests for the scenario where self.raw is already contiguous to verify that calling .contiguous() does not introduce an unnecessary copy or performance overhead.
self.raw: torch.Tensor = self.raw.contiguous()

onnxscript/ir/external_data.py:180

  • [nitpick] Ensure test coverage for writing external tensor data using memoryview, especially for tensors with non-trivial memory layouts or edge-case sizes.
data_file.write(memoryview(tensor))

onnxscript/ir/_core.py:131

  • [nitpick] Consider refactoring this code to remove the type ignore comment by ensuring that the array passed to memoryview is correctly typed, which may help prevent potential runtime issues.
return memoryview(array)  // type: ignore[arg-type]

@justinchuby
Copy link
Collaborator Author

justinchuby commented Apr 29, 2025

gemma, after

Profile at /home/justinchu/dev/onnxruntime-genai/src/python/py/models/builder.py:3659

228.397 <module>  builder.py:1
└─ 228.397 create_model  builder.py:3440
   ├─ 220.427 Gemma3Model.save_model  builder.py:517
   │  └─ 218.916 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │     └─ 218.713 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:331
   │        └─ 218.707 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:228
   │           └─ 218.686 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │              ├─ 163.483 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:113
   │              │  └─ 163.434 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1002
   │              │     └─ 163.417 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:966
   │              │        └─ 163.415 tensor_func  builder.py:555
   │              │           └─ 163.352 <lambda>  builder.py:811
   │              │              ├─ 149.903 Tensor.contiguous  <built-in>
   │              │              └─ 11.592 Tensor.to  <built-in>
   │              ├─ 52.574 BufferedWriter.write  <built-in>
   │              └─ 2.614 [self]  ../../onnxscript/onnxscript/ir/external_data.py
   ├─ 3.872 Gemma3Model.make_model  builder.py:2277
   │  └─ 3.249 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   └─ 3.501 Gemma3Model.save_processing  builder.py:452
      └─ 2.687 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
            [2 frames hidden]  transformers

before

208.994 <module>  builder.py:1
└─ 208.974 create_model  builder.py:3440
   ├─ 198.222 Gemma3Model.save_model  builder.py:517
   │  ├─ 195.303 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 194.712 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:332
   │  │     └─ 194.692 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:229
   │  │        └─ 194.075 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           ├─ 147.473 LazyTensor.tobytes  ../../onnxscript/onnxscript/ir/_core.py:974
   │  │           │  ├─ 108.904 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:934
   │  │           │  │  └─ 108.904 tensor_func  builder.py:555
   │  │           │  │     └─ 108.836 <lambda>  builder.py:811
   │  │           │  │        ├─ 95.261 Tensor.contiguous  <built-in>
   │  │           │  │        └─ 11.653 Tensor.to  <built-in>
   │  │           │  └─ 38.567 TorchTensor.tobytes  ../../onnxscript/onnxscript/ir/tensor_adapters.py:101
   │  │           ├─ 43.618 BufferedWriter.write  <built-in>
   │  │           └─ 2.668 [self]  ../../onnxscript/onnxscript/ir/external_data.py
   │  └─ 2.672 remove  <built-in>
   ├─ 5.073 Gemma3Model.make_model  builder.py:2277
   │  └─ 4.644 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   │        [9 frames hidden]  transformers, importlib
   └─ 4.332 Gemma3Model.save_processing  builder.py:452
      └─ 3.677 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
            [4 frames hidden]  transformers

@justinchuby justinchuby marked this pull request as draft April 29, 2025 05:22
@justinchuby
Copy link
Collaborator Author

justinchuby commented Apr 29, 2025

Removing a call to contiguous()

156.458 <module>  builder.py:1
└─ 156.444 create_model  builder.py:3440
   ├─ 145.216 Gemma3Model.save_model  builder.py:517
   │  └─ 143.432 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │     └─ 142.773 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:330
   │        └─ 142.755 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:227
   │           └─ 142.395 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │              ├─ 86.049 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:113
   │              │  └─ 85.990 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1002
   │              │     ├─ 65.119 TorchTensor.numpy  ../../onnxscript/onnxscript/ir/tensor_adapters.py:79
   │              │     │  └─ 65.111 Tensor.contiguous  <built-in>
   │              │     └─ 20.863 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:966
   │              │        └─ 20.857 tensor_func  builder.py:555
   │              │           └─ 20.808 <lambda>  builder.py:811
   │              │              └─ 20.778 Tensor.to  <built-in>
   │              ├─ 53.200 BufferedWriter.write  <built-in>
   │              └─ 3.141 [self]  ../../onnxscript/onnxscript/ir/external_data.py
   ├─ 5.426 Gemma3Model.make_model  builder.py:2277
   │  └─ 4.780 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   │        [15 frames hidden]  transformers, importlib, torch
   └─ 4.430 Gemma3Model.save_processing  builder.py:452
      └─ 3.778 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
            [5 frames hidden]  transformers, <built-in>

@justinchuby
Copy link
Collaborator Author

301.946 <module>  builder.py:1
└─ 301.946 create_model  builder.py:3440
   ├─ 293.662 Gemma3Model.save_model  builder.py:517
   │  ├─ 283.700 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 283.546 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:331
   │  │     └─ 283.541 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:228
   │  │        └─ 283.519 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           ├─ 166.181 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:113
   │  │           │  └─ 166.122 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1005
   │  │           │     └─ 166.098 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:969
   │  │           │        └─ 166.097 tensor_func  builder.py:555
   │  │           │           └─ 166.019 <lambda>  builder.py:811
   │  │           │              ├─ 145.567 Tensor.contiguous  <built-in>
   │  │           │              └─ 18.497 Tensor.to  <built-in>
   │  │           └─ 114.390 BufferedWriter.write  <built-in>
   │  └─ 9.655 remove  <built-in>
   ├─ 3.822 Gemma3Model.make_model  builder.py:2277
   │  └─ 3.252 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   └─ 3.100 Gemma3Model.save_processing  builder.py:452

@justinchuby
Copy link
Collaborator Author

No: builder contiguouse() Yes: Conditional torchtensor contiguous() and numpy contiguous():

181.161 <module>  builder.py:1
└─ 181.137 create_model  builder.py:3440
   ├─ 169.306 Gemma3Model.save_model  builder.py:517
   │  ├─ 166.007 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 165.417 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:331
   │  │     └─ 165.407 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:228
   │  │        └─ 165.068 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           ├─ 96.420 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:113
   │  │           │  └─ 96.343 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1005
   │  │           │     ├─ 78.057 TorchTensor.numpy  ../../onnxscript/onnxscript/ir/tensor_adapters.py:79
   │  │           │     │  └─ 78.047 Tensor.contiguous  <built-in>
   │  │           │     └─ 18.281 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:969
   │  │           │        └─ 18.280 tensor_func  builder.py:555
   │  │           │           └─ 18.217 <lambda>  builder.py:811
   │  │           │              └─ 18.213 Tensor.to  <built-in>
   │  │           ├─ 65.679 BufferedWriter.write  <built-in>
   │  │           └─ 2.952 [self]  ../../onnxscript/onnxscript/ir/external_data.py
   │  └─ 2.976 remove  <built-in>
   ├─ 5.284 Gemma3Model.make_model  builder.py:2277
   │  └─ 4.782 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
   │        [13 frames hidden]  transformers, importlib, torch
   └─ 5.020 Gemma3Model.save_processing  builder.py:452
      └─ 4.356 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
            [5 frames hidden]  transformers, <built-in>

@justinchuby
Copy link
Collaborator Author

144.380 <module>  builder.py:1
└─ 144.363 create_model  builder.py:3440
   ├─ 134.189 Gemma3Model.save_model  builder.py:517
   │  ├─ 131.956 save  ../../onnxscript/onnxscript/ir/_io.py:40
   │  │  └─ 131.508 unload_from_model  ../../onnxscript/onnxscript/ir/external_data.py:332
   │  │     └─ 131.495 convert_tensors_to_external  ../../onnxscript/onnxscript/ir/external_data.py:229
   │  │        └─ 131.284 _write_external_data  ../../onnxscript/onnxscript/ir/external_data.py:157
   │  │           ├─ 78.125 LazyTensor.__buffer__  ../../onnxscript/onnxscript/ir/_core.py:113
   │  │           │  └─ 77.989 LazyTensor.numpy  ../../onnxscript/onnxscript/ir/_core.py:1005
   │  │           │     ├─ 63.879 TorchTensor.numpy  ../../onnxscript/onnxscript/ir/tensor_adapters.py:79
   │  │           │     │  └─ 63.866 Tensor.contiguous  <built-in>
   │  │           │     └─ 14.103 LazyTensor._evaluate  ../../onnxscript/onnxscript/ir/_core.py:969
   │  │           │        └─ 14.103 tensor_func  builder.py:555
   │  │           │           └─ 13.997 <lambda>  builder.py:811
   │  │           │              └─ 13.993 Tensor.to  <built-in>
   │  │           ├─ 50.105 BufferedWriter.write  <built-in>
   │  │           └─ 3.019 memoryview.__exit__  <built-in>
   │  └─ 1.938 remove  <built-in>
   ├─ 5.209 Gemma3Model.save_processing  builder.py:452
   │  └─ 4.430 AutoTokenizer.from_pretrained  transformers/models/auto/tokenization_auto.py:819
   │        [5 frames hidden]  transformers, <built-in>
   └─ 3.634 Gemma3Model.make_model  builder.py:2277
      └─ 3.054 AutoModelForCausalLM.from_pretrained  transformers/models/auto/auto_factory.py:452
            [9 frames hidden]  transformers, importlib

@justinchuby
Copy link
Collaborator Author

image

if not _IS_LITTLE_ENDIAN:
# Need to copy because we are returning the underlying data directly
array = array.view(array.dtype.newbyteorder("<")).copy()
return array.__buffer__(flags)

Check failure

Code scanning / lintrunner

MYPY/attr-defined Error

"ndarray[Any, Any]" has no attribute "__buffer__" To disable, use # type: ignore[attr-defined]
array = self.numpy()
assert array.data.c_contiguous, "Bug: The array should be contiguous"
assert self.dtype.itemsize == array.itemsize, "Bug: The itemsize should match"
return array.__buffer__(flags)

Check failure

Code scanning / lintrunner

MYPY/attr-defined Error

"ndarray[Any, dtype[Any]]" has no attribute "__buffer__" To disable, use # type: ignore[attr-defined]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: IR Intermediate representation
Projects
Development

Successfully merging this pull request may close these issues.

1 participant