Skip to content

Fix model serialization with external data in current directory#17311

Merged
tianleiwu merged 1 commit into
mainfrom
tlwu/fix_ort_load_onnx_with_ext_data_in_current_dir
Aug 28, 2023
Merged

Fix model serialization with external data in current directory#17311
tianleiwu merged 1 commit into
mainfrom
tlwu/fix_ort_load_onnx_with_ext_data_in_current_dir

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

@tianleiwu tianleiwu commented Aug 28, 2023

Description

When original model has external data in current directory, saving the optimized model will raise File not found exception during looking for external data file under root directory "/". This fix will look under current directory for this case.

I manually tested an extra case and it is working: Original model with external data in root directory ("/"), and save optimized to current directory.

BTW, there is another bug found: when "session.optimized_model_external_initializers_min_size_in_bytes" is set a large value, some tensor is still pointed to the original external data file. Add a TODO in unit test for this bug. Possible solution: load external data into memory before saving model.

Motivation and Context

@kunal-vaishnavi
Copy link
Copy Markdown
Contributor

Is the default threshold value too large in these cases?

ORT transformer optimizer:

external_data_file_threshold: int = 1024,

sess_options.add_session_config_entry(
"session.optimized_model_external_initializers_min_size_in_bytes", str(external_data_file_threshold)
)

ORT test cases:

so.add_session_config_entry("session.optimized_model_external_initializers_min_size_in_bytes", "100")

so.add_session_config_entry("session.optimized_model_external_initializers_min_size_in_bytes", "100")

so.add_session_config_entry("session.optimized_model_external_initializers_min_size_in_bytes", "100")

@tianleiwu
Copy link
Copy Markdown
Contributor Author

tianleiwu commented Aug 28, 2023

Is the default threshold value too large in these cases?

The default value 1024 is fine since it is also used in onnx API:
https://github.com/onnx/onnx/blob/708a58abb7474a7886cace6a2bb7aa057ee0498a/onnx/__init__.py#L284

@tianleiwu tianleiwu merged commit ee9d046 into main Aug 28, 2023
@tianleiwu tianleiwu deleted the tlwu/fix_ort_load_onnx_with_ext_data_in_current_dir branch August 28, 2023 23:06
@natke natke added the triage:approved Approved for cherrypicks for release label Sep 1, 2023
Lafi7e pushed a commit that referenced this pull request Sep 1, 2023
When original model has external data in current directory, saving the
optimized model will raise File not found exception during looking for
external data file under root directory "/". This fix will look under
current directory for this case.

I manually tested an extra case and it is working: Original model with
external data in root directory ("/"), and save optimized to current
directory.

BTW, there is another bug found: when
"session.optimized_model_external_initializers_min_size_in_bytes" is set
a large value, some tensor is still pointed to the original external
data file. Add a TODO in unit test for this bug. Possible solution: load
external data into memory before saving model.
snnn pushed a commit that referenced this pull request Sep 7, 2023
Cherry-pick 2nd round for 1.16.0 release.
PR List:

#17201
#17270
#17311
#17315
#17320
#17326
#17355
#17227
#17380
#17386
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
…osoft#17311)

When original model has external data in current directory, saving the
optimized model will raise File not found exception during looking for
external data file under root directory "/". This fix will look under
current directory for this case.

I manually tested an extra case and it is working: Original model with
external data in root directory ("/"), and save optimized to current
directory.

BTW, there is another bug found: when
"session.optimized_model_external_initializers_min_size_in_bytes" is set
a large value, some tensor is still pointed to the original external
data file. Add a TODO in unit test for this bug. Possible solution: load
external data into memory before saving model.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

triage:approved Approved for cherrypicks for release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants