Skip to content

[On-Device-Training] Upgrade Flatbuffers to Support 2GB+ Checkpoints.#19770

Merged
AdamLouly merged 17 commits intomainfrom
adamlouly/update_ckpt_flatbuffers
Mar 14, 2024
Merged

[On-Device-Training] Upgrade Flatbuffers to Support 2GB+ Checkpoints.#19770
AdamLouly merged 17 commits intomainfrom
adamlouly/update_ckpt_flatbuffers

Conversation

@AdamLouly
Copy link
Contributor

Description

Modifications to support 2GB+ checkpoint & Upgrading Flatbuffers

Motivation and Context

This PR includes changes that will make ort handle 2GB+ checkpoints.
To do that we need to upgrade flatbuffers to 23.5.9 - google/flatbuffers#7945

  • Modified the commitHash and the hash for the new version
  • Removed the patch for rust generator's unused variable warning as it is no longer producing this - Check it out here
  • Updated the VerifyField calls with alignment values that were introduced in the new version.

@edgchen1
Copy link
Contributor

edgchen1 commented Mar 6, 2024

A few places that may need updating:

It is possible to use another flatc as well, e.g., from a separate installation. Note that ONNX Runtime uses
FlatBuffers 1.12.

#flatbuffers 1.11.0 does not have flatbuffers::IsOutRange, therefore we require 1.12.0+
FetchContent_Declare(
flatbuffers
URL ${DEP_URL_flatbuffers}
URL_HASH SHA1=${DEP_SHA1_flatbuffers}
PATCH_COMMAND ${ONNXRUNTIME_FLATBUFFERS_PATCH_COMMAND}
FIND_PACKAGE_ARGS 1.12.0...<2.0.0 NAMES Flatbuffers
)

"flatbuffers": "^1.12.0",

"flatbuffers": "^1.12.0",

@fs-eire would the onnxruntime-web flatbuffers dependency need to be updated as well?

@AdamLouly AdamLouly marked this pull request as ready for review March 6, 2024 19:47
@AdamLouly AdamLouly requested a review from a team as a code owner March 6, 2024 19:47
@AdamLouly AdamLouly requested review from a team and askhade March 6, 2024 19:47
@AdamLouly AdamLouly requested a review from skottmckay March 8, 2024 03:08
@sumitsays
Copy link
Contributor

From DirectML perspective, it looks good to me.

@fs-eire
Copy link
Contributor

fs-eire commented Mar 8, 2024

ONNX Runtime Web CI failed. This is because this PR does partially upgrade for flatbuffer for ort-web.

Please either revert this change for ort-web, or perform a full upgrade

  • Use npm install flatbuffers@^24.3.7 to install so that packages-lock.json will be refreshed with dependencies version
  • Follow instructions in js/web/onnxjs/ort-schema/flatbuffers/README.md to generate new flatbuffer files for v24.3.7 and update the files in js/web/onnxjs/ort-schema/flatbuffers/

We don't necessarily need this change for ort-web because only WebGL is using this flatbuffer library as dependencies, and we do not have plan to support on device training for WebGL backend. For Wasm/CPU/WebGPU, this library is not needed.

@AdamLouly
Copy link
Contributor Author

ONNX Runtime Web CI failed. This is because this PR does partially upgrade for flatbuffer for ort-web.

Please either revert this change for ort-web, or perform a full upgrade

  • Use npm install flatbuffers@^24.3.7 to install so that packages-lock.json will be refreshed with dependencies version
  • Follow instructions in js/web/onnxjs/ort-schema/flatbuffers/README.md to generate new flatbuffer files for v24.3.7 and update the files in js/web/onnxjs/ort-schema/flatbuffers/

We don't necessarily need this change for ort-web because only WebGL is using this flatbuffer library as dependencies, and we do not have plan to support on device training for WebGL backend. For Wasm/CPU/WebGPU, this library is not needed.

Thanks for clarifying, in this case we can just revert ort-web upgrade.

snnn
snnn previously approved these changes Mar 14, 2024
Copy link
Contributor

@snnn snnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

Copy link
Contributor

@snnn snnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In file included from /mnt/vss/_work/1/s/include/onnxruntime/core/graph/graph.h:24, you need to add some guards to disable external warnings. Like what we do in https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/graph/onnx_protobuf.h . Otherwise the web pipelines won't pass

Copy link
Contributor

@pranavsharma pranavsharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing for admin.

@AdamLouly
Copy link
Contributor Author

Thanks for the review everyone!

@AdamLouly AdamLouly merged commit 3255813 into main Mar 14, 2024
@AdamLouly AdamLouly deleted the adamlouly/update_ckpt_flatbuffers branch March 14, 2024 23:36
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this file generated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants