exir: add flatbuffer Program serialization for performance#16691
exir: add flatbuffer Program serialization for performance#16691chizkiyahu wants to merge 1 commit into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16691
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit e89436b with merge base 3b16295 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a FlatBuffers-based serializer to achieve significant performance improvements in program serialization (36.3× speedup, 61.6% memory reduction). The new serializer converts Python Program objects directly to binary format, bypassing the previous JSON intermediate step while maintaining robustness through a fallback mechanism.
Changes:
- New FlatBuffers serializer module at
exir/_serialize/_flatbuffer_program.pywith direct Python-to-binary conversion - Updated default serialization path in
_program.pyto use the new FlatBuffer serializer with JSON fallback for robustness - Comprehensive unit tests covering roundtrip serialization, path equivalence validation, and error handling
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| exir/_serialize/_flatbuffer_program.py | Implements direct Python-to-FlatBuffer serialization logic, avoiding JSON intermediate representation |
| exir/_serialize/_program.py | Integrates the new FlatBuffer serializer as the default path with JSON fallback for error cases |
| exir/_serialize/test/test_flatbuffer_program.py | Tests for the new serializer including roundtrip validation and alignment verification |
| exir/_serialize/test/TARGETS | Adds test target for the new FlatBuffer program tests |
| exir/_serialize/TARGETS | Registers the new _flatbuffer_program.py module in the build system |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@pytorchbot label ciflow/trunk |
|
To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page). This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
Introduce the flatbuffer serializer in
exir/_serialize/_flatbuffer_program.py
and use it by default, delivering ~36.3x average speedup and ~61.6%
average memory reduction across ~600 models,
with JSON fallback and error logging for robustness.
Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Change-Id: If3eda942162f99fb53bba739d4e97a14c60974fa
92918a7 to
e89436b
Compare
|
@pytorchbot label "partner: arm" |
|
@pytorchbot label ciflow/trunk |
|
To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page). This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows. |
|
@pytorchbot label "release notes: exir" |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| builder: Any = flatbuffers.Builder(0) | ||
|
|
There was a problem hiding this comment.
The FlatBuffers Builder is initialized with size 0, which relies on automatic buffer growth. For large programs, this could cause multiple reallocations. Consider providing an initial size estimate based on program complexity or document this design decision.
| builder: Any = flatbuffers.Builder(0) | |
| # Provide a non-zero initial size to reduce FlatBuffers buffer reallocations. | |
| # This is a conservative heuristic based on program structure; the builder | |
| # will still grow automatically if needed. | |
| estimated_initial_size = 1024 | |
| estimated_initial_size += 256 * len(program.execution_plan) | |
| estimated_initial_size += 512 * len(program.constant_buffer) | |
| estimated_initial_size += 256 * len(program.backend_delegate_data) | |
| estimated_initial_size += 256 * len(program.segments) | |
| if program.mutable_data_segments is not None: | |
| estimated_initial_size += 256 * len(program.mutable_data_segments) | |
| if program.named_data is not None: | |
| estimated_initial_size += 128 * len(program.named_data) | |
| builder: Any = flatbuffers.Builder(estimated_initial_size) |
| except Exception as exc: | ||
| logger.error( | ||
| f"Failed to serialize Program to flatbuffer; trying JSON fallback due to: {exc}" |
There was a problem hiding this comment.
The error logging uses an f-string that directly includes the exception object. Consider using exception chaining with 'from exc' to preserve the full traceback when re-raising or handling errors, which would make debugging easier if the fallback also fails.
| except Exception as exc: | |
| logger.error( | |
| f"Failed to serialize Program to flatbuffer; trying JSON fallback due to: {exc}" | |
| except Exception: | |
| logger.exception( | |
| "Failed to serialize Program to flatbuffer; trying JSON fallback." |
| def _program_to_flatbuffer_with_fallback( | ||
| program: Program, | ||
| *, | ||
| constant_tensor_alignment: int, | ||
| delegate_alignment: Optional[int], | ||
| ) -> _FlatbufferResult: | ||
| """ | ||
| Serializes the Program into a FlatBuffer, with a JSON fallback for robustness. | ||
|
|
||
| The FlatBuffer serialization path is the preferred fast path, offering | ||
| significantly better runtime performance and lower memory usage in benchmarks. | ||
| The JSON path is retained solely as a fallback to ensure robustness in cases | ||
| where FlatBuffer serialization fails. | ||
| """ | ||
| try: | ||
| return _program_to_flatbuffer( | ||
| program, | ||
| constant_tensor_alignment=constant_tensor_alignment, | ||
| delegate_alignment=delegate_alignment, | ||
| ) | ||
| except Exception as exc: | ||
| logger.error( | ||
| f"Failed to serialize Program to flatbuffer; trying JSON fallback due to: {exc}" | ||
| ) | ||
| return _program_json_to_flatbuffer( | ||
| _program_to_json(program), | ||
| constant_tensor_alignment=constant_tensor_alignment, | ||
| delegate_alignment=delegate_alignment, | ||
| ) |
There was a problem hiding this comment.
The new fallback mechanism in _program_to_flatbuffer_with_fallback is not covered by tests. Consider adding a test that triggers the fallback path to ensure it works correctly when the direct flatbuffer serialization fails.
| """Converts a Program dataclass into binary flatbuffer data. | ||
|
|
||
| Unlike _program_json_to_flatbuffer(), this does not use JSON or invoke | ||
| flatc to build the binary. |
There was a problem hiding this comment.
The documentation for _program_to_flatbuffer lacks important details about the parameters and return value. Consider documenting what constant_tensor_alignment and delegate_alignment control, their valid values (must be powers of 2), and what _FlatbufferResult contains.
| flatc to build the binary. | |
| flatc to build the binary. | |
| Args: | |
| program: In‑memory :class:`Program` instance to be serialized. | |
| constant_tensor_alignment: Optional byte alignment to use for | |
| constant tensor buffers in the generated flatbuffer. When | |
| provided, this value must be a positive power of two (for | |
| example, 1, 2, 4, 8, ...). If ``None``, an appropriate | |
| alignment is selected based on the schema and target | |
| platform defaults. | |
| delegate_alignment: Optional byte alignment to use for data | |
| associated with backend delegates (for example, delegate | |
| blobs or scratch buffers) in the generated flatbuffer. When | |
| provided, this value must be a positive power of two. If | |
| ``None``, an appropriate alignment is selected based on the | |
| schema and target platform defaults. | |
| Returns: | |
| _FlatbufferResult: A container with the serialized flatbuffer | |
| representation of ``program`` and associated metadata required | |
| for consumption, such as resource file information and schema‑ | |
| specific alignment details. |
There was a problem hiding this comment.
Thanks @chizkiyahu for this optimization.
I'd like to explore another alternative.
Looks like you have hand-written exir/_serialize/_flatbuffer_program.py file, which is means it needs to be in-sync with the schema.
One approach to use flatc + autogeneration.
flatc --python --gen-object-api -o exir/_serialize/generated/ program.fbs
Emits exir/_serialize/generated/program_generated.py and checked in
- Generator script (generate.py) that reads program.fbs, emits exir/_serialize/generated/convert_program.py (also checked in):
def convert_program(val: Program) -> ProgramT:
result = ProgramT()
result.version = val.version
result.executionPlan = [convert_execution_plan(x) for x in val.execution_plan]
result.constantBuffer = [convert_buffer(x) for x in val.constant_buffer]
...
return result
- A thin wrapper
def _program_to_flatbuffer(program: Program, ...) -> _FlatbufferResult:
program_t = convert_program(program)
builder = flatbuffers.Builder(0)
program_t.Pack(builder)
builder.Finish(...)
return _FlatbufferResult(data=bytes(builder.Output()), ...)
- CI check to make sure (1) gnerated files match schema (2) output matches json path
|
@zingo I no longer work on this project, best of luck! |
|
@mergennachin It includes the requested changes. Thanks! |
Title
Introduce flatbuffer serializer as default for program serialization
Summary
Add a FlatBuffers-based serializer at
exir/_serialize/_flatbuffer_program.pyand make it the default serializer for program artifacts. The new serializer achieves large efficiency gains across our model suite while retaining robustness via a JSON fallback and detailed error logging.Before
python -> json -> binary
After
python -> binary
Performance
Measured across ~600 models:
Tests
Unit tests added at:
exir/_serialize/test/test_flatbuffer_program.pycc @freddan80 @per @zingo @oscarandersson8218 @digantdesai