Skip to content

exir: add flatbuffer Program serialization for performance#16691

Closed
chizkiyahu wants to merge 1 commit into
pytorch:mainfrom
chizkiyahu:exir-flatbuffer-serialize-fastpath
Closed

exir: add flatbuffer Program serialization for performance#16691
chizkiyahu wants to merge 1 commit into
pytorch:mainfrom
chizkiyahu:exir-flatbuffer-serialize-fastpath

Conversation

@chizkiyahu
Copy link
Copy Markdown
Contributor

@chizkiyahu chizkiyahu commented Jan 19, 2026

Title

Introduce flatbuffer serializer as default for program serialization

Summary

Add a FlatBuffers-based serializer at exir/_serialize/_flatbuffer_program.py and make it the default serializer for program artifacts. The new serializer achieves large efficiency gains across our model suite while retaining robustness via a JSON fallback and detailed error logging.

Before

python -> json -> binary

After

python -> binary

Performance

Measured across ~600 models:

  • ~36.3× average speedup (serialize latency).
  • ~61.6% average reduction in memory usage during serialization.

Tests

Unit tests added at:

  • exir/_serialize/test/test_flatbuffer_program.py

cc @freddan80 @per @zingo @oscarandersson8218 @digantdesai

Copilot AI review requested due to automatic review settings January 19, 2026 12:09
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jan 19, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16691

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e89436b with merge base 3b16295 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 19, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a FlatBuffers-based serializer to achieve significant performance improvements in program serialization (36.3× speedup, 61.6% memory reduction). The new serializer converts Python Program objects directly to binary format, bypassing the previous JSON intermediate step while maintaining robustness through a fallback mechanism.

Changes:

  • New FlatBuffers serializer module at exir/_serialize/_flatbuffer_program.py with direct Python-to-binary conversion
  • Updated default serialization path in _program.py to use the new FlatBuffer serializer with JSON fallback for robustness
  • Comprehensive unit tests covering roundtrip serialization, path equivalence validation, and error handling

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
exir/_serialize/_flatbuffer_program.py Implements direct Python-to-FlatBuffer serialization logic, avoiding JSON intermediate representation
exir/_serialize/_program.py Integrates the new FlatBuffer serializer as the default path with JSON fallback for error cases
exir/_serialize/test/test_flatbuffer_program.py Tests for the new serializer including roundtrip validation and alignment verification
exir/_serialize/test/TARGETS Adds test target for the new FlatBuffer program tests
exir/_serialize/TARGETS Registers the new _flatbuffer_program.py module in the build system

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label ciflow/trunk

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jan 19, 2026

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

Introduce the flatbuffer serializer in
           exir/_serialize/_flatbuffer_program.py
and use it by default, delivering ~36.3x average speedup and ~61.6%
average memory reduction across ~600 models,
with JSON fallback and error logging for robustness.

Signed-off-by: Chizkiyahu Raful <chizkiyahu.raful@arm.com>
Change-Id: If3eda942162f99fb53bba739d4e97a14c60974fa
@chizkiyahu chizkiyahu force-pushed the exir-flatbuffer-serialize-fastpath branch from 92918a7 to e89436b Compare January 19, 2026 12:11
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "partner: arm"

@pytorch-bot pytorch-bot Bot added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Jan 19, 2026
@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label ciflow/trunk

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jan 19, 2026

To add these label(s) (ciflow/trunk) to the PR, please first approve the workflows that are awaiting approval (scroll to the bottom of this page).

This helps ensure we don't trigger CI on this PR until it is actually authorized to do so. Please ping one of the reviewers if you do not have access to approve and run workflows.

@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@pytorchbot label "release notes: exir"

@pytorch-bot pytorch-bot Bot added the release notes: exir Changes to any dialects and passes on these dialects, such as memory planning label Jan 19, 2026
@zingo
Copy link
Copy Markdown
Collaborator

zingo commented Jan 19, 2026

Hi @SS-JIA / @swolchok here is one nore thet is outside or Arm folders that we don't want to review ourself :)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +643 to +644
builder: Any = flatbuffers.Builder(0)

Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FlatBuffers Builder is initialized with size 0, which relies on automatic buffer growth. For large programs, this could cause multiple reallocations. Consider providing an initial size estimate based on program complexity or document this design decision.

Suggested change
builder: Any = flatbuffers.Builder(0)
# Provide a non-zero initial size to reduce FlatBuffers buffer reallocations.
# This is a conservative heuristic based on program structure; the builder
# will still grow automatically if needed.
estimated_initial_size = 1024
estimated_initial_size += 256 * len(program.execution_plan)
estimated_initial_size += 512 * len(program.constant_buffer)
estimated_initial_size += 256 * len(program.backend_delegate_data)
estimated_initial_size += 256 * len(program.segments)
if program.mutable_data_segments is not None:
estimated_initial_size += 256 * len(program.mutable_data_segments)
if program.named_data is not None:
estimated_initial_size += 128 * len(program.named_data)
builder: Any = flatbuffers.Builder(estimated_initial_size)

Copilot uses AI. Check for mistakes.
Comment on lines +446 to +448
except Exception as exc:
logger.error(
f"Failed to serialize Program to flatbuffer; trying JSON fallback due to: {exc}"
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error logging uses an f-string that directly includes the exception object. Consider using exception chaining with 'from exc' to preserve the full traceback when re-raising or handling errors, which would make debugging easier if the fallback also fails.

Suggested change
except Exception as exc:
logger.error(
f"Failed to serialize Program to flatbuffer; trying JSON fallback due to: {exc}"
except Exception:
logger.exception(
"Failed to serialize Program to flatbuffer; trying JSON fallback."

Copilot uses AI. Check for mistakes.
Comment on lines +426 to +454
def _program_to_flatbuffer_with_fallback(
program: Program,
*,
constant_tensor_alignment: int,
delegate_alignment: Optional[int],
) -> _FlatbufferResult:
"""
Serializes the Program into a FlatBuffer, with a JSON fallback for robustness.

The FlatBuffer serialization path is the preferred fast path, offering
significantly better runtime performance and lower memory usage in benchmarks.
The JSON path is retained solely as a fallback to ensure robustness in cases
where FlatBuffer serialization fails.
"""
try:
return _program_to_flatbuffer(
program,
constant_tensor_alignment=constant_tensor_alignment,
delegate_alignment=delegate_alignment,
)
except Exception as exc:
logger.error(
f"Failed to serialize Program to flatbuffer; trying JSON fallback due to: {exc}"
)
return _program_json_to_flatbuffer(
_program_to_json(program),
constant_tensor_alignment=constant_tensor_alignment,
delegate_alignment=delegate_alignment,
)
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new fallback mechanism in _program_to_flatbuffer_with_fallback is not covered by tests. Consider adding a test that triggers the fallback path to ensure it works correctly when the direct flatbuffer serialization fails.

Copilot uses AI. Check for mistakes.
"""Converts a Program dataclass into binary flatbuffer data.

Unlike _program_json_to_flatbuffer(), this does not use JSON or invoke
flatc to build the binary.
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for _program_to_flatbuffer lacks important details about the parameters and return value. Consider documenting what constant_tensor_alignment and delegate_alignment control, their valid values (must be powers of 2), and what _FlatbufferResult contains.

Suggested change
flatc to build the binary.
flatc to build the binary.
Args:
program: Inmemory :class:`Program` instance to be serialized.
constant_tensor_alignment: Optional byte alignment to use for
constant tensor buffers in the generated flatbuffer. When
provided, this value must be a positive power of two (for
example, 1, 2, 4, 8, ...). If ``None``, an appropriate
alignment is selected based on the schema and target
platform defaults.
delegate_alignment: Optional byte alignment to use for data
associated with backend delegates (for example, delegate
blobs or scratch buffers) in the generated flatbuffer. When
provided, this value must be a positive power of two. If
``None``, an appropriate alignment is selected based on the
schema and target platform defaults.
Returns:
_FlatbufferResult: A container with the serialized flatbuffer
representation of ``program`` and associated metadata required
for consumption, such as resource file information and schema
specific alignment details.

Copilot uses AI. Check for mistakes.
@mergennachin mergennachin requested a review from lucylq January 21, 2026 02:39
Copy link
Copy Markdown
Contributor

@mergennachin mergennachin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chizkiyahu for this optimization.

I'd like to explore another alternative.

Looks like you have hand-written exir/_serialize/_flatbuffer_program.py file, which is means it needs to be in-sync with the schema.

One approach to use flatc + autogeneration.

flatc --python --gen-object-api -o exir/_serialize/generated/ program.fbs

Emits exir/_serialize/generated/program_generated.py and checked in

  1. Generator script (generate.py) that reads program.fbs, emits exir/_serialize/generated/convert_program.py (also checked in):
def convert_program(val: Program) -> ProgramT:
      result = ProgramT()
      result.version = val.version
      result.executionPlan = [convert_execution_plan(x) for x in val.execution_plan]
      result.constantBuffer = [convert_buffer(x) for x in val.constant_buffer]
      ...
      return result
  1. A thin wrapper
def _program_to_flatbuffer(program: Program, ...) -> _FlatbufferResult:
      program_t = convert_program(program)
      builder = flatbuffers.Builder(0)
      program_t.Pack(builder)
      builder.Finish(...)
      return _FlatbufferResult(data=bytes(builder.Output()), ...)
  1. CI check to make sure (1) gnerated files match schema (2) output matches json path

@swolchok
Copy link
Copy Markdown
Contributor

@zingo I no longer work on this project, best of luck!

@chizkiyahu
Copy link
Copy Markdown
Contributor Author

@mergennachin
I’ve opened a new PR: #17333

It includes the requested changes.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: exir Changes to any dialects and passes on these dialects, such as memory planning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants