-
Notifications
You must be signed in to change notification settings - Fork 1k
Binary size skill #17988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binary size skill #17988
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| --- | ||
| name: binary-size | ||
| description: Analyze and reduce ExecuTorch binary size. Use when investigating binary size, running size tests, or optimizing the runtime for size-constrained deployments. | ||
| --- | ||
|
|
||
| # Binary Size | ||
|
|
||
| ## Start from the `main` branch of executorch | ||
| Ask the user where the executorch repo is. | ||
|
|
||
| ```bash | ||
| git checkout main && git pull | ||
| ``` | ||
|
|
||
| ## Build and measure baseline | ||
| ```bash | ||
| conda activate executorch | ||
| bash test/build_size_test.sh | ||
| strip -o /tmp/size_test_stripped cmake-out/test/size_test | ||
| strip -o /tmp/size_test_all_ops_stripped cmake-out/test/size_test_all_ops | ||
| ls -la /tmp/size_test_stripped /tmp/size_test_all_ops_stripped | ||
| ``` | ||
|
|
||
| Produces two binaries: | ||
| - `cmake-out/test/size_test` — ExecuTorch runtime without operator implementations | ||
| - `cmake-out/test/size_test_all_ops` — ExecuTorch runtime with portable ops | ||
|
|
||
| ## Analyze with bloaty | ||
| ```bash | ||
| bloaty cmake-out/test/size_test -d symbols -n 30 # by symbol | ||
| bloaty cmake-out/test/size_test -d sections # by ELF section | ||
| bloaty <after> -- <before> # diff two builds | ||
| nm -S <binary> | sort -k2 -rn | head -30 # symbol sizes | ||
| strings <binary> | less # string literals in .rodata | ||
| ``` | ||
|
|
||
| Note: `bloaty -d compileunits` requires debug info (`-g`). The Release build does not include it. | ||
|
|
||
| ## Key build flags | ||
| Set by `test/build_size_test.sh`: | ||
| - `CMAKE_BUILD_TYPE=Release` | ||
| - `EXECUTORCH_OPTIMIZE_SIZE=ON` — enables `-Os`, `-fno-exceptions`, `-fno-rtti`, unwind table suppression | ||
| - `CXXFLAGS="-fno-exceptions -fno-rtti -Wall -Werror"` | ||
|
|
||
| ## Constraints | ||
| - Use **CMake** to build (not Buck) | ||
| - **C++17 minimum** language standard | ||
| - Must build on **GCC 9** (CI uses `executorch-ubuntu-22.04-gcc9-nopytorch`) and **Clang 12** — avoid compiler-specific flags or pragmas without version guards | ||
| - Do not regress existing functionality — run tests for modified files | ||
| - Do not change build flags in `build_size_test.sh` for size reductions | ||
| - Do not increase latency in the core runtime | ||
|
|
||
| ## Where to look for size reductions | ||
| - `.text`: look for large functions, template bloat, duplicate instantiations | ||
| - `.rodata`: verbose error messages, format strings, embedded file paths (`__FILE__`) | ||
| - `.eh_frame`: should already be suppressed when `EXECUTORCH_OPTIMIZE_SIZE=ON` | ||
| - Static init functions (`nm -S <binary> | grep GLOBAL__sub_I`): use `constexpr` constructors to constant-initialize static arrays | ||
| - Logging strings: `ET_LOG_ENABLED=0` in Release eliminates format strings; ensure it propagates to consumers via `PUBLIC` compile definitions on cmake targets | ||
| - Inline header functions: watch for compile-define mismatches between library and consumer TUs (e.g. `ET_LOG_ENABLED` set in library but not in consumer) | ||
|
|
||
| ## For each change | ||
| 1. Create a branch: `git checkout -b binary-size-<N>` | ||
| 2. Implement, rebuild, measure stripped sizes | ||
| 3. Create a separate PR — one logical change per PR | ||
| 4. Record results in `binary-size-<N>.md`: | ||
|
|
||
| | Binary | This change (N vs N-1) | Cumulative (N vs main) | | ||
| |---|---|---| | ||
| | `size_test` (stripped) | -X | -Y | | ||
| | `size_test_all_ops` (stripped) | -X | -Y | | ||
|
|
||
| 5. Update the CI size threshold in `.github/workflows/pull.yml` if sizes decrease | ||
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -475,10 +475,8 @@ jobs: | |||||||||
| output=$(ls -la cmake-out/test/size_test) | ||||||||||
| arr=($output) | ||||||||||
| size=${arr[4]} | ||||||||||
|
Comment on lines
475
to
477
|
||||||||||
| # threshold=48120 on devserver with gcc9 | ||||||||||
| # todo(lfq): update once binary size is below 50kb. | ||||||||||
| # Note: using gcc9-nopytorch container with pinned nightly PyTorch | ||||||||||
| threshold="63785" | ||||||||||
| # Current CI size: 48008 (gcc9-nopytorch, 2026-03-06) | ||||||||||
| threshold="48500" | ||||||||||
|
Comment on lines
+478
to
+479
|
||||||||||
| # Current CI size: 48008 (gcc9-nopytorch, 2026-03-06) | |
| threshold="48500" | |
| # Current CI size: 48008 (gcc9-nopytorch, 2026-03-06); leave ~2KB headroom to avoid CI flakiness | |
| threshold="50000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This “Key build flags” list is incomplete vs
test/build_size_test.sh(it also sets-Wno-int-in-bool-contextand-DET_HAVE_PREAD=0viaCOMMON_CXXFLAGS). Consider updating this section to reflect the full flags actually used, and clarify which flags come fromEXECUTORCH_OPTIMIZE_SIZEvs explicitCXXFLAGS.