Skip to content

[lld][Webassembly] Avoid a signed overflow on large sections#178287

Merged
sbc100 merged 1 commit intollvm:mainfrom
FatihBAKIR:wasm-ld-crash-fix
Feb 10, 2026
Merged

[lld][Webassembly] Avoid a signed overflow on large sections#178287
sbc100 merged 1 commit intollvm:mainfrom
FatihBAKIR:wasm-ld-crash-fix

Conversation

@FatihBAKIR
Copy link
Contributor

@FatihBAKIR FatihBAKIR commented Jan 27, 2026

wasm sections sizes are specified as u32s, and thus can be as large as 4GB. wasm-ld currently stores the offset into a section as an int32_t which overflows on large sections and results in a crash. This change makes it a int64_t to accommodate any valid wasm section and allow catching even larger sections instead of wrapping around.

This PR fixes the issue by storing the offset as a int64_t, as well as adding extra checks to handle un-encodeable sections to fail instead of producing garbage wasm binaries, and also adds lit tests to make sure it works. I confirmed the test fails on main but passes with this fix.

Fixes: #178286

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jan 27, 2026

@llvm/pr-subscribers-lld-wasm

@llvm/pr-subscribers-lld

Author: Fatih BAKIR (FatihBAKIR)

Changes

wasm sections sizes are specified as u32s, and thus can be as large as 4GB. wasm-ld currently stores the offset into a section as an int32_t which overflows on large sections and results in a crash. This change makes it a uint32_t to accommodate any valid wasm section.

This PR fixes the issue by storing the offset as a uint32_t, and also adds a lit test to make sure it works. I confirmed the test fails on main but passes with this fix.

Fixes #178286


Full diff: https://github.com/llvm/llvm-project/pull/178287.diff

2 Files Affected:

  • (added) lld/test/wasm/large-section.test (+34)
  • (modified) lld/wasm/InputChunks.h (+1-1)
diff --git a/lld/test/wasm/large-section.test b/lld/test/wasm/large-section.test
new file mode 100644
index 0000000000000..6f399000d9c89
--- /dev/null
+++ b/lld/test/wasm/large-section.test
@@ -0,0 +1,34 @@
+# RUN: split-file %s %t
+# RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown %t/chunk1.s -o %t/chunk1.o
+# RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown %t/chunk2.s -o %t/chunk2.o
+# --no-gc-sections to prevent the linker from optimizing the chunk away, otherwise it produces a tiny output
+# RUN: wasm-ld --no-entry --no-gc-sections %t/chunk1.o %t/chunk2.o -o %t/combined.wasm
+# RUN: llvm-readobj --sections %t/combined.wasm | FileCheck %s
+
+# Just making sure the linker doesn't crash for now and it has the combined, gigantic section, may need a better check
+# CHECK: Size: 2348810260
+
+# A 2GB + some extra bytes of data to make sure we go over 2G 
+#--- chunk1.s
+.section .data.chunk1,"",@
+.globl chunk1_start
+.type chunk1_start,@object
+chunk1_start:
+  .int32 0xAAAAAAAA
+  .int32 0xBBBBBBBB
+  .zero 2214592504
+  .int32 0xCCCCCCCC
+  .int32 0xDDDDDDDD
+.size chunk1_start, 2214592512
+
+#--- chunk2.s
+.section .data.chunk2,"",@
+.globl chunk2_start
+.type chunk2_start,@object
+chunk2_start:
+  .int32 0x11111111
+  .int32 0x22222222
+  .zero 134217712
+  .int32 0x44444444
+  .int32 0x55555555
+.size chunk2_start, 134217728
diff --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index 1fe78d76631f1..462fa766081e6 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -97,7 +97,7 @@ class InputChunk {
 
   // After assignAddresses is called, this represents the offset from
   // the beginning of the output section this chunk was assigned to.
-  int32_t outSecOff = 0;
+  uint32_t outSecOff = 0;
 
   uint8_t sectionKind : 3;
 

@sbc100
Copy link
Collaborator

sbc100 commented Jan 27, 2026

Maybe "large section" -> "large data sections" in the PR title?

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you prefix the PR title with [lld][Webassembly]?

lgtm!

Although i makes me wonder if we have a test for when the 4gb limit is reached? Do we have a reasonable error in that case? I would guess not :-/

lgtm to this change as it though since it fixes the immediate issue.

@FatihBAKIR FatihBAKIR changed the title [wasm-ld] Avoid a signed overflow on large sections [lld][Webassembly] Avoid a signed overflow on large sections Jan 27, 2026
@FatihBAKIR
Copy link
Contributor Author

Updated the title with the prefix, but since the member is used in all section types (including debug info, function etc), I believe it applies to all types, and we have experienced it with debug info sections too.

@dschuff
Copy link
Member

dschuff commented Jan 27, 2026

Yeah, we should probably have better coverage of this kind of thing for custom sections too, since huge custom sections are even more common than huge data sections. Like, test with 2-4G custom sections, maybe multiple of them, etc.

Actually, thinking about this test a little more... why does this test result in a big data sections? We shouldn't be generating any data segments at all for huge runs of zeros, should we? Or do we not have that optimization in lld?

@FatihBAKIR
Copy link
Contributor Author

FatihBAKIR commented Jan 27, 2026

For catching even larger sections I think we should change the member to an [u]int64_t and check we aren't going over the 4G mark.

For the full zeros thing, I defensively added the non-zero int32_ts to the beginning and the end of the sections to avoid the linker from optimizing it away, not sure if it recognizes it though.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 27, 2026

Oh sorry, I was thinking the InputChunk was only used to data segments... but if its used to custom sections too then this change LGTM

@dschuff
Copy link
Member

dschuff commented Jan 27, 2026

Interesting... actually maybe we could also just write a test like this one, but with .section .debug_foo instead of a data section.

wasm sections sizes are specified as u32s, and thus can be as large as
4GB. wasm-ld currently stores the offset into a section as an int32_t
which overflows on large sections and results in a crash. This change
makes it a int64_t to accommodate any valid wasm section.

This change also catches section size overflows early and fail instead
of wrapping and producing a corrupt wasm binary.
@FatihBAKIR
Copy link
Contributor Author

I added 2 more tests, one for checking debug sections specifically, and one that goes even above 4GB and expects the linker to fail instead of silently producing a broken wasm binary.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra tests and error checks!

@sbc100
Copy link
Collaborator

sbc100 commented Feb 10, 2026

Can you update the PR description which still mentions uint32_t?

@FatihBAKIR
Copy link
Contributor Author

Oops, forgot the PR description, updated

@sbc100 sbc100 merged commit c703f5a into llvm:main Feb 10, 2026
10 checks passed
@github-actions
Copy link

@FatihBAKIR Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

@llvm-ci
Copy link

llvm-ci commented Feb 10, 2026

LLVM Buildbot has detected a new failure on builder lldb-x86_64-debian running on lldb-x86_64-debian while building lld at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/162/builds/40738

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: functionalities/data-formatter/root-reference-children/TestRootReferenceChildren.py (616 of 3388)
PASS: lldb-api :: commands/expression/ir-interpreter-phi-nodes/TestIRInterpreterPHINodes.py (617 of 3388)
PASS: lldb-api :: functionalities/breakpoint/breakpoint_command/TestBreakpointCommandsFromPython.py (618 of 3388)
PASS: lldb-api :: macosx/duplicate-archive-members/TestDuplicateMembers.py (619 of 3388)
PASS: lldb-shell :: Settings/TestChildDepthTruncation.test (620 of 3388)
PASS: lldb-api :: commands/expression/dollar-in-variable/TestDollarInVariable.py (621 of 3388)
PASS: lldb-api :: functionalities/memory/cache/TestMemoryCache.py (622 of 3388)
PASS: lldb-api :: tools/lldb-server/TestGdbRemoteFork.py (623 of 3388)
PASS: lldb-api :: lang/c/struct_types/TestStructTypes.py (624 of 3388)
UNRESOLVED: lldb-api :: python_api/run_locker/TestRunLocker.py (625 of 3388)
******************** TEST 'lldb-api :: python_api/run_locker/TestRunLocker.py' FAILED ********************
Script:
--
/usr/bin/python3 /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./lib --env LLVM_INCLUDE_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/include --env LLVM_TOOLS_DIR=/home/worker/2.0.1/lldb-x86_64-debian/build/./bin --arch x86_64 --build-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex --lldb-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/worker/2.0.1/lldb-x86_64-debian/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/lldb --compiler /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/clang --dsymutil /home/worker/2.0.1/lldb-x86_64-debian/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./bin --lldb-obj-root /home/worker/2.0.1/lldb-x86_64-debian/build/tools/lldb --lldb-libs-dir /home/worker/2.0.1/lldb-x86_64-debian/build/./lib --cmake-build-type Release -t /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/python_api/run_locker -p TestRunLocker.py
--
Exit Code: -11

Command Output (stdout):
--
lldb version 23.0.0git (https://github.com/llvm/llvm-project.git revision c703f5a1632973dd6eade473614dfbed1b088d9e)
  clang revision c703f5a1632973dd6eade473614dfbed1b088d9e
  llvm revision c703f5a1632973dd6eade473614dfbed1b088d9e
"can't evaluate expressions when the process is running."

--
Command Output (stderr):
--
Change dir to: /home/worker/2.0.1/lldb-x86_64-debian/llvm-project/lldb/test/API/python_api/run_locker
runCmd: settings clear --all

output: 

runCmd: settings set symbols.enable-external-lookup false

output: 

runCmd: settings set target.inherit-tcc true

output: 

runCmd: settings set target.disable-aslr false

output: 

runCmd: settings set target.detach-on-error false

output: 

runCmd: settings set target.auto-apply-fixits false

@omjavaid
Copy link
Contributor

This is failing on 32 bit Arm Linux:

https://lab.llvm.org/staging/#/builders/160/builds/1189

lld/test/wasm/large-debug-section.test
lld/test/wasm/large-section.test
lld/test/wasm/section-too-large.test

The failure got masked because bot was already failing.

@omjavaid
Copy link
Contributor

Looking at the logs tests are failing on ARM 32-bit with OOM errors possibly hitting the 32-bit process address space limit around 3GB. Not sure if we should just mark these UNSUPPORTED on 32-bit or if theres something off with llvm-mc allocating the full 2.3GB for .zero 2214592504 directives - seems like it shouldnt need to materialize all that in memory but maybe im missing something about wasm object format requirements

@sbc100 since you merged this - any thoughts on whether this is expected behavior or if theres a better approach here? Should I just add UNSUPPORTED for 32-bit targets or is the memory allocation worth looking into

omjavaid added a commit that referenced this pull request Feb 17, 2026
@omjavaid
Copy link
Contributor

This is failing on 32 bit Arm Linux:

https://lab.llvm.org/staging/#/builders/160/builds/1189

lld/test/wasm/large-debug-section.test lld/test/wasm/large-section.test lld/test/wasm/section-too-large.test

The failure got masked because bot was already failing.

@FatihBAKIR I am going revert this change temporarily to unblock the buildbot.

@FatihBAKIR
Copy link
Contributor Author

I think we should mark the tests as unavailable on arm32 until we have a proper fix for out of memory issues. I wouldn't mind the revert if it was reverting a regression, but it would just remove tests that exercise an existing bug.

@sbc100
Copy link
Collaborator

sbc100 commented Feb 17, 2026

I guess these tests are fundamentally using more memory than is available on 32-bit systems?

Do we have any way to express that certain tests require this much memory?

I guess the memory requirements are unavoidable since we are creating a file which needs to be >2gb in size and we use mmap for the output file.

manasij7479 pushed a commit to manasij7479/llvm-project that referenced this pull request Feb 18, 2026
…8287)

wasm sections sizes are specified as u32s, and thus can be as large as
4GB. wasm-ld currently stores the offset into a section as an int32_t
which overflows on large sections and results in a crash. This change
makes it a int64_t to accommodate any valid wasm section and allow
catching even larger sections instead of wrapping around.

This PR fixes the issue by storing the offset as a int64_t, as well as
adding extra checks to handle un-encodeable sections to fail instead of
producing garbage wasm binaries, and also adds lit tests to make sure it
works. I confirmed the test fails on `main` but passes with this fix.

Fixes: llvm#178286
manasij7479 pushed a commit to manasij7479/llvm-project that referenced this pull request Feb 18, 2026
@teresajohnson
Copy link
Contributor

I see that this was reverted, but I noticed in the meantime that the tests are leaving some very large files on my local disk. I'm not sure what the policy is for this, but ideally a few tests wouldn't leave multiple multi-gig files. Here are currently the largest files on my disk, all from a single client's test directory, and I have multiple clients in use. Can the tests clean up after themselves?

2.2G ./llvm/llvm_m5_build/tools/lld/test/wasm/Output/large-section.test.tmp/combined.wasm
2.2G ./llvm/llvm_m5_build/tools/lld/test/wasm/Output/large-debug-section.test.tmp/combined.wasm
2.1G ./llvm/llvm_m5_build/tools/lld/test/wasm/Output/section-too-large.test.tmp/chunk2.o
2.1G ./llvm/llvm_m5_build/tools/lld/test/wasm/Output/section-too-large.test.tmp/chunk1.o
2.1G ./llvm/llvm_m5_build/tools/lld/test/wasm/Output/large-section.test.tmp/chunk1.o
2.1G ./llvm/llvm_m5_build/tools/lld/test/wasm/Output/large-debug-section.test.tmp/debug1.o

sbc100 pushed a commit that referenced this pull request Feb 25, 2026
wasm sections sizes are specified as u32s, and thus can be as large as
4GB. wasm-ld currently stores the offset into a section as an int32_t
which overflows on large sections and results in a crash. This change
makes it a int64_t to accommodate any valid wasm section and allow
catching even larger sections instead of wrapping around.

This PR fixes the issue by storing the offset as a int64_t, as well as
adding extra checks to handle un-encodeable sections to fail instead of
producing garbage wasm binaries, and also adds lit tests to make sure it
works. I confirmed the test fails on main but passes with this fix.

This is the same as #178287 but
deletes the temporary files the tests create and requires the tests run
on a 64-bit platform to avoid OOM issues due to the large binaries it
creates.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 25, 2026
…ns (#183225)

wasm sections sizes are specified as u32s, and thus can be as large as
4GB. wasm-ld currently stores the offset into a section as an int32_t
which overflows on large sections and results in a crash. This change
makes it a int64_t to accommodate any valid wasm section and allow
catching even larger sections instead of wrapping around.

This PR fixes the issue by storing the offset as a int64_t, as well as
adding extra checks to handle un-encodeable sections to fail instead of
producing garbage wasm binaries, and also adds lit tests to make sure it
works. I confirmed the test fails on main but passes with this fix.

This is the same as llvm/llvm-project#178287 but
deletes the temporary files the tests create and requires the tests run
on a 64-bit platform to avoid OOM issues due to the large binaries it
creates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wasm-ld crash when linking large binaries

7 participants