Summary
On synth v0.11.1 (6f7e3ac8), after the #167 relocation fix, R_ARM_THM_CALL resolves to symbol + 4 instead of symbol. Every call therefore branches one instruction-pair past the callee's entry point — skipping its first 32 bits of code. The object links cleanly (no ld errors), but the linked program is incorrect.
This is the same call-lowering path #167 fixed (the garbage 0xC00000 addend is gone), but a +4 addend remains.
Minimal reproduction
(module
(func $callee (param i32) (result i32) local.get 0 i32.const 7 i32.add)
(func $caller (export "caller") (param i32) (result i32)
local.get 0 call $callee i32.const 2 i32.mul)
(export "callee" (func $callee)))
synth compile mini.wat --target cortex-m4f --all-exports --relocatable -o mini.o
arm-zephyr-eabi-ld mini.o -e caller -Ttext=0x08000000 -o mini.elf
arm-zephyr-eabi-nm mini.elf | grep -iE 'caller|callee'
# 08000000 T callee <-- callee entry
# 08000008 T caller
Raw-Thumb disasm of the linked caller (objdump won't decode without $t mapping symbols — see #170 — so dump .text to a flat binary):
arm-zephyr-eabi-objcopy -O binary --only-section=.text mini.elf t.bin
arm-zephyr-eabi-objdump -D -b binary -marm -Mforce-thumb --adjust-vma=0x08000000 t.bin
# 8000008: f7ff fffc bl 0x8000004 <-- lands at callee+4, should be 0x8000000
The +4 is visible before linking too:
arm-zephyr-eabi-objdump -dr mini.o
# 8: f000 f800 bl 4 <callee+0x4> <-- placeholder already encodes callee+0x4
# 8: R_ARM_THM_CALL func_0
Real-module confirmation
merged.both.wasm (z_impl_k_sem_give + gale_k_sem_give_decide + 5 imports), linked with --defsym func_N=<real names> (workaround for #173). All 6 calls in z_impl_k_sem_give are +4:
| call site |
reloc → symbol |
symbol addr |
linked BL target |
| 0x90 |
func_0 |
0x10a |
0x10e |
| 0x98 |
func_1 |
0x10e |
0x112 |
| 0xc6 |
func_6 (gale_k_sem_give_decide) |
0x000 |
0x004 |
| 0xee |
func_2 |
0x112 |
0x116 |
| 0xf2 |
func_3 |
0x114 |
0x118 |
| 0x102 |
func_4 |
0x116 |
0x11a |
e.g. the call to gale_k_sem_give_decide (@0x0: movs r4,#1; movs r5,#0; …) lands at 0x4, skipping the first two instructions.
Likely cause
The 0xF800 "zero-offset" Thumb-BL placeholder adopted in the #167 fix doesn't net to a 0 displacement under R_ARM_THM_CALL. Thumb BL computes target = (P + 4) + (offset << 1); the relocation's reference point and the placeholder's implicit addend together leave a residual +4. A correctly-zeroed placeholder for this relocation should make ld resolve to S exactly (addend 0), i.e. the encoded displacement must pre-compensate so the net is S, not S + 4.
Impact
Blocker for executing any synth arm output containing calls: every call enters its callee 4 bytes late. Found while pushing the cross-language-LTO-via-wasm route to a linkable image — with #167 + the #173 workaround the gale seam object links, but would crash/misbehave on hardware because z_impl_k_sem_give calls gale_k_sem_give_decide+4 (and k_spin_lock+4, etc.).
Environment
- synth v0.11.1 (
6f7e3ac8), arm backend, target cortex-m4f
- Zephyr SDK 1.0.1 binutils 2.43.1 (
arm-zephyr-eabi-ld/objdump/objcopy)
Summary
On synth v0.11.1 (
6f7e3ac8), after the #167 relocation fix,R_ARM_THM_CALLresolves to symbol + 4 instead of symbol. Every call therefore branches one instruction-pair past the callee's entry point — skipping its first 32 bits of code. The object links cleanly (nolderrors), but the linked program is incorrect.This is the same call-lowering path #167 fixed (the garbage
0xC00000addend is gone), but a +4 addend remains.Minimal reproduction
Raw-Thumb disasm of the linked
caller(objdump won't decode without$tmapping symbols — see #170 — so dump.textto a flat binary):arm-zephyr-eabi-objcopy -O binary --only-section=.text mini.elf t.bin arm-zephyr-eabi-objdump -D -b binary -marm -Mforce-thumb --adjust-vma=0x08000000 t.bin # 8000008: f7ff fffc bl 0x8000004 <-- lands at callee+4, should be 0x8000000The +4 is visible before linking too:
Real-module confirmation
merged.both.wasm(z_impl_k_sem_give + gale_k_sem_give_decide + 5 imports), linked with--defsym func_N=<real names>(workaround for #173). All 6 calls inz_impl_k_sem_giveare +4:e.g. the call to
gale_k_sem_give_decide(@0x0:movs r4,#1; movs r5,#0; …) lands at0x4, skipping the first two instructions.Likely cause
The
0xF800"zero-offset" Thumb-BL placeholder adopted in the #167 fix doesn't net to a 0 displacement underR_ARM_THM_CALL. Thumb BL computestarget = (P + 4) + (offset << 1); the relocation's reference point and the placeholder's implicit addend together leave a residual +4. A correctly-zeroed placeholder for this relocation should makeldresolve toSexactly (addend 0), i.e. the encoded displacement must pre-compensate so the net isS, notS + 4.Impact
Blocker for executing any synth
armoutput containing calls: every call enters its callee 4 bytes late. Found while pushing the cross-language-LTO-via-wasm route to a linkable image — with #167 + the #173 workaround the gale seam object links, but would crash/misbehave on hardware becausez_impl_k_sem_givecallsgale_k_sem_give_decide+4(andk_spin_lock+4, etc.).Environment
6f7e3ac8),armbackend, targetcortex-m4farm-zephyr-eabi-ld/objdump/objcopy)