Skip to content

Merge develop to next/kelvin/409#821

Merged
pkova merged 38 commits intonext/kelvin/409from
develop
May 21, 2025
Merged

Merge develop to next/kelvin/409#821
pkova merged 38 commits intonext/kelvin/409from
develop

Conversation

@pkova
Copy link
Copy Markdown
Collaborator

@pkova pkova commented May 21, 2025

No description provided.

ngzax and others added 30 commits April 11, 2024 12:58
consolidates re-initializing the jet dashboard
hashtables into a public method for reuse.
I removed the ivory pill from the serf in #750, but I missed this code
path in _mars_do_boot. This code only ever runs if the process has been
killed during initial boot and we have to replay. The older ivory pill
being installed breaks the structural sharing for jet registrations,
causing ships that crash during boot to perform extremely poorly until
they get melded.
This is just wrong, the macro means "INT" "LONG" "POINTER" == 32 bits. It was
muna here since the beginning to work around a very peculiar issue, see next
commit for details.
All vere versions since the introduction of the zig build system in vere-v3.2
have had a misconfigured openssl build. This was eventually caught by simply
doing +https://facebook.com in the dojo on mac or linux aarch64, segfaulting the binary instantly.
Facebook uses TLS 1.3 with the TLS_CHACHA20_POLY1305_SHA256 cryptosuite,
exercising the vendored assembly file poly1305-armv8.S. We had mistakenly
defined the macro __ILP32__ for this translation unit which means integers,
longs and pointers are 32 bits which is obviously wrong. Fixing this bug lead to
a more insidious problem, however.

The obvious fix of removing the __ILP32__ macro fixed the facebook problem on
linux-aarch64. On macos-aarch64 the fix caused an immediate segfault in the
macos loader (dyld) when starting the vere binary.

The zig build system shells out to the LLVM linker LLD in all cases except
Mach-O. When inspecting the vere binary and the operation of the zig Mach-O linker it
became clear that the segfault in the loader happens because the zig Mach-O
linker emits a rebase into the read-only __TEXT section of the vere binary. When
running the build with --verbose-link and grabbing the final zig link command
and switching out the linker to the macos native ld the vere binary was
completely fine. This is in other words a bug in the zig Mach-O linker.

Further examination revealed that the incorrectly rebased symbol was
_OPENSSL_armcap_P. This is a constant ARMV7_NEON on macos-aarch64 so we work
around the zig linker bug by not using the symbol at all.
pkova and others added 8 commits May 16, 2025 19:19
…815)

All vere versions since the introduction of the zig build system in
vere-v3.2 have had a misconfigured openssl build. This was eventually
caught by simply doing `+https://facebook.com` in the dojo on mac or
linux aarch64, segfaulting the binary instantly. Facebook uses TLS 1.3
with the TLS_CHACHA20_POLY1305_SHA256 cryptosuite, exercising the
vendored assembly file `poly1305-armv8.S`. We had mistakenly defined the
macro `__ILP32__` for this translation unit which means integers, longs
and pointers are 32 bits which is obviously wrong. Fixing this bug lead
to a more insidious problem, however.

The obvious fix of removing the `__ILP32__` macro fixed the facebook
problem on linux-aarch64. On macos-aarch64 the fix caused an immediate
segfault in the macos loader (dyld) when starting the vere binary.

The zig build system shells out to the LLVM linker LLD in all cases
except Mach-O. When inspecting the vere binary and the operation of the
zig Mach-O linker it became clear that the segfault in the loader
happens because the zig Mach-O linker emits a rebase into the read-only
`__TEXT` section of the vere binary. When running the build with
`--verbose-link` and grabbing the final `zig link` command and switching
out the linker to the macos native `ld` the vere binary was
completely fine. This is in other words a bug in the zig Mach-O linker.

Further examination revealed that the incorrectly rebased symbol was
`_OPENSSL_armcap_P`. This is a constant `ARMV7_NEON` on macos-aarch64 so
we work around the zig linker bug by not using the symbol at all.
#766 was reverted due to issues fetching dependencies in CI, which seem
to be related to both unreliable mirrors (from the likes of GNU) and
ziglang/zig#19878 which shows that `zigfetch` fails to fetch files when
they need TLS via proxy connections.

To resolve these issues, I've simply swapped dependency mirrors for ones
that sidestep both.

Note: CI is still unreliable, but seems to work "most" of the time.
We'll have to see how it performs "at scale" for our repository and
revert if necessary once again.
Mostly resolves #816: zlib is still being refetched for some reason.
This PR adds `melt`, an on-loom variant of `meld` originally implemented
by @ngzax. The original commits have been cherry-picked into this repo,
and the implementation has been touched up to work with current vere.
This branch works correctly on a fake ship, more testing is needed.
@pkova pkova requested a review from a team as a code owner May 21, 2025 10:38
@pkova pkova merged commit 0c55ec9 into next/kelvin/409 May 21, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants