llvm_jit: DLLExport storage on jit_table_at globals (#2802 cache-hit fix)#2811
Merged
Conversation
…fix) The cached `.jitted_scripts/<ns>/<hash>.dll` bakes function-pointer addresses into a few global initializers at codegen, then `ResolveExternVisitor` walks the AST after every DLL load and overwrites each slot via `set_glob_address` with the current-session address — that handoff is what keeps cache-hit reloads safe across ASLR shifts. `add_table_linkage` on the `jit_table_at/find/erase` globals only set `LLVMExternalLinkage`, no DLLExport storage class. On Windows that leaves the symbol out of the DLL export table, so `set_glob_address` (which uses GetProcAddress to find the slot, then writes through the pointer) finds nothing and silently returns false. The codegen-time address — valid in the session that wrote the DLL — survives unchanged into every subsequent session, where ASLR has placed the runtime DLL somewhere else. The JIT'd code loads the stale pointer and crashes inside the table operation. Fix: mirror `set_public_linkage`'s DLLExport linkage + DLLExportStorageClass combo on the table-accessor globals, matching what `save_address_global` already does for builtin function pointers (which are reliably re-resolved). Defense in depth: `set_glob_address` now panics with a precise diagnostic when the named global is missing from the DLL's export table, instead of returning false silently. Every caller in `ResolveExternVisitor` is fixing up a slot that MUST be writable, so a missing export is always a codegen linkage bug. The panic makes a future regression of this exact shape fail loud at compile-time instead of crashing in JIT'd code after an unrelated ASLR shift. `tests/jit_tests/dll_cache.das` probe now exercises a `table[k]` operation (was `x + 1`) so `jit_table_at_tString` lands in the emitted globals. Paired with the new panic, this gives CI a hard-fail signal for the linkage regression — verified: reverting `add_table_linkage` and re-running the test produces `set_glob_address: missing DLL export ... — codegen-side linkage bug, see #2802` and dastest exits non-zero. `LLVM_JIT_CODEGEN_VERSION` bumped 0x09 → 0x0a so previously-written cached DLLs miss the cache on the first run after this lands and get GC'd. Stacks under PR #2808 (which adds the `-jit-no-cache` opt-out flag as a permanent escape hatch — still useful for users who want to skip the cache entirely). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes a Windows-specific crash on JIT DLL cache hits by ensuring JIT-emitted table helper globals are exported from cached DLLs (so their baked addresses can be patched on reload), and by making missing exports fail loudly instead of silently producing stale pointers.
Changes:
- Export
jit_table_at/find/eraseglobals from JIT DLLs on Windows by applying DLLExport linkage + storage class. - Make
DLLHandle.set_glob_addresspanic with a precise diagnostic when a required exported global symbol is missing. - Bump
LLVM_JIT_CODEGEN_VERSIONto invalidate previously cached DLLs, and update the DLL cache test probe to exercisetable[k]so the relevant globals are emitted.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| tests/jit_tests/dll_cache.das | Updates the probe program to use table["a"] so jit_table_at_* globals are emitted and the regression is caught by the new panic. |
| modules/dasLLVM/daslib/llvm_macro.das | Bumps the JIT codegen version to force cache miss/GC of old cached DLLs. |
| modules/dasLLVM/daslib/llvm_jit_common.das | Ensures JIT table helper globals are marked for DLL export (linkage + storage class). |
| modules/dasLLVM/daslib/llvm_dll_utils.das | Converts silent “missing export” behavior into a hard failure with a targeted panic message. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
CI failure on Linux ARM Release / darwin Release shows the panic firing on `count` (and others) — LLVM globalopt legitimately deletes unused globals, make_call has a documented `if (global_var == null) return` guard for the same case (d901e74). set_glob_address can't distinguish "optimizer removed" from "exported-but-missing" without the LLVM module state, which is gone by the time ResolveExternVisitor runs. Revert to the original silent-skip behavior. The DLLExport linkage change on add_table_linkage is the real bug fix and stays — that prevents the missing-export case at codegen time, so set_glob_address doesn't need to catch it at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes the Windows JIT DLL cache crash documented at #2802. Follow-up to #2808 (now merged) — that added the
-jit-no-cacheopt-out flag, this is the actual cache-side fix so the flag stays a "skip the cache entirely" convenience rather than a workaround for broken behavior.What was broken
The cached
.jitted_scripts/<ns>/<hash>.dllbakes runtime function-pointer addresses into a few global initializers at codegen, thenResolveExternVisitorwalks the AST after every DLL load and overwrites each slot viaset_glob_addresswith the current-session address — that handoff is what keeps cache-hit reloads safe across ASLR shifts.add_table_linkageon thejit_table_at/find/eraseglobals only setLLVMExternalLinkage, no DLLExport storage class. On Windows that leaves the symbol out of the DLL export table, soset_glob_address(GetProcAddress + write-through-pointer) finds nothing and silently returns false. The codegen-time address — valid in the session that wrote the DLL — survives unchanged into every subsequent session, where ASLR has placed the runtime DLL somewhere else. The JIT'd code loads the stale pointer and crashes inside the first table operation.dumpbin /exportson a pre-fix cached DLL confirms`jit`table_at`tString` globis missing whilesave_address_global's builtin function-pointer globals (which useset_public_linkage→ DLLExport) are present.Fix
add_table_linkagemirrorsset_public_linkage's DLLExport linkage + DLLExportStorageClass combo. Post-fixdumpbin /exportsshowsjit_table_at_*in the exports table.LLVM_JIT_CODEGEN_VERSIONbumped0x09→0x0aso previously-written cached DLLs miss on first run after this lands and get GC'd.Test
tests/jit_tests/dll_cache.dasprobe now exercisestable[k](wasx + 1) sojit_table_at_tStringlands in the emitted globals, exercising the codegen + fixup path the fix targets. The cache-behavior assertions are unchanged.Verification on Windows
daslang -jit dasProfile/tests/dict.das -- --nomain(the original Windows JIT DLL cache: stale absolute-address bake survives across runs → AV on cache hit #2802 crash repro), cache cleared before session 1, both exit 0 with"dictionary", 0.013941000, 10.jit_tests/suite under-jit: 229 passed, 8 pre-existingmissing prerequisite 'UnitTest'failures (build infra, unrelated).Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com