Conversation
Reverts #5826. The corrected cache paths exposed a latent bug: the `_build` cache restores beam files but not the Rustler-built NIF (`prql.so`), because `_build/test/lib/prql/priv` is a symlink to `priv/` outside the cached path. After the cache hit, `mix compile` sees beam files as up-to-date and skips Rustler, leaving the test runtime with no NIF to load. Reverting restores the prior state where the cache silently never restored — slow but correct. A maintainer can redesign caching with attention to the priv/ symlink and Rust-source-aware cache keys. Closes #5830 Co-Authored-By: Claude <noreply@anthropic.com>
|
A separate ci-fix session landed on the same root cause but went forward instead of back. Posting the diff here so a maintainer can pick between the two — happy with whichever lands. The forward fix is also small: cache --- a/.github/workflows/test-elixir.yaml
+++ b/.github/workflows/test-elixir.yaml
@@ -68,19 +68,29 @@ jobs:
# Step: Define how to cache the `_build` directory. After the first run,
# this speeds up tests runs a lot. This includes not re-compiling our
# project's downloaded deps every run.
+ #
+ # Also cache `priv/`: Rustler builds the NIF via cargo into the workspace
+ # `target/` and then copies it to `priv/native/prql.so`. `_build` only
+ # holds Elixir artifacts (and a symlink to `priv`), so without caching
+ # `priv` the restored cache is missing the .so but mix still considers
+ # the project compiled and skips the rebuild.
+ #
+ # Cargo.lock is in the key because the .so depends on Rust deps; an
+ # update there should invalidate the build cache.
- name: Cache compiled build
id: cache-build
uses: actions/cache@v4
env:
- cache-name: cache-compiled-build
+ cache-name: cache-compiled-build-v2
with:
- path: prqlc/bindings/elixir/_build
+ path: |
+ prqlc/bindings/elixir/_build
+ prqlc/bindings/elixir/priv
key:
${{ runner.os }}-mix-${{ env.cache-name }}-${{
- hashFiles('**/mix.lock') }}
+ hashFiles('**/mix.lock', '**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-mix-${{ env.cache-name }}-
- ${{ runner.os }}-mix-Trade-off vs. the revert here: the revert is a one-line round-trip that gets CI green tonight and leaves caching for later. The forward fix keeps caching working — the elixir job currently spends ~2 min on the Rust build, so reverting concedes that on every run, including for the deps cache half (which was always correct on its own; #5826 just bundled it with the broken Source for the priv/native location: the successful PR run for #5826 logged |
|
do a full fix and link from here please |
|
Forward fix opened as #5832 — caches |
Problem
After #5826 merged at 08:44 UTC and ran on
main(cache miss → fresh build → cache saved), the next run onmain(the 11:04 UTC nightly, run 24955057135) hit the cache and failed:Root cause
The
_buildcache from #5826 doesn't include the Rustler-built NIF. By default Mix symlinks_build/<env>/lib/<app>/privtopriv/in the project root, andpriv/sits outside the cached path. The saved cache is only ~288KB (beam files only — a real NIF would be multi-MB).On a cache hit:
_build/is restored (beam files + danglingprivsymlink)priv/native/prql.sois gonemix compilesees beam files as up-to-date → skips Rustler → no rebuildmix testtries to load the NIF → ENOENTThe
depscache is fine on its own, but #5826 added them as a pair and the_buildhalf is the broken one.Solution
Revert #5826. This restores the prior state where the cache silently never restored — slow but correct. A proper re-add needs:
prqlc/bindings/elixir/privin the cache paths, and make the cache key sensitive to Rust source (native/**, the prqlc crate sources) so a staleprql.sodoesn't get reused when Rust changes;_buildcache and keep thedepscache, accepting a fresh ~1.5 min Rust rebuild every run.Leaving that design call to a maintainer.
Testing
The verification is the next nightly (or any
testsrun onmainafter this lands) showing test-elixir green again — same posture as pre-#5826, since this restores the exact prior workflow.Closes #5830 — automated triage