Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses Linux native device breakage caused by the Debian/Ubuntu t64 transition renaming libaio.so.1 → libaio.so.1t64, by making the native-device loader self-heal at runtime and by updating build/test infrastructure to remove now-redundant symlink workarounds.
Changes:
- Update
NativeStorageDevice’s DllImport resolver to load the native library via an absolute path and (on Linux) auto-repair missinglibaio.so.1by creating a local compat symlink tolibaio.so.1t64, with improved diagnostics. - Add a C++ libaio symbol-version pinning header and set
RPATH=$ORIGINonlibnative_device.soto allow colocated dependency resolution. - Remove Docker/CI symlink workaround steps and relax Docker image validation to accept either
libaio.so.1orlibaio.so.1t64.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/docker-tests/validate_docker_images.py | Accept either libaio SONAME in image validation. |
| libs/storage/Tsavorite/cs/src/core/Device/NativeStorageDevice.cs | Absolute-path native load + libaio t64 auto-repair + diagnostics. |
| libs/storage/Tsavorite/cc/src/device/libaio_compat.h | New header to force specific versioned libaio symbols at link time. |
| libs/storage/Tsavorite/cc/src/device/file_linux.h | Include the new libaio compatibility header. |
| libs/storage/Tsavorite/cc/src/CMakeLists.txt | Add $ORIGIN RPATH and disable new dtags (DT_RPATH behavior). |
| libs/server/Resp/Vector/VectorManager.cs | No-op initialization/recovery when vector feature disabled. |
| Dockerfile | Remove build-time global libaio symlink workaround. |
| Dockerfile.ubuntu | Remove build-time global libaio symlink workaround. |
| .github/workflows/ci.yml | Remove Ubuntu libaio workaround step. |
| .github/workflows/nightly.yml | Remove Ubuntu libaio workaround step. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
bffadf8 to
180bf91
Compare
…nsition)
The t64 ABI transition renamed libaio.so.1 to libaio.so.1t64, breaking
libnative_device.so which has a hard DT_NEEDED of libaio.so.1. Fix the
problem in three places so both Docker and non-Docker users on t64 hosts
get a working native device without manual intervention.
1) libaio_compat.h (new) pins the libaio entry points to specific symbol
versions at link time:
io_setup @LIBAIO_0.4
io_destroy @LIBAIO_0.4
io_getevents@LIBAIO_0.4 (userspace ring fast path)
io_submit @LIBAIO_0.1
Older libaio-dev marked LIBAIO_0.4 as the default version so a plain
link picked these up automatically. On t64 (libaio1t64-dev) the default
is gone and libaio.h has no .symver redirects for x86_64, so a fresh
link produces UNVERSIONED references that at runtime resolve to the
slower LIBAIO_0.1 io_getevents - which always syscalls and blocks -
causing NativeStorageDevice probe/TryComplete paths to hang. With
libaio_compat.h included first, any future rebuild on any distro
reproduces the correct versioned bindings.
2) CMakeLists.txt sets RPATH=$ORIGIN (via INSTALL_RPATH +
BUILD_WITH_INSTALL_RPATH + --disable-new-dtags) so libnative_device.so
searches its own directory for dependencies. This enables the managed
loader's fallback (below).
3) NativeStorageDevice.ImportResolver resolves NativeLibraryPath to an
absolute path (fixing a latent bug where the relative path bypassed
.NET's runtimes/ probing) and, on Linux, catches DllNotFoundException
referencing libaio.so.1, locates libaio.so.1t64 in standard multiarch
paths, and drops a compat symlink next to libnative_device.so. The
symlink creation tolerates the race where multiple processes start
simultaneously and another process has already created a usable
symlink. If repair still fails, the loader throws a descriptive
DllNotFoundException explaining the t64 transition and offering three
remediation options. This path is primarily for non-Docker users
(developers running dotnet GarnetServer on their own Debian 13 /
Ubuntu 24.04 machines).
Also:
- VectorManager.Initialize() and ResumePostRecovery() now early-return
when IsEnabled is false. Vector Set preview is off by default; there
is no reason these paths should touch storage when the feature is
disabled.
- Dockerfile and Dockerfile.ubuntu still install libaio1t64 and
pre-create the libaio.so.1 -> libaio.so.1t64 symlink at build time
for maximum robustness (works on read-only filesystems and under
restrictive seccomp profiles that block symlink(2)). The managed
loader fallback is belt-and-braces for non-Docker users.
(Dockerfile.alpine and Dockerfile.azurelinux ship libaio.so.1
natively. Dockerfile.chiseled uses a restricted runtime image and
was not changed - it already stages libaio.so.1 from a build stage.)
- .github/workflows/ci.yml and nightly.yml drop the ubuntu-latest
libaio pre-step; the managed ImportResolver now handles repair
automatically on any host.
- validate_docker_images.py accepts either libaio.so.1 or
libaio.so.1t64 when checking library presence.
The bundled libnative_device.so has been rebuilt against the above
sources with '-O3 -g -DNDEBUG' (project Release defaults). Verified via
objdump -T that io_* references are correctly versioned.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
180bf91 to
8af8d4d
Compare
- CMakeLists.txt: fix FNATIVE_DEVICE_HEADERS typo so file_linux.h and
libaio_compat.h are actually associated with the native_device target
(cosmetic, does not affect compiled binary).
- NativeStorageDevice: wrap Directory.GetCurrentDirectory() in a
TryGetCurrentDirectory helper so a deleted/inaccessible CWD cannot
block native library resolution when the library exists in the
assembly or AppContext directory.
- NativeStorageDevice.BuildLibaioDiagnostic: expand architecture mapping
(x64, Arm64, Arm) with a null fallback that emits a distro-agnostic
fix instruction, and correct the remediation advice to suggest a
valid DeviceType value ('RandomAccess') instead of the non-existent
'Managed'.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
vazois
approved these changes
Apr 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The t64 ABI transition renamed libaio.so.1 to libaio.so.1t64, breaking libnative_device.so which has a hard DT_NEEDED of libaio.so.1. Previously we worked around this with system-wide symlinks in every Dockerfile and CI workflow.
Fix this properly in the loader itself:
CMakeLists.txt now sets RPATH=$ORIGIN (via INSTALL_RPATH + BUILD_WITH_INSTALL_RPATH + --disable-new-dtags) so libnative_device.so searches its own directory for dependencies. This lets a managed-side compat symlink next to the native library satisfy the linker without any LD_LIBRARY_PATH contortions from the caller.
libaio_compat.h (new) pins the libaio entry points to the specific symbol versions that make libaio's userspace fast paths kick in:
io_setup @LIBAIO_0.4
io_destroy @LIBAIO_0.4
io_getevents@LIBAIO_0.4 (userspace ring fast path)
io_submit @LIBAIO_0.1
Older libaio-dev marked LIBAIO_0.4 as the default version so a plain
link picked these up automatically. On t64 (libaio1t64-dev) the default
is gone and libaio.h has no .symver redirects for x86_64, so a fresh
link produces UNVERSIONED references that at runtime resolve to the
slower LIBAIO_0.1 io_getevents which always syscalls and blocks -
which caused NativeStorageDevice probe/TryComplete paths to hang.
NativeStorageDevice.ImportResolver now resolves NativeLibraryPath to an absolute path (fixing a latent bug where the relative path bypassed .NET's runtimes/ probing) and, on Linux, catches DllNotFoundException referencing libaio.so.1, locates libaio.so.1t64 in standard multiarch paths, and drops a compat symlink next to libnative_device.so. The symlink creation tolerates the race where multiple processes start simultaneously and another process has already created a usable symlink. If repair still fails, the loader throws a descriptive DllNotFoundException explaining the t64 transition and offering three remediation options.
VectorManager.Initialize() and ResumePostRecovery() now early-return when IsEnabled is false. Vector Set preview is off by default; there is no reason these paths should touch storage when the feature is disabled.
With the loader + build fixes in place, remove the now-redundant workarounds:
Dockerfile and Dockerfile.ubuntu: drop the ln -sf libaio.so.1 line. (Dockerfile.alpine and Dockerfile.azurelinux ship libaio.so.1 natively. Dockerfile.chiseled uses a restricted runtime and was not touched.)
.github/workflows/ci.yml and nightly.yml: drop the ubuntu-latest libaio pre-step; the managed ImportResolver now handles repair automatically and the test suite actually exercises the repair path.
validate_docker_images.py: accept either libaio.so.1 or libaio.so.1t64, since the former is only materialized lazily (on first native device init) for glibc images now.
The bundled libnative_device.so has been rebuilt against the above sources with '-O3 -g -DNDEBUG' (project Release defaults). Verified via objdump -T that io_* references are correctly versioned.