Skip to content

feat(pixi-build-cmake): track exact build inputs from ninja instead of broad globs#5972

Merged
baszalmstra merged 3 commits intoprefix-dev:mainfrom
baszalmstra:cmake-input-globs-via-ninja
May 5, 2026
Merged

feat(pixi-build-cmake): track exact build inputs from ninja instead of broad globs#5972
baszalmstra merged 3 commits intoprefix-dev:mainfrom
baszalmstra:cmake-input-globs-via-ninja

Conversation

@baszalmstra
Copy link
Copy Markdown
Contributor

@baszalmstra baszalmstra commented Apr 28, 2026

Description

The pixi-build-cmake backend currently reports a coarse set of glob patterns as the build's inputs:

**/*.{c,cc,cxx,cpp,h,hpp,hxx}
**/*.{cmake,cmake.in}
**/CMakeFiles.txt

Pixi expands those globs against the source tree to decide whether the build cache is still valid. There are two problems with this:

  • Glob traversal is slow. On every cache check pixi has to walk the whole project tree, applying gitignore-style matching to every file under it. For projects with vendored dependencies, large asset directories, or node_modules-shaped subtrees, the walk dominates the time spent deciding whether anything actually changed. This is the main reason for the change.
  • The set is too wide. Any unrelated .cpp or .h somewhere in the tree (a vendored dep, a sibling experiment, a file the user copied in for reference) is treated as a build input. Editing it triggers a rebuild that does not need to happen, and lockfile churn can include files CMake never looked at.

CMake and Ninja already know exactly which files the build consumed: declared sources, transitively included headers, the CMakeLists.txt chain, and anything an include() pulled in via CMAKE_MODULE_PATH. We have that information for free after every successful build.

This PR has the backend ask Ninja directly which files participated in the build, after the build runs. The exact set is forwarded to pixi's caching layer. Cache validation now reads a small enumerated list of files instead of walking the project tree.

How the input set is computed

Three Ninja sub-commands cover the build graph:

Source What it gives us
ninja -t inputs all Declared translation units (the .cc / .cpp files that targets list as their sources). These do not show up in -t deps.
ninja -t deps The transitive header chain, read from the depfile DB the compiler emitted during the build.
ninja -t targets all CMakeLists.txt and the *.cmake files that CMake registers on its regen rule (they appear as <path>: phony).

Anything outside the project root is dropped (system headers, conda environment files, files in unrelated source trees). Anything under <project>/.pixi/ is also dropped, since pixi's own cache and conda envs are tracked through the environment hash, not the input set.

file(GLOB CONFIGURE_DEPENDS …)

If a CMakeLists.txt collects sources via file(GLOB … CONFIGURE_DEPENDS …), the backend recovers the original glob patterns from CMake's VerifyGlobs.cmake and forwards them to pixi. Adding a new file matching one of those patterns invalidates the build cache and triggers a reconfigure, the same way it does for an in-tree cmake --build cycle.

Plain file(GLOB) without CONFIGURE_DEPENDS follows CMake's own semantics: only the files that matched at configure time are tracked. Adding a new file does not invalidate the cache. This mirrors what plain file(GLOB) does inside CMake itself, the documented footgun where you have to re-run CMake by hand. Users who want auto-detection should use CONFIGURE_DEPENDS, and they get correct invalidation in exchange.

Why not the CMake File API?

Both the File API (cmakeFiles-v1, codemodel-v2) and Ninja-based extraction give the same set of files. The File API is mildly cleaner to parse, but it would need a query-marker injection step before configure, and it does not expose discovered headers (-t deps is still required). Crucially, neither approach exposes the original glob patterns; for those we have to read VerifyGlobs.cmake either way. So this PR sticks with Ninja, which does not require any pre-configure setup.

Fallback

If any step fails (build dir wiped, Ninja exits non-zero, CMakeCache.txt unparseable), the backend logs a warning and falls back to a coarse glob set.

How Has This Been Tested?

  • 13 unit tests in the new crates/pixi_build_cmake/src/inputs.rs cover the parsers, path canonicalization (relative -t deps paths, Windows backslashes, mixed slashes), the .pixi/ filter, the absolute-path filter for -t targets all (drops Ninja's synthetic phony aliases), and VerifyGlobs.cmake extraction including GLOB_RECURSE translation to gitignore-style ** patterns.
  • End-to-end tested against examples/pixi-build/cpp-sdl with PIXI_BUILD_BACKEND_OVERRIDE. The cache went from broad globs to two exact entries: ["src/main.cc", "CMakeLists.txt"].
  • Tested against a synthetic multi-file project (multiple TUs, a project header included from each, a sub-CMakeLists.txt via add_subdirectory, an include()'d helper from CMAKE_MODULE_PATH). All 8 expected files were captured, nothing else.
  • Tested file(GLOB) and file(GLOB CONFIGURE_DEPENDS) together: cold build captured the recovered src_configure_deps/*.cc pattern alongside the resolved files. After adding a new file matching that pattern, the next pixi install correctly invalidated the cache and rebuilt; a third invocation reported "up-to-date" again.

AI Disclosure

  • This PR contains AI-generated content.
    • I have tested any AI-generated content in my PR.
    • I take responsibility for any AI-generated content in my PR.

Tools: Claude Code

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added sufficient tests to cover my changes.

@baszalmstra baszalmstra force-pushed the cmake-input-globs-via-ninja branch 2 times, most recently from 61e3d30 to 4980ebb Compare April 28, 2026 16:44
…file list

Query ninja's view of the build graph after the build runs to recover the
exact set of input files instead of returning broad globs. Two ninja
invocations cover everything:

  * 'ninja -t deps' — sources and the transitive header chain (loaded
    from the depfile DB the compiler emits).
  * 'ninja -t targets all' — surfaces CMakeLists.txt and *.cmake files
    that CMake registers as inputs to its regen rule (they appear as
    '<path>: phony' entries).

Both ninja calls run in parallel via std::thread::scope so neither can
block on a full pipe buffer waiting for the other. Source root comes from
CMakeCache.txt's CMAKE_HOME_DIRECTORY entry; paths outside it are dropped
(those are tracked by pixi's environment hash).

If anything fails (build dir missing, ninja exits non-zero, parse error)
we fall back to the original broad globs. The fallback also fixes a typo:
'**/CMakeFiles.txt' was meant to be '**/CMakeLists.txt'.
@baszalmstra baszalmstra force-pushed the cmake-input-globs-via-ninja branch from 4980ebb to 3854460 Compare April 28, 2026 17:09
Bundle (source_dir, build_dir) into a `Layout` struct that owns the
relativize / relativize-glob rules, and wrap the ninja invocation in a
small `Ninja` struct so the parallel-query plumbing isn't repeated. Each
parser now takes `&str` (decoded once via `str::lines()`, which handles
both `\n` and `\r\n`) instead of `&[u8]` with manual `\r` trimming.
@baszalmstra baszalmstra requested a review from ruben-arts May 1, 2026 08:10
Copy link
Copy Markdown
Contributor

@hunger hunger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about doing something like this just this WE :-)

I had opted for the file path thing though: That works independent of build backend configured, but indeed has the disadvantage of not listing headers (at least not reliably, many project list them in their sources though to make them show up in IDEs).

@baszalmstra baszalmstra merged commit 8167f04 into prefix-dev:main May 5, 2026
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants