Skip to content

Track #embed file dependencies in CMake and enable ccache depend mode#100411

Merged
alexey-milovidov merged 10 commits intomasterfrom
cmake-embed-deps
Mar 29, 2026
Merged

Track #embed file dependencies in CMake and enable ccache depend mode#100411
alexey-milovidov merged 10 commits intomasterfrom
cmake-embed-deps

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov commented Mar 23, 2026

The C++ #embed directive includes file contents at compile time,
but the build system does not automatically track these as dependencies.
When an embedded file (e.g., play.html) changes, the .cpp file that
embeds it is not recompiled.

Fix by adding OBJECT_DEPENDS properties to source files that use #embed,
and enabling ccache depend mode so that ccache also correctly invalidates
when #embed-ed files change (ccache's direct mode parser does not recognize
#embed).

Changelog category (leave one):

  • Build/Testing/Packaging Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Track #embed file dependencies in CMake and enable ccache depend mode for correct rebuilds.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

alexey-milovidov and others added 3 commits March 23, 2026 03:04
The C++ `#embed` directive includes file contents at compile time,
but the build system does not automatically track these as
dependencies. When an embedded file (e.g., `play.html`) changes,
the .cpp file that embeds it is not recompiled.

Fix this by adding `OBJECT_DEPENDS` properties to all source files
that use `#embed`, covering: HTML pages, JavaScript files, XML
configs, and NLP data files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ccache's direct mode uses its own source parser to find included
files, but it does not recognize C23 `#embed` directives. This
means changes to files included via `#embed` (HTML pages, JS
libraries, XML configs) do not invalidate the ccache, leading to
stale build artifacts.

Enable depend mode (available since ccache 4.0), which uses the
compiler's dependency file (.d) instead. The compiler correctly
lists `#embed` files in its dependency output.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 23, 2026

Workflow [PR], commit [0b63b69]

Summary:


AI Review

Summary

This PR fixes stale rebuild behavior for C23 #embed inputs by adding OBJECT_DEPENDS for embedded resources and enabling ccache depend mode for compatible versions. The latest PR head also addresses earlier coverage gaps noted in prior bot feedback (FrequencyHolder and attachInformationSchemaTables). I did not find additional high-confidence correctness, safety, performance, or compatibility issues in the current diff.

Missing context
  • ⚠️ No CI logs or benchmark reports were provided in this review context.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh Bot added the pr-build Pull request with build/testing/packaging improvement label Mar 23, 2026
Comment thread src/CMakeLists.txt Outdated
set_source_files_properties(Server/KeeperDashboardRequestHandler.cpp PROPERTIES OBJECT_DEPENDS
"${CMAKE_SOURCE_DIR}/programs/keeper/dashboard.html")

# Note: Common/FrequencyHolder.cpp embeds files from contrib/nlp-data/ via #embed,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#embed dependency tracking is still incomplete for src/Common/FrequencyHolder.cpp.

Right now this PR adds OBJECT_DEPENDS for most #embed users, but explicitly skips NLP blobs. That means when contrib/nlp-data/*.zst changes in a checkout where those files exist, FrequencyHolder.cpp can still be left stale and not recompiled.

Please add conditional dependencies instead of skipping entirely, e.g. guard with if (EXISTS ...) and then call set_source_files_properties(Common/FrequencyHolder.cpp PROPERTIES OBJECT_DEPENDS "...charset.zst;...tonality_ru.zst").

Copy link
Copy Markdown
Member

@azat azat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Comment thread src/CMakeLists.txt Outdated
"${CMAKE_SOURCE_DIR}/programs/keeper/dashboard.html")

# Note: Common/FrequencyHolder.cpp embeds files from contrib/nlp-data/ via #embed,
# but that submodule is not always checked out, so we cannot add OBJECT_DEPENDS for it.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can do this only of this file exists? (i.e. submodules has been initialized)

@rschu1ze
Copy link
Copy Markdown
Member

While at it, would you please remove the comments

/// Note: CMake doesn't recognize changes in #embed-ed files. If you change any of these files, you will need to
/// make a scratch build.

which I introduced with #99075?

Comment thread src/CMakeLists.txt
set_source_files_properties(Server/IndexRequestHandler.cpp PROPERTIES OBJECT_DEPENDS
"${CMAKE_SOURCE_DIR}/programs/server/index.html")

set_source_files_properties(Server/KeeperDashboardRequestHandler.cpp PROPERTIES OBJECT_DEPENDS
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still leaves one #embed consumer untracked: src/Storages/System/attachInformationSchemaTables.cpp embeds multiple files under src/Storages/System/information_schema/*.sql, but there is no matching OBJECT_DEPENDS entry.

As a result, edits to these SQL templates can still reuse stale objects and skip recompilation. Please add set_source_files_properties(Storages/System/attachInformationSchemaTables.cpp PROPERTIES OBJECT_DEPENDS "...") with the embedded SQL files.

alexey-milovidov and others added 5 commits March 26, 2026 22:47
…ng deps

- Remove "CMake doesn't recognize changes in #embed-ed files" comments
  from all files, since this PR now tracks those dependencies.
- Add conditional `OBJECT_DEPENDS` for `FrequencyHolder.cpp` NLP blobs
  (guarded by `if (EXISTS ...)` for when the submodule is not checked out).
- Add `OBJECT_DEPENDS` for `attachInformationSchemaTables.cpp` which was
  also missing dependency tracking for its embedded SQL files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 29, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.20% 84.10% -0.10%
Functions 24.60% 24.60% +0.00%
Branches 76.70% 76.70% +0.00%

Changed lines: PR changed-lines coverage: N/A (no coverable changed lines found in .info). · Uncovered code

Full report · Diff report

@alexey-milovidov alexey-milovidov merged commit 9ddc576 into master Mar 29, 2026
152 of 153 checks passed
@alexey-milovidov alexey-milovidov deleted the cmake-embed-deps branch March 29, 2026 15:46
@robot-ch-test-poll robot-ch-test-poll added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-build Pull request with build/testing/packaging improvement pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants