Skip to content

Vendor CityHash as an internal part of the library#533

Merged
slabko merged 2 commits into
masterfrom
merge-cityhash
Jul 2, 2026
Merged

Vendor CityHash as an internal part of the library#533
slabko merged 2 commits into
masterfrom
merge-cityhash

Conversation

@slabko

@slabko slabko commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Vendor CityHash as an internal part of the library

ClickHouse's wire protocol checksums compressed blocks with a specific
frozen CityHash128 variant. It is not interchangeable with upstream or
system CityHash: a different implementation produces different hashes and
the server rejects blocks with "Checksum doesn't match: corrupted data".

Treating it as a swappable contrib/ dependency was therefore misleading
and unsafe. Move it out of contrib/ and into the library proper:

  • Move contrib/cityhash/cityhash/ -> clickhouse/cityhash/.
  • Wrap the code in a clickhouse::cityhash namespace to prevent symbol
    clashes when consumers link their own CityHash; update call sites in
    compressed.cpp, lowcardinality.cpp and types.cpp accordingly.
  • Drop the WITH_SYSTEM_CITYHASH option, which could silently break wire
    compatibility, along with the Findcityhash machinery it required.
  • Compile CityHash directly into the clickhouse target (Bazel and CMake)
    instead of as a separate library.
  • Remove the cityhash Bazel target and move it to clikchouse target itself.

slabko added 2 commits July 2, 2026 18:17
ClickHouse's wire protocol checksums compressed blocks with a specific
frozen CityHash128 variant. It is not interchangeable with upstream or
system CityHash: a different implementation produces different hashes and
the server rejects blocks with "Checksum doesn't match: corrupted data".

Treating it as a swappable contrib/ dependency was therefore misleading
and unsafe. Move it out of contrib/ and into the library proper:

- Move contrib/cityhash/cityhash/ -> clickhouse/cityhash/.
- Wrap the code in a clickhouse::cityhash namespace to prevent symbol
  clashes when consumers link their own CityHash; update call sites in
  compressed.cpp, lowcardinality.cpp and types.cpp accordingly.
- Drop the WITH_SYSTEM_CITYHASH option, which could silently break wire
  compatibility, along with the Findcityhash machinery it required.
- Compile CityHash directly into the clickhouse target (Bazel and CMake)
  instead of as a separate library.
@slabko slabko changed the title Merge cityhash Vendor CityHash as an internal part of the library Jul 2, 2026
@slabko slabko mentioned this pull request Jul 2, 2026
1 task
@slabko slabko marked this pull request as ready for review July 2, 2026 16:47
@slabko slabko requested a review from mzitnik as a code owner July 2, 2026 16:47
@slabko

slabko commented Jul 2, 2026

Copy link
Copy Markdown
Contributor Author

@BYVoid, if you have a moment, can check if this change is ok 2f1783a

@BYVoid

BYVoid commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

LGTM

@joe-clickhouse joe-clickhouse left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@slabko slabko merged commit cf5c40b into master Jul 2, 2026
78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants