Conversation
17d92e9 to
500c05b
Compare
theory
reviewed
May 23, 2026
serprex
commented
May 23, 2026
serprex
commented
May 23, 2026
serprex
commented
May 23, 2026
3 tasks
b4bc592 to
9ee9486
Compare
theory
reviewed
May 27, 2026
theory
reviewed
May 27, 2026
theory
reviewed
May 27, 2026
b9e53d5 to
8c62cc3
Compare
theory
requested changes
May 27, 2026
Collaborator
theory
left a comment
There was a problem hiding this comment.
SO close! Happy to help resolve some of these. Would also like to squash it into a single commit with a nice description.
bad039f to
8548ff7
Compare
serprex
commented
May 28, 2026
theory
reviewed
May 28, 2026
9214615 to
300dfc5
Compare
The clickhouse-cpp library's use of [RAII] and C++ exceptions conflicted
with PostgreSQL's memory management and `PG_TRY`/`setjmp`/`longjmp`
patterns. Analysis of these conflicts lead to the conclusion that
continuing to use the C++ library was unsafe.
Instead, replace the vendored clickhouse-cpp library with the new
headers-only [clickhouse-c] client, also vendored. This package provides
an interface for memory management functions, the `chc_alloc` struct,
mapped here to the Postgres `palloc` family of functions, to ensure
consistent memory handling. It also supports fetching results by block,
rather than all at once, for reduced memory consumption.
This change results in greatly reduced build time without the need to
compile the vendored C++ library, and reduces the size of the resulting
shared library ~75%.
The new library supports all the same features as clickhouse-cpp, but at
a lower level, such that we must handle the conversion of values from
the ClickHouse binary wire format. Most of the conversions are fairly
straightforward, with a few exceptions:
* Encoding and decoding Decimal values
* Encoding parameters requires special quoting and escaping
* Connecting via TCP or TLS requires lower-level socket configuration
However, it also means we no longer rely on clickhouse-cpp's vendored
dependencies (absl, cityhash, lz4, and zstd). Instead, we simply import
the appropriate headers to support compression and encryption and
require the necessary libraries (`liblz4` and `libzstd`). In other
words, no more vendoring aside from the headers.
Take advantage of the consistent use of PostgreSQL's memory management by
creating memory contexts specific to each operation for easy cleanup.
Replace the `src/binary.cpp` and `src/convert.c` files with the
`src/binary` directory with various responsibilities split by `.c` file
for clearer organization:
* `binary.c` - core glue for the driver
* `binary_internal.h` - Private state for the driver
* `connection.c` - TCP with and without TLS connections
* `convert.c` - Binary value conversion, mostly unchanged from
`src/convert.c`
* `decode.c` - Convert ClickHouse wire column values to Postgres
Datums; used for `SELECT`
* `encode.c` - Convert Postgres Datums to ClickHouse binary values and
append to columns; used for `INSERT`
* `insert.c` - Handles `INSERT` process: prepare, insert (flush), and
finalization, along with appending values to ClickHouse columns
* `select.c` - Handles `SELECT` process: simple query, fetching rows
(pump) and cleanup; returns results by block, rather than buffering
all the results at once
Preserve all previous behaviors with three exceptions:
* UInt16 values were previously incorrectly cast to `int16` instead of
`int32`. Fixed here, along with the tests, bringing the behavior in
line with the http driver.
* Bool values cast to `bool`.
* Improved some messages and added additional query context to error
messages.
Update the `Dockerfile` to require `liblz4` and `libzstd`.
Vastly simplify `Makefile`, now that we no longer need to manage a
vendored C++ library that uses `cmake`. Instead we simply add all the C
files to `OBJS` and pull the vendored submodule if it's not already
present (it ships with release packages). We also add the no-longer
vendored `lz4` and `zstd` dependencies to `PG_LDFLAGS`. We also do away
with `clang-format`, which we'd used only for the C++ files, and teach
`pg_bsd_indent` and `clang-tidy` to find the new `*.c` files in
`src/binary`.
Update the GitHub workflow to eliminate caching, since we no longer
build `clickhouse-cpp`, and to install the `lz4` and `zstd`
dependencies. We also remove the separate static and dynamic tests,
since there is just the static build now.
[RAII]: https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization
[clickhouse-c]: https://github.com/ClickHouse/clickhouse-c
Co-authored-by: David E. Wheeler <david.wheeler@clickhouse.com>
This was referenced May 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #241 and #254.