Mount a ClickHouse server as a read-only POSIX filesystem. Browse databases
with ls, stream tables with cat, slice with head/tail/grep — no
client library, no SQL boilerplate.
Website: https://donge.github.io/clickfs
Status: MVP / experimental. Read-only. Single-node HTTP only. Tested on Linux (libfuse3) and macOS (macFUSE).
Sometimes you just want to grep a table. clickhouse-client -q "SELECT * FROM db.t FORMAT TSV" | grep ... works, but breaks shell muscle memory: no tab
completion, no find, no piping into tools that expect file arguments.
clickfs mounts the server so the shell is the client:
clickfs mount http://localhost:8123 /mnt/ch &
ls /mnt/ch/db/default/
cat /mnt/ch/db/default/events/all.tsv | head -n 100
wc -l /mnt/ch/db/default/events/2024-01-15.tsv
cat /mnt/ch/db/default/events/.schemaEverything is streamed — no temp files, no buffering of full result sets.
/mnt/ch/
└── db/
└── <database>/
└── <table>/
├── .schema # SHOW CREATE TABLE output
├── README.md # AI/agent-friendly summary (regenerated on open)
├── all.tsv # full table, TSVWithNames
└── <part_id>.tsv # one file per partition_id
Partitions are listed dynamically from system.parts (active only).
all.tsv and per-partition files are streamed lazily on read().
README.md is synthesized on each open() from 5 concurrent
sub-queries (DESCRIBE, aggregate stats from system.parts,
system.columns, COUNT(), and a 5-row sample). It contains
Stats, Schema, Columns, Sample, Example queries,
and Files — everything an agent needs to understand a table
without writing SQL. Failed sub-queries degrade to _(unavailable)_
so a missing privilege never black-holes the file.
curl -fsSL https://donge.github.io/clickfs/install.sh | shDownloads the latest prebuilt binary to ~/.local/bin/clickfs. For a
system-wide install:
curl -fsSL https://donge.github.io/clickfs/install.sh | sudo sh -s -- --prefix /usr/localLinux needs a fuse3 package (sudo apt install fuse3 or
sudo yum install fuse3); macOS needs
macFUSE (brew install --cask macfuse).
cargo install clickfsOr with cargo-binstall for prebuilt binaries (no compile):
cargo binstall clickfscargo install --path .
# or
cargo build --release && ./target/release/clickfs --helpclickfs mount <URL> <MOUNTPOINT> [options]Common options:
| Flag | Default | Description |
|---|---|---|
--user |
default |
ClickHouse user (or CLICKFS_USER env) |
--password |
(empty) | Password (or CLICKFS_PASSWORD env, recommended) |
--allow-other |
off | Let other UIDs see the mount |
--no-auto-unmount |
off | Keep the mount alive after the process exits (Linux) |
--query-timeout |
60 |
Server-side query timeout (seconds) |
--max-result-bytes |
1073741824 |
Per-query byte cap (1 GiB) |
--cache-ttl-ms |
2000 |
Metadata cache TTL (or CLICKFS_CACHE_TTL_MS); 0 disables |
--no-compression |
off | Disable HTTP gzip (default sends Accept-Encoding: gzip and enable_http_compression=1) |
--insecure |
off | Skip TLS cert + hostname verification (dev only) |
--ca-bundle <PATH> |
(unset) | Extra PEM CA bundle to trust on top of system roots |
Example:
CLICKFS_PASSWORD=secret clickfs mount \
https://clickhouse.example.com:8443 \
/mnt/ch \
--user analystThe process runs in the foreground; Ctrl-C unmounts and exits.
clickfs umount /mnt/chOr use the platform tool: fusermount -u /mnt/ch / umount /mnt/ch.
Each open file handle is one streaming SELECT. The kernel must read
contiguously from offset 0 forward — random seeks return EIO. This works
fine for cat, head, tail -c +N (after stream reposition), wc, grep,
and any pipeline. It will not work for mmap or random-access editors.
For the common "show me the latest N rows" use case, all.tsv advertises
a virtual EOF at 2^63 and supports tail-mode: when a reader pread()s
deep inside that pseudo-EOF window (as tail -n N, less +G, and many
log viewers do), clickfs transparently issues a one-shot
SELECT * FROM <tbl> ORDER BY <pk> DESC LIMIT N FORMAT TabSeparatedWithNames
and serves the (row-reversed) result from an in-memory buffer pinned to
the end of the file. The header line is preserved.
- Default: enabled,
N = 10000. Tune with--tail-rows N/CLICKFS_TAIL_ROWS, or disable with--no-tail. - The ORDER BY column list comes from
system.tables.primary_key, falling back tosorting_key, thentuple()for engines without a key (Memory/Log/StripeLog). - Multi-column keys correctly apply
DESCper column. - The buffer is per-fd; reads strictly outside the materialized
window still return
EIOwith a debug-level reverse-seek hint.
FOPEN_DIRECT_IO is set on data files so the kernel page cache does not
attempt readahead beyond the current position.
Every mutating operation (write, mkdir, unlink, rename, truncate,
chmod, chown, create, ...) returns EROFS. v1 is intentionally
read-only; mutation requires careful design around ClickHouse's
non-transactional model.
Uses tracing; configure via RUST_LOG:
RUST_LOG=clickfs=debug clickfs mount http://localhost:8123 /mnt/ch
RUST_LOG=clickfs::stream=trace,clickfs::driver=debug clickfs mount ...All logs go to stderr.
See docs/ARCHITECTURE.md for the full design, plus:
docs/path-mapping.md— path → query plandocs/streaming-read.md— async → sync bridgedocs/query-construction.md— SQL buildersdocs/observability.md— logging conventionsdocs/traits.md— internal interfaces
- HTTP/HTTPS protocol only (no native TCP)
- Read-only
- TSV output only (TSVWithNames for
all.tsv) - Strict-sequential reads per fd; backwards seeks return EIO unless they land inside the tail-mode materialization window (see "Reads are strict-sequential" above)
- Metadata listings (db/table/partition/existence) are cached for
--cache-ttl-ms(default 2000 ms);.schemaand data streams are always fresh - Stream cancellation issues
KILL QUERYto the server in addition to dropping the HTTP connection; queries also bound bymax_execution_time=60 - Single ClickHouse server; no cluster / replica routing
- Unit tests:
cargo test --bin clickfs(56 cases, no FUSE needed) - End-to-end:
tests/e2e.sh— 48 cases against a real ClickHouse + mounted FUSE. Seetests/README.md.
CLICKFS_PASSWORD='secret' tests/e2e.sh --buildMIT OR Apache-2.0