Distributed storage system for HPC/AI workloads. Kiseki (軌跡 — trajectory) manages encrypted, content-addressed data across Slingshot, InfiniBand, RoCEv2, and TCP fabrics with S3 and NFS gateways, per-shard Raft consensus, and client-side caching on compute-node NVMe.
- Encryption-native — all data encrypted at rest and in transit (AES-256-GCM, FIPS 140-2/3 via aws-lc-rs). No plaintext past the gateway boundary.
- Multi-tenant isolation — per-tenant encryption keys, zero-trust admin boundary, HMAC-keyed chunk IDs, scoped audit trail
- HPC fabric-first — Slingshot CXI, InfiniBand, RoCEv2 with automatic failover to TCP+TLS. GPU-direct for NVIDIA and AMD.
- Client-side caching — two-tier (L1 memory + L2 NVMe) cache with staging API for training datasets. Slurm and Lattice integration.
- Raft per shard — independent consensus groups, dynamic membership, persistent log, automatic split at threshold
Clients FUSE mount / Python SDK / C FFI / S3 / NFS
│
Gateways S3 (SigV4) ─── NFS (Kerberos/AUTH_SYS)
│
Data Plane Composition → Log (Raft per shard) → Chunk (EC on devices)
│
Control Plane Tenancy · IAM · Quota · Federation · Compliance · Advisory
│
Key Management Internal · Vault · AWS KMS · Azure Key Vault · GCP Cloud KMS
│
Transports CXI / InfiniBand / RoCEv2 / TCP+TLS
18 Rust crates, 74K+ lines, 721 tests, 26 e2e tests, 31 ADRs, 76 invariants.
Integrates with Lattice (workload scheduling), Pact (node configuration), and OpenCHAMI (boot infrastructure).
Download pre-built binaries from the latest release:
# Server (storage nodes) — pick your arch
curl -LO https://github.com/witlox/kiseki/releases/latest/download/kiseki-server-x86_64.tar.gz
tar xzf kiseki-server-x86_64.tar.gz -C /usr/local/bin/
# Client (compute nodes) — pick your arch
curl -LO https://github.com/witlox/kiseki/releases/latest/download/kiseki-client-x86_64.tar.gz
tar xzf kiseki-client-x86_64.tar.gz -C /usr/local/bin/
# Admin CLI (workstation)
curl -LO https://github.com/witlox/kiseki/releases/latest/download/kiseki-server-x86_64.tar.gz
tar xzf kiseki-server-x86_64.tar.gz kiseki-admin -C /usr/local/bin/| Server binaries | Client binaries |
|---|---|
kiseki-server-x86_64 (server + admin CLI) |
kiseki-client-x86_64 (CLI + libkiseki_client + header) |
kiseki-server-aarch64 |
kiseki-client-aarch64 |
Or use Docker:
docker pull ghcr.io/witlox/kiseki:latest# Start the full stack (server + Jaeger + Vault + Keycloak)
docker compose up -d
# Admin dashboard
open http://localhost:9090/ui
# Create a bucket and write an object
curl -X PUT http://localhost:9000/my-bucket
curl -X PUT http://localhost:9000/my-bucket/hello.txt -d "hello kiseki"
# Check cluster status
kiseki-admin --endpoint http://localhost:9090 status# Server admin (embedded in kiseki-server binary)
kiseki-server status # Cluster health summary
kiseki-server maintenance on # Enable read-only mode
kiseki-server shard list # List all shards
# Remote admin (from workstation)
kiseki-admin --endpoint http://node:9090 status
kiseki-admin --endpoint http://node:9090 nodes
kiseki-admin --endpoint http://node:9090 events --severity error --hours 1
# Client (compute nodes)
kiseki-client stage --dataset /training/imagenet
kiseki-client stage --status
kiseki-client cache --stats| Feature | Description |
|---|---|
| S3 Gateway | PUT/GET/HEAD/DELETE, bucket CRUD, multipart, SigV4 auth |
| NFS Gateway | NFSv3 + NFSv4.2, AUTH_SYS/Kerberos, per-export config |
| FUSE Mount | POSIX read/write/mkdir/symlink, nested directories |
| Client Cache | L1 (memory) + L2 (NVMe), pinned/organic/bypass modes |
| Staging API | Pre-fetch datasets, Slurm prolog/epilog, Lattice integration |
| Erasure Coding | 4+2, 8+3, degraded reads, automatic repair |
| Raft Consensus | Per-shard groups, mTLS, persistent log (redb), dynamic membership |
| Transports | CXI, InfiniBand, RoCEv2, TCP+TLS with automatic failover |
| GPU-Direct | NVIDIA cuFile + AMD ROCm for zero-copy training data loading |
| Encryption | AES-256-GCM, HKDF-SHA256, FIPS via aws-lc-rs, crypto-shred |
| KMS Providers | Internal, HashiCorp Vault, AWS KMS, Azure Key Vault, GCP Cloud KMS |
| Authentication | mTLS, SPIFFE, S3 SigV4, NFS Kerberos, OIDC/JWT (RS256/ES256) |
| Observability | Prometheus metrics, structured tracing, OpenTelemetry/Jaeger |
| Admin UI | Web dashboard (HTMX + Chart.js), 3-hour metric history, alerts |
| Federation | Async cross-site replication, data residency enforcement |
Full documentation at witlox.github.io/kiseki — or build locally:
mdbook serve # http://localhost:3000| Section | Contents |
|---|---|
| User Guide | Getting started, S3, NFS, FUSE, Python SDK, client cache |
| Administration | Deployment, configuration, monitoring, backup, key management |
| Architecture | System design, bounded contexts, data flow, encryption, Raft |
| Security | Security model, STRIDE analysis, authentication, tenant isolation |
| API Reference | gRPC, REST, CLI, environment variables |
| Decisions | 31 Architecture Decision Records |
# Build
cargo build --workspace
# Test (721 unit + integration tests)
cargo test --workspace --exclude kiseki-acceptance
# BDD acceptance tests
cargo test -p kiseki-acceptance
# E2e tests (requires Docker)
docker compose up -d
cd tests/e2e && pytest -ra
# Lint
cargo fmt --check && cargo clippy -- -D warningsApache-2.0. See LICENSE.
