feat: add Large File Support (LFS) subsystem#134
Conversation
Add transparent large-file offloading for Kafka via an S3-backed proxy. When enabled, oversized records are automatically replaced with compact envelope references pointing to S3 objects, keeping Kafka partitions lean while supporting arbitrarily large payloads. Core components: - pkg/lfs: envelope codec, S3 client, checksum, consumer/producer/resolver - cmd/lfs-proxy: standalone LFS proxy binary with HTTP ingest, SASL, TLS, Swagger UI, and ops tracker - cmd/proxy/lfs_*: feature-flagged LFS module merged into unified proxy - internal/console: LFS consumer, S3 browser, and metrics in web console - pkg/operator: LFS proxy resource management for Kubernetes operator - deploy/helm: Helm templates for lfs-proxy deployment and monitoring Client SDKs (Java, Python, JavaScript, browser): - lfs-client-sdk/: multi-language SDKs for producing/consuming LFS records Iceberg processor addon: - addons/processors/iceberg-processor: reads LFS envelopes and sinks to Apache Iceberg tables Also includes: - Broker fix: send error response instead of dropping TCP connection on handler errors (fixes "fetching message: EOF" with older Fetch versions) - E2E test suite for LFS proxy, SDK, and iceberg processor - CI/Helm/Docker updates for LFS proxy build and deployment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
HTML report showing the 212 files included in the clean LFS PR and the 5,912 files excluded (node_modules, build artifacts, personal configs, docs deferred to separate PRs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
novatechflow
left a comment
There was a problem hiding this comment.
thank @kamir - awesome! Large file support is a true issue on plain Kafka and provider-based ones!
…gitignore Enforce strict separation: core platform PR must not contain demo scripts, deployment guides, staging infra, or spring-boot demo targets. Only core LFS additions remain (build-sdk, docker-build-lfs-proxy, test-lfs-proxy-broker). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I just noticed that this PR still duplicates the whole kafka proxy logic in cmd/lfs-proxy/*.go |
|
pkg/lfs looks like it is quite bloated. There are four different ways to resolve an LFS envelope to its payload: Consumer.Unwrap(), Consumer.UnwrapEnvelope(), Resolver.Resolve(), and Record.Value(). Reducing that and some other things to a streamlined interface would also make it easier to use in e.g. the iceberg processor. This and the duplication mentioned above seem like quick wins before we merge it. |
novatechflow
left a comment
There was a problem hiding this comment.
thats great! thanks @kamir
good points. @kamir, thoughts? |
|
I started working on the comments. Here is the plan:
|
… into single method
Addresses reviewer feedback (klaudworks): pkg/lfs had four different ways
to resolve an LFS envelope. This removes the redundancy between Unwrap()
and UnwrapEnvelope() by consolidating into a single Unwrap() that returns
(*Envelope, []byte, error). Callers that don't need the envelope metadata
simply ignore the first return value.
Before: Unwrap() -> ([]byte, error) — discards envelope
UnwrapEnvelope() -> (*Envelope, []byte, error) — preserves envelope
After: Unwrap() -> (*Envelope, []byte, error) — always available
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Addresses reviewer feedback (klaudworks): cmd/lfs-proxy/ duplicated ~30% of the unified proxy's infrastructure (TCP listener, Kafka protocol handling, connection management, health checks). LFS is now exclusively a feature-flag on the unified proxy, enabled with KAFSCALE_PROXY_LFS_ENABLED=true. The unified proxy provides partition-aware routing (vs round-robin) and consumer group support. Removed: - cmd/lfs-proxy/ (5,733 lines of Go source + tests) - deploy/docker/lfs-proxy.Dockerfile - deploy/helm/kafscale/templates/lfs-proxy-*.yaml (6 Helm templates) - test/e2e/lfs_proxy_*.go (e2e tests that exec'd the standalone binary) - CI jobs: build-lfs-proxy, e2e-lfs-proxy - Makefile targets: docker-build-lfs-proxy, test-lfs-proxy-broker Kept: - cmd/proxy/lfs_*.go (unified proxy LFS module) - pkg/lfs/ (shared LFS library) - api/lfs-proxy/openapi.yaml (API spec) - lfs-client-sdk/ (all client SDKs) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unused 'Any' import in envelope.py - Add explanatory comment to empty except clause in producer.py Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 123 new tests covering all LFS code paths in the unified proxy: - lfs_http_test.go (70 tests): HTTP handlers, validation, CORS, multipart sessions - lfs_test.go (53 tests): record encoding, compression, headers, metrics, histogram All 171 cmd/proxy tests pass (including pre-existing proxy tests). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@klaudworks - pls review too |
Summary
Transparent large-file offloading for Kafka via an S3-backed proxy. When enabled, oversized records are automatically replaced with compact envelope references pointing to S3 objects, keeping Kafka partitions lean while supporting arbitrarily large payloads.
What changed (reviewer feedback addressed)
After initial submission, two concerns were raised by @klaudworks:
cmd/lfs-proxy/duplicated proxy infrastructure — Removed entirely (5,733 lines). LFS is now exclusively a feature-flag on the unified proxy (KAFSCALE_PROXY_LFS_ENABLED=true), which provides partition-aware routing and consumer group support.pkg/lfshad redundant envelope resolution methods — ConsolidatedUnwrap()andUnwrapEnvelope()into a singleUnwrap() -> (*Envelope, []byte, error)method. Callers that don't need envelope metadata simply ignore the first return value.Additionally:
Commit history
feb7e4f8ebdbdb84fc96d604Consumer.Unwrap()andUnwrapEnvelope()into single method4cc089fec688877f623f09f5Core Go packages
pkg/lfs— envelope codec, S3 client, checksum, consumer/producer/resolvercmd/proxy/lfs_*.go— feature-flagged LFS module in the unified proxyinternal/console— LFS consumer, S3 browser, and metrics in web consolepkg/operator— LFS proxy Kubernetes resource managementDeployment
deploy/helm— Helm values for LFS config (proxy.lfs.enabled)deploy/docker-compose— local development compose (unified proxy with LFS)Client SDKs
lfs-client-sdk/java/(Maven, envelope codec, producer, consumer, resolver)lfs-client-sdk/python/(envelope, producer, resolver)lfs-client-sdk/js/(Node.js SDK with tests)lfs-client-sdk/js-browser/(TypeScript browser SDK)Addons
addons/processors/iceberg-processor/reads LFS envelopes and sinks to Apache Iceberg tablesBug fix included
Test coverage
171 tests in
cmd/proxy/— all passing:lfs_http_test.go(70 tests) — HTTP handlers (/lfs/produce,/lfs/download,/lfs/uploads), request validation, CORS, multipart upload sessions, error responseslfs_test.go(53 new tests) — record batch encoding, varint codec, CRC32 checksums, compression (none/gzip/snappy), Kafka header manipulation, S3 put/get/delete, Prometheus metrics, histogram bucketspkg/lfs/tests — envelope encode/decode, consumer unwrap (consolidated API), resolver with checksum validation, S3 client streaminglfs-client-sdk/tests — Java (JUnit), JavaScript (Jest), Python (pytest)File count
~170 files changed (down from 212 after removing the standalone lfs-proxy).
Test plan
go vet ./...— cleango test ./...— all packages pass🤖 Generated with Claude Code