feat: Add iceberg-lance module with Lance columnar format support#15580
Open
fightBoxing wants to merge 2 commits intoapache:mainfrom
Open
feat: Add iceberg-lance module with Lance columnar format support#15580fightBoxing wants to merge 2 commits intoapache:mainfrom
fightBoxing wants to merge 2 commits intoapache:mainfrom
Conversation
added 2 commits
March 10, 2026 19:12
- Add LANCE enum to FileFormat - Add lance module configuration in settings.gradle and build.gradle - Implement core lance module: - Lance.java: main entry with ReadBuilder/WriteBuilder/DataWriteBuilder - LanceSchemaUtil: Iceberg Schema <-> Arrow Schema conversion - LanceValueWriters/Readers: type-specific value writers and readers - LanceFileAppender: FileAppender implementation with metrics collection - LanceIterable: CloseableIterable with projection support - LanceMetrics: metrics builder and collector - LanceUtil: configuration constants and utilities - Implement data layer adapters: - GenericLanceReader/Writer for GenericRecord support - Add comprehensive test suite (60 test cases, 8 test classes) - Add Lance format design document
- Apply Google Java Format via spotlessApply - Replace new ArrayList<>() with Lists.newArrayList() (checkstyle) - Add .hasMessage() to assertThatThrownBy assertions (checkstyle) - Fix Javadoc HTML entity escaping for <-> symbols - Remove unused imports
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a new
iceberg-lancemodule to support Lance columnar data format in Apache Iceberg. Lance is a modern columnar format optimized for ML/AI workloads with native vector search, O(1) random access, and zero-copy Arrow integration.Changes
Modified Files
api/.../FileFormat.java— AddedLANCE("lance", true)enum valuesettings.gradle— Registeredlancemodulebuild.gradle— Addediceberg-lanceproject configuration with Arrow dependenciesNew Module:
iceberg-lance(12 core files + 8 test files)Core Classes
Lance.javaWriteBuilder,ReadBuilder,DataWriteBuilder(follows Parquet/ORC pattern)LanceSchemaUtil.javaLanceValueWriters.javaLanceValueReaders.javaLanceFileAppender.javaFileAppenderimplementation with Metrics collectionLanceIterable.javaCloseableIterableimplementation with column projection supportLanceMetrics.javaMetricsCollectorfor rowCount/columnSizes/valueCounts/nullCounts/boundsLanceUtil.javaData Layer Integration
GenericLanceReader.javaRecordreader adapterGenericLanceWriter.javaRecordwriter adapterTests (60 test cases, all passing)
TestFileFormatLanceTestLanceSchemaUtilTestLanceValueReadersWritersTestLanceMetricsTestLanceDataWriterTestLanceDataReaderTestLanceReadProjectionTestLanceUtilArchitecture Design
Why Lance in Iceberg?
Extension Architecture
Type Mapping (Iceberg ↔ Arrow ↔ Lance)
Design Principles
iceberg-parquetandiceberg-orcmodule structureInternalData, no impact on existing functionalityLanceIterablesupports projection reads with fieldId matchingImplementation Roadmap
CI Checks
All checks pass locally: