Skip to content

Comments

fix: Add complex types testing for lance#17769

Merged
voonhous merged 2 commits intoapache:masterfrom
rahil-c:rahil/hudi-lance-complex-type-testing
Jan 8, 2026
Merged

fix: Add complex types testing for lance#17769
voonhous merged 2 commits intoapache:masterfrom
rahil-c:rahil/hudi-lance-complex-type-testing

Conversation

@rahil-c
Copy link
Collaborator

@rahil-c rahil-c commented Jan 2, 2026

Describe the issue this Pull Request addresses

Issue: #17665

Add complex type testing for lance file format integration, currently we have only added basic primitive type tests in TestHoodieSparkLanceWriter, but we will need to test types such as (decimal, timestamp, structs, arrays, maps)

Note currently map type in default version of lance file writer is not supported. When trying to run the test testReadMapType encountered the following:

Caused by: java.lang.RuntimeException: LanceError(Schema): Unsupported data type: Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false), /Users/rahil/workplace/lance/rust/lance-core/src/datatypes.rs:171:31
	at com.lancedb.lance.file.LanceFileWriter.writeNative(Native Method)
	at com.lancedb.lance.file.LanceFileWriter.write(LanceFileWriter.java:119)
	at org.apache.hudi.io.lance.HoodieBaseLanceWriter.flushBatch(HoodieBaseLanceWriter.java:214)
	at org.apache.hudi.io.lance.HoodieBaseLanceWriter.close(HoodieBaseLanceWriter.java:136)
	... 6 more

For now will keep the map test disabled.

Summary and Changelog

Added complex type testing in TestHoodieSparkLanceReader which rounds trips using the HoodieSparkLanceWriter

Impact

none

Risk Level

low

Documentation Update

Contributor's checklist

  • Read through contributor's guide
  • Enough context is provided in the sections above
  • Adequate tests were added if applicable

@rahil-c rahil-c force-pushed the rahil/hudi-lance-complex-type-testing branch from 49ee9b7 to 27a2ddd Compare January 2, 2026 15:46
@github-actions github-actions bot added the size:M PR with lines of changes in (100, 300] label Jan 2, 2026
@rahil-c rahil-c marked this pull request as ready for review January 2, 2026 15:54
@rahil-c
Copy link
Collaborator Author

rahil-c commented Jan 2, 2026

feedback: move tests to data source level, and consolidate tests since we dont one test per type.

List<InternalRow> expectedRows = new ArrayList<>();
expectedRows.add(createRow(1, new Object[]{}, new Object[]{})); // Empty arrays
expectedRows.add(createRow(2, new Object[]{42}, new Object[]{"Alice"})); // Single element
expectedRows.add(createRow(3, new Object[]{1, 2, 3, 4, 5}, new Object[]{"Bob", "Charlie", "David"})); // Multi-element
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add an extra test case to check if we can read null data types correctly as Spark's ArrayData supports null elements. It will be good to verify that Lance preservers them.

expectedRows.add(createRow(4, new Object[]{1, null, 3}, new Object[]{"A", null, "C"}));

@the-other-tim-brown
Copy link
Contributor

feedback: move tests to data source level, and consolidate tests since we dont one test per type.

@rahil-c I tried migrating the code to the datasource tests but that suite is all in scala and it was causing a large amount of changes to get the code compatible with the datasource APIs. We'll just leave the tests where they are.

Copy link
Member

@voonhous voonhous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hudi-bot
Copy link
Collaborator

hudi-bot commented Jan 8, 2026

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@voonhous
Copy link
Member

voonhous commented Jan 8, 2026

image

Azure + GitHub CI is green, merging this in.

@voonhous voonhous merged commit e030185 into apache:master Jan 8, 2026
138 of 141 checks passed
PavithranRick pushed a commit to PavithranRick/hudi that referenced this pull request Jan 8, 2026
* fix: Add complex types testing for lance

* add null element test

---------

Co-authored-by: Timothy Brown <tim@onehouse.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M PR with lines of changes in (100, 300]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants