test: add unit tests for utility classes by LuciferYang · Pull Request #416 · lance-format/lance-spark

LuciferYang · 2026-04-10T14:28:44Z

Summary

Add unit tests for BlobUtils, LargeVarCharUtils, Utils, and VectorUtils covering all reachable branches in both Spark and Arrow field-checking methods
Tests cover null inputs, missing/wrong metadata, case-insensitive value matching, type-check isolation (non-String for LargeVarChar, non-numeric for Vector), and edge cases like empty version lists and malformed metadata
UtilsTest helper builds ZonedDateTime directly instead of re-implementing the production micros-to-Instant conversion, keeping test logic independent from source logic

Test coverage details

Class	Methods tested	Tests
`BlobUtils`	`isBlobSparkField`, `isBlobArrowField`	12
`LargeVarCharUtils`	`isLargeVarCharSparkField`, `isLargeVarCharArrowField`, `createPropertyKey`	14
`Utils`	`parseVersion`, `findVersion`	11
`VectorUtils`	`isVectorField`, `getVectorDimension`, `isVectorArrowField`, `getVectorArrowDimension`, `shouldBeFixedSizeList`, `createVectorSizePropertyKey`	25

Test plan

mvn test -pl lance-spark-base_2.12 -Dtest="org.lance.spark.utils.*Test" — 108 tests pass

hamersaw

It looks like the majority of these tests are essentially "x=10; assertTrue(isXTen(x))". It doesn't feel like there is a great deal of added value, the regressions would occur if upstream began setting X to something else, which we really can't test. I'm wondering if there's a better way to detect these kinds of issues.

Addresses review feedback on PR lance-format#416: positive-match tests that used the util's own constants on both sides ("x=KEY; assertTrue(isKey(x))") were pure tautologies. - Drop the `*Valid` tests from BlobUtilsTest and LargeVarCharUtilsTest; the `CaseInsensitive` tests already cover the positive path with a non-tautological literal input. - Replace constant references in test inputs with string literals (`"lance-encoding:blob"`, `"arrow:large-var-char"`, `"arrow.fixed-size-list.size"`) so a rename of the production constant breaks the test. - Fix `testCreateVectorSizePropertyKey` to use a hardcoded expected literal instead of concatenating the constant. Upstream-convention drift remains an integration-test concern (BaseBlobCreateTableTest, SchemaConverterTest, LanceArrowUtilsSuite), noted in the updated class javadocs.

LuciferYang · 2026-04-20T03:07:03Z

@hamersaw Fair point — pushed 5b8bb14 trimming this:

Dropped the *Valid tests from BlobUtilsTest and LargeVarCharUtilsTest — they were the literal "x=KEY; assertTrue(isKey(x))" tautology you called out. The CaseInsensitive cases already cover the positive path with non-tautological input (and would catch an accidental .equals regression).
Replaced constant references in test inputs with string literals ("lance-encoding:blob", "arrow:large-var-char", "arrow.fixed-size-list.size"). A rename of the production constant now breaks the test — so the wire-format contract is pinned on the test side, which it wasn't before.
Fixed testCreateVectorSizePropertyKey the same way (was concatenating the constant into the expected value).

58 tests total, all pass locally. Javadoc on each class now notes that upstream-convention drift is the integration layer's job — BaseBlobCreateTableTest, SchemaConverterTest, and the LanceArrow* suites already round-trip through real lance data and would catch that.

Let me know if this is the direction you had in mind, or if you'd rather drop the unit tests entirely and lean on the integration coverage.

Addresses review feedback on PR lance-format#416: positive-match tests that used the util's own constants on both sides ("x=KEY; assertTrue(isKey(x))") were pure tautologies. - Drop the `*Valid` tests from BlobUtilsTest and LargeVarCharUtilsTest; the `CaseInsensitive` tests already cover the positive path with a non-tautological literal input. - Replace constant references in test inputs with string literals (`"lance-encoding:blob"`, `"arrow:large-var-char"`, `"arrow.fixed-size-list.size"`) so a rename of the production constant breaks the test. - Fix `testCreateVectorSizePropertyKey` to use a hardcoded expected literal instead of concatenating the constant. Upstream-convention drift remains an integration-test concern (BaseBlobCreateTableTest, SchemaConverterTest, LanceArrowUtilsSuite), noted in the updated class javadocs.

github-actions Bot added the chore Features related to test, build, style improvements label Apr 10, 2026

hamersaw reviewed Apr 13, 2026

View reviewed changes

LuciferYang added 3 commits April 20, 2026 11:10

test: add unit tests for utility classes

4c9d0e4

fix test name

e5d61e1

LuciferYang force-pushed the add-utils-unit-tests branch from 5b8bb14 to 75ec172 Compare April 20, 2026 03:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add unit tests for utility classes#416

test: add unit tests for utility classes#416
LuciferYang wants to merge 3 commits intolance-format:mainfrom
LuciferYang:add-utils-unit-tests

LuciferYang commented Apr 10, 2026

Uh oh!

hamersaw left a comment

Uh oh!

LuciferYang commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LuciferYang commented Apr 10, 2026

Summary

Test coverage details

Test plan

Uh oh!

hamersaw left a comment

Choose a reason for hiding this comment

Uh oh!

LuciferYang commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants