Skip to content

API: Add Unit tests for StructProjection class#15984

Open
Govindarajan-D wants to merge 2 commits intoapache:mainfrom
Govindarajan-D:test-struct-projection
Open

API: Add Unit tests for StructProjection class#15984
Govindarajan-D wants to merge 2 commits intoapache:mainfrom
Govindarajan-D:test-struct-projection

Conversation

@Govindarajan-D
Copy link
Copy Markdown
Contributor

@Govindarajan-D Govindarajan-D commented Apr 15, 2026

Added 5 Unit tests that improves the reliability of StructProjection class.

  1. Subset projection from a flat schema
  2. Nested projection across multiple levels
  3. Optional field projection (schema evolution)
  4. Map projection with nested value
  5. List projection

Code coverage from 0% to 82% (Based on Jacoco report)

@github-actions github-actions bot added the API label Apr 15, 2026
public void testSubsetProjection() {
Schema dataSchema =
new Schema(
Types.NestedField.required(10, "id", Types.LongType.get()),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please import Types.NestedField or even Types.NestedField.required. This could make the code way smaller.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pravy. Thanks for your review. I used Types.NestedField explicitly because that seems to be the convention used. If you check StructProjection for which these test cases were written, it doesn't use the complete import and same seems to be the case in all files.

Let me know your thoughts!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the result of my investigation:

$ git grep NestedField|grep import|wc -l
     711
$ git grep NestedField.required|grep import|wc -l
     333

The whole class is about creating schemas. We should hide as much boilerplate as possible to be able to read the important parts.


@Test
public void testNestedProjection() {
Types.StructType dataCoordinateStruct =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import Types.StructType

Types.NestedField.required(10, "street", Types.StringType.get()),
Types.NestedField.required(20, "coordinates", dataCoordinateStruct));

Schema dataSchema =
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create a schema like this:

      new Schema(
          required(14, "content", Types.IntegerType.get()),
          required(1, "path", Types.StringType.get()),
          required(
              8,
              "partition_summaries",
               ListType.ofRequired(
                  9,
                  StructType.of(
                      required(10, "contains_null", Types.BooleanType.get()),
                      required(11, "contains_nan", Types.BooleanType.get()),
                      required(12, "lower_bound", Types.StringType.get()),
                      required(13, "upper_bound", Types.StringType.get())))));

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravy. I am planning to write some more tests for this file especially negative tests for partial projection. Will include nested List in addition to partial projection.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the code is too chatty, I think it is way more readable if we create the whole Schema in a single command. Also the structs are not reused, so we don't really need them

@pvary
Copy link
Copy Markdown
Contributor

pvary commented Apr 16, 2026

Thanks @Govindarajan-D for the PR. Left some minor formatting comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants