Let's say we have an iceberg schema:
Schema schema = new Schema(
Types.NestedField.required(0, "id", Types.LongType.get()),
Types.NestedField.optional(3, "location", Types.StructType.of(
Types.NestedField.required(1, "lat", Types.FloatType.get()),
Types.NestedField.required(2, "long", Types.FloatType.get())
))
);
And if someone want to do the nested projection by using the project schema:
Schema latOnly = new Schema(
Types.NestedField.optional(3, "location", Types.StructType.of(
Types.NestedField.required(1, "lat", Types.FloatType.get())
))
);
If the data row is :
{
"id": 10001,
"location": null
}
Then what's the expected projected value for the project schema latOnly ? Should we set the location.lat to be null although its field are defined required in Types.NestedField.required(1, "lat", Types.FloatType.get()) ?
I think the current StructProjection did not handle this issue correctly because it will just throw a NullPointerException when projecting the nested required field while providing a null value for the parent struct.
This is related to the broken unit tests from this PR.