Skip to content

Conversation

@pvary
Copy link
Contributor

@pvary pvary commented Oct 16, 2025

No description provided.

Comment on lines +153 to +155
Map<String, String> properties = table == null ? ImmutableMap.of() : table.properties();
MetricsConfig metricsConfig =
table == null ? MetricsConfig.getDefault() : MetricsConfig.forTable(table);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to handle null table, as a few use-cases for FileAppanderFactory uses it without an actual table

import org.apache.iceberg.util.CharSequenceSet;
import org.apache.iceberg.util.CharSequenceWrapper;

class SortedPosDeleteWriter<T> implements FileWriter<PositionDelete<T>, DeleteWriteResult> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used, non public class.
Removed

STRUCT_FIELD);

protected abstract FileAppender<T> writeAndGetAppender(List<Record> records) throws Exception;
protected abstract DataFile writeAndGetDataFile(List<Record> records) throws Exception;
Copy link
Contributor Author

@pvary pvary Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of this change, we need to commit Spark 3.4/3.5, Flink 2.0/1.20 together, as all of them uses this as a base test. The other Spark and Flink versions are clean backports

appender.addAll(records);
} finally {
appender.close();
List<Record> records, File file, FileFormat fileFormat, Schema schema) throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the factory as an input, we create the factory internally using the schema information

.collect(Collectors.toList());
}

private static EncryptedOutputFile encrypt(OutputFile out) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the one in FileHelpers public, and used it everywhere

new GenericFileWriterFactory.Builder()
.dataSchema(table.schema())
.dataFileFormat(format)
.writerProperties(writeProperties)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sad truth is that this test is not checking if the bloom filters are actually configured/created or not.

@pvary pvary requested review from nastra and stevenzwu October 17, 2025 08:29
return DEFAULT;
}

public static MetricsConfig getPositionDelete() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe forPositionDelete()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was debating the same. The getPositionDelete won narrowly, because of the getDefault.
But taking a look at it with fresh eyes the forPositionDelete seems better.

Updated.

@pvary pvary merged commit 03f2af8 into apache:main Oct 17, 2025
42 checks passed
@pvary
Copy link
Contributor Author

pvary commented Oct 17, 2025

Merged to main.
Thanks for the review @nastra!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants