Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core: Use properties while initializing default HadoopFileIO for Hadoop catalog. #9283

Merged
merged 1 commit into from
Dec 19, 2023

Conversation

agrawalreetika
Copy link
Contributor

@agrawalreetika agrawalreetika commented Dec 12, 2023

Make HadoopFileIO as default for Hadoop Catalog to initialize HadoopFileIO catalog configs similar to NessieCatalog

@github-actions github-actions bot added the core label Dec 12, 2023
@agrawalreetika
Copy link
Contributor Author

@nastra, Could you please help me with review on this?

@ajantha-bhat
Copy link
Member

@agrawalreetika: Previously also HadoopFileIO was default for HadoopCatalog.

I think only problem was properties was not considered for the default HadoopFileIO initilaization.

I expect PR title to be

Core: Use properties while initializing default HadoopFileIO for Hadoop catalog.

And a simple PR description.

@agrawalreetika agrawalreetika changed the title Set HadoopFileIO as default for HadoopCatalog Core: Use properties while initializing default HadoopFileIO for Hadoop catalog. Dec 12, 2023
@agrawalreetika
Copy link
Contributor Author

Thank you @ajantha-bhat for the details. I have made the changes in the Title and description.

@yingsu00
Copy link

PR message "to Initialise" -> to initialize

catalog.initialize("hadoop", catalogProps);
FileIO fileIO = catalog.newTableOps(tableIdent).io();

Assertions.assertThat(fileIO.properties().get("warehouse")).isEqualTo("/hive/testwarehouse");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Assertions.assertThat(fileIO.properties().get("warehouse")).isEqualTo("/hive/testwarehouse");
Assertions.assertThat(fileIO.properties()).containsEntry("warehouse", "/hive/testwarehouse");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is generally better, because if the assertion ever fails, it will show the content of properties()

FileIO fileIO = catalog.newTableOps(tableIdent).io();

Assertions.assertThat(fileIO.properties().get("warehouse")).isEqualTo("/hive/testwarehouse");
Assertions.assertThat(fileIO.properties().get("io.manifest.cache-enabled")).isEqualTo("true");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Assertions.assertThat(fileIO.properties().get("io.manifest.cache-enabled")).isEqualTo("true");
Assertions.assertThat(fileIO.properties()).containsEntry("io.manifest.cache-enabled", "true");

@@ -110,13 +110,18 @@ public void initialize(String name, Map<String, String> properties) {

this.catalogName = name;
this.warehouseLocation = LocationUtil.stripTrailingSlash(inputWarehouseLocation);

if (conf == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I just saw this change now. Why do we need this? I think this should be reverted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nastra If Hadoop Configuration is not set then subsequent would have NullPointer Excpetion. While adding the Tests I noticed we don't have conf check in HadoopCatalog the way we have it in HiveCatalog

I can explicitly initialize conf in my Tests if you think this check is not required here. Let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conf always needs to be set explicitly. This is because the HadoopCatalog (and a few others) implement Configurable

Copy link
Contributor

@nastra nastra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to revert the change that creates a conf when it was null

Copy link
Contributor

@amogh-jahagirdar amogh-jahagirdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thanks @agrawalreetika !

@amogh-jahagirdar amogh-jahagirdar merged commit c2018f8 into apache:main Dec 19, 2023
41 checks passed
@amogh-jahagirdar
Copy link
Contributor

Merged, thanks @agrawalreetika and thanks for reviews @nastra @ajantha-bhat !

lisirrx pushed a commit to lisirrx/iceberg that referenced this pull request Jan 4, 2024
geruh pushed a commit to geruh/iceberg that referenced this pull request Jan 26, 2024
devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants