Skip to content

Spark: Add register Truncate UDF#3708

Merged
rdblue merged 1 commit intoapache:masterfrom
Timzhang01:registerTruncate
Jan 20, 2022
Merged

Spark: Add register Truncate UDF#3708
rdblue merged 1 commit intoapache:masterfrom
Timzhang01:registerTruncate

Conversation

@Timzhang01
Copy link
Contributor

@Timzhang01 Timzhang01 commented Dec 10, 2021

When writing to the partition table, we need to register the truncate transform function in Spark to specify it during sort.

issue: #3707

cc/ @rdblue @aokolnychyi

@github-actions github-actions bot added the spark label Dec 10, 2021
import org.junit.Test;
import org.junit.runners.Parameterized;

public class TestPartitionedWritesAsSelect extends SparkCatalogTestBase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use SparkTestBaseWithCatalog instead. No need to run this with multiple catalogs since this is testing direct registration with Spark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing out, I will understand the latest code

AssertHelpers.assertThrows("should registered temporary function" +
"nor a permanent function registered in the database 'default'",
AnalysisException.class,
"Undefined function: 'iceberg_bucket8'",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary to test the case where you reference a function that has not been registered.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. This code is really redundant

}

@Test
public void testInsertAsSelectWithTruncateFailure() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why test these failure cases? I think we test these elsewhere and it doesn't make sense to fail if something gets better later. I'd remove this and the test above.

Copy link
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I think this is fine to have if anyone wants to call the truncate functions from Spark. I'd fix a couple of tests before merging though.

@rdblue
Copy link
Contributor

rdblue commented Dec 18, 2021

@xiaotianzhang01 do you want to take a look at test failures and update this? It would be good to get it into the next release!

@Timzhang01
Copy link
Contributor Author

@rdblue Thank you for your comment. Sorry, the reply is a bit late, I have updated the pr and successfully ran ut locally

@Timzhang01 Timzhang01 requested a review from rdblue December 30, 2021 02:33
@Timzhang01
Copy link
Contributor Author

@rdblue Do you have time to take a look?

@rdblue rdblue merged commit 76fabd8 into apache:master Jan 20, 2022
@rdblue
Copy link
Contributor

rdblue commented Jan 20, 2022

Thanks, @xiaotianzhang01!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants