Skip to content

Conversation

@jinhai-cloud
Copy link
Contributor

What changes were proposed in this pull request?

Add a check on the number of returned metastore partitions by calling Hive#getNumPartitionsByFilter, and add SQLConf spark.sql.hive.metastorePartitionLimit, default value is 100_000

Why are the changes needed?

In the method Shim#getPartitionsByFilter, when filter is empty or when the hive table has a large number of partitions, calling getAllPartitionsMethod or getPartitionsByFilterMethod will results in Driver OOM.

Does this PR introduce any user-facing change?

No

How was this patch tested?

This change is already covered by existing tests

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@maropu
Copy link
Member

maropu commented Nov 10, 2020

Could you add tests? btw, does hive have a similar config?

@maropu maropu changed the title [SPARK-33187][SQL] Add a check on the number of returned metastore pa… [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method Nov 10, 2020
@wangyum
Copy link
Member

wangyum commented Nov 10, 2020

Is it a good design if partition number larger than 100000?

@jinhai-cloud
Copy link
Contributor Author

Is it a good design if partition number larger than 100000?

User queries generally do not exceed 100000 partitions. It's just that filter can't filter partitions very well.
For example, the expression is of a different type, issue: https://issues.apache.org/jira/browse/SPARK-15287.
And Presto also has a parameter to limit the number of partitions in HiveMetadata#getPartitionsAsList

image

@jinhai-cloud
Copy link
Contributor Author

Could you add tests? btw, does hive have a similar config?

I'll think about adding a test.
HiveMetaStore#get_partitions_by_filter also call method checkLimitNumberOfPartitionsByFilter,and HiveConf#METASTORE_LIMIT_PARTITION_REQUEST default value is -1.
image
image

Copy link
Member

@wangyum wangyum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you verify is hive.metastore.limit.partition.request works? if it works, we can add it to document. We have added a Hadoop config before: mapreduce.fileoutputcommitter.algorithm.version.

.version("3.1.0")
.intConf
.checkValue(_ >= -1, "The maximum must be a positive integer, -1 to follow the Hive config.")
.createWithDefault(100000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have you considered to keep the default behavior as it is but allow it to be configurable? changing it means now we'll need to make two HMS calls (one additional getNumPartitionsByFilter) which I'm not sure is desirable (have seen HMS perform very badly in production before).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunchao Thank you for your response. I think this is a reasonable maximum, and Presto also has a parameter to limit the number of partitions in HiveMetadata#getPartitionsAsList, default value is 100_000

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea the default value 100_000 looks fine to me. My main question is whether we need to make the default value to be that and double the HMS calls. Presto doesn't call getNumPartitionsByFilter it seems as it streams through a partition iterator and stops once the threshold is reached.

@jinhai-cloud
Copy link
Contributor Author

jinhai-cloud commented Dec 3, 2020

Could you verify is hive.metastore.limit.partition.request works? if it works, we can add it to document. We have added a Hadoop config before: mapreduce.fileoutputcommitter.algorithm.version.

@wangyum It does not work on the client

@github-actions github-actions bot added the SQL label Dec 3, 2020
@jinhai-cloud
Copy link
Contributor Author

@sunchao @wangyum Can you help me verify this patch?

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Mar 28, 2021
@github-actions github-actions bot closed this Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants