Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement]: Table partition files list performance issue #2635

Open
3 tasks done
link3280 opened this issue Mar 13, 2024 · 3 comments
Open
3 tasks done

[Improvement]: Table partition files list performance issue #2635

link3280 opened this issue Mar 13, 2024 · 3 comments

Comments

@link3280
Copy link
Contributor

link3280 commented Mar 13, 2024

Search before asking

  • I have searched in the issues and found no similar issues.

What would you like to be improved?

Currently, the table partition files API could be stuck for a very long time if the table has lots of files (e.g. over 100K). The root cause is that AMS gets all file entries to calculate partitions, instead of filtering the entries by partitions.

This may be due to a limitation that Iceberg Java API is not able to read the partition metadata table directly. But hopefully we could find a workaround or push Iceberg community to solve this problem.

How should we improve?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Subtasks

No response

Code of Conduct

@link3280
Copy link
Contributor Author

link3280 commented Apr 11, 2024

I propose to align PartitionBaseInfo with the iceberg partition metadata table, which contains the following columns:

+-------------------------------+--------------------+--------------------------------------------+--+
|           col_name            |     data_type      |                  comment                   |
+-------------------------------+--------------------+--------------------------------------------+--+
| partition                     | struct<dt:string>  |                                            |
| spec_id                       | int                |                                            |
| record_count                  | bigint             | Count of records in data files             |
| file_count                    | int                | Count of data files                        |
| position_delete_record_count  | bigint             | Count of records in position delete files  |
| position_delete_file_count    | int                | Count of position delete files             |
| equality_delete_record_count  | bigint             | Count of records in equality delete files  |
| equality_delete_file_count    | int                | Count of equality delete files             |
+-------------------------------+--------------------+--------------------------------------------+--+

That would fix the performance issue because we don't have to iterate over all the entries to count files. The complexity would be reduced from millions to thousands for large tables whose partitions contain 1k files.

However, the downside is that we have to drop the commit time and the storage size at the partition level which are calculated based on the entries.

@majin1102 @zhoujinsong @baiyangtx WDYT?

@huyuanfeng2018
Copy link
Contributor

@link3280
Perhaps can expect to get it in the iceberg metadata. This information has been saved in the latest iceberg release.
apache/iceberg#8502

@link3280
Copy link
Contributor Author

@link3280 Perhaps can expect to get it in the iceberg metadata. This information has been saved in the latest iceberg release. apache/iceberg#8502

Cool! Then we could still keep the partition storage size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants