Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ManifestFile Stats in snapshot summary. #10246

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nk1506
Copy link
Contributor

@nk1506 nk1506 commented Apr 29, 2024

Currently snapshot summary doesn't have statistics related to Manifest Files.
This change is adding two new summary fields "total-data-manifest-files" and "total-delete-manifest-files". There is a plan to help engines with better planning estimation in terms of manifest reading.

Old Snapshot Summary

{
  "added-data-files": "1",
  "added-files-size": "389",
  "added-records": "1",
  "changed-partition-count": "1",
  "spark.app.id": "local-1714372186602",
  "total-data-files": "1",
  "total-delete-files": "0",
  "total-equality-deletes": "0",
  "total-files-size": "389",
  "total-position-deletes": "0",
  "total-records": "1"
}

New Snapshot Summary

{
  "added-data-files": "1",
  "added-files-size": "389",
  "added-records": "1",
  "changed-partition-count": "1",
  "spark.app.id": "local-1714372186602",
  "total-data-files": "1",
  "total-delete-files": "0",
  "total-equality-deletes": "0",
  "total-files-size": "389",
  "total-position-deletes": "0",
  "total-records": "1",
  "total-data-manifest-files": "1",
  "total-delete-manifest-files": "1"
}

cc: @ajantha-bhat , @nastra

@@ -190,6 +190,7 @@ public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) {
List<ManifestFile> apply = Lists.newArrayList();
Iterables.addAll(apply, newManifestsWithMetadata);
apply.addAll(keptManifests);
apply.forEach(summaryBuilder::addedManifestStats);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add manifestStats with any newManifest.

@@ -263,6 +273,12 @@ void addTo(ImmutableMap.Builder<String, String> builder) {
setIf(removedDeleteFiles > 0, builder, REMOVED_DELETE_FILES_PROP, removedDeleteFiles);
setIf(addedRecords > 0, builder, ADDED_RECORDS_PROP, addedRecords);
setIf(deletedRecords > 0, builder, DELETED_RECORDS_PROP, deletedRecords);
setIf(totalDataManifestFiles > 0, builder, TOTAL_DATA_MANIFEST_FILES, totalDataManifestFiles);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are zero counts these stats will be not added to summary.

@@ -156,6 +156,8 @@ public List<ManifestFile> apply(TableMetadata base, Snapshot snapshot) {
manifests.addAll(snapshot.allManifests(ops.io()));
}

manifests.forEach(summaryBuilder::addedManifestStats);
Copy link
Member

@jbonofre jbonofre Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid question: do we have a rough estimate on this change in terms of manifest append performance ? I guess it's not important, but just wondering.

Copy link
Contributor

@Fokko Fokko Apr 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also my main question.

My train of thought: You will need to read the manifest-list in any situation. The number of manifest can vary widely:

  • If FastAppends are used frequently, there will be many small manifests that you want to bundle into batches.
  • If MergeAppends are used, the manifests are rather hefty (8 megabytes by default, set using commit.manifest.target-size-bytes).

With the knowledge from the summary, you could spin up executors before reading the manifest-list, but this can be difficult since you would also need to know the sizes of the manifest to do some effective planning.

The downside is that we add extra information to the metadata-JSON, which can also grow in size when there are many commits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Fokko for feedback.

With the knowledge from the summary, you could spin up executors before reading the manifest-list, but this can be difficult since you would also need to know the sizes of the manifest to do some effective planning.

With manifest stats, It can help with better planning in terms of manifest file scans. Without these stats in metadata json either someone will have to assume or do one more IO with manifestList.

The downside is that we add extra information to the metadata-JSON, which can also grow in size when there are many commits.

I agree it adds some extra bytes to metadataJson. But I think here we have options to optimize with expire snapshots. But to avoid an extra IO for planning I dont find any other way.

WDYT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stupid question: do we have a rough estimate on this change in terms of manifest append performance ? I guess it's not important, but just wondering.

It is only going to add cost with metadata json size. Adding it to snapshot summary is inline operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nk1506 I think the main question (at least from me) is essentially:

Is information on just the total number of data or delete manifests actually helpful for being able to help with whatever planning estimation you're trying to do?

As @Fokko said you'll probably want to know manifest sizes (in bytes) of the files involved in planning since there can be variance depending on the strategy used for the append. I think with just determining based on number of files (and not including sizes, which means you have to read the manifest anyways) there would be a lot of overprovisioning and underprovisioning (if you're trying to a distributed planning).

But definitely open to hearing more, and especially if you are able to share any data points on how this metric helped ; that would give us some more confidence that this is useful and can generalize.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, I'm in favor of adding metrics in manifest list if it helps engines during planification.
I don't see how total-data-manifest-files and total-delete-manifest-files would be helpful for the planner, but you know better than me here 😄
I would understand straight if the planner would have actually the location of the delete manifest files, as he would be able to "bypass" reading them. So maybe this PR is the first on a series where the next ones would add concrete location. Just thinking loud again.

To summarize: ok with this change generally speaking even if I don't see how it would help the planner 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amogh-jahagirdar , I agree we should also consider Manifest Size for better estimation. But don't you think we can solve this problem by rewrite_manifests.

For any engine if it is running compaction first and then doing GC. Do you think Manifest size(say every file is of ~8MB) will be relevant ?

Please share your thoughts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But definitely open to hearing more, and especially if you are able to share any data points on how this metric helped ; that would give us some more confidence that this is useful and can generalize.

+1 on this, how would knowing these 2 new stats help with planning?

  • Knowing how many can help you skip any of the manifests during planning?
  • or knowing such will help make manifest rewrite decision?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback.

Regarding the usages of manifest counts for planning here is my feedback:

  1. Having Manifest counts in advance helps to plan the parallelism. Like spark is doing after reading from ManifestList.
  2. How it will help with SnapshotSummary ?

Engine like Spark doesn't get any benefits from these stats. Since it's parallelism is dynamic with runtime in nature.

But other engines like Dremio which decides it's parallelism(during compiletime) in advance. Providing these stats will help for better parallelism.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a discussion with @nk1506 to understand it better.

Dremio (or query engines that uses CBO) need to estimate the cost of the query plan.
parallelism is one of the factor for cost estimation.

Query has planning and execution phase.
So, during planning phase we would like to know how many manifests exist to estimate the parallelism required for reading manifests without doing an actual IO of manifest list (as we want planning phase to be as fast as possible).
We currently estimate the parallelism of data files IO by reading the stats that exist in the snapshot summary. But the stats related to manifest count is missing in snapshot summary and we are unable to estimate. Hence, the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants