[Cherry-pick to branch-1.2] [#10163] improvement(optimizer): improve recommender log messaging (#10121)#10192
Merged
jerryshao merged 1 commit intobranch-1.2from Mar 4, 2026
Conversation
…10121) ### What changes were proposed in this pull request? Improve log messaging for optimizer [Recommender#recommendForStrategyType](https://github.com/apache/gravitino/blob/969f7be707334b0a11a64846fbf014ca062d3a4a/maintenance/optimizer/src/main/java/org/apache/gravitino/maintenance/optimizer/recommender/Recommender.java#L134C15-L134C39) execution. ### Why are the changes needed? Currently there're several issues when `recommendForStrategyType` method is run: 1. it's not clear why there's no output for candidate tables that doesn't match triggering criteria 2. [CompactionJobContext](https://github.com/apache/gravitino/blob/main/maintenance/optimizer/src/main/java/org/apache/gravitino/maintenance/optimizer/recommender/handler/compaction/CompactionJobContext.java) produces verbose output for cases w/ many partitions and large tables. Here are the sample log messages when running `recommendForStrategyType` w/ multiple test identifiers for `compaction` strategy type: #### Without this patch: ``` 2026-02-19 22:40:12.753 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:183)] - Recommend strategy compactionSmallPartitionFiles for identifiers [generic.ad.ads_hourly_onsite_insertion_data, generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, generic.ad.ads_hourly_engagement_sampled_table_iceberg_insertion_id, generic.ad.ads_hourly_l1_sampled_data_table, generic.ad.ads_hourly_video_engagement_sampled_table_iceberg] 2026-02-19 22:40:14.505 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:201)] - Recommend strategy compactionSmallPartitionFiles for identifier generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid score: 45548 2026-02-19 22:40:15.859 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.job.NoopJobSubmitter.submitJob(NoopJobSubmitter.java:43)] - NoopJobSubmitter submitJob: template=compaction, identifier=generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, jobExecuteContext=CompactionJobContext(name=generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, jobOptions={}, jobTemplateName=compaction, columns=[org.apache.gravitino.client.GenericColumn@f434c029, org.apache.gravitino.client.GenericColumn@510cb662, org.apache.gravitino.client.GenericColumn@3f404cd3, org.apache.gravitino.client.GenericColumn@8fce398, ... 2026-02-19 22:40:15.875 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForStrategyType(Recommender.java:158)] - Submit job for strategy compactionSmallPartitionFiles with context CompactionJobContext(name=generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, jobOptions={}, jobTemplateName=compaction, columns=[org.apache.gravitino.client.GenericColumn@f434c029, org.apache.gravitino.client.GenericColumn@510cb662, org.apache.gravitino.client.GenericColumn@3f404cd3, org.apache.gravitino.client.GenericColumn@8fce398, ... ``` #### With this patch: ``` 2026-02-19 22:26:16.679 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:183)] - Recommend strategy compactionSmallPartitionFiles for identifiers [generic.ad.ads_hourly_onsite_insertion_data, generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, generic.ad.ads_hourly_engagement_sampled_table_iceberg_insertion_id, generic.ad.ads_hourly_l1_sampled_data_table, generic.ad.ads_hourly_video_engagement_sampled_table_iceberg] 2026-02-19 22:26:18.154 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:191)] - Skip strategy compactionSmallPartitionFiles for identifier generic.ad.ads_hourly_onsite_insertion_data because strategy handler trigger condition is not met 2026-02-19 22:26:18.444 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:205)] - Recommend strategy compactionSmallPartitionFiles for identifier generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid score: 45548 2026-02-19 22:26:19.010 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:191)] - Skip strategy compactionSmallPartitionFiles for identifier generic.ad.ads_hourly_engagement_sampled_table_iceberg_insertion_id because strategy handler trigger condition is not met 2026-02-19 22:26:19.181 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:191)] - Skip strategy compactionSmallPartitionFiles for identifier generic.ad.ads_hourly_l1_sampled_data_table because strategy handler trigger condition is not met 2026-02-19 22:26:19.736 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForOneStrategy(Recommender.java:191)] - Skip strategy compactionSmallPartitionFiles for identifier generic.ad.ads_hourly_video_engagement_sampled_table_iceberg because strategy handler trigger condition is not met 2026-02-19 22:26:19.736 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.job.NoopJobSubmitter.submitJob(NoopJobSubmitter.java:43)] - NoopJobSubmitter submitJob: template=compaction, identifier=generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, jobExecuteContext=CompactionJobContext{name=generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, jobTemplateName='compaction', jobOptions={}, columnCount=5041, partitioningCount=4, partitionCount=100} 2026-02-19 22:26:19.743 INFO [main] [org.apache.gravitino.maintenance.optimizer.recommender.Recommender.recommendForStrategyType(Recommender.java:158)] - Submit job for strategy compactionSmallPartitionFiles with context CompactionJobContext{name=generic.ad.ads_daily_conversion_sampled_yellow_box_iceberg_userid, jobTemplateName='compaction', jobOptions={}, columnCount=5041, partitioningCount=4, partitionCount=100} ``` Fix: #10163 ### Does this PR introduce _any_ user-facing change? 1. Adds extra INFO logs in `gravitino-optimizer` log file for not matching candidates. 2. Removes redundant fields from `CompactionJobContext#toString` method to improve readability. ### How was this patch tested? - [x] Manual verification w/ `./gravitino-optimizer.sh -type recommend_strategy_type` CLI command Co-authored-by: Roman Horilyi <rhorilyi@pinterest.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry-pick Information:
branch-1.2