-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-7624] Fixing index tagging duration #11035
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
danny0405
reviewed
Apr 17, 2024
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metrics/HoodieMetrics.java
Show resolved
Hide resolved
yihua
force-pushed
the
fixIndexDurationMetrics1
branch
from
May 13, 2024 19:32
244e2a2
to
b4458f1
Compare
github-actions
bot
added
size:M
PR with lines of changes in (100, 300]
and removed
size:S
PR with lines of changes in (10, 100]
labels
May 13, 2024
yihua
force-pushed
the
fixIndexDurationMetrics1
branch
from
May 14, 2024 01:40
e0d1d60
to
074845c
Compare
danny0405
reviewed
May 14, 2024
...lient/hudi-client-common/src/main/java/org/apache/hudi/table/action/HoodieWriteMetadata.java
Show resolved
Hide resolved
...lient-common/src/main/java/org/apache/hudi/table/action/commit/BaseCommitActionExecutor.java
Outdated
Show resolved
Hide resolved
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
Outdated
Show resolved
Hide resolved
...nt/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/BaseWriteHelper.java
Show resolved
Hide resolved
nsivabalan
force-pushed
the
fixIndexDurationMetrics1
branch
from
May 14, 2024 08:42
cba3f6e
to
13176f2
Compare
nsivabalan
force-pushed
the
fixIndexDurationMetrics1
branch
from
May 14, 2024 09:07
13176f2
to
205d863
Compare
github-actions
bot
added
size:S
PR with lines of changes in (10, 100]
and removed
size:M
PR with lines of changes in (100, 300]
labels
May 14, 2024
danny0405
approved these changes
May 14, 2024
yihua
added a commit
that referenced
this pull request
May 15, 2024
yihua
added a commit
that referenced
this pull request
May 15, 2024
yihua
added a commit
that referenced
this pull request
May 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Logs
Index lookup duration we emit as of now is buggy. We compute the duration before and after tag() call which is actually lazy. So, the actual lookup was not even triggered, but we compute the duration and emit the value. Within tag() calls, we do partitioners instantiation and few other minor things that runs in the driver and the index duration was referring to that.
We also confirmed from our production metrics.
duration from stream sync for one batch of ingest : 34 mins.
delta commit duration: 32 mins
index look up duration (buggy): 3.6 mins
So, fixing it in this patch. We are introducing a metric named "pre_write.lookup.duration" which will refer to duration of starting of write to the completion of building workload profile. Just when we are building the workload profile is when the entire dag is triggered and hence we can't split it up further.
Also removed the previous buggy metrics.
Testing:
Also tested manually. I injected 10 sec delay to bloom index check function (executor) and here are the duration from metrics.
hudi_trips_cow.commit.duration
value = 13886
hudi_trips_cow.pre_write.lookup.duration
value = 11411
hudi_trips_cow.index.lookup.duration
value = 987
Impact
Correct metric value for pre write duration.
Risk level (write none, low medium or high below)
low
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist