[HUDI-92] Provide reasonable names for Spark DAG stages in Hudi. by prashantwason · Pull Request #1289 · apache/hudi

prashantwason · 2020-01-29T00:37:47Z

What is the purpose of the pull request

HUDI DAG stages do not have any names. The Spark History Server UI shows these stages with the HUDI JAVA filename and the line number.

This change provides descriptive names for the stages which makes it easier to visualize the HUDI DAG.

Brief change log

All code locations where JavaSparkContext.XXX (DAG creation methods like parallelize) are used have been updated with descriptive names for the job with the following API
jsc.setJobGroup(title, description);
Title is the class name so its easy to identify. Description is the activity for that stage.
Unit test code has been updates to enable the unit tests to write the Spark event logs so that even the unit tests can be visualized in a locally installed Spark History Server UI.
The unit tests need to be run as follows:

mvn test -DSPARK_EVLOG_DIR=/path/for/spark/event/log

Once the unit tests complete, the Spark History Server shows the application logs for all completed unit tests. The unit tests themselves can be identified by the test-class-name as the application name.

Verify this pull request

-- This pull request is already covered by existing tests, such as (please describe tests).

All existing unit tests pass.
If the SPARK_EVLOG_DIR property is not set, spark event logging is not enabled (this is defauly behavior)
Manually verified the change by running unit tests locally and observing the application logs on the locally installed Spark History Server.

Committer checklist

Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

n3nash · 2020-01-29T18:58:36Z

@bvaradar @vinothchandar can you guys help review this ?

n3nash · 2020-01-29T19:00:01Z

@prashantwason I think you tried another approach using aspects to make the code look cleaner right ? Could you please briefly describe that approach here (pros and cons) here so reviewers are also aware of it ?

prashantwason · 2020-01-29T20:28:33Z

A DAG stage name and description can be set using the JavaSparkContext.setJobDescription(...) method. The same name/description is used for all stages which use the same thread until the name/description is updated (another call to setJobDescription) or deleted (clearJobGroup).

In this PR, I am using the ClassName as the stage name and a textual description derived from the method logic. HUDI classes have very descriptive names so this works well.

There are two ways this may be done:

Manually (this PR) by adding code set the name/description before any DAG stages are started.
Using Java AOP to automatically find code locations matching some pattern and augment them with the call to setJobDescription.

To use AOP approach, we can create a separate AspectJ file containing the Pointcuts (code locations to augment) and Advices (code to insert). There is a separate AspectJ compiler which at runtime can change the class bytecode to add the Advices.

Pros of AOP approach:

Does not require any change in current code
Also covers future code automatically
Easy to undo (just don't run the AspectJ compiler as part of build)
Can be extended to more use-cases like automating Metrics.

Cons of AOP approach:

Require AspectJ and its compiler to be integrated with the HUDI build chain
The Advice cannot be dynamic. Hence we cannot provide descriptions to the DAG stages (we can still use the class name as the DAG stage name).

Since the code has a manageable number of places where DAG is created, I prefer the simpler manual approach. It also ends up documenting the code.

vinothchandar · 2020-02-04T08:01:49Z

@prashantwason this is a great contribution for anyone debugging hudi writing... Can you post some screenshots for how upsert/bulk_insert dags now show up on the UI?

also @n3nash if you want to review this, feel free to grab this from me

prashantwason · 2020-02-05T20:05:48Z

Spark History Server Screenshots

vinothchandar · 2020-02-06T06:33:08Z

@prashantwason this is so awesome! Started reviewing this ..

vinothchandar · 2020-02-06T06:35:50Z

The unit tests need to be run as follows:
mvn test -DSPARK_EVLOG_DIR=/path/for/spark/event/log

lets add this to the README, under a new section Running Tests

vinothchandar

Made one pass and left some general comments to clarify the detail/description text..

Few high level questions

It seems like you are only covering cases where an RDD is getting created? Is it possible to change the job group between stages? for e.g we have quite a few stages in HoodieBloomIndex and wondering if we can show them all https://cwiki.apache.org/confluence/display/HUDI/def~bloom-index
Related to 1, if the answer is no, then how does an typical bulk_insert, upsert, insert dag look like? I am wondering if say the entire bloom index stages get named Obtain file ranges as range pruning is enabled, since that was the last call we made to set the name...

hudi-client/src/main/java/org/apache/hudi/CompactionAdminClient.java

hudi-client/src/main/java/org/apache/hudi/HoodieWriteClient.java

hudi-client/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java

hudi-client/src/main/java/org/apache/hudi/io/compact/HoodieMergeOnReadTableCompactor.java

hudi-client/src/main/java/org/apache/hudi/table/HoodieCopyOnWriteTable.java

hudi-client/src/test/java/org/apache/hudi/HoodieClientTestHarness.java

prashantwason · 2020-02-18T19:25:04Z

It seems like you are only covering cases where an RDD is getting created? Is it possible to change the job group between stages?

The setJobGroup() description applies to the Thread and is used until either it is updated or removed. So we need to label each stage and they should show up correctly.

We can label RDD creations as well as operations on them.

vinothchandar · 2020-02-18T19:32:44Z

So we need to label each stage and they should show up correctly.

if we could do that, and if you can post the upsert() dag for example, that would be great..

vinothchandar · 2020-02-24T19:39:53Z

@prashantwason Wondering if you have some screenshots for the upsert dag.. (I can try running tjhe PR locally if not)

prashantwason · 2020-02-24T19:48:10Z

I dont have them yet. I can run any specific Hudi test already committed and quickly get the screenshot if that helps.

…

On Mon, Feb 24, 2020 at 11:39 AM vinoth chandar ***@***.***> wrote: @prashantwason <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_prashantwason&d=DwMCaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=c89AU9T1AVhM4r2Xi3ctZA&m=krc72Vl8fn-BAV9Uw7bypYN5QdpKdBETNBLTRmB0Lxg&s=GjZTgUvDI4lH7V_gFYBDEoTDzLa1ymVz_0l4ErWsUqk&e=> Wondering if you have some screenshots for the upsert dag.. (I can try running tjhe PR locally if not) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_incubator-2Dhudi_pull_1289-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DAN55SS53LAOGY63I2KHAEFTREQPAXA5CNFSM4KM4XRH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMZIBDY-23issuecomment-2D590512271&d=DwMCaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=c89AU9T1AVhM4r2Xi3ctZA&m=krc72Vl8fn-BAV9Uw7bypYN5QdpKdBETNBLTRmB0Lxg&s=j_VwpAFSuK81XDMj5bLlrsR-alMRwPkTRoVwEwLiDwE&e=>, or unsubscribe <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AN55SSZKYZC4UBXW474KUALREQPAXANCNFSM4KM4XRHQ&d=DwMCaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=c89AU9T1AVhM4r2Xi3ctZA&m=krc72Vl8fn-BAV9Uw7bypYN5QdpKdBETNBLTRmB0Lxg&s=jAfj9t0L3_KFQ1dOdg7Djlcv5tUbbUe0ootOGzMuBjo&e=> .

vinothchandar · 2020-03-16T19:28:57Z

@prashantwason still driving this? Can I help get this moving along?

prashantwason · 2020-03-30T18:32:26Z

@vinothchandar Yep, I would like this to move forward. Let me revive this as it seems there are merge conflicts now.

We dont have this deployed yet so the only DAG screenshots I can provide are from unit tests. On you end, can you provide me some specific unit tests which are exercising the specific DAGs you are interested in? I can generate the screenshots from those.

vinothchandar · 2020-04-07T14:50:17Z

TestMergeOnReadTable or TestClientCopyOnWriteStorage etc that will do a full upsert dag for cow and mor are good starting points.. but really, we need to run an upsert with a real job to ensure these values also show up in real deployments

vinothchandar · 2020-04-22T20:15:30Z

@prashantwason this will be a good candidate to fast track into the next release. are you still working on this?

nsivabalan · 2020-05-23T22:24:50Z

@prashantwason : Once you update the PR, do let @lamber-ken know that its ready for review.
@lamber-ken : would you mind reviewing this PR.

prashantwason · 2020-05-30T01:34:38Z

@lamber-ken I have updated the PR and added screenshots from the Spark History Server UI for some of the MOR table operations. Please have a look.

prashantwason · 2020-05-30T01:35:35Z

List of executed tests - the name is made up of the test and the method

Various DAGs

codecov-commenter · 2020-05-30T02:24:10Z

Codecov Report

Merging #1289 into master will increase coverage by 0.00%.
The diff coverage is 20.00%.

@@            Coverage Diff            @@
##             master    #1289   +/-   ##
=========================================
  Coverage     18.19%   18.19%           
  Complexity      856      856           
=========================================
  Files           348      348           
  Lines         15344    15359   +15     
  Branches       1523     1523           
=========================================
+ Hits           2792     2795    +3     
- Misses        12195    12207   +12     
  Partials        357      357

Impacted Files	Coverage Δ	Complexity Δ
.../org/apache/hudi/client/CompactionAdminClient.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (ø)`
...c/main/java/org/apache/hudi/table/HoodieTable.java	`39.53% <0.00%> (-0.63%)`	`22.00 <0.00> (ø)`
...e/hudi/table/action/clean/CleanActionExecutor.java	`10.52% <0.00%> (-0.17%)`	`5.00 <0.00> (ø)`
...ction/compact/HoodieMergeOnReadTableCompactor.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (ø)`
...on/rollback/MergeOnReadRollbackActionExecutor.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (ø)`
...che/hudi/table/action/rollback/RollbackHelper.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (ø)`
...able/action/savepoint/SavepointActionExecutor.java	`0.00% <0.00%> (ø)`	`0.00 <0.00> (ø)`
...n/java/org/apache/hudi/index/HoodieIndexUtils.java	`47.61% <100.00%> (+2.61%)`	`3.00 <1.00> (ø)`
.../org/apache/hudi/index/bloom/HoodieBloomIndex.java	`59.00% <100.00%> (+0.41%)`	`16.00 <0.00> (ø)`
...he/hudi/table/action/commit/UpsertPartitioner.java	`55.71% <100.00%> (+0.31%)`	`15.00 <0.00> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a0d3f1...ed384e6. Read the comment docs.

n3nash · 2020-06-12T16:52:17Z

@vinothchandar could you take a look at the screenshots and see if that provides what you were looking for ?

vinothchandar · 2020-06-15T12:50:07Z

@n3nash this has been since reassigned.. feel free to grab this review, I have few others queued up before I can get to this.

n3nash · 2020-07-08T03:51:22Z

@prashantwason can you rebase and push please ? I can then merge this

prashantwason · 2020-07-13T19:16:32Z

@n3nash I have rebased the changes. Build is green.

…undled jars (apache#1289)

n3nash requested a review from bvaradar January 29, 2020 18:58

n3nash assigned vinothchandar Jan 29, 2020

n3nash self-requested a review January 29, 2020 19:00

prashantwason force-pushed the pw_dag_names branch from a4ecd36 to 5245677 Compare February 5, 2020 01:49

vinothchandar reviewed Feb 6, 2020

View reviewed changes

vinothchandar assigned n3nash and vinothchandar and unassigned vinothchandar May 5, 2020

vinothchandar changed the title ~~[HUDI-92] Provide reasonable names for Spark DAG stages in Hudi.~~ [WIP] [HUDI-92] Provide reasonable names for Spark DAG stages in Hudi. May 5, 2020

vinothchandar assigned lamberken and unassigned vinothchandar May 14, 2020

prashantwason force-pushed the pw_dag_names branch from f5220d9 to ed384e6 Compare May 30, 2020 01:33

n3nash changed the title ~~[WIP] [HUDI-92] Provide reasonable names for Spark DAG stages in Hudi.~~ [HUDI-92] Provide reasonable names for Spark DAG stages in Hudi. Jun 12, 2020

[HUDI-92] Provide reasonable names for Spark DAG stages in HUDI.

3d52657

prashantwason force-pushed the pw_dag_names branch from ed384e6 to 3d52657 Compare July 13, 2020 18:08

vinothchandar approved these changes Jul 19, 2020

View reviewed changes

vinothchandar merged commit b71f25f into apache:master Jul 19, 2020

vinishjail97 pushed a commit to vinishjail97/hudi that referenced this pull request Mar 24, 2025

[ENG-24093][INTERNAL] Protobuf utilities shaded but not included in b…

a02276d

…undled jars (apache#1289)

Conversation

prashantwason commented Jan 29, 2020

What is the purpose of the pull request

Brief change log

Verify this pull request

Committer checklist

Uh oh!

n3nash commented Jan 29, 2020

Uh oh!

n3nash commented Jan 29, 2020

Uh oh!

prashantwason commented Jan 29, 2020

Uh oh!

vinothchandar commented Feb 4, 2020

Uh oh!

prashantwason commented Feb 5, 2020

Uh oh!

vinothchandar commented Feb 6, 2020

Uh oh!

vinothchandar commented Feb 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vinothchandar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

prashantwason commented Feb 18, 2020

Uh oh!

vinothchandar commented Feb 18, 2020

Uh oh!

vinothchandar commented Feb 24, 2020

Uh oh!

prashantwason commented Feb 24, 2020 via email

Uh oh!

vinothchandar commented Mar 16, 2020

Uh oh!

prashantwason commented Mar 30, 2020

Uh oh!

vinothchandar commented Apr 7, 2020

Uh oh!

vinothchandar commented Apr 22, 2020

Uh oh!

nsivabalan commented May 23, 2020

Uh oh!

prashantwason commented May 30, 2020

Uh oh!

prashantwason commented May 30, 2020

Uh oh!

codecov-commenter commented May 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

n3nash commented Jun 12, 2020

Uh oh!

vinothchandar commented Jun 15, 2020

Uh oh!

n3nash commented Jul 8, 2020

Uh oh!

prashantwason commented Jul 13, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

vinothchandar commented Feb 6, 2020 •

edited

Loading

codecov-commenter commented May 30, 2020 •

edited

Loading