Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#972 ] fix(tez): Add output mapOutputByteCounter metrics #1016

Merged
merged 5 commits into from
Jul 18, 2023
Merged

[#972 ] fix(tez): Add output mapOutputByteCounter metrics #1016

merged 5 commits into from
Jul 18, 2023

Conversation

bin41215
Copy link
Contributor

@bin41215 bin41215 commented Jul 17, 2023

What changes were proposed in this pull request?

Add output mapOutputByteCounter metrics

Why are the changes needed?

add this metrics to fix reducing the number of tasks

Fix: #972

Does this PR introduce any user-facing change?

No.

How was this patch tested?

ut

@bin41215 bin41215 changed the title [MINOR]Fix(tez) add output mapOutputByteCounter metrics [#972 ] Fix(tez) add output mapOutputByteCounter metrics Jul 17, 2023
@jerqi jerqi changed the title [#972 ] Fix(tez) add output mapOutputByteCounter metrics [#972 ] fix(tez): Add output mapOutputByteCounter metrics Jul 17, 2023
@jerqi
Copy link
Contributor

jerqi commented Jul 17, 2023

Could you polish your pull request description? Could you fill the correct information instead of default information?

@codecov-commenter
Copy link

codecov-commenter commented Jul 17, 2023

Codecov Report

Merging #1016 (6d97906) into master (4b11370) will increase coverage by 1.16%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master    #1016      +/-   ##
============================================
+ Coverage     53.65%   54.81%   +1.16%     
- Complexity     2520     2521       +1     
============================================
  Files           382      362      -20     
  Lines         21647    19289    -2358     
  Branches       1798     1798              
============================================
- Hits          11614    10573    -1041     
+ Misses         9328     8083    -1245     
+ Partials        705      633      -72     
Impacted Files Coverage Δ
...ez/runtime/library/common/sort/impl/RssSorter.java 57.95% <ø> (ø)
.../runtime/library/common/sort/impl/RssUnSorter.java 63.63% <ø> (ø)
...library/common/sort/buffer/WriteBufferManager.java 86.85% <100.00%> (+0.12%) ⬆️

... and 21 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@bin41215
Copy link
Contributor Author

bin41215 commented Jul 17, 2023 via email

Configuration conf = new Configuration();
FileSystem localFs = FileSystem.getLocal(conf);
Path workingDir = new Path(System.getProperty("test.build.data",
System.getProperty("java.io.tmpdir", "/tmp")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use org.junit.jupiter.api.io.TempDir instead of /tmp? TempDir can be cleaned automatically. Maybe we don't have the permission of /tmp in some machine envrionment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use org.junit.jupiter.api.io.TempDir instead of /tmp? TempDir can be cleaned automatically. Maybe we don't have the permission of /tmp in some machine envrionment.

there need hadoop.fs.Path but TempDir only support nio.file.Path and io.File

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use similar code as below.

 public void test(@TempDir File tmpDir) throws IOException {
    Path workingDir = new Path(System.getProperty("test.build.data",
        System.getProperty("java.io.tmpdir", tmpDir.toString()))
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmpDir.toString()

thxs, i'll modify it later

HashPartitioner.class.getName());
conf.setStrings(TezRuntimeFrameworkConfigs.LOCAL_DIRS, workingDir.toString());
OutputContext outputContext = OutputTestHelpers.createOutputContext(conf, workingDir);
TezCounter mapOutputByteCounter = outputContext.getCounters().findCounter(TaskCounter.OUTPUT_BYTES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add some some assertion to verify the result of metrics?

Configuration conf = new Configuration();
FileSystem localFs = FileSystem.getLocal(conf);
Path workingDir = new Path(System.getProperty("test.build.data",
System.getProperty("java.io.tmpdir", "/tmp")),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

HashPartitioner.class.getName());
conf.setStrings(TezRuntimeFrameworkConfigs.LOCAL_DIRS, workingDir.toString());
OutputContext outputContext = OutputTestHelpers.createOutputContext(conf, workingDir);
TezCounter mapOutputByteCounter = outputContext.getCounters().findCounter(TaskCounter.OUTPUT_BYTES);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

@jerqi jerqi requested review from zhengchenyu and lifeSo July 17, 2023 16:11
@jerqi
Copy link
Contributor

jerqi commented Jul 17, 2023

@lifeSo @zhengchenyu Could you help review this pr?

@zhengchenyu
Copy link
Collaborator

I found only mapOutputByteCounter was added. Do you plan to add other counters?
I have no problem about the code. Can we add a unit test to compare the counter generated by RssSorter to the counter generated by DefaultSort?

@zhengchenyu
Copy link
Collaborator

LGTM +1

@jerqi
Copy link
Contributor

jerqi commented Jul 18, 2023

You should run the command to format the files

mvn spotless:apply -Pspark3 -Pspark2 -Ptez -Pmr -Phadoop2.8

@lifeSo
Copy link
Collaborator

lifeSo commented Jul 18, 2023

Could you polish your pull request description? Could you fill the correct information instead of default information?

ok

Copy link
Collaborator

@lifeSo lifeSo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, good!

Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @bin41215 @lifeSo @zhengchenyu , merged to master.

@jerqi jerqi merged commit 976ae9a into apache:master Jul 18, 2023
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask] Tez client add missed metric
5 participants