Skip to content
This repository was archived by the owner on Jul 10, 2024. It is now read-only.

SUBMARINE-308. Support PyTorch with TonY runtime#145

Closed
pingsutw wants to merge 3 commits intoapache:masterfrom
pingsutw:SUBMARINE-308
Closed

SUBMARINE-308. Support PyTorch with TonY runtime#145
pingsutw wants to merge 3 commits intoapache:masterfrom
pingsutw:SUBMARINE-308

Conversation

@pingsutw
Copy link
Copy Markdown
Member

@pingsutw pingsutw commented Jan 8, 2020

What is this PR for?

Support PyTorch with TonY runtime

What type of PR is it?

[Bug Fix]

Todos

  • - Task

What is the Jira issue?

https://issues.apache.org/jira/browse/SUBMARINE-308

How should this be tested?

https://travis-ci.org/pingsutw/hadoop-submarine/builds/634462654

Screenshots (if appropriate)

Screenshot from 2020-01-09 06-31-04

Questions:

  • Does the licenses files need update? No
  • Is there breaking changes for older versions? No
  • Does this needs documentation? No

@jojochuang
Copy link
Copy Markdown
Contributor

Change looks good. Test failure appears unrelated.

@pingsutw
Copy link
Copy Markdown
Member Author

pingsutw commented Jan 9, 2020

@jojochuang thanks for the review, I re-push the patch to trigger Travis ci

Copy link
Copy Markdown

@adamantal adamantal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @pingsutw. Looks good, had some minor nits.

I saw that you already used the patch against a YARN cluster, and it worked which is good. May I ask you whether you plan adding some unit testing to this issue?

}

LOG.info("Starting Tony runtime..");

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a log entry here about the type of the framework?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

tonyConf = YarnUtils.tonyConfFromClientContext(
(ParametersHolder) parameters);
} catch (Exception e) {
throw new RuntimeException("Failed to create tony conf from client context", e);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe throw a SubmarineException here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// Resources for PS & Worker
if (parameters.getPsResource() != null) {
if (parameters.getOptionValue(CliConstants.PS_RES) != null) {
Resource resource = getResource(parameters, CliConstants.PS_RES);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Resource resource = getResource(parameters, CliConstants.PS_RES);
Resource psResource = getResource(parameters, CliConstants.PS_RES);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
if (parameters.getWorkerResource() != null) {
if (parameters.getOptionValue(CliConstants.WORKER_RES) != null) {
Resource resource = getResource(parameters, CliConstants.WORKER_RES);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Resource resource = getResource(parameters, CliConstants.WORKER_RES);
Resource workerResource = getResource(parameters, CliConstants.WORKER_RES);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


private static Resource getResource(Parameter parametersHolder, String option)
throws ParseException, YarnException {
String ResourceStr =
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
String ResourceStr =
String resourceStr =

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

throws ParseException, YarnException {
String ResourceStr =
parametersHolder.getOptionValue(option);
if (ResourceStr == null) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (ResourceStr == null) {
if (resourceStr == null) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@pingsutw
Copy link
Copy Markdown
Member Author

Thanks @adamantal for the review.
I commit a new patch and add uint test for PyTorch

@pingsutw
Copy link
Copy Markdown
Member Author

Will merge if no more comments

@asfgit asfgit closed this in 74cf91e Jan 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants