New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SUBMARINE-321. Add JobManager and SubmitterManager components #135
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jiwq for the contribution. It looks good to me in general. I've several comments and please check them.
* The replica spec for TFJob. It contains replicas, restart policy and pod template. | ||
* The template describe the running instance for task. | ||
*/ | ||
public class TFReplicaSpec { |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
* TensorFlow: Chief, Ps, Worker, Evaluator | ||
* PyTorch: Master, Worker | ||
*/ | ||
private Map<String, JobTaskSpec> taskSpecs; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tangzhankun for the review.
/** | ||
* The job task name, the range is [Chief, Ps, Worker, Evaluator, Master] | ||
*/ | ||
private String name; |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
It LGTM. +1. Thanks for your contribution. @jiwq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiwq,
Thanks for the contributions.
What is this PR for?
The JobManager is responsible for cache and schedule the job to submitter.
The SubmitterManager help to load the different submitter plugins.
What type of PR is it?
Feature
Todos
[ ] Hook REST server and JobManager
What is the Jira issue?
https://issues.apache.org/jira/browse/SUBMARINE-321
How should this be tested?
Screenshots (if appropriate)
Questions: