Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Environment URL field in Phase model #2392

Merged
merged 10 commits into from
Dec 23, 2019

Conversation

krtkvrm
Copy link
Member

@krtkvrm krtkvrm commented Jul 17, 2019

Added environment_url field in ChallengePhase model, that is the repository url for environment that will be used by RL Worker.
The worker can get the environment_url using api/challenges/phase/environment/<phase-slug>.

Signed-off-by: vkartik97 <3920286+vkartik97@users.noreply.github.com>
@krtkvrm
Copy link
Member Author

krtkvrm commented Jul 17, 2019

@yashdusing
Copy link
Contributor

Hey @vkartik97 ! I read your blog and was wondering if you tried this ?
Also, you mentioned of a problem regarding GPU orchestration. Could you please elaborate ?

@krtkvrm
Copy link
Member Author

krtkvrm commented Jul 18, 2019

@yashdusing

Hey @vkartik97 ! I read your blog and was wondering if you tried this ?

Yes, this is already enabled in EC2 P2 but I wanted to achieve sharing of GPU resources among the environment containers in multiple pods.

Also, you mentioned of a problem regarding GPU orchestration. Could you please elaborate ?

GPU Orchestration on P2 is working as expected, the problem is just the sharing extension. I tried it on 1 node minikube, which didn’t give expected behaviour.

@yashdusing
Copy link
Contributor

@vkartik97 I was wondering if this can be used on top of k8s ? It has fractional gpu support and also runs on top of k8s.

@krtkvrm
Copy link
Member Author

krtkvrm commented Jul 30, 2019

@yashdusing Really thanks for researching on the topic.
I did try ray-project at the end of Phase 2, and I ran into similar this issue: [ray] ray misuse gpu in docker container, while trying running tensorflow training on nividia-docker, one of the dependency for Kubernetes GPU Scheduling. I believe that the issue is due to nvidia-docker, but I need to spike more to come to any conclusion.
Again, thanks for working on this :)

Signed-off-by: vkartik97 <3920286+vkartik97@users.noreply.github.com>
@codecov-io
Copy link

codecov-io commented Jul 30, 2019

Codecov Report

Merging #2392 into master will decrease coverage by 0.05%.
The diff coverage is 53.84%.

@@            Coverage Diff             @@
##           master    #2392      +/-   ##
==========================================
- Coverage   72.74%   72.68%   -0.06%     
==========================================
  Files          82       82              
  Lines        5316     5327      +11     
==========================================
+ Hits         3867     3872       +5     
- Misses       1449     1455       +6
Impacted Files Coverage Δ
apps/challenges/views.py 100% <ø> (ø) ⬆️
apps/jobs/urls.py 100% <ø> (ø) ⬆️
apps/jobs/views.py 100% <ø> (ø) ⬆️
scripts/workers/remote_submission_worker.py 37.02% <53.84%> (+0.3%) ⬆️
Impacted Files Coverage Δ
apps/challenges/views.py 100% <ø> (ø) ⬆️
apps/jobs/urls.py 100% <ø> (ø) ⬆️
apps/jobs/views.py 100% <ø> (ø) ⬆️
scripts/workers/remote_submission_worker.py 37.02% <53.84%> (+0.3%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7adb66c...3f412f1. Read the comment docs.

@stale
Copy link

stale bot commented Sep 28, 2019

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactivity label Sep 28, 2019
@stale
Copy link

stale bot commented Oct 5, 2019

This pull request has been automatically closed as there is no further activity. Thank you for your contributions.

scripts/seed.py Outdated
@@ -175,11 +175,15 @@ def create_challenge_phases(challenge, number_of_phases=1):
challenge_phases = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please open up a new PR with the changes in this file?

Copy link
Member Author

@krtkvrm krtkvrm Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: #2533

@RishabhJain2018 RishabhJain2018 merged commit dcbc4f4 into Cloud-CV:master Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants