Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

yan-973 test nifty cuda docker image #137

Closed
wants to merge 1,766 commits into from
Closed

yan-973 test nifty cuda docker image #137

wants to merge 1,766 commits into from

Conversation

calgray
Copy link
Collaborator

@calgray calgray commented Apr 11, 2022

Minor daliuge-engine docker argument changes for better cuda compatibility

pritchardn and others added 30 commits February 4, 2022 15:08
Liu 221 - Generate template.palette in DALiuGE
@coveralls
Copy link

coveralls commented Apr 11, 2022

Coverage Status

Coverage decreased (-2.8%) to 64.027% when pulling 9d711c7 on yan-973 into 001fa1d on master.

@calgray calgray self-assigned this Apr 19, 2022
@@ -2,6 +2,7 @@
DOCKER_OPTS="\
--shm-size=2g --ipc=shareable \
--rm \
--gpus=all \
Copy link
Collaborator Author

@calgray calgray Apr 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as a sanity check could someone test that this doesn't cause an unknown argument error on an environment without nvidia-docker?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a go. This is my first time using these build/run scripts based on Docker so it might take a bit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running Engine development version in background...
docker run -td --shm-size=2g --ipc=shareable --rm --gpus=all --name daliuge-engine -v /var/run/docker.sock:/var/run/docker.sock -p 5555:5555 -p 6666:6666 -p 8000:8000 -p 8001:8001 -p 8002:8002 -p 9000:9000 --user 1000:1000  --group-add 134 -v /home/rtobar/dlg/workspace/settings/passwd:/etc/passwd -v /home/rtobar/dlg/workspace/settings/group:/etc/group -v /home/rtobar/dlg:/home/rtobar/dlg --env DLG_ROOT=/home/rtobar/dlg  icrar/daliuge-engine:yan-973
51a0d17da1384a237d5271cf067bc0d3b37cedd0a05d3222f6d699895cda750c
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and indeed without the --gpus=all option it starts)

Copy link
Collaborator Author

@calgray calgray May 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I think the best workaround in this case may be wrap the arg based on the presence of nvidia-docker, e.g.

$([[ $(nvidia-docker version) ]] && echo '--gpus=all' || echo '')

or

$(if [[ -x nvidia-docker ]]; then echo '--gpus=all'; else echo ''; fi)

@calgray calgray requested review from awicenec and rtobar April 19, 2022 02:57
@awicenec awicenec force-pushed the master branch 2 times, most recently from eb80e22 to 06cd2c3 Compare May 19, 2022 12:50
@calgray
Copy link
Collaborator Author

calgray commented May 20, 2022

rebased to #157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants