Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

running hvd-tensorflow in docker pulled from official docker hub, got " Read -1, expected 131072, errno = 1" #653

Closed
yyht opened this issue Nov 26, 2018 · 7 comments
Labels

Comments

@yyht
Copy link

@yyht yyht commented Nov 26, 2018

image
what does it mean? did i do something wrong?
image
but the process was still running

@byan23

This comment has been minimized.

Copy link

@byan23 byan23 commented Nov 27, 2018

Read this part of the doc: https://github.com/uber/horovod/blob/master/docs/docker.md
"If you don't run your container in privileged mode, you may see the following message:"

So, simply ignore it or try adding "--privileged" to your docker command.

@tgaddair tgaddair added the question label Nov 28, 2018
@tgaddair

This comment has been minimized.

Copy link
Collaborator

@tgaddair tgaddair commented Nov 28, 2018

Hey @yyht, did the suggestion given by @byan23 solve your issue?

@yyht

This comment has been minimized.

Copy link
Author

@yyht yyht commented Nov 30, 2018

Yes, thanks @tgaddair @byan23. Horovod is an excellent work, and helps me a lot for large-scale training

@yyht yyht closed this Nov 30, 2018
@tgaddair

This comment has been minimized.

Copy link
Collaborator

@tgaddair tgaddair commented Nov 30, 2018

Thanks @yyht, glad it's working out for you!

@MichaelX99

This comment has been minimized.

Copy link

@MichaelX99 MichaelX99 commented Dec 11, 2018

I am trying to run horovod inside a docker container on AWS Sagemaker which requires that the container to be run without privilege. This is effectively a warning since training does work however it clutters up the log files. Is there perhaps a catch that I can implement to not have the warnings written out to STDOUT?

@alsrgv

This comment has been minimized.

Copy link
Member

@alsrgv alsrgv commented Dec 14, 2018

cc @karakusc, do you know if Sagemaker allows using privileged containers? Or is it possible to use nvidia-docker? I don't seem to have "Read error" with it.

@karakusc

This comment has been minimized.

Copy link
Collaborator

@karakusc karakusc commented Dec 14, 2018

@alsrgv @MichaelX99 SageMaker does not allow running privileged containers, but you can set up a filter in CloudWatch to suppress these warnings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.