Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SINGA-429 update dockerfile for cuda cudnn update #437

Merged
merged 7 commits into from Mar 10, 2019

Conversation

dcslin
Copy link
Member

@dcslin dcslin commented Mar 1, 2019

Hi team,

Kindly find the changes below regarding update of cuda and cudnn version.

  • manual install cmake to 3.14, reason: as per testing and online search, cmake 3.10( the distributed version from apt-get install ) could not detect cuda 10 properly. And it's fixed in new version

Kindly let me know your comment. Thank you

@dcslin
Copy link
Member Author

dcslin commented Mar 1, 2019

Also, please kindly advise do we need cudnn 7.5? as mentioned in https://issues.apache.org/jira/browse/SINGA-429

However, the official docker image from nvidia is offering cudnn 7.4.2
https://gitlab.com/nvidia/cuda/blob/ubuntu18.04/10.0/devel/cudnn7/Dockerfile

@dcslin dcslin changed the title update dockerfile for cuda cudnn update SINGA-429 update dockerfile for cuda cudnn update Mar 1, 2019
@nudles
Copy link
Member

nudles commented Mar 3, 2019

pls add the command for compiling and installing singa.
when the users pull the image and run the container, they should be able to run
import singa without errors.

@dcslin
Copy link
Member Author

dcslin commented Mar 5, 2019

Hi @nudles , this is almost done. I have been stuck on a weird cmake(version > 3.12) behaviour and investigating it:
make syntax error in generated makefile build/python/CMakeFiles/_singa_wrap.dir/flags.make

@nudles
Copy link
Member

nudles commented Mar 7, 2019

how about mkldnn? can install it in the docker and let singa use it?

@dcslin
Copy link
Member Author

dcslin commented Mar 7, 2019

Hi @nudles , I am working on MKL, but I am trying make it optional, since for src/model/operation/*, MKLDNN actually "overwrite" cuda implementation

@nudles
Copy link
Member

nudles commented Mar 7, 2019

MKLDNN is called when the code is running on CPU; cudnn is called when it is running on GPU.
There should be no conflict.

@dcslin
Copy link
Member Author

dcslin commented Mar 8, 2019

HI @nudles , this is basically done:

  • python3, cuda9, devel
  • python3, cuda10, devel
  • mkldnn enabled

but it is not yet ready to merge and depends on:

  1. a workaround for cmake issue (generated makefile with syntax error). And I am still experimenting to understand the root cause..
    dcslin@a871dac
  2. mkldnn PR
    Integrated MKL-dnn for operation in model  #431

# ENV CMAKE_INCLUDE_PATH /usr/local/cuda/include:${CMAKE_INCLUDE_PATH}
# ENV CMAKE_LIBRARY_PATH /usr/local/cuda/lib64:${CMAKE_LIBRARY_PATH}
# config ssh service
&& mkdir /var/run/sshd \
# install mkldnn
RUN git clone https://github.com/intel/mkl-dnn.git /tmp/mkl-dnn \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be better to fix the version, e.g., from https://github.com/intel/mkl-dnn/archive/v1.0-pc.zip
otherwise, when the github repo is updated with API changes; the compilation of singa may fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you, this is right, I have fixed it.

libgoogle-glog-dev \
sudo \
&& apt-get clean \
&& apt-get autoremove \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to install icc and icpc to compile mkldnn?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi i will review this. for now with "manual fixing cmake file" i can execute mkldnn test

@nudles nudles merged commit 5eea07f into apache:master Mar 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants