New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to CuDNN 7 and CUDA 9 #12052

Closed
tpankaj opened this Issue Aug 4, 2017 · 170 comments

Comments

Projects
None yet
@tpankaj

tpankaj commented Aug 4, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows Server 2012
  • TensorFlow installed from (source or binary): binary
  • TensorFlow version (use command below): 1.3.0-rc1
  • Python version: 3.5.2
  • Bazel version (if compiling from source): N/A
  • CUDA/cuDNN version: CUDA V8.0.44, CuDNN 6.0
  • GPU model and memory: Nvidia GeForce GTX 1080 Ti, 11 GB
  • Exact command to reproduce: N/A

Describe the problem

Please upgrade TensorFlow to support CUDA 9 and CuDNN 7. Nvidia claims this will provide a 2x performance boost on Pascal GPUs.

@shivaniag

This comment has been minimized.

Show comment
Hide comment
@shivaniag

shivaniag Aug 4, 2017

Contributor

@tfboyd do you have any comments on this?

Contributor

shivaniag commented Aug 4, 2017

@tfboyd do you have any comments on this?

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 5, 2017

Member

cuDNN 7 is still in preview mode and is being worked on. We just moved to cuDNN 6.0 with 1.3, which should go final in a couple weeks. You can download cuDNN 1.3.0rc2 if you are interested in that. I have not compiled with cuDNN 7 or CUDA 9 yet. I have heard CUDA 9 is not easy to install on all platforms and only select install packages are available. When the libraries are final we will start the final evaluation. NVIDIA has also just started sending patches to the major ML platforms to support aspects of these new libraries and I suspect there will be additional work.

Edit: I meant to say CUDA 9 is not easy to install on all platforms and instead said cuDNN. I also changed sure there will be work to I suspect there will be additional work. The rest of my silly statement I left, e.g. I did not realize cuDNN 7 went live yesterday.

Member

tfboyd commented Aug 5, 2017

cuDNN 7 is still in preview mode and is being worked on. We just moved to cuDNN 6.0 with 1.3, which should go final in a couple weeks. You can download cuDNN 1.3.0rc2 if you are interested in that. I have not compiled with cuDNN 7 or CUDA 9 yet. I have heard CUDA 9 is not easy to install on all platforms and only select install packages are available. When the libraries are final we will start the final evaluation. NVIDIA has also just started sending patches to the major ML platforms to support aspects of these new libraries and I suspect there will be additional work.

Edit: I meant to say CUDA 9 is not easy to install on all platforms and instead said cuDNN. I also changed sure there will be work to I suspect there will be additional work. The rest of my silly statement I left, e.g. I did not realize cuDNN 7 went live yesterday.

@tfboyd tfboyd self-assigned this Aug 5, 2017

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 5, 2017

Member

Not saying how you should read the website. But the 2x faster on pascal looks to be part of the CUDA 8 release. I suppose it depends on how you read the site. NVIDIA has not mentioned to us that CUDA 9 is going to speed up Pascal by 2x (on everything) and while anything is possible, I would not expect that to happen.

https://developer.nvidia.com/cuda-toolkit/whatsnew

The site is a little confusing but I think the section you are quoting is nested under the CUDA 8. I only mention this so you do not have unrealistic expectations for their release. For Volta there should be some great gains from what I understand and I think (I do not now for sure) people are just getting engineering samples of Volta to start high level work to get ready for the full release.

Member

tfboyd commented Aug 5, 2017

Not saying how you should read the website. But the 2x faster on pascal looks to be part of the CUDA 8 release. I suppose it depends on how you read the site. NVIDIA has not mentioned to us that CUDA 9 is going to speed up Pascal by 2x (on everything) and while anything is possible, I would not expect that to happen.

https://developer.nvidia.com/cuda-toolkit/whatsnew

The site is a little confusing but I think the section you are quoting is nested under the CUDA 8. I only mention this so you do not have unrealistic expectations for their release. For Volta there should be some great gains from what I understand and I think (I do not now for sure) people are just getting engineering samples of Volta to start high level work to get ready for the full release.

@sclarkson

This comment has been minimized.

Show comment
Hide comment
@sclarkson

sclarkson Aug 5, 2017

@tfboyd cuDNN 7 is no longer in preview mode as of yesterday. It has been officially released for both CUDA 8.0 and CUDA 9.0 RC.

sclarkson commented Aug 5, 2017

@tfboyd cuDNN 7 is no longer in preview mode as of yesterday. It has been officially released for both CUDA 8.0 and CUDA 9.0 RC.

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 5, 2017

Member

Ahh I missed that. Thanks @sclarkson and sorry for the wrong info.

Member

tfboyd commented Aug 5, 2017

Ahh I missed that. Thanks @sclarkson and sorry for the wrong info.

@theflofly

This comment has been minimized.

Show comment
Hide comment
@theflofly

theflofly Aug 5, 2017

Contributor

I will certainly try it because finally gcc 6 is supported by CUDA 9 and Ubuntu 17.04 comes with it.

Contributor

theflofly commented Aug 5, 2017

I will certainly try it because finally gcc 6 is supported by CUDA 9 and Ubuntu 17.04 comes with it.

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 5, 2017

Member
Member

tfboyd commented Aug 5, 2017

@ppwwyyxx

This comment has been minimized.

Show comment
Hide comment
@ppwwyyxx

ppwwyyxx Aug 5, 2017

Contributor

Speaking of methods to be added, group convolution from cudnn7 would be a important feature for vision community.

Contributor

ppwwyyxx commented Aug 5, 2017

Speaking of methods to be added, group convolution from cudnn7 would be a important feature for vision community.

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 5, 2017

Member
Member

tfboyd commented Aug 5, 2017

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 5, 2017

Member
Member

tfboyd commented Aug 5, 2017

@4F2E4A2E

This comment has been minimized.

Show comment
Hide comment
@4F2E4A2E

4F2E4A2E Aug 6, 2017

Contributor

I am trying to get cuDNN 7 with CUDA 8/9 running. CUDA 8 is not supported by the GTX 1080 Ti - at least the installer says so ^^

I am having a big time trouble getting it running together. I want to point out this great article that sums up what i already have tried: https://nitishmutha.github.io/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html

The CUDA examples are working via Visual-Studio in both setup combinations.
Here the output of the deviceQuery.exe which was compiled using Visual-Studio:

PS C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release> deviceQuery.exe
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release\deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080 Ti"
  CUDA Driver Version / Runtime Version          9.0 / 9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11264 MBytes (11811160064 bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1683 MHz (1.68 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1, Device0 = GeForce GTX 1080 Ti
Result = PASS

@tfboyd do you have any link confirming the cuDNN update from Nvidea?

Contributor

4F2E4A2E commented Aug 6, 2017

I am trying to get cuDNN 7 with CUDA 8/9 running. CUDA 8 is not supported by the GTX 1080 Ti - at least the installer says so ^^

I am having a big time trouble getting it running together. I want to point out this great article that sums up what i already have tried: https://nitishmutha.github.io/tensorflow/2017/01/22/TensorFlow-with-gpu-for-windows.html

The CUDA examples are working via Visual-Studio in both setup combinations.
Here the output of the deviceQuery.exe which was compiled using Visual-Studio:

PS C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release> deviceQuery.exe
C:\ProgramData\NVIDIA Corporation\CUDA Samples\v9.0\bin\win64\Release\deviceQuery.exe Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 1080 Ti"
  CUDA Driver Version / Runtime Version          9.0 / 9.0
  CUDA Capability Major/Minor version number:    6.1
  Total amount of global memory:                 11264 MBytes (11811160064 bytes)
  (28) Multiprocessors, (128) CUDA Cores/MP:     3584 CUDA Cores
  GPU Max Clock rate:                            1683 MHz (1.68 GHz)
  Memory Clock rate:                             5505 Mhz
  Memory Bus Width:                              352-bit
  L2 Cache Size:                                 2883584 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 9.0, NumDevs = 1, Device0 = GeForce GTX 1080 Ti
Result = PASS

@tfboyd do you have any link confirming the cuDNN update from Nvidea?

@tpankaj

This comment has been minimized.

Show comment
Hide comment
@tpankaj

tpankaj Aug 7, 2017

@4F2E4A2E 1080 Ti definitely supports CUDA 8.0. That's what I've been using with TensorFlow for the past several months.

tpankaj commented Aug 7, 2017

@4F2E4A2E 1080 Ti definitely supports CUDA 8.0. That's what I've been using with TensorFlow for the past several months.

@colmantse

This comment has been minimized.

Show comment
Hide comment
@colmantse

colmantse Aug 7, 2017

Hi all, so i have gtx 1080 ti with cuda 8.0. I am trying to install tensorflow-gpu, do i go for cuDNN 5.1, 6.0 or 7.0?

colmantse commented Aug 7, 2017

Hi all, so i have gtx 1080 ti with cuda 8.0. I am trying to install tensorflow-gpu, do i go for cuDNN 5.1, 6.0 or 7.0?

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 7, 2017

Member
Member

tfboyd commented Aug 7, 2017

@colmantse

This comment has been minimized.

Show comment
Hide comment
@colmantse

colmantse Aug 7, 2017

thanks, i tried with cudnn 6.0 but doesn't work, i guess because of my dummy tf-gpu installation. cudnn 5.1 works for me with python 3.6

colmantse commented Aug 7, 2017

thanks, i tried with cudnn 6.0 but doesn't work, i guess because of my dummy tf-gpu installation. cudnn 5.1 works for me with python 3.6

@4F2E4A2E

This comment has been minimized.

Show comment
Hide comment
@4F2E4A2E

4F2E4A2E Aug 8, 2017

Contributor

@tpankaj Thank you! I've got it running with CUDA 8 and cuDNN 5.1

Contributor

4F2E4A2E commented Aug 8, 2017

@tpankaj Thank you! I've got it running with CUDA 8 and cuDNN 5.1

@cancan101

This comment has been minimized.

Show comment
Hide comment
@cancan101

cancan101 Aug 8, 2017

Contributor

Here are the full set of features in cuDNN 7:

Key Features and Enhancements
This cuDNN release includes the following key features and enhancements.
Tensor Cores
Version 7.0.1 of cuDNN is the first to support the Tensor Core operations in its
implementation. Tensor Cores provide highly optimized matrix multiplication
building blocks that do not have an equivalent numerical behavior in the traditional
instructions, therefore, its numerical behavior is slightly different.
cudnnSetConvolutionMathType, cudnnSetRNNMatrixMathType, and
cudnnMathType_t
The cudnnSetConvolutionMathType and cudnnSetRNNMatrixMathType
functions enable you to choose whether or not to use Tensor Core operations in
the convolution and RNN layers respectively by setting the math mode to either
CUDNN_TENSOR_OP_MATH or CUDNN_DEFAULT_MATH.
Tensor Core operations perform parallel floating point accumulation of multiple
floating point products.
Setting the math mode to CUDNN_TENSOR_OP_MATH indicates that the library will use
Tensor Core operations.
The default is CUDNN_DEFAULT_MATH. This default indicates that the Tensor Core
operations will be avoided by the library. The default mode is a serialized operation
whereas, the Tensor Core is a parallelized operation, therefore, the two might result
in slightly different numerical results due to the different sequencing of operations.
The library falls back to the default math mode when Tensor Core operations are
not supported or not permitted.
cudnnSetConvolutionGroupCount
A new interface that allows applications to perform convolution groups in the
convolution layers in a single API call.
cudnnCTCLoss
cudnnCTCLoss provides a GPU implementation of the Connectionist Temporal
Classification (CTC) loss function for RNNs. The CTC loss function is used for
phoneme recognition in speech and handwriting recognition.
CUDNN_BATCHNORM_SPATIAL_PERSISTENT
The CUDNN_BATCHNORM_SPATIAL_PERSISTENT function is a new batch
normalization mode for cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward. This mode is similar to
CUDNN_BATCHNORM_SPATIAL, however, it can be faster for some tasks.
cudnnQueryRuntimeError
The cudnnQueryRuntimeError function reports error codes written by GPU
kernels when executing cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward with the
CUDNN_BATCHNORM_SPATIAL_PERSISTENT mode.
cudnnGetConvolutionForwardAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionForwardAlgorithm.
cudnnGetConvolutionBackwardDataAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardAlgorithm.
cudnnGetConvolutionBackwardFilterAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardFilterAlgorithm.
CUDNN_REDUCE_TENSOR_MUL_NO_ZEROS
The MUL_NO_ZEROS function is a multiplication reduction that ignores zeros in the
data.
CUDNN_OP_TENSOR_NOT
The OP_TENSOR_NOT function is a unary operation that takes the negative of
(alpha*A).
cudnnGetDropoutDescriptor
The cudnnGetDropoutDescriptor function allows applications to get dropout
values.

Contributor

cancan101 commented Aug 8, 2017

Here are the full set of features in cuDNN 7:

Key Features and Enhancements
This cuDNN release includes the following key features and enhancements.
Tensor Cores
Version 7.0.1 of cuDNN is the first to support the Tensor Core operations in its
implementation. Tensor Cores provide highly optimized matrix multiplication
building blocks that do not have an equivalent numerical behavior in the traditional
instructions, therefore, its numerical behavior is slightly different.
cudnnSetConvolutionMathType, cudnnSetRNNMatrixMathType, and
cudnnMathType_t
The cudnnSetConvolutionMathType and cudnnSetRNNMatrixMathType
functions enable you to choose whether or not to use Tensor Core operations in
the convolution and RNN layers respectively by setting the math mode to either
CUDNN_TENSOR_OP_MATH or CUDNN_DEFAULT_MATH.
Tensor Core operations perform parallel floating point accumulation of multiple
floating point products.
Setting the math mode to CUDNN_TENSOR_OP_MATH indicates that the library will use
Tensor Core operations.
The default is CUDNN_DEFAULT_MATH. This default indicates that the Tensor Core
operations will be avoided by the library. The default mode is a serialized operation
whereas, the Tensor Core is a parallelized operation, therefore, the two might result
in slightly different numerical results due to the different sequencing of operations.
The library falls back to the default math mode when Tensor Core operations are
not supported or not permitted.
cudnnSetConvolutionGroupCount
A new interface that allows applications to perform convolution groups in the
convolution layers in a single API call.
cudnnCTCLoss
cudnnCTCLoss provides a GPU implementation of the Connectionist Temporal
Classification (CTC) loss function for RNNs. The CTC loss function is used for
phoneme recognition in speech and handwriting recognition.
CUDNN_BATCHNORM_SPATIAL_PERSISTENT
The CUDNN_BATCHNORM_SPATIAL_PERSISTENT function is a new batch
normalization mode for cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward. This mode is similar to
CUDNN_BATCHNORM_SPATIAL, however, it can be faster for some tasks.
cudnnQueryRuntimeError
The cudnnQueryRuntimeError function reports error codes written by GPU
kernels when executing cudnnBatchNormalizationForwardTraining
and cudnnBatchNormalizationBackward with the
CUDNN_BATCHNORM_SPATIAL_PERSISTENT mode.
cudnnGetConvolutionForwardAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionForwardAlgorithm.
cudnnGetConvolutionBackwardDataAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardAlgorithm.
cudnnGetConvolutionBackwardFilterAlgorithm_v7
This new API returns all algorithms sorted by expected performance
(using internal heuristics). These algorithms are output similarly to
cudnnFindConvolutionBackwardFilterAlgorithm.
CUDNN_REDUCE_TENSOR_MUL_NO_ZEROS
The MUL_NO_ZEROS function is a multiplication reduction that ignores zeros in the
data.
CUDNN_OP_TENSOR_NOT
The OP_TENSOR_NOT function is a unary operation that takes the negative of
(alpha*A).
cudnnGetDropoutDescriptor
The cudnnGetDropoutDescriptor function allows applications to get dropout
values.

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 11, 2017

Member

Alright I am thinking about starting a new issue that is more of a "blog" of CUDA 9 RC + cuDNN 7.0. I have a TF build "in my hand" that is patched together but is CUDA 9RC and cuDNN 7.0 and I want to see if anyone is interesting in trying it. I also need to make sure there is not some weird reason why I cannot share it. There are changes that need to be made to some upstream libraries that TensorFlow uses but you will start to see PRs coming in from NVIDIA in the near future. I and the team were able to test CUDA 8 + cuDNN 6 on Volta and then CUDA 9RC + cuDNN 7 on Volta (V100) with FP32 code. I only do Linux builds and Python 2.7 but if all/any of you are interested I would like to try and involve the community more than we did with cuDNN 6.0. It might not be super fun but I want to offer as well as try to make this feel more like we are in this together vs. I am feeing information. I also still want to build out lists of what features we are working on but not promising for cuDNN 7 (and 6.0). @cancan101 thank you for the full list.

Member

tfboyd commented Aug 11, 2017

Alright I am thinking about starting a new issue that is more of a "blog" of CUDA 9 RC + cuDNN 7.0. I have a TF build "in my hand" that is patched together but is CUDA 9RC and cuDNN 7.0 and I want to see if anyone is interesting in trying it. I also need to make sure there is not some weird reason why I cannot share it. There are changes that need to be made to some upstream libraries that TensorFlow uses but you will start to see PRs coming in from NVIDIA in the near future. I and the team were able to test CUDA 8 + cuDNN 6 on Volta and then CUDA 9RC + cuDNN 7 on Volta (V100) with FP32 code. I only do Linux builds and Python 2.7 but if all/any of you are interested I would like to try and involve the community more than we did with cuDNN 6.0. It might not be super fun but I want to offer as well as try to make this feel more like we are in this together vs. I am feeing information. I also still want to build out lists of what features we are working on but not promising for cuDNN 7 (and 6.0). @cancan101 thank you for the full list.

@Froskekongen

This comment has been minimized.

Show comment
Hide comment
@Froskekongen

Froskekongen Aug 11, 2017

@tfboyd: I would be grateful for descriptions on doing CUDA 9.0RC+cuDNN 7.0. I am using a weird system myself (ubuntu 17.10 beta with TF1.3, CUDA 8.0 and cuDNN 6.0 gcc-4.8), and upgrading to cuda 9 and cudnn 7 would actually be nice compilerwise.

Froskekongen commented Aug 11, 2017

@tfboyd: I would be grateful for descriptions on doing CUDA 9.0RC+cuDNN 7.0. I am using a weird system myself (ubuntu 17.10 beta with TF1.3, CUDA 8.0 and cuDNN 6.0 gcc-4.8), and upgrading to cuda 9 and cudnn 7 would actually be nice compilerwise.

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 11, 2017

Member
Member

tfboyd commented Aug 11, 2017

@theflofly

This comment has been minimized.

Show comment
Hide comment
@theflofly

theflofly Aug 12, 2017

Contributor

@tfboyd: I am interested, how will you share it? A branch?

Contributor

theflofly commented Aug 12, 2017

@tfboyd: I am interested, how will you share it? A branch?

@tanmayb123

This comment has been minimized.

Show comment
Hide comment
@tanmayb123

tanmayb123 Aug 12, 2017

@tfboyd I'd definitely be very interested as well. Thanks!

tanmayb123 commented Aug 12, 2017

@tfboyd I'd definitely be very interested as well. Thanks!

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 14, 2017

Member
Member

tfboyd commented Aug 14, 2017

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Aug 22, 2017

Member

Instructions and a binary to play with if you like Python 2.7. I am going to close this as I will update the issue I create to track the effort. @tanmayb123 @Froskekongen

#12474

Member

tfboyd commented Aug 22, 2017

Instructions and a binary to play with if you like Python 2.7. I am going to close this as I will update the issue I create to track the effort. @tanmayb123 @Froskekongen

#12474

@tfboyd tfboyd closed this Aug 22, 2017

@apacha

This comment has been minimized.

Show comment
Hide comment
@apacha

apacha Sep 6, 2017

I just tried installing pre-compiled tensorflow-gpu-1.3.0 for Python 3.6 on Windows x64 and provided cuDNN library version 7.0 with Cuda 8.0 and at least for me, everything seems to work. I'm not seeing any exception or issues.
Is this to be expected? Is cuDNN 7.0 backwards-compatible to cuDNN 6.0? May this lead to any issues?

apacha commented Sep 6, 2017

I just tried installing pre-compiled tensorflow-gpu-1.3.0 for Python 3.6 on Windows x64 and provided cuDNN library version 7.0 with Cuda 8.0 and at least for me, everything seems to work. I'm not seeing any exception or issues.
Is this to be expected? Is cuDNN 7.0 backwards-compatible to cuDNN 6.0? May this lead to any issues?

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Sep 6, 2017

Member

@apacha I am a little surprised it worked. I have seen the error before in my testing where the TensorFlow binary cannot find cuDNN because it looks for it by name and the *.so files include 6.0/7.0 in the names. Remotely possible you have cuDNN 6 still in your path. I do not like making guesses about your setup but if I was making a bet I would saying it is still using cuDNN 6.

In regards to backwards compatible minus TensorFlow being compiled to look for a specific version. I do not know.

Finally, it is not a big deal. cuDNN 7 PRs are almost approved/merged and the pre compiled binary will likely move to cuDNN 7 as of 1.5.

UPDATE on progress to CUDA 9RC and cuDNN 7

  • PRs from NVIDIA are nearly approved
  • EIGEN change has been approved and merged
  • FP16 testing has started in earnest on V100 (Volta)
Member

tfboyd commented Sep 6, 2017

@apacha I am a little surprised it worked. I have seen the error before in my testing where the TensorFlow binary cannot find cuDNN because it looks for it by name and the *.so files include 6.0/7.0 in the names. Remotely possible you have cuDNN 6 still in your path. I do not like making guesses about your setup but if I was making a bet I would saying it is still using cuDNN 6.

In regards to backwards compatible minus TensorFlow being compiled to look for a specific version. I do not know.

Finally, it is not a big deal. cuDNN 7 PRs are almost approved/merged and the pre compiled binary will likely move to cuDNN 7 as of 1.5.

UPDATE on progress to CUDA 9RC and cuDNN 7

  • PRs from NVIDIA are nearly approved
  • EIGEN change has been approved and merged
  • FP16 testing has started in earnest on V100 (Volta)
@apacha

This comment has been minimized.

Show comment
Hide comment
@apacha

apacha Sep 6, 2017

@tfboyd just for the sake of completeness: I was using cuDNN 5 previously and since I had to update for tensorflow 1.3, I was just hopping to cuDNN version 7 to give it a shot. I've explicitly deleted cudnn64_5.dll and there is no cudnn64_6.dll in my CUDA installation path. Maybe it's Windows magic. :-P

Though notice one thing: I am still using CUDA 8.0, not 9.0.

apacha commented Sep 6, 2017

@tfboyd just for the sake of completeness: I was using cuDNN 5 previously and since I had to update for tensorflow 1.3, I was just hopping to cuDNN version 7 to give it a shot. I've explicitly deleted cudnn64_5.dll and there is no cudnn64_6.dll in my CUDA installation path. Maybe it's Windows magic. :-P

Though notice one thing: I am still using CUDA 8.0, not 9.0.

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Sep 6, 2017

Member

@apacha It might be windows magic. I did not want to sound judgmental as I had no idea. I think windows magic is possible as the cuDNN calls should not have changed and thus backwards compatible seems likely. For the linux builds TensorFlow is looking for specific files (or that is what it looks like when I get errors) and is very unhappy if it does not find cudnnblahblah.6.so. Thanks for the update and specifics.

Member

tfboyd commented Sep 6, 2017

@apacha It might be windows magic. I did not want to sound judgmental as I had no idea. I think windows magic is possible as the cuDNN calls should not have changed and thus backwards compatible seems likely. For the linux builds TensorFlow is looking for specific files (or that is what it looks like when I get errors) and is very unhappy if it does not find cudnnblahblah.6.so. Thanks for the update and specifics.

@RemiMorin

This comment has been minimized.

Show comment
Hide comment
@RemiMorin

RemiMorin Sep 13, 2017

Is there a branch / tag whatever we can checkout and try it out?
Started a brand new installation, Ubuntu 17... then new gcc impose CUDA 9, I see that CuDNN who fit with is 7... you see where I'm heading.
I can for sure hack my setup in many places (and start it from scratch again with Ubuntu 16) just I'm so close, the fix is said to be close... why make a big jump in the past if I can make a small jump in the future!

RemiMorin commented Sep 13, 2017

Is there a branch / tag whatever we can checkout and try it out?
Started a brand new installation, Ubuntu 17... then new gcc impose CUDA 9, I see that CuDNN who fit with is 7... you see where I'm heading.
I can for sure hack my setup in many places (and start it from scratch again with Ubuntu 16) just I'm so close, the fix is said to be close... why make a big jump in the past if I can make a small jump in the future!

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Sep 13, 2017

Member
Member

tfboyd commented Sep 13, 2017

@jshin49

This comment has been minimized.

Show comment
Hide comment
@jshin49

jshin49 Sep 27, 2017

@tfboyd Is this still an issue? I realized that cuda 9.0 has been released just today.

jshin49 commented Sep 27, 2017

@tfboyd Is this still an issue? I realized that cuda 9.0 has been released just today.

@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 9, 2017

@goodmangu, I will work on the MacOS build over the weekend.

@nasergh, Did you install cuDNN?

Here is my guide for cuDNN including source and docs to test the install:

Download cuDNN 7.0.4 files

You must log into your Nvidia developer account in your browser

Check Each Hash

cd $HOME/Downloads
md5sum cudnn-9.0-linux-x64-v7.tgz && \
md5sum libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb && \
md5sum libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb && \
md5sum libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Output should be:

fc8a03ac9380d582e949444c7a18fb8d cudnn-9.0-linux-x64-v7.tgz
e986f9a85fd199ab8934b8e4835496e2 libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
4bd528115e3dc578ce8fca0d32ab82b8 libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
04ad839c937362a551eb2170afb88320 libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Install cuDNN 7.0.4 and libraries

tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
sudo dpkg -i libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Verifying cuDNN

Ubuntu 17.10 includes version 7+ of the GNU compilers
CUDA is not compatible with higher than version 6
The error returned is:

#error -- unsupported GNU version! gcc versions later than 6 are not supported!

Fix - Install Version 6 and create symbolic links in CUDA bin directory:

sudo apt-get install gcc-6 g++-6
sudo ln -sf /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
sudo ln -sf /usr/bin/g++-6 /usr/local/cuda/bin/g++

Now build mnistCUDNN to test cuDNN

cp -r /usr/src/cudnn_samples_v7/ $HOME
cd $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
./mnistCUDNN

If cuDNN is properly installed, you will see:

Test passed!

Tweakmind commented Dec 9, 2017

@goodmangu, I will work on the MacOS build over the weekend.

@nasergh, Did you install cuDNN?

Here is my guide for cuDNN including source and docs to test the install:

Download cuDNN 7.0.4 files

You must log into your Nvidia developer account in your browser

Check Each Hash

cd $HOME/Downloads
md5sum cudnn-9.0-linux-x64-v7.tgz && \
md5sum libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb && \
md5sum libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb && \
md5sum libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Output should be:

fc8a03ac9380d582e949444c7a18fb8d cudnn-9.0-linux-x64-v7.tgz
e986f9a85fd199ab8934b8e4835496e2 libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
4bd528115e3dc578ce8fca0d32ab82b8 libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
04ad839c937362a551eb2170afb88320 libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Install cuDNN 7.0.4 and libraries

tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
sudo dpkg -i libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

Verifying cuDNN

Ubuntu 17.10 includes version 7+ of the GNU compilers
CUDA is not compatible with higher than version 6
The error returned is:

#error -- unsupported GNU version! gcc versions later than 6 are not supported!

Fix - Install Version 6 and create symbolic links in CUDA bin directory:

sudo apt-get install gcc-6 g++-6
sudo ln -sf /usr/bin/gcc-6 /usr/local/cuda/bin/gcc
sudo ln -sf /usr/bin/g++-6 /usr/local/cuda/bin/g++

Now build mnistCUDNN to test cuDNN

cp -r /usr/src/cudnn_samples_v7/ $HOME
cd $HOME/cudnn_samples_v7/mnistCUDNN
make clean && make
./mnistCUDNN

If cuDNN is properly installed, you will see:

Test passed!

@nasergh

This comment has been minimized.

Show comment
Hide comment
@nasergh

nasergh Dec 9, 2017

Dear @Tweakmind
your way works thanks for your help(i was trying to install tensor for more than 3 weeks!!!)
problem is i install it on python3.6 and now i have a problem with PIL package

Traceback (most recent call last):
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 551, in get
    inputs = self.queue.get(block=True).get()
  File "/home/pc2/anaconda3/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/home/pc2/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 391, in get_index
    return _SHARED_SEQUENCES[uid][i]
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/preprocessing/image.py", line 761, in __getitem__
    return self._get_batches_of_transformed_samples(index_array)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/preprocessing/image.py", line 1106, in _get_batches_of_transformed_samples
    interpolation=self.interpolation)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/preprocessing/image.py", line 345, in load_img
    raise ImportError('Could not import PIL.Image. '
ImportError: Could not import PIL.Image. The use of `array_to_img` requires PIL.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/models.py", line 1227, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/engine/training.py", line 2115, in fit_generator
    generator_output = next(output_generator)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 557, in get
    six.raise_from(StopIteration(e), e)
  File "<string>", line 3, in raise_from
StopIteration: Could not import PIL.Image. The use of `array_to_img` requires PIL.

i try to install pillow but it's not help
i also try to install PIL but

UnsatisfiableError: The following specifications were found to be in conflict:
  - pil -> python 2.6*
  - python 3.6*

nasergh commented Dec 9, 2017

Dear @Tweakmind
your way works thanks for your help(i was trying to install tensor for more than 3 weeks!!!)
problem is i install it on python3.6 and now i have a problem with PIL package

Traceback (most recent call last):
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 551, in get
    inputs = self.queue.get(block=True).get()
  File "/home/pc2/anaconda3/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/home/pc2/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 391, in get_index
    return _SHARED_SEQUENCES[uid][i]
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/preprocessing/image.py", line 761, in __getitem__
    return self._get_batches_of_transformed_samples(index_array)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/preprocessing/image.py", line 1106, in _get_batches_of_transformed_samples
    interpolation=self.interpolation)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/preprocessing/image.py", line 345, in load_img
    raise ImportError('Could not import PIL.Image. '
ImportError: Could not import PIL.Image. The use of `array_to_img` requires PIL.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 7, in <module>
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/models.py", line 1227, in fit_generator
    initial_epoch=initial_epoch)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/engine/training.py", line 2115, in fit_generator
    generator_output = next(output_generator)
  File "/home/pc2/venv/lib/python3.6/site-packages/keras/utils/data_utils.py", line 557, in get
    six.raise_from(StopIteration(e), e)
  File "<string>", line 3, in raise_from
StopIteration: Could not import PIL.Image. The use of `array_to_img` requires PIL.

i try to install pillow but it's not help
i also try to install PIL but

UnsatisfiableError: The following specifications were found to be in conflict:
  - pil -> python 2.6*
  - python 3.6*

@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 9, 2017

@nasergh What do you get with:

pip install pillow

Mine looks like:

~$ pip install pillow
Requirement already satisfied: pillow in ./anaconda3/lib/python3.6/site-packages

Tweakmind commented Dec 9, 2017

@nasergh What do you get with:

pip install pillow

Mine looks like:

~$ pip install pillow
Requirement already satisfied: pillow in ./anaconda3/lib/python3.6/site-packages
@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 9, 2017

@nasergh, I need to crash but I'll check in when I get up.

Tweakmind commented Dec 9, 2017

@nasergh, I need to crash but I'll check in when I get up.

@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 9, 2017

@goodmangu, I won't be able to do the Mac build over the weekend as I don't have access to my 2012 Mac Pro. Hopefully, you're good with Ubuntu for now. I know it works well for me. I should have it back next weekend.

Tweakmind commented Dec 9, 2017

@goodmangu, I won't be able to do the Mac build over the weekend as I don't have access to my 2012 Mac Pro. Hopefully, you're good with Ubuntu for now. I know it works well for me. I should have it back next weekend.

@meetshah1995

This comment has been minimized.

Show comment
Hide comment
@meetshah1995

meetshah1995 Dec 10, 2017

@Tweakmind - Thanks! , have you seen any performance boost with CUDA 9 and cuDNN 7?

Also I think some steps mentioned by @Tweakmind below are redundant, you either need:

tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

or

sudo dpkg -i libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

meetshah1995 commented Dec 10, 2017

@Tweakmind - Thanks! , have you seen any performance boost with CUDA 9 and cuDNN 7?

Also I think some steps mentioned by @Tweakmind below are redundant, you either need:

tar -xzvf cudnn-9.0-linux-x64-v7.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

or

sudo dpkg -i libcudnn7_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-dev_7.0.4.31-1+cuda9.0_amd64.deb
sudo dpkg -i libcudnn7-doc_7.0.4.31-1+cuda9.0_amd64.deb

@whatever1983

This comment has been minimized.

Show comment
Hide comment
@whatever1983

whatever1983 Dec 13, 2017

@gunan
CUDA 9.1.85 was just released moments ago with CuDNN 7.0.5, with nvcc compiler bug fixes. I wonder if it lets win10 users compile Tensorflow 1.4.1? It is about time.

whatever1983 commented Dec 13, 2017

@gunan
CUDA 9.1.85 was just released moments ago with CuDNN 7.0.5, with nvcc compiler bug fixes. I wonder if it lets win10 users compile Tensorflow 1.4.1? It is about time.

@gunan

This comment has been minimized.

Show comment
Hide comment
@gunan

gunan Dec 14, 2017

Member

From our correspondences with NVIDIA, I dont think 9.1 fixed this issue.
However, we have workarounds. First, we need this PR to be merged into eigen:
https://bitbucket.org/eigen/eigen/pull-requests/351/win-nvcc/diff

Then we will update our eigen dependency, which should fix all our builds for CUDA9

Member

gunan commented Dec 14, 2017

From our correspondences with NVIDIA, I dont think 9.1 fixed this issue.
However, we have workarounds. First, we need this PR to be merged into eigen:
https://bitbucket.org/eigen/eigen/pull-requests/351/win-nvcc/diff

Then we will update our eigen dependency, which should fix all our builds for CUDA9

@4F2E4A2E

This comment has been minimized.

Show comment
Hide comment
@4F2E4A2E

4F2E4A2E Dec 15, 2017

Contributor

The pr is declined but it seams to be merged manually. Do we have to wait for a eigen release or is it getting built by the sources?

Contributor

4F2E4A2E commented Dec 15, 2017

The pr is declined but it seams to be merged manually. Do we have to wait for a eigen release or is it getting built by the sources?

gunan added a commit to gunan/tensorflow that referenced this issue Dec 15, 2017

gunan added a commit to gunan/tensorflow that referenced this issue Dec 15, 2017

@gunan gunan closed this in #15405 Dec 15, 2017

gunan added a commit that referenced this issue Dec 15, 2017

@hadaev8

This comment has been minimized.

Show comment
Hide comment
@hadaev8

hadaev8 Dec 15, 2017

Cool, then it will be on Nightly pip?

hadaev8 commented Dec 15, 2017

Cool, then it will be on Nightly pip?

@nasergh

This comment has been minimized.

Show comment
Hide comment
@nasergh

nasergh Dec 16, 2017

@Tweakmind
i try to rebuild tensor with using python 2.7
but in bazel build i get this error
i also install numpy but no change.

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
ERROR: /home/gh2/Downloads/tensorflow/util/python/BUILD:5:1: no such package '@local_config_python//': Traceback (most recent call last):
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 310
		_create_local_python_repository(repository_ctx)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 274, in _create_local_python_repository
		_get_numpy_include(repository_ctx, python_bin)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 257, in _get_numpy_include
		_execute(repository_ctx, [python_bin, "-c",..."], <2 more arguments>)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 76, in _execute
		_python_configure_fail("\n".join([error_msg.strip() if ... ""]))
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 37, in _python_configure_fail
		fail(("%sPython Configuration Error:%...)))
Python Configuration Error: Problem getting numpy include path.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
**ImportError: No module named numpy**
Is numpy installed?
 and referenced by '//util/python:python_headers'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Loading failed
INFO: Elapsed time: 10.826s
FAILED: Build did NOT complete successfully (26 packages loaded)
    currently loading: tensorflow/core ... (3 packages)
    Fetching http://mirror.bazel.build/.../~ooura/fft.tgz; 20,338b 5s
    Fetching http://mirror.bazel.build/zlib.net/zlib-1.2.8.tar.gz; 19,924b 5s
    Fetching http://mirror.bazel.build/.../giflib-5.1.4.tar.gz; 18,883b 5s

nasergh commented Dec 16, 2017

@Tweakmind
i try to rebuild tensor with using python 2.7
but in bazel build i get this error
i also install numpy but no change.

bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
ERROR: /home/gh2/Downloads/tensorflow/util/python/BUILD:5:1: no such package '@local_config_python//': Traceback (most recent call last):
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 310
		_create_local_python_repository(repository_ctx)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 274, in _create_local_python_repository
		_get_numpy_include(repository_ctx, python_bin)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 257, in _get_numpy_include
		_execute(repository_ctx, [python_bin, "-c",..."], <2 more arguments>)
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 76, in _execute
		_python_configure_fail("\n".join([error_msg.strip() if ... ""]))
	File "/home/gh2/Downloads/tensorflow/third_party/py/python_configure.bzl", line 37, in _python_configure_fail
		fail(("%sPython Configuration Error:%...)))
Python Configuration Error: Problem getting numpy include path.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
**ImportError: No module named numpy**
Is numpy installed?
 and referenced by '//util/python:python_headers'
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Loading failed
INFO: Elapsed time: 10.826s
FAILED: Build did NOT complete successfully (26 packages loaded)
    currently loading: tensorflow/core ... (3 packages)
    Fetching http://mirror.bazel.build/.../~ooura/fft.tgz; 20,338b 5s
    Fetching http://mirror.bazel.build/zlib.net/zlib-1.2.8.tar.gz; 19,924b 5s
    Fetching http://mirror.bazel.build/.../giflib-5.1.4.tar.gz; 18,883b 5s
@masasys

This comment has been minimized.

Show comment
Hide comment
@masasys

masasys Dec 16, 2017

It seems that OSX is excluded in version 7.0.5 of cuDNN. Does anyone know a detailed thing?

masasys commented Dec 16, 2017

It seems that OSX is excluded in version 7.0.5 of cuDNN. Does anyone know a detailed thing?

@eeilon79

This comment has been minimized.

Show comment
Hide comment
@eeilon79

eeilon79 Dec 16, 2017

I still can't get tensorflow-gpu to work in Windows 10 (with CUDA 9.0.176 and cudnn 7.0).
I've uninstalled both tensorflow and tensorflow-gpu and reinstalled them (with the --no-cache-dir to ensure downloading of most recent version with the eigen workaround). When I install both, my GPU is not recognized:

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'random_uniform_1/sub': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.

When I install just tensorflow-gpu it complains about a missing dll:

ImportError: Could not find 'cudart64_80.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 8.0 from this URL: https://developer.nvidia.com/cuda-toolkit

Which is weird because my CUDA version is 9.0, not 8.0, and is recognized (deviceQuery test passed).
My python version is 3.6.3. I'm trying to run this code in Spyder (3.2.4) in order to test tensorflow-gpu.
What did I miss?

eeilon79 commented Dec 16, 2017

I still can't get tensorflow-gpu to work in Windows 10 (with CUDA 9.0.176 and cudnn 7.0).
I've uninstalled both tensorflow and tensorflow-gpu and reinstalled them (with the --no-cache-dir to ensure downloading of most recent version with the eigen workaround). When I install both, my GPU is not recognized:

InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'random_uniform_1/sub': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0 ]. Make sure the device specification refers to a valid device.

When I install just tensorflow-gpu it complains about a missing dll:

ImportError: Could not find 'cudart64_80.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 8.0 from this URL: https://developer.nvidia.com/cuda-toolkit

Which is weird because my CUDA version is 9.0, not 8.0, and is recognized (deviceQuery test passed).
My python version is 3.6.3. I'm trying to run this code in Spyder (3.2.4) in order to test tensorflow-gpu.
What did I miss?

@hadaev8

This comment has been minimized.

Show comment
Hide comment
@hadaev8

hadaev8 Dec 16, 2017

I'm trying to build from source by bazel on win 7, get error

No toolcahin for cpu 'x64_windows'

Can anyone build whl?

hadaev8 commented Dec 16, 2017

I'm trying to build from source by bazel on win 7, get error

No toolcahin for cpu 'x64_windows'

Can anyone build whl?

@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 16, 2017

@hadaev8, I need a lot more information to help. I can work on a whl but it will have heavy dependencies and not Win7, once I solve MacOS, I will solve Win10. In any case, post your details.

@eeilon79, I need to recreate this under Win10. I'm currently focused on MacOS now that Ubuntu is solved. I will come back to Win 10.

Tweakmind commented Dec 16, 2017

@hadaev8, I need a lot more information to help. I can work on a whl but it will have heavy dependencies and not Win7, once I solve MacOS, I will solve Win10. In any case, post your details.

@eeilon79, I need to recreate this under Win10. I'm currently focused on MacOS now that Ubuntu is solved. I will come back to Win 10.

@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 16, 2017

@nasergh, is there a requirement for python 2.7?

Tweakmind commented Dec 16, 2017

@nasergh, is there a requirement for python 2.7?

@Tweakmind

This comment has been minimized.

Show comment
Hide comment
@Tweakmind

Tweakmind Dec 16, 2017

With CUDA 8.0 and cuDNN 6.0, this is how I installed TensorFlow from source for Cuda GPU and AVX2 support in Win10::

Requirements:

* Windows 10 64-Bit
* Visual Studio 15 C++ Tools
* NVIDIA CUDA® Toolkit 8.0
* NVIDIA cuDNN 6.0 for CUDA 8.0
* Cmake
* Swig

Install Visual Studio Community Edition Update 3 w/Windows Kit 10.0.10240.0
Follow instructions at: https://github.com/philferriere/dlwin (Thank you Phil)

Create a Virtual Drive N: for clarity
I suggest creating a directory off C: or your drive of choice and creating N: based on these instructions (2GB min):
https://technet.microsoft.com/en-us/library/gg318052(v=ws.10).aspx

Install Cuda 8.0 64-bit
https://developer.nvidia.com/cuda-downloads (Scroll down to Legacy)

Install cuDNN 6.0 for Cuda 8.0
https://developer.nvidia.com/rdp/cudnn-download
Put cuda folder from zip on N:\and rename cuDNN-6

Install CMake
https://cmake.org/files/v3.10/cmake-3.10.0-rc5-win64-x64.msi

Install Swig (swigwin-3.0.12)
https://sourceforge.net/projects/swig/files/swigwin/swigwin-3.0.12/swigwin-3.0.12.zip

cntk-py36

activate cntk-py36
pip install https://cntk.ai/PythonWheel/GPU/cntk-2.2-cp36-cp36m-win_amd64.whl
python -c "import cntk; print(cntk.__version__)"
conda install pygpu
pip install keras

Remove old tensorflow in Tools if it exists

move tensorflow tensorflow.not
git clone --recursive https://github.com/tensorflow/tensorflow.git
cd C:\Users\%USERNAME%\Tools\tensorflow\tensorflow\contrib\cmake
Edit CMakeLists.txt

Comment out these:

# if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
#   include(CheckCXXCompilerFlag)
#   CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#   if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#     set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
#   endif()
# endif()

Add these:

if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
  include(CheckCXXCompilerFlag)
  CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
  if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
  else()
    CHECK_CXX_COMPILER_FLAG("/arch:AVX2" COMPILER_OPT_ARCH_AVX_SUPPORTED)
    if(COMPILER_OPT_ARCH_AVX_SUPPORTED)
      set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX2")
    endif()
  endif()
endif()

mkdir build & cd build

"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\vcvars64.bat"

cmake .. -A x64 -DCMAKE_BUILD_TYPE=Release ^
-DSWIG_EXECUTABLE=N:/swigwin-3.0.12/swig.exe ^
-DPYTHON_EXECUTABLE=N:/Anaconda3/python.exe ^
-DPYTHON_LIBRARIES=N:/Anaconda3/libs/python36.lib ^
-Dtensorflow_ENABLE_GPU=ON ^
-DCUDNN_HOME="n:\cuDNN-6" ^
-Dtensorflow_WIN_CPU_SIMD_OPTIONS=/arch:AVX2

-- Building for: Visual Studio 14 2015
-- Selecting Windows SDK version 10.0.14393.0 to target Windows 10.0.16299.
-- The C compiler identification is MSVC 19.0.24225.1
-- The CXX compiler identification is MSVC 19.0.24225.1
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED - Failed
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED - Success
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED - Success
-- Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0 (found suitable version "8.0", minimum required is "8.0")
-- Found PythonInterp: C:/Users/%USERNAME%/Anaconda3/python.exe (found version "3.6.3")
-- Found PythonLibs: C:/Users/%USERNAME%/Anaconda3/libs/python36.lib (found version "3.6.3")
-- Found SWIG: C:/Users/%USERNAME%/Tools/swigwin-3.0.12/swig.exe (found version "3.0.12")
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/%USERNAME%/Tools/tensorflow/tensorflow/contrib/cmake/build

MSBuild /p:Configuration=Release tf_python_build_pip_package.vcxproj

Tweakmind commented Dec 16, 2017

With CUDA 8.0 and cuDNN 6.0, this is how I installed TensorFlow from source for Cuda GPU and AVX2 support in Win10::

Requirements:

* Windows 10 64-Bit
* Visual Studio 15 C++ Tools
* NVIDIA CUDA® Toolkit 8.0
* NVIDIA cuDNN 6.0 for CUDA 8.0
* Cmake
* Swig

Install Visual Studio Community Edition Update 3 w/Windows Kit 10.0.10240.0
Follow instructions at: https://github.com/philferriere/dlwin (Thank you Phil)

Create a Virtual Drive N: for clarity
I suggest creating a directory off C: or your drive of choice and creating N: based on these instructions (2GB min):
https://technet.microsoft.com/en-us/library/gg318052(v=ws.10).aspx

Install Cuda 8.0 64-bit
https://developer.nvidia.com/cuda-downloads (Scroll down to Legacy)

Install cuDNN 6.0 for Cuda 8.0
https://developer.nvidia.com/rdp/cudnn-download
Put cuda folder from zip on N:\and rename cuDNN-6

Install CMake
https://cmake.org/files/v3.10/cmake-3.10.0-rc5-win64-x64.msi

Install Swig (swigwin-3.0.12)
https://sourceforge.net/projects/swig/files/swigwin/swigwin-3.0.12/swigwin-3.0.12.zip

cntk-py36

activate cntk-py36
pip install https://cntk.ai/PythonWheel/GPU/cntk-2.2-cp36-cp36m-win_amd64.whl
python -c "import cntk; print(cntk.__version__)"
conda install pygpu
pip install keras

Remove old tensorflow in Tools if it exists

move tensorflow tensorflow.not
git clone --recursive https://github.com/tensorflow/tensorflow.git
cd C:\Users\%USERNAME%\Tools\tensorflow\tensorflow\contrib\cmake
Edit CMakeLists.txt

Comment out these:

# if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
#   include(CheckCXXCompilerFlag)
#   CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#   if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
#     set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
#   endif()
# endif()

Add these:

if (tensorflow_OPTIMIZE_FOR_NATIVE_ARCH)
  include(CheckCXXCompilerFlag)
  CHECK_CXX_COMPILER_FLAG("-march=native" COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
  if (COMPILER_OPT_ARCH_NATIVE_SUPPORTED)
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=native")
  else()
    CHECK_CXX_COMPILER_FLAG("/arch:AVX2" COMPILER_OPT_ARCH_AVX_SUPPORTED)
    if(COMPILER_OPT_ARCH_AVX_SUPPORTED)
      set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /arch:AVX2")
    endif()
  endif()
endif()

mkdir build & cd build

"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin\amd64\vcvars64.bat"

cmake .. -A x64 -DCMAKE_BUILD_TYPE=Release ^
-DSWIG_EXECUTABLE=N:/swigwin-3.0.12/swig.exe ^
-DPYTHON_EXECUTABLE=N:/Anaconda3/python.exe ^
-DPYTHON_LIBRARIES=N:/Anaconda3/libs/python36.lib ^
-Dtensorflow_ENABLE_GPU=ON ^
-DCUDNN_HOME="n:\cuDNN-6" ^
-Dtensorflow_WIN_CPU_SIMD_OPTIONS=/arch:AVX2

-- Building for: Visual Studio 14 2015
-- Selecting Windows SDK version 10.0.14393.0 to target Windows 10.0.16299.
-- The C compiler identification is MSVC 19.0.24225.1
-- The CXX compiler identification is MSVC 19.0.24225.1
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working C compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe
-- Check for working CXX compiler: C:/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/x86_amd64/cl.exe -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_NATIVE_SUPPORTED - Failed
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED
-- Performing Test COMPILER_OPT_ARCH_AVX_SUPPORTED - Success
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED
-- Performing Test COMPILER_OPT_WIN_CPU_SIMD_SUPPORTED - Success
-- Found CUDA: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v8.0 (found suitable version "8.0", minimum required is "8.0")
-- Found PythonInterp: C:/Users/%USERNAME%/Anaconda3/python.exe (found version "3.6.3")
-- Found PythonLibs: C:/Users/%USERNAME%/Anaconda3/libs/python36.lib (found version "3.6.3")
-- Found SWIG: C:/Users/%USERNAME%/Tools/swigwin-3.0.12/swig.exe (found version "3.0.12")
-- Configuring done
-- Generating done
-- Build files have been written to: C:/Users/%USERNAME%/Tools/tensorflow/tensorflow/contrib/cmake/build

MSBuild /p:Configuration=Release tf_python_build_pip_package.vcxproj
@hadaev8

This comment has been minimized.

Show comment
Hide comment
@hadaev8

hadaev8 Dec 16, 2017

@Tweakmind
python 3.6, tensorflow last from master, cuda 9.0, cudnn 7.0.5 for cuda 9.0, basel and swig loaded today.

hadaev8 commented Dec 16, 2017

@Tweakmind
python 3.6, tensorflow last from master, cuda 9.0, cudnn 7.0.5 for cuda 9.0, basel and swig loaded today.

@argman

This comment has been minimized.

Show comment
Hide comment
@argman

argman Dec 17, 2017

@Tweakmind do you build with master or ?

argman commented Dec 17, 2017

@Tweakmind do you build with master or ?

@hadaev8

This comment has been minimized.

Show comment
Hide comment
@hadaev8

hadaev8 Dec 17, 2017

@Tweakmind
May you build on windows with cuda 9 cudnn 7 and share .whl?

hadaev8 commented Dec 17, 2017

@Tweakmind
May you build on windows with cuda 9 cudnn 7 and share .whl?

@alc5978

This comment has been minimized.

Show comment
Hide comment
@alc5978

alc5978 Dec 21, 2017

@Tweakmind

Don't you try to build on win 10 with cuda 9 cudnn 7 ?

Thanks for your expertise !

alc5978 commented Dec 21, 2017

@Tweakmind

Don't you try to build on win 10 with cuda 9 cudnn 7 ?

Thanks for your expertise !

@whatever1983

This comment has been minimized.

Show comment
Hide comment
@whatever1983

whatever1983 Dec 24, 2017

@hadaev8 @alc5978
pip install -U tf-nightly-gpu now gives a win10 build dated 20171221, which is based on TF 1.5 beta with CUDA 9.0 and CuDNN 7.0.5. I ran it last night, it is ok. Now we should move onto CUDA 9.1 for the 12x CUDA kernel launch speed. Tensorflow windows support is pretty slow and anemic. Stable official builds should be offered ASAP. I am actually for Tensorflow 1.5 stable to be released with CUDA 9.1, by the end of January please?

whatever1983 commented Dec 24, 2017

@hadaev8 @alc5978
pip install -U tf-nightly-gpu now gives a win10 build dated 20171221, which is based on TF 1.5 beta with CUDA 9.0 and CuDNN 7.0.5. I ran it last night, it is ok. Now we should move onto CUDA 9.1 for the 12x CUDA kernel launch speed. Tensorflow windows support is pretty slow and anemic. Stable official builds should be offered ASAP. I am actually for Tensorflow 1.5 stable to be released with CUDA 9.1, by the end of January please?

@arunmandal53

This comment has been minimized.

Show comment
Hide comment
@arunmandal53

arunmandal53 Jan 1, 2018

Go to http://www.python36.com/install-tensorflow141-gpu/ for step by step installation of tensorflow with cuda 9.1 and cudnn7.05 on ubuntu. And go to http://www.python36.com/install-tensorflow-gpu-windows for step by step installation of tensorflow with cuda 9.1 and cudnn 7.0.5 on Windows.

arunmandal53 commented Jan 1, 2018

Go to http://www.python36.com/install-tensorflow141-gpu/ for step by step installation of tensorflow with cuda 9.1 and cudnn7.05 on ubuntu. And go to http://www.python36.com/install-tensorflow-gpu-windows for step by step installation of tensorflow with cuda 9.1 and cudnn 7.0.5 on Windows.

@tonmoyborah

This comment has been minimized.

Show comment
Hide comment
@tonmoyborah

tonmoyborah Jan 24, 2018

It's 2018, almost end of January and installation of TF with CUDA9.1 and CuDNN7 on Windows 10 is still not done?

tonmoyborah commented Jan 24, 2018

It's 2018, almost end of January and installation of TF with CUDA9.1 and CuDNN7 on Windows 10 is still not done?

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Jan 24, 2018

Member

1.5 is RC with CUDA 9 + cuDNN 7 and should go GA in the next few days. (CUDA 9.1 was GA in December and requires another device driver upgrade that is disruptive to many users. The current plan is to keep the default build on CUDA 9.0.x and keep upgrading to newer cuDNN versions).

I opened an issue to discuss CUDA 9.1.

The 12x kernel launch speed improvement is more nuanced than the 12x number. The top end of 12x is for ops with a lot of arguments and the disruption to users is high due to the device driver upgrade. I hope to have a "channel" testing 9.1 in the near future and figure out how to deal with this paradigm.

Member

tfboyd commented Jan 24, 2018

1.5 is RC with CUDA 9 + cuDNN 7 and should go GA in the next few days. (CUDA 9.1 was GA in December and requires another device driver upgrade that is disruptive to many users. The current plan is to keep the default build on CUDA 9.0.x and keep upgrading to newer cuDNN versions).

I opened an issue to discuss CUDA 9.1.

The 12x kernel launch speed improvement is more nuanced than the 12x number. The top end of 12x is for ops with a lot of arguments and the disruption to users is high due to the device driver upgrade. I hope to have a "channel" testing 9.1 in the near future and figure out how to deal with this paradigm.

@ViktorM

This comment has been minimized.

Show comment
Hide comment
@ViktorM

ViktorM Jan 24, 2018

I hope it will be finally CUDA 9.1, not 9.0.

ViktorM commented Jan 24, 2018

I hope it will be finally CUDA 9.1, not 9.0.

@Magicfeng007

This comment has been minimized.

Show comment
Hide comment
@Magicfeng007

Magicfeng007 Feb 4, 2018

I hope it will be finally CUDA 9.1, not 9.0 too.

Magicfeng007 commented Feb 4, 2018

I hope it will be finally CUDA 9.1, not 9.0 too.

@alc5978

This comment has been minimized.

Show comment
Hide comment
@alc5978

alc5978 Feb 4, 2018

I 'm sur it will be finally CUDA 9.1, not 9.0 too, isn't it ? :)

alc5978 commented Feb 4, 2018

I 'm sur it will be finally CUDA 9.1, not 9.0 too, isn't it ? :)

@tfboyd

This comment has been minimized.

Show comment
Hide comment
@tfboyd

tfboyd Feb 5, 2018

Member

@ViktorM @Magicfeng007 @alc5978
The 9.1 thread is here if you want to follow along although it is basically closed. If you could list why you want 9.1 that would be useful, and what is your setup/configuration. A benchmark that you ran showing the perf boost would also be useful in understanding the immediate need. In meetings with NVIDIA, we both agreed there was not an immediate need to make 9.1 the default. which would then force people to upgrade their drivers again.

Member

tfboyd commented Feb 5, 2018

@ViktorM @Magicfeng007 @alc5978
The 9.1 thread is here if you want to follow along although it is basically closed. If you could list why you want 9.1 that would be useful, and what is your setup/configuration. A benchmark that you ran showing the perf boost would also be useful in understanding the immediate need. In meetings with NVIDIA, we both agreed there was not an immediate need to make 9.1 the default. which would then force people to upgrade their drivers again.

@meghashyam0046

This comment has been minimized.

Show comment
Hide comment
@meghashyam0046

meghashyam0046 Feb 10, 2018

If anybody are still facing problems like Keras with TensorFlow backend not using GPU.... just follow the instructions in this page. It is updated and works 100% correctly.
https://research.wmz.ninja/articles/2017/01/configuring-gpu-accelerated-keras-in-windows-10.html

meghashyam0046 commented Feb 10, 2018

If anybody are still facing problems like Keras with TensorFlow backend not using GPU.... just follow the instructions in this page. It is updated and works 100% correctly.
https://research.wmz.ninja/articles/2017/01/configuring-gpu-accelerated-keras-in-windows-10.html

@alc5978

This comment has been minimized.

Show comment
Hide comment
@alc5978

alc5978 Feb 24, 2018

Hi All
I today install tensorflow-gpu 1.6.0rc1 on win10 with CUDA 9.0 and cuDNN 7.0.5 library with http://www.python36.com/install-tensorflow-using-official-pip-pacakage/

Everything seems ok

alc5978 commented Feb 24, 2018

Hi All
I today install tensorflow-gpu 1.6.0rc1 on win10 with CUDA 9.0 and cuDNN 7.0.5 library with http://www.python36.com/install-tensorflow-using-official-pip-pacakage/

Everything seems ok

@ashokpant

This comment has been minimized.

Show comment
Hide comment
@ashokpant

ashokpant Mar 6, 2018

I created one script for NVIDIA GPU prerequisites (CUDA-9.0 and cuDNN-7.0) for the latest TensorFlow (v1.5+), here is the link.

ashokpant commented Mar 6, 2018

I created one script for NVIDIA GPU prerequisites (CUDA-9.0 and cuDNN-7.0) for the latest TensorFlow (v1.5+), here is the link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment