New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL support #22

Open
outlace opened this Issue Nov 9, 2015 · 517 comments

Comments

Projects
None yet
@outlace

outlace commented Nov 9, 2015

I understand TensorFlow only supports CUDA. What would need to be done to add in OpenCL support?

@nmabhinandan

This comment has been minimized.

Show comment
Hide comment
@nmabhinandan

nmabhinandan Nov 9, 2015

It's strange that Google ditched open OpenCL for proprietary CUDA.
im-just-saying

It's strange that Google ditched open OpenCL for proprietary CUDA.
im-just-saying

@ebrevdo

This comment has been minimized.

Show comment
Hide comment
@ebrevdo

ebrevdo Nov 9, 2015

Contributor

At the very least, the Eigen library would have to support OpenCL.

Contributor

ebrevdo commented Nov 9, 2015

At the very least, the Eigen library would have to support OpenCL.

@vrv vrv referenced this issue Nov 9, 2015

Closed

Could port to OpenCL? #28

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Nov 9, 2015

Contributor

👍

Contributor

bhack commented Nov 9, 2015

👍

@keveman keveman added the cuda label Nov 9, 2015

@jamesliu96

This comment has been minimized.

Show comment
Hide comment

👍

@alexatknit

This comment has been minimized.

Show comment
Hide comment

👍

@dhess

This comment has been minimized.

Show comment
Hide comment
@dhess

dhess Nov 11, 2015

thumbs up and all that.

dhess commented Nov 11, 2015

thumbs up and all that.

@gujunli

This comment has been minimized.

Show comment
Hide comment
@gujunli

gujunli Nov 11, 2015

I will be interested in expanding Tensor Flow with OpenCL. As we have already released OpenCL caffe. https://github.com/amd/OpenCL-caffe. Hopefully it can get integrated in light way? Is anyone interested in working together on this?

gujunli commented Nov 11, 2015

I will be interested in expanding Tensor Flow with OpenCL. As we have already released OpenCL caffe. https://github.com/amd/OpenCL-caffe. Hopefully it can get integrated in light way? Is anyone interested in working together on this?

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Nov 11, 2015

Contributor

@gujunli Nice to see AMD here. /cc @naibaf7 @lunochod

Contributor

bhack commented Nov 11, 2015

@gujunli Nice to see AMD here. /cc @naibaf7 @lunochod

@nmabhinandan

This comment has been minimized.

Show comment
Hide comment
@nmabhinandan

nmabhinandan Nov 11, 2015

would be great.

would be great.

@sasadep

This comment has been minimized.

Show comment
Hide comment

sasadep commented Nov 11, 2015

👍

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Nov 15, 2015

Contributor

/cc @lukeiwanski for Eigen/OpenCL/SYCL

Contributor

bhack commented Nov 15, 2015

/cc @lukeiwanski for Eigen/OpenCL/SYCL

@ankdesh

This comment has been minimized.

Show comment
Hide comment
@ankdesh

ankdesh Nov 16, 2015

@gujunli Certainly would be interested in contributing. Please let me know when you plan to start.

ankdesh commented Nov 16, 2015

@gujunli Certainly would be interested in contributing. Please let me know when you plan to start.

@lukeiwanski

This comment has been minimized.

Show comment
Hide comment
@lukeiwanski

lukeiwanski Nov 25, 2015

Contributor

Hi all,

Here at Codeplay we are looking into Eigen's tensor running on GPU using SYCL (a modern C++ layer on top of OpenCL). From what we have gathered so far, GPU tensor design is very closely coupled with CUDA and it will require interface changes for another programming model and particularly a SYCL and OpenCL 1.2 version.

If anyone is interested in digging deeper / helping out, we are most certainly interested in contributing.

Thanks,
Luke

Contributor

lukeiwanski commented Nov 25, 2015

Hi all,

Here at Codeplay we are looking into Eigen's tensor running on GPU using SYCL (a modern C++ layer on top of OpenCL). From what we have gathered so far, GPU tensor design is very closely coupled with CUDA and it will require interface changes for another programming model and particularly a SYCL and OpenCL 1.2 version.

If anyone is interested in digging deeper / helping out, we are most certainly interested in contributing.

Thanks,
Luke

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Nov 25, 2015

Contributor

@lukeiwanski Thank you for the feedback. I think that @benoitsteiner worked at the tensor extension part of eigen.

Contributor

bhack commented Nov 25, 2015

@lukeiwanski Thank you for the feedback. I think that @benoitsteiner worked at the tensor extension part of eigen.

@jszuppe

This comment has been minimized.

Show comment
Hide comment
@jszuppe

jszuppe Dec 6, 2015

👍 I can help code some OpenCL/SYCL if someone makes a plan, divides work into tasks etc. I recommend using Boost.Compute as a wrapper for OpenCL (it makes running kernels, testing, templating easier).

jszuppe commented Dec 6, 2015

👍 I can help code some OpenCL/SYCL if someone makes a plan, divides work into tasks etc. I recommend using Boost.Compute as a wrapper for OpenCL (it makes running kernels, testing, templating easier).

@ieee8023

This comment has been minimized.

Show comment
Hide comment

ieee8023 commented Dec 7, 2015

+1

@armish

This comment has been minimized.

Show comment
Hide comment

armish commented Dec 7, 2015

👍

@lukeiwanski

This comment has been minimized.

Show comment
Hide comment
@lukeiwanski

lukeiwanski Dec 8, 2015

Contributor

Hi all,

Just to keep you posted, we are still investigating how we can change the Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach that targets heterogeneous programming models ( not only OpenCL / SYCL ) we will create a proposal.

Thanks,
Luke

Contributor

lukeiwanski commented Dec 8, 2015

Hi all,

Just to keep you posted, we are still investigating how we can change the Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach that targets heterogeneous programming models ( not only OpenCL / SYCL ) we will create a proposal.

Thanks,
Luke

@gujunli

This comment has been minimized.

Show comment
Hide comment
@gujunli

gujunli Dec 8, 2015

Pls keep me update. I developed opencl-caffe for AMD. I am also looking at
tensor flow.

Thanks.
Junlu
On Dec 8, 2015 10:19 AM, "Luke Iwanski" notifications@github.com wrote:

Hi all,

Just to keep you posted, we are still investigating how we can change the
Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach we will create a proposal.

Thanks,
Luke


Reply to this email directly or view it on GitHub
#22 (comment)
.

gujunli commented Dec 8, 2015

Pls keep me update. I developed opencl-caffe for AMD. I am also looking at
tensor flow.

Thanks.
Junlu
On Dec 8, 2015 10:19 AM, "Luke Iwanski" notifications@github.com wrote:

Hi all,

Just to keep you posted, we are still investigating how we can change the
Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach we will create a proposal.

Thanks,
Luke


Reply to this email directly or view it on GitHub
#22 (comment)
.

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Dec 9, 2015

Contributor

/cc @ptillet @gongzg Is there any interest in this by Intel? I really hope that we don't fragment OPENCL here like in Caffe where we have an AMD fork, Intel unmerged PRs, another semi-unofficial AMD PR, and a long staging user PR (plus two old abandoned Opencl efforts). If somebody is interested in the history can take a look at BVLC/caffe#2610 comments.

Contributor

bhack commented Dec 9, 2015

/cc @ptillet @gongzg Is there any interest in this by Intel? I really hope that we don't fragment OPENCL here like in Caffe where we have an AMD fork, Intel unmerged PRs, another semi-unofficial AMD PR, and a long staging user PR (plus two old abandoned Opencl efforts). If somebody is interested in the history can take a look at BVLC/caffe#2610 comments.

@gongzg

This comment has been minimized.

Show comment
Hide comment
@gongzg

gongzg Dec 17, 2015

@bhack We do have interest in this. Thanks for letting me know. If there is a proposal for Eigen's OpenCL/SYCL implementation, we will see what we can do from Intel side.

gongzg commented Dec 17, 2015

@bhack We do have interest in this. Thanks for letting me know. If there is a proposal for Eigen's OpenCL/SYCL implementation, we will see what we can do from Intel side.

@benoitsteiner benoitsteiner self-assigned this Dec 23, 2015

@ZirconCode

This comment has been minimized.

Show comment
Hide comment

👍

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Jan 1, 2016

Contributor

An interesting initiative at https://github.com/ptillet/isaac also if here we rely on Eigen tensor extension.

Contributor

bhack commented Jan 1, 2016

An interesting initiative at https://github.com/ptillet/isaac also if here we rely on Eigen tensor extension.

@DanMcLaughlin

This comment has been minimized.

Show comment
Hide comment
@DanMcLaughlin

DanMcLaughlin Jan 19, 2016

I also would like to contribute. @benoitsteiner can you organize it?

I also would like to contribute. @benoitsteiner can you organize it?

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Jan 19, 2016

Contributor

This was included in the Roadmap but also tagged as contribution so a direction/bootstrap could be really useful.

Contributor

bhack commented Jan 19, 2016

This was included in the Roadmap but also tagged as contribution so a direction/bootstrap could be really useful.

@gujunli

This comment has been minimized.

Show comment
Hide comment
@gujunli

gujunli Jan 19, 2016

I can contribute to organize it. who is responsible for OpenCL support in
Tensor flow now?

Thanks a lot.
Junli

On Tue, Jan 19, 2016 at 7:50 AM, bhack notifications@github.com wrote:

This was included in the Roadmap but also tagged as contribution so a
direction/bootstrap could be really useful.


Reply to this email directly or view it on GitHub
#22 (comment)
.


Junli Gu--谷俊丽
Coordinated Science Lab
University of Illinois at Urbana-Champaign


gujunli commented Jan 19, 2016

I can contribute to organize it. who is responsible for OpenCL support in
Tensor flow now?

Thanks a lot.
Junli

On Tue, Jan 19, 2016 at 7:50 AM, bhack notifications@github.com wrote:

This was included in the Roadmap but also tagged as contribution so a
direction/bootstrap could be really useful.


Reply to this email directly or view it on GitHub
#22 (comment)
.


Junli Gu--谷俊丽
Coordinated Science Lab
University of Illinois at Urbana-Champaign


@DanMcLaughlin

This comment has been minimized.

Show comment
Hide comment
@DanMcLaughlin

DanMcLaughlin Jan 19, 2016

I just assumed Benoit because he self assigned the feature, but I think you've got it Junli! Maybe start with an email or forum thread of interested parties?

I just assumed Benoit because he self assigned the feature, but I think you've got it Junli! Maybe start with an email or forum thread of interested parties?

@martinwicke

This comment has been minimized.

Show comment
Hide comment
@martinwicke

martinwicke Jan 19, 2016

Member

@benoitsteiner knows more about interested parties that may not have shown
up in this thread (or this issue). I'd wait for him to coordinate to make
sure we avoid duplicating work.

On Tue, Jan 19, 2016 at 11:42 AM Dan McLaughlin notifications@github.com
wrote:

I just assumed Benoit because he self assigned the feature, but I think
you've got it Junli! Maybe start with an email or forum thread of
interested parties?


Reply to this email directly or view it on GitHub
#22 (comment)
.

Member

martinwicke commented Jan 19, 2016

@benoitsteiner knows more about interested parties that may not have shown
up in this thread (or this issue). I'd wait for him to coordinate to make
sure we avoid duplicating work.

On Tue, Jan 19, 2016 at 11:42 AM Dan McLaughlin notifications@github.com
wrote:

I just assumed Benoit because he self assigned the feature, but I think
you've got it Junli! Maybe start with an email or forum thread of
interested parties?


Reply to this email directly or view it on GitHub
#22 (comment)
.

@MikalaiDrabovich

This comment has been minimized.

Show comment
Hide comment
@MikalaiDrabovich

MikalaiDrabovich Jan 19, 2016

Contributor

I'm interested. Is there any roadmap?

On Jan 19, 2016, at 11:46 AM, Martin Wicke notifications@github.com wrote:

@benoitsteiner knows more about interested parties that may not have shown
up in this thread (or this issue). I'd wait for him to coordinate to make
sure we avoid duplicating work.

On Tue, Jan 19, 2016 at 11:42 AM Dan McLaughlin notifications@github.com
wrote:

I just assumed Benoit because he self assigned the feature, but I think
you've got it Junli! Maybe start with an email or forum thread of
interested parties?


Reply to this email directly or view it on GitHub
#22 (comment)
.


Reply to this email directly or view it on GitHub.

Contributor

MikalaiDrabovich commented Jan 19, 2016

I'm interested. Is there any roadmap?

On Jan 19, 2016, at 11:46 AM, Martin Wicke notifications@github.com wrote:

@benoitsteiner knows more about interested parties that may not have shown
up in this thread (or this issue). I'd wait for him to coordinate to make
sure we avoid duplicating work.

On Tue, Jan 19, 2016 at 11:42 AM Dan McLaughlin notifications@github.com
wrote:

I just assumed Benoit because he self assigned the feature, but I think
you've got it Junli! Maybe start with an email or forum thread of
interested parties?


Reply to this email directly or view it on GitHub
#22 (comment)
.


Reply to this email directly or view it on GitHub.

@hsaputra

This comment has been minimized.

Show comment
Hide comment
@hsaputra

hsaputra Jan 19, 2016

Contributor

Is there a list of CUDA dependency libraries that Tensorflow relying on?

This would help to see if we could have immediate OpenCL alternatives.

Contributor

hsaputra commented Jan 19, 2016

Is there a list of CUDA dependency libraries that Tensorflow relying on?

This would help to see if we could have immediate OpenCL alternatives.

@naibaf7

This comment has been minimized.

Show comment
Hide comment
@naibaf7

naibaf7 Jan 19, 2016

@hsaputra
There is clFFT, clBLAS (alternatively ViennaCL). Random number generator is a bit more tricky (no curand), either use a CPU generator and transfer to GPU or use another existing kernel for RNG.

The biggest pitfall will again be efficient convolution implementations (something like cuDNN).

There is experience about such issues here:
BVLC/caffe#2610
BVLC/caffe#2195
https://github.com/amd/OpenCL-caffe

naibaf7 commented Jan 19, 2016

@hsaputra
There is clFFT, clBLAS (alternatively ViennaCL). Random number generator is a bit more tricky (no curand), either use a CPU generator and transfer to GPU or use another existing kernel for RNG.

The biggest pitfall will again be efficient convolution implementations (something like cuDNN).

There is experience about such issues here:
BVLC/caffe#2610
BVLC/caffe#2195
https://github.com/amd/OpenCL-caffe

@bhack

This comment has been minimized.

Show comment
Hide comment
@bhack

bhack Jan 19, 2016

Contributor

Tensorflow use tensor extension upstreamed to Eigen. So I think that an Opencl/Sycl support to Eigen is needed. See this thread

Contributor

bhack commented Jan 19, 2016

Tensorflow use tensor extension upstreamed to Eigen. So I think that an Opencl/Sycl support to Eigen is needed. See this thread

@hsaputra

This comment has been minimized.

Show comment
Hide comment
@hsaputra

hsaputra Jan 20, 2016

Contributor

Thanks @naibaf7. Yeah, I don't think there is a viable alternative for cuDNN for OpenCL right now.

Contributor

hsaputra commented Jan 20, 2016

Thanks @naibaf7. Yeah, I don't think there is a viable alternative for cuDNN for OpenCL right now.

@VincentSC

This comment has been minimized.

Show comment
Hide comment
@VincentSC

VincentSC Jan 21, 2016

The website http://opencl.org is created to support open source porting projects just like these! We're currently installing all necessary tools at the website and have space for repositories at https://github.com/OpenCL/ - later on we're adding build-servers to test for several types of hardware and can provide our expertise in how to write code that runs at full speed on numerous hardware.

We're launching a porting initiative for GEGL next week, but we're happy to also support you.

The website http://opencl.org is created to support open source porting projects just like these! We're currently installing all necessary tools at the website and have space for repositories at https://github.com/OpenCL/ - later on we're adding build-servers to test for several types of hardware and can provide our expertise in how to write code that runs at full speed on numerous hardware.

We're launching a porting initiative for GEGL next week, but we're happy to also support you.

@choongng

This comment has been minimized.

Show comment
Hide comment
@choongng

choongng Jan 9, 2018

@AlphasCodes @znmeb I know the TF team prefer to keep the thread TF-only, we're happy to host the PlaidML-specific conversation over on the PlaidML project. That said, we do hope to eventually support TensorFlow itself as well as non-OpenCL platforms (e.g. Apple's Metal for iOS which currently exists in prototype form).

https://github.com/plaidml/plaidml

choongng commented Jan 9, 2018

@AlphasCodes @znmeb I know the TF team prefer to keep the thread TF-only, we're happy to host the PlaidML-specific conversation over on the PlaidML project. That said, we do hope to eventually support TensorFlow itself as well as non-OpenCL platforms (e.g. Apple's Metal for iOS which currently exists in prototype form).

https://github.com/plaidml/plaidml

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Jan 10, 2018

@choongng Thanks for the information i edited my message accordingly.

@znmeb The AMD A12-9800E iGPU should be GCN v3.

The main and only reason for me to do the benchmarks/tests is to find an answer on my question "Stay with AMD or switch to Nvidia for my Deep Learning adventure".

And the answer is. I really like the opensource approach of AMD but i will likely switch to Nvidia due to 2 factors. First the Deep Learning software stack (e.g. Tensorflow) is much more mature for Nvidia. Second the graphic card offers for my very specific needs (must fit into a Dan A4 SFX Case and must be very very silent / almost noiseless under full load for hours) is very limited or even non existing on AMD side.

ghost commented Jan 10, 2018

@choongng Thanks for the information i edited my message accordingly.

@znmeb The AMD A12-9800E iGPU should be GCN v3.

The main and only reason for me to do the benchmarks/tests is to find an answer on my question "Stay with AMD or switch to Nvidia for my Deep Learning adventure".

And the answer is. I really like the opensource approach of AMD but i will likely switch to Nvidia due to 2 factors. First the Deep Learning software stack (e.g. Tensorflow) is much more mature for Nvidia. Second the graphic card offers for my very specific needs (must fit into a Dan A4 SFX Case and must be very very silent / almost noiseless under full load for hours) is very limited or even non existing on AMD side.

@sohnryang

This comment has been minimized.

Show comment
Hide comment
@sohnryang

sohnryang Jan 15, 2018

Is Intel GPUs supported? I think my Iris Pro can speed up the age-long training for a bit.

Is Intel GPUs supported? I think my Iris Pro can speed up the age-long training for a bit.

@mirh

This comment has been minimized.

Show comment
Hide comment
@mirh

mirh Jan 15, 2018

Discuss the lack of intel gpu (or amd cpu) support here codeplaysoftware/computecpp-sdk#78

codeplaysoftware/computecpp-sdk#82

mirh commented Jan 15, 2018

Discuss the lack of intel gpu (or amd cpu) support here codeplaysoftware/computecpp-sdk#78

codeplaysoftware/computecpp-sdk#82

@ghost ghost referenced this issue Jan 31, 2018

Closed

Feature: AMD GPU Support #86

@maxmilne

This comment has been minimized.

Show comment
Hide comment
@maxmilne

maxmilne Feb 5, 2018

Just trying to get a sense of the state of this issue. Am I right to say that this repo:

https://github.com/lukeiwanski/tensorflow

...built with ComputeCpp, is the current best option for building Tensorflow with general AMD GPU support? And if so, is there any benchmark evidence that this build provides a speedup over CPU?

maxmilne commented Feb 5, 2018

Just trying to get a sense of the state of this issue. Am I right to say that this repo:

https://github.com/lukeiwanski/tensorflow

...built with ComputeCpp, is the current best option for building Tensorflow with general AMD GPU support? And if so, is there any benchmark evidence that this build provides a speedup over CPU?

@briansp2020

This comment has been minimized.

Show comment
Hide comment
@briansp2020

briansp2020 Feb 5, 2018

Depends on what you mean by "general AMD GPU support". If you mean really old dGPU or APUs, I don't know. But if you have newer (2nd Gen GCN or newer), hipTensorFlow (v1.0.1) running on ROCm was working pretty well.

Depends on what you mean by "general AMD GPU support". If you mean really old dGPU or APUs, I don't know. But if you have newer (2nd Gen GCN or newer), hipTensorFlow (v1.0.1) running on ROCm was working pretty well.

@maxmilne

This comment has been minimized.

Show comment
Hide comment
@maxmilne

maxmilne Feb 6, 2018

@briansp2020 Ah yes I have seen AMD's work on ROCm. Unfortunately they only support Linux though, and it doesn't even seem like support for any other OS is on their roadmap. I'm hoping for something that supports Windows.

maxmilne commented Feb 6, 2018

@briansp2020 Ah yes I have seen AMD's work on ROCm. Unfortunately they only support Linux though, and it doesn't even seem like support for any other OS is on their roadmap. I'm hoping for something that supports Windows.

@briansp2020

This comment has been minimized.

Show comment
Hide comment
@briansp2020

briansp2020 Feb 6, 2018

@mjmax Is there any GPU accelerated tensorflow package available for Windows? I thought, if you want GPU accelerated deeplearning, Linux was only choice. If TensorFlow was ported to OpenCL, would that make it easier to port to Windows? I'm not sure why TensorFlow is not available on windows with GPU acceleration when CUDA is supported there.

I guess this is now off topic, but if anyone know of TensorFlow and/or PyTorch for windows that is GPU accelerated, I'd like to know about it as well...

@mjmax Is there any GPU accelerated tensorflow package available for Windows? I thought, if you want GPU accelerated deeplearning, Linux was only choice. If TensorFlow was ported to OpenCL, would that make it easier to port to Windows? I'm not sure why TensorFlow is not available on windows with GPU acceleration when CUDA is supported there.

I guess this is now off topic, but if anyone know of TensorFlow and/or PyTorch for windows that is GPU accelerated, I'd like to know about it as well...

@maxmilne

This comment has been minimized.

Show comment
Hide comment
@maxmilne

maxmilne Feb 6, 2018

@briansp2020 As far as I know, Tensorflow already supports Nvidia GPU acceleration on Windows.

maxmilne commented Feb 6, 2018

@briansp2020 As far as I know, Tensorflow already supports Nvidia GPU acceleration on Windows.

@mirh

This comment has been minimized.

Show comment
Hide comment
@mirh

mirh Feb 6, 2018

CL tensofrflow is already a mess on linux, don't expect anything any soon.
If you want to accelerate stuff there, there's only plaidML.
(and please, we are already at 500 comments.. let's try to only post if really, really necessary)

mirh commented Feb 6, 2018

CL tensofrflow is already a mess on linux, don't expect anything any soon.
If you want to accelerate stuff there, there's only plaidML.
(and please, we are already at 500 comments.. let's try to only post if really, really necessary)

@naibaf7

This comment has been minimized.

Show comment
Hide comment
@naibaf7

naibaf7 Feb 6, 2018

@mirh OpenCL Caffe does work on Windows. Sure it's not TensorFlow in terms of features, but pretty solid for Software that has to be deployed everywhere.

naibaf7 commented Feb 6, 2018

@mirh OpenCL Caffe does work on Windows. Sure it's not TensorFlow in terms of features, but pretty solid for Software that has to be deployed everywhere.

@LifeIsStrange

This comment has been minimized.

Show comment
Hide comment
@LifeIsStrange

LifeIsStrange Feb 10, 2018

What about replacing the openCL port with the HIP port backed by AMD ?

https://github.com/ROCmSoftwarePlatform/hiptensorflow

What about replacing the openCL port with the HIP port backed by AMD ?

https://github.com/ROCmSoftwarePlatform/hiptensorflow

@keryell

This comment has been minimized.

Show comment
Hide comment
@keryell

keryell Feb 10, 2018

Haha! @LifeIsStrange Life is very strange actually... Are you working for the HiP marketing team of AMD ? :-)
Please look at the subject of this issue : "OpenCL support".

This means it is about the Khronos standard https://en.wikipedia.org/wiki/OpenCL (and the other SYCL standard from the OpenCL Khronos working group appears at the end of the "Overview" section).

Of course there is a world outside of this issue, but it is... outside! :-)

Please try not to increase inconsiderately the entropy of the universe by posting some random posts on this already too lengthy discussion... :-)
This comment applies to some other posters here, not only you, by the way.
This is a GitHub issue to solve a technical problem: having TensorFlow running on devices supporting the OpenCL standard, not a FaceBook page about how people like or dislike tool A or B. :-)
But please feel free to send some git commits related to this issue we can look at...

keryell commented Feb 10, 2018

Haha! @LifeIsStrange Life is very strange actually... Are you working for the HiP marketing team of AMD ? :-)
Please look at the subject of this issue : "OpenCL support".

This means it is about the Khronos standard https://en.wikipedia.org/wiki/OpenCL (and the other SYCL standard from the OpenCL Khronos working group appears at the end of the "Overview" section).

Of course there is a world outside of this issue, but it is... outside! :-)

Please try not to increase inconsiderately the entropy of the universe by posting some random posts on this already too lengthy discussion... :-)
This comment applies to some other posters here, not only you, by the way.
This is a GitHub issue to solve a technical problem: having TensorFlow running on devices supporting the OpenCL standard, not a FaceBook page about how people like or dislike tool A or B. :-)
But please feel free to send some git commits related to this issue we can look at...

@XVilka

This comment has been minimized.

Show comment
Hide comment
@XVilka

XVilka Feb 12, 2018

There is a fork of TensorFlow supporting OpenCL https://github.com/hughperkins/tf-coriander

And of course @benoitsteiner 's work https://github.com/benoitsteiner/tensorflow-opencl

IMHO, it is ridiculous that mainstream TF still didn't merged their work.

XVilka commented Feb 12, 2018

There is a fork of TensorFlow supporting OpenCL https://github.com/hughperkins/tf-coriander

And of course @benoitsteiner 's work https://github.com/benoitsteiner/tensorflow-opencl

IMHO, it is ridiculous that mainstream TF still didn't merged their work.

@VincentSC

This comment has been minimized.

Show comment
Hide comment
@VincentSC

VincentSC Feb 12, 2018

Is the focus here on getting-it-to-run-as-lomg-as-it-is-OpenCL, or making it actually run faster? I'd prefer there not a holy war, but focusing on getting it to run fast on several GPUs. LifeIsStrange's focus is on getting it to work on AMD GPUs and then HIP makes good sense. For others the focus is to make it work on Intel GPUs or Android, and then OpenCL makes much more sense. GPU-languages are a mess, so please keep practical,

If I read some of the comments here, performance is an issue with the OpenCL ports. But unfortunately I cannot see many benchmarks around. Are there more benchmarks than this one? https://github.com/AlphasCodes/DeepLearning/blob/master/Tensorflow_Benchmarks.md

Is the focus here on getting-it-to-run-as-lomg-as-it-is-OpenCL, or making it actually run faster? I'd prefer there not a holy war, but focusing on getting it to run fast on several GPUs. LifeIsStrange's focus is on getting it to work on AMD GPUs and then HIP makes good sense. For others the focus is to make it work on Intel GPUs or Android, and then OpenCL makes much more sense. GPU-languages are a mess, so please keep practical,

If I read some of the comments here, performance is an issue with the OpenCL ports. But unfortunately I cannot see many benchmarks around. Are there more benchmarks than this one? https://github.com/AlphasCodes/DeepLearning/blob/master/Tensorflow_Benchmarks.md

@cathalgarvey

This comment has been minimized.

Show comment
Hide comment
@cathalgarvey

cathalgarvey Feb 12, 2018

@VincentSC

This comment has been minimized.

Show comment
Hide comment
@VincentSC

VincentSC Feb 12, 2018

Comparing only 2 numbers is no information - who cares if OpenCL on NVidia runs at half speed if it runs at 4x speed on other GPUs?

I think we'd need these benchmarks:

  1. CUDA on NV GPUs (reference benchmarks)
  2. https://github.com/hughperkins/tf-coriander on AMD, Nvidia and Intel GPUs
  3. https://github.com/benoitsteiner/tensorflow-opencl on AMD, Nvidia and Intel GPUs
  4. https://github.com/lukeiwanski/tensorflow on AMD, Nvidia and Intel GPUs

The reference benchmarks are easy to be found. We have some high end GPUs here, so only need a place to put the numbers in (with links to building-docs).

Comparing only 2 numbers is no information - who cares if OpenCL on NVidia runs at half speed if it runs at 4x speed on other GPUs?

I think we'd need these benchmarks:

  1. CUDA on NV GPUs (reference benchmarks)
  2. https://github.com/hughperkins/tf-coriander on AMD, Nvidia and Intel GPUs
  3. https://github.com/benoitsteiner/tensorflow-opencl on AMD, Nvidia and Intel GPUs
  4. https://github.com/lukeiwanski/tensorflow on AMD, Nvidia and Intel GPUs

The reference benchmarks are easy to be found. We have some high end GPUs here, so only need a place to put the numbers in (with links to building-docs).

@samscott1989

This comment has been minimized.

Show comment
Hide comment
@samscott1989

samscott1989 Feb 22, 2018

OpenCL support It must become true.

cuda too limited,and nvidia dont want to share it.
cuda only work for Nv gpus.
that is dead end for TensorFlow,
if another "TensorFlow" come out but more support than TensorFlow.
if TensorFlow still only support cuda in windows.
you have to realize TensorFlow not the only choose.

samscott1989 commented Feb 22, 2018

OpenCL support It must become true.

cuda too limited,and nvidia dont want to share it.
cuda only work for Nv gpus.
that is dead end for TensorFlow,
if another "TensorFlow" come out but more support than TensorFlow.
if TensorFlow still only support cuda in windows.
you have to realize TensorFlow not the only choose.

@briansp2020

This comment has been minimized.

Show comment
Hide comment
@briansp2020

briansp2020 Feb 22, 2018

Why is OpenCL better than HIP? I think OpenCL has failed to gain traction and supporting OpenCL at this point in time probably is counter productive and waist of resources for the whole comunity/industry. I'd rather see TensorFlow support HIP directly and let the compiler/tool/library to take care of the portability.

Isn't it better for software to support 1 language/programming model?

Why is OpenCL better than HIP? I think OpenCL has failed to gain traction and supporting OpenCL at this point in time probably is counter productive and waist of resources for the whole comunity/industry. I'd rather see TensorFlow support HIP directly and let the compiler/tool/library to take care of the portability.

Isn't it better for software to support 1 language/programming model?

@mirh

This comment has been minimized.

Show comment
Hide comment
@mirh

mirh Feb 22, 2018

Software has to support what it has to support to cover every use case.
HIP is all bells and whistles (at least on the paper) if you have supported hardware. But there aren't just "newer amd and nvidia cards" to this world.

Now please, for the love of god, complain here for any problem with that.
And here for everybody else interested to the continuation of this issue.

mirh commented Feb 22, 2018

Software has to support what it has to support to cover every use case.
HIP is all bells and whistles (at least on the paper) if you have supported hardware. But there aren't just "newer amd and nvidia cards" to this world.

Now please, for the love of god, complain here for any problem with that.
And here for everybody else interested to the continuation of this issue.

@Hoeze

This comment has been minimized.

Show comment
Hide comment
@Hoeze

Hoeze Feb 23, 2018

I thought, that SPIR-V would directly replace CUDA as a cross-hardware alternative:
http://alphanew.net/index.php?section=alphanew&site=overview&lang=eng&newsID=111

Why does Google still rely on CUDA?

Hoeze commented Feb 23, 2018

I thought, that SPIR-V would directly replace CUDA as a cross-hardware alternative:
http://alphanew.net/index.php?section=alphanew&site=overview&lang=eng&newsID=111

Why does Google still rely on CUDA?

@tugrul512bit

This comment has been minimized.

Show comment
Hide comment
@tugrul512bit

tugrul512bit Mar 8, 2018

Can these help?

OpenCL random number generation(Thomas Wang's):

uint wang_hash(uint seed)
{
               seed = (seed ^ 61) ^ (seed >> 16);
               seed *= 9;
               seed = seed ^ (seed >> 4);
               seed *= 0x27d4eb2d;
               seed = seed ^ (seed >> 15);
               return seed;
}
            
void wang_rnd_0(__global unsigned int * intSeeds,int id)                
{
               uint maxint=0;
               maxint--;
               uint rndint=wang_hash(id);
               intSeeds[id]=rndint;
}

float wang_rnd(__global unsigned int * intSeeds,int id)                
{
               uint maxint=0;
               maxint--;
               uint rndint=wang_hash(intSeeds[id]);
               intSeeds[id]=rndint;
               return ((float)rndint)/(float)maxint;
}


// initialize each thread's own random number seed
__kernel void rnd_0(__global unsigned int * intSeeds)
{
               int id=get_global_id(0);
               wang_rnd_0(intSeeds,id);     
}

// get a new random value by each thread
__kernel void rnd_1(__global unsigned int * intSeeds)
{
               int id=get_global_id(0);
               float randomFloat=wang_rnd(intSeeds,id);
}

OpenCL SHA3hashing(forgot who wrote this)

https://gist.github.com/tugrul512bit/c8170f74846e36e350607664f12c525c

Can these help?

OpenCL random number generation(Thomas Wang's):

uint wang_hash(uint seed)
{
               seed = (seed ^ 61) ^ (seed >> 16);
               seed *= 9;
               seed = seed ^ (seed >> 4);
               seed *= 0x27d4eb2d;
               seed = seed ^ (seed >> 15);
               return seed;
}
            
void wang_rnd_0(__global unsigned int * intSeeds,int id)                
{
               uint maxint=0;
               maxint--;
               uint rndint=wang_hash(id);
               intSeeds[id]=rndint;
}

float wang_rnd(__global unsigned int * intSeeds,int id)                
{
               uint maxint=0;
               maxint--;
               uint rndint=wang_hash(intSeeds[id]);
               intSeeds[id]=rndint;
               return ((float)rndint)/(float)maxint;
}


// initialize each thread's own random number seed
__kernel void rnd_0(__global unsigned int * intSeeds)
{
               int id=get_global_id(0);
               wang_rnd_0(intSeeds,id);     
}

// get a new random value by each thread
__kernel void rnd_1(__global unsigned int * intSeeds)
{
               int id=get_global_id(0);
               float randomFloat=wang_rnd(intSeeds,id);
}

OpenCL SHA3hashing(forgot who wrote this)

https://gist.github.com/tugrul512bit/c8170f74846e36e350607664f12c525c

@tensorflowbutler

This comment has been minimized.

Show comment
Hide comment
@tensorflowbutler

tensorflowbutler Mar 25, 2018

Member

Please remove the assignee, as this issue is inviting external contributions. Otherwise, remove the contributions welcome label. Thank you.

Member

tensorflowbutler commented Mar 25, 2018

Please remove the assignee, as this issue is inviting external contributions. Otherwise, remove the contributions welcome label. Thank you.

@tensorflowbutler

This comment has been minimized.

Show comment
Hide comment
@tensorflowbutler

tensorflowbutler Apr 8, 2018

Member

Please remove the assignee, as this issue is inviting external contributions. Otherwise, remove the contributions welcome label. Thank you.

Member

tensorflowbutler commented Apr 8, 2018

Please remove the assignee, as this issue is inviting external contributions. Otherwise, remove the contributions welcome label. Thank you.

@wis

This comment has been minimized.

Show comment
Hide comment
@wis

wis Apr 10, 2018

It is in Google's interest to support OpenCL,
by having a specific (company/brand/vendor)'s specific hardware as a dependency for your software, you enforce yourself to pay more for hardware, market competition lowers costs.
Google has always been about commodity hardware since the very beginning which was and still crucial for Google's success (market dominance), having lower data center operating costs, enabled the revolutionary generous essentially free services offerings like Gmail (storage space) and Google Photos (storage space and auto-tagging).

wis commented Apr 10, 2018

It is in Google's interest to support OpenCL,
by having a specific (company/brand/vendor)'s specific hardware as a dependency for your software, you enforce yourself to pay more for hardware, market competition lowers costs.
Google has always been about commodity hardware since the very beginning which was and still crucial for Google's success (market dominance), having lower data center operating costs, enabled the revolutionary generous essentially free services offerings like Gmail (storage space) and Google Photos (storage space and auto-tagging).

@znmeb

This comment has been minimized.

Show comment
Hide comment
@znmeb

znmeb Apr 10, 2018

@wesamco No, it isn't necessarily in Google's interest. They make their own hardware - something called a "TensorBoard", IIRC. They can bypass OpenCL and CUDA / CUDnn and make the board run raw TensorFlow code.

znmeb commented Apr 10, 2018

@wesamco No, it isn't necessarily in Google's interest. They make their own hardware - something called a "TensorBoard", IIRC. They can bypass OpenCL and CUDA / CUDnn and make the board run raw TensorFlow code.

@VincentSC

This comment has been minimized.

Show comment
Hide comment
@VincentSC

VincentSC Apr 11, 2018

raw TensorFlow code.

There is no such thing - it's not like unprocessed food. TPUs need their own DNN-library that handles the different types of calls.

It seems it's time for compressing the above discussion into one list again:

  • CodePlay is working on a SYCL backend
  • Hugh Perkins is working on tf-coriander
  • AMD is working on a HIP backend
  • PlaidML only supports CPUs at the moment.
  • Status of support for Intel GPUs is unclear.

So choose a project you like and start supporting them. Maybe each of the groups can give a status-update on their project?

Do understand that OpenCL has been transformed from a full language to a language-definition/hardware-specification that is represented in SPIRV (kernels), which then can be run on top of a platform like OpenCL-drivers and later also on Vulkan-drivers (platforms). So by supporting SYCL, you also support OpenCL.

VincentSC commented Apr 11, 2018

raw TensorFlow code.

There is no such thing - it's not like unprocessed food. TPUs need their own DNN-library that handles the different types of calls.

It seems it's time for compressing the above discussion into one list again:

  • CodePlay is working on a SYCL backend
  • Hugh Perkins is working on tf-coriander
  • AMD is working on a HIP backend
  • PlaidML only supports CPUs at the moment.
  • Status of support for Intel GPUs is unclear.

So choose a project you like and start supporting them. Maybe each of the groups can give a status-update on their project?

Do understand that OpenCL has been transformed from a full language to a language-definition/hardware-specification that is represented in SPIRV (kernels), which then can be run on top of a platform like OpenCL-drivers and later also on Vulkan-drivers (platforms). So by supporting SYCL, you also support OpenCL.

@mirh

This comment has been minimized.

Show comment
Hide comment
@mirh

mirh Apr 11, 2018

Perfect sum-up, but plaidml does run on gpus too.
It's just that at the moment they are a backend for keras, not tensorflow. So it's kinda OT there.

mirh commented Apr 11, 2018

Perfect sum-up, but plaidml does run on gpus too.
It's just that at the moment they are a backend for keras, not tensorflow. So it's kinda OT there.

@lukeiwanski

This comment has been minimized.

Show comment
Hide comment
@lukeiwanski

lukeiwanski Apr 12, 2018

Contributor

Hi all,
@VincentSC thanks for great sum up of the different efforts!

So choose a project you like and start supporting them. Maybe each of the groups can give a status-update on their project?

The SYCL approach supports a variety of platforms / devices now. The ones I can mention are:

  • AMD GPUs (FirePro W8100, R9 Nano and R9 380 Series ) Instructions available here or here
  • ARM Mali ( HiKey 960 ) Instructions available here
  • Intel GPU ( SkyLake series ) with Intel NEO OpenCL driver

When it comes to AMD, at the moment the GPUs mentioned above are using the AMDGPU-Pro drivers 17.40-xxx with legacy OpenCL enabled.
I don’t see any obvious reason why other series would not work ( with the assumption that SPIR / SPIR-V is supported that is)

The main platform we are focusing on is Linux - however, we have ongoing efforts to enable Windows in the future. We have no plans to support OSX in near future. I know sad face.

Our focus is on improving performance for CNNs. Current performance is unoptimized and nowhere near where we see it ending up. That said, we are already beating CPU performance for most models on different targets.

In order to speed up the development cycle and reduce overall compilation time of TensorFlow (as well as improve portability) we are working on Eigen, BLAS and DNN libraries.
These libraries aim to solve the performance issue as well as building up an ecosystem of portable libraries that can be easily integrated with complex projects like TensorFlow.

Below, see graphs for performance that we can share at present. They are taken from my fork https://github.com/lukeiwanski/tensorflow/tree/dev/amd_gpu at 271093b.
The benchmark used is https://github.com/tensorflow/benchmarks

cpuvssycl
Graph is normalised to Intel i7-4790K results.

We are slowly upstreaming changes to Eigen once that happens we will follow with TensorFlow.

Hope that helps,
Luke

Contributor

lukeiwanski commented Apr 12, 2018

Hi all,
@VincentSC thanks for great sum up of the different efforts!

So choose a project you like and start supporting them. Maybe each of the groups can give a status-update on their project?

The SYCL approach supports a variety of platforms / devices now. The ones I can mention are:

  • AMD GPUs (FirePro W8100, R9 Nano and R9 380 Series ) Instructions available here or here
  • ARM Mali ( HiKey 960 ) Instructions available here
  • Intel GPU ( SkyLake series ) with Intel NEO OpenCL driver

When it comes to AMD, at the moment the GPUs mentioned above are using the AMDGPU-Pro drivers 17.40-xxx with legacy OpenCL enabled.
I don’t see any obvious reason why other series would not work ( with the assumption that SPIR / SPIR-V is supported that is)

The main platform we are focusing on is Linux - however, we have ongoing efforts to enable Windows in the future. We have no plans to support OSX in near future. I know sad face.

Our focus is on improving performance for CNNs. Current performance is unoptimized and nowhere near where we see it ending up. That said, we are already beating CPU performance for most models on different targets.

In order to speed up the development cycle and reduce overall compilation time of TensorFlow (as well as improve portability) we are working on Eigen, BLAS and DNN libraries.
These libraries aim to solve the performance issue as well as building up an ecosystem of portable libraries that can be easily integrated with complex projects like TensorFlow.

Below, see graphs for performance that we can share at present. They are taken from my fork https://github.com/lukeiwanski/tensorflow/tree/dev/amd_gpu at 271093b.
The benchmark used is https://github.com/tensorflow/benchmarks

cpuvssycl
Graph is normalised to Intel i7-4790K results.

We are slowly upstreaming changes to Eigen once that happens we will follow with TensorFlow.

Hope that helps,
Luke

@llhe

This comment has been minimized.

Show comment
Hide comment
@llhe

llhe Jul 17, 2018

Contributor

For deep learning inference on mobile devices with GPU/OpenCL support, you can checkout MACE, which is optimized for Adreno, Mali and PowerVR GPUs. Here are some benchmark results.

Contributor

llhe commented Jul 17, 2018

For deep learning inference on mobile devices with GPU/OpenCL support, you can checkout MACE, which is optimized for Adreno, Mali and PowerVR GPUs. Here are some benchmark results.

@yogeshrbtk

This comment has been minimized.

Show comment
Hide comment
@yogeshrbtk

yogeshrbtk Jul 30, 2018

@keryell @benoitsteiner , which version of tensorflow and trisycl are required for integration. I am having trouble building tensorflow (1.9) with latest trisycl release.

yogeshrbtk commented Jul 30, 2018

@keryell @benoitsteiner , which version of tensorflow and trisycl are required for integration. I am having trouble building tensorflow (1.9) with latest trisycl release.

@keryell

This comment has been minimized.

Show comment
Hide comment
@keryell

keryell Jul 30, 2018

Unfortunately the latest TensorFlow is using more advanced features than the current triSYCL can cope with, so you have to use ComputeCpp, currently the only fully compliant SYCL implementation...

keryell commented Jul 30, 2018

Unfortunately the latest TensorFlow is using more advanced features than the current triSYCL can cope with, so you have to use ComputeCpp, currently the only fully compliant SYCL implementation...

@HackInvent

This comment has been minimized.

Show comment
Hide comment
@HackInvent

HackInvent Aug 3, 2018

Tensorflow is supported by Google Brain, and Google has partnership with nVidia, I guess that we shall not expect from Tensorflow to support OpenCL
Big OpenCL community effort is needed

Tensorflow is supported by Google Brain, and Google has partnership with nVidia, I guess that we shall not expect from Tensorflow to support OpenCL
Big OpenCL community effort is needed

@mancoast

This comment has been minimized.

Show comment
Hide comment
@mancoast

mancoast Aug 8, 2018

OpenCL support please!

mancoast commented Aug 8, 2018

OpenCL support please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment