Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenCL support #22

Open
outlace opened this issue Nov 9, 2015 · 541 comments
Open

OpenCL support #22

outlace opened this issue Nov 9, 2015 · 541 comments

Comments

@outlace
Copy link

@outlace outlace commented Nov 9, 2015

I understand TensorFlow only supports CUDA. What would need to be done to add in OpenCL support?

@nmabhinandan
Copy link

@nmabhinandan nmabhinandan commented Nov 9, 2015

It's strange that Google ditched open OpenCL for proprietary CUDA.
im-just-saying

@ebrevdo
Copy link
Contributor

@ebrevdo ebrevdo commented Nov 9, 2015

At the very least, the Eigen library would have to support OpenCL.

@bhack
Copy link
Contributor

@bhack bhack commented Nov 9, 2015

👍

@keveman keveman added the cuda label Nov 9, 2015
@jamesliu96
Copy link

@jamesliu96 jamesliu96 commented Nov 10, 2015

👍

1 similar comment
@alexatknit
Copy link

@alexatknit alexatknit commented Nov 10, 2015

👍

@dhess
Copy link

@dhess dhess commented Nov 11, 2015

thumbs up and all that.

@gujunli
Copy link

@gujunli gujunli commented Nov 11, 2015

I will be interested in expanding Tensor Flow with OpenCL. As we have already released OpenCL caffe. https://github.com/amd/OpenCL-caffe. Hopefully it can get integrated in light way? Is anyone interested in working together on this?

@bhack
Copy link
Contributor

@bhack bhack commented Nov 11, 2015

@gujunli Nice to see AMD here. /cc @naibaf7 @lunochod

@nmabhinandan
Copy link

@nmabhinandan nmabhinandan commented Nov 11, 2015

would be great.

@sasadep
Copy link

@sasadep sasadep commented Nov 11, 2015

👍

@bhack
Copy link
Contributor

@bhack bhack commented Nov 15, 2015

/cc @lukeiwanski for Eigen/OpenCL/SYCL

@ankdesh
Copy link

@ankdesh ankdesh commented Nov 16, 2015

@gujunli Certainly would be interested in contributing. Please let me know when you plan to start.

@lukeiwanski
Copy link
Contributor

@lukeiwanski lukeiwanski commented Nov 25, 2015

Hi all,

Here at Codeplay we are looking into Eigen's tensor running on GPU using SYCL (a modern C++ layer on top of OpenCL). From what we have gathered so far, GPU tensor design is very closely coupled with CUDA and it will require interface changes for another programming model and particularly a SYCL and OpenCL 1.2 version.

If anyone is interested in digging deeper / helping out, we are most certainly interested in contributing.

Thanks,
Luke

@bhack
Copy link
Contributor

@bhack bhack commented Nov 25, 2015

@lukeiwanski Thank you for the feedback. I think that @benoitsteiner worked at the tensor extension part of eigen.

@jszuppe
Copy link

@jszuppe jszuppe commented Dec 6, 2015

👍 I can help code some OpenCL/SYCL if someone makes a plan, divides work into tasks etc. I recommend using Boost.Compute as a wrapper for OpenCL (it makes running kernels, testing, templating easier).

@ieee8023
Copy link

@ieee8023 ieee8023 commented Dec 7, 2015

+1

1 similar comment
@armish
Copy link

@armish armish commented Dec 7, 2015

👍

@lukeiwanski
Copy link
Contributor

@lukeiwanski lukeiwanski commented Dec 8, 2015

Hi all,

Just to keep you posted, we are still investigating how we can change the Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach that targets heterogeneous programming models ( not only OpenCL / SYCL ) we will create a proposal.

Thanks,
Luke

@gujunli
Copy link

@gujunli gujunli commented Dec 8, 2015

Pls keep me update. I developed opencl-caffe for AMD. I am also looking at
tensor flow.

Thanks.
Junlu
On Dec 8, 2015 10:19 AM, "Luke Iwanski" notifications@github.com wrote:

Hi all,

Just to keep you posted, we are still investigating how we can change the
Eigen interface to better fit the SYCL/OpenCL 1.2 programming model.
Once we come up with a reasonable approach we will create a proposal.

Thanks,
Luke


Reply to this email directly or view it on GitHub
#22 (comment)
.

@bhack
Copy link
Contributor

@bhack bhack commented Dec 9, 2015

/cc @ptillet @gongzg Is there any interest in this by Intel? I really hope that we don't fragment OPENCL here like in Caffe where we have an AMD fork, Intel unmerged PRs, another semi-unofficial AMD PR, and a long staging user PR (plus two old abandoned Opencl efforts). If somebody is interested in the history can take a look at BVLC/caffe#2610 comments.

@gongzg
Copy link

@gongzg gongzg commented Dec 17, 2015

@bhack We do have interest in this. Thanks for letting me know. If there is a proposal for Eigen's OpenCL/SYCL implementation, we will see what we can do from Intel side.

@benoitsteiner benoitsteiner self-assigned this Dec 23, 2015
@ZirconCode
Copy link

@ZirconCode ZirconCode commented Dec 23, 2015

👍

@bhack
Copy link
Contributor

@bhack bhack commented Jan 1, 2016

An interesting initiative at https://github.com/ptillet/isaac also if here we rely on Eigen tensor extension.

@DanMcLaughlin
Copy link

@DanMcLaughlin DanMcLaughlin commented Jan 19, 2016

I also would like to contribute. @benoitsteiner can you organize it?

@sohnryang
Copy link

@sohnryang sohnryang commented Aug 27, 2018

@Makhaon Me too. I can't afford to buy a machine with NVIDIA graphic card.

@busukxuan
Copy link

@busukxuan busukxuan commented Aug 27, 2018

Besides the above 2 posts, I'd like to add that now AMD's Vega GPUs (including the ones inside Raven Ridge APUs) can do FP16 at twice the FLOPS, so if TF could support them (through OpenCL) it would really help people with less budget. Also a lot of these people would be students, and if we get them to use TF as the starting point of their DNN journey, they would probably stick with TF down the road, and even tell others about TF; it's a great way to help expand this project.

@FelixSchwarz
Copy link

@FelixSchwarz FelixSchwarz commented Aug 27, 2018

I think this thread is mostly meaningless for developers (too much noise - and I'll add some more ;-) but I think many comments are missing the point:
If you want to run Tensorflow with AMD cards OpenCL IS NOT what you are looking for - please head over to https://github.com/ROCmSoftwarePlatform/ and install the ROCm stack. AFAIK AMD's current strategy is based on ROCm instead of OpenCL for Tensorflow/pytorch.

Generic OpenCL was too much maintenance/did not give enough performance benefits to be worthwhile for AMD. Therefore this ticket is only interesting if you are running (e.g.) an ARM platform which uses OpenCL only.

(Disclaimer: just an outsider, no real inside into Tensorflow development so maybe the information above completely wrong and misleading. Feel free to bash me if you know better.)

@ghost
Copy link

@ghost ghost commented Aug 27, 2018

Just a thought, what about llvm with the new GPU offload? That would put a great level of abstraction between tensorflow and cuda specific code.

@mirh
Copy link

@mirh mirh commented Aug 27, 2018

What about all of you reading just 10 posts above and noticing there already is a fork by lukeiwanski/codeplaysoftware you can try ?
(also my hats off to xiaomi for, once, contributing some serious kind of open source effort)

@fantesykikachu
Copy link

@fantesykikachu fantesykikachu commented Sep 4, 2018

@FelixSchwarz Just so you are aware ROCm uses OpenCL, it is AMD's userspace OpenCL driver on Linux (that is why is why it doesn't support windows), so if you are not aware of how AMD's driver ecosystem on linux works, they have their kernel side drivers AMDGPU and AMDKFD(which is now getting merged into AMDGPU) then there is the userspace drivers RadeonSI(for OpenGL) RadV/AMDVLK(for Vulkan) and ROCm(for OpenCL).

@XVilka
Copy link

@XVilka XVilka commented Sep 15, 2018

Judging by the dynamics of this bug and other forks Google has zero interest in this and will never implement this in the official repository. I would vote for closing this issue (or locking it) at all to not give any false hopes for everyone.

@klokik
Copy link

@klokik klokik commented Sep 15, 2018

@tamusjroyce
Copy link

@tamusjroyce tamusjroyce commented Nov 28, 2018

There is a TensorRT that supports Movidius Pi Hat. And that Movidius Pi Hat is Google’s $45 “AIY Vision Kit”. Google links to Target to buy it.

This doesn't have any ties to CUDA or Nvidia? Says it uses an Intel chip. At its heart, maybe the chip is a FPGA? Anyone know anything more about it?

@znmeb
Copy link

@znmeb znmeb commented Nov 28, 2018

I know quite a bit about the big Movidius unit - it's inference only and it runs either TensorFlow or Caffe pre-compiled models. IIRC they're all in 16 bit mode.

The Movidius chip itself is much more powerful but you have to be a qualified partner to get the SDK.

@filips123
Copy link

@filips123 filips123 commented Jan 9, 2019

Is there any update? This issue is over 3 years old.

@mirh
Copy link

@mirh mirh commented Jan 9, 2019

YES THERE IS JUST LOOK AT THE LAST HANDFUL OF POSTS.

@XVilka
Copy link

@XVilka XVilka commented Jan 10, 2019

@filips123 no, there are no updates and will never be in any foreseeable future - probability of that is lower than of alien invasion and finding a way to travel back in time.

@lppier
Copy link

@lppier lppier commented Jan 10, 2019

This intel initiative PlaidML works reasonably well, worth checking it out.
https://github.com/plaidml/plaidml
It runs on opencl OR metal on mac. It works with Macbook Pro AMD gpus, which is what I was looking for.
Meanwhile, could you guys help vote for Pytorch support in PlaidML? plaidml/plaidml#63

@mirh
Copy link

@mirh mirh commented Jan 10, 2019

PlaidML is certainly all nice and dandy (I, for one, somehow could get more performance on an nvidia gpu on opencl than with tf's cuda itself)..
But it's a backend for keras? In complete replacement to tensorflow, which you know, it's the repo we are discussing this in?
(for as much as I seem to understand latest tf versions can export models directly to keras? so there's that..)

Anyway, for the fourth damn time, if you want a recent solution on opencl and something still being actively developed (and also the thing with the actual chances to be merged here for real one day), there's just codeplay stack.
Again:
https://developer.codeplay.com/computecppce/latest/tensorflow-overview
https://github.com/Rbiessy/tensorflow/tree/dev/amd_gpu

@lppier
Copy link

@lppier lppier commented Jan 11, 2019

PlaidML is certainly all nice and dandy (I, for one, somehow could get more performance on an nvidia gpu on opencl than with tf's cuda itself)..
But it's a backend for keras? In complete replacement to tensorflow, which you know, it's the repo we are discussing this in?
(for as much as I seem to understand latest tf versions can export models directly to keras? so there's that..)

Anyway, for the fourth damn time, if you want a recent solution on opencl and something still being actively developed (and also the thing with the actual chances to be merged here for real one day), there's just codeplay stack.
Again:
https://developer.codeplay.com/computecppce/latest/tensorflow-overview
https://github.com/Rbiessy/tensorflow/tree/dev/amd_gpu

My apologies, I had not realised there was no tensorflow support. My assuming brain thought that keras gpu support == tensorflow support.

@iperov
Copy link

@iperov iperov commented Feb 18, 2019

plaidML is super cool. Works on keras.
Of course I had to transfer some tf code to pure keras in order to work on plaidML backend (for example tf.image.ssim)
But result - my code works on NVIDIA and AMD cards.

Also plaidML is heaven for researchers. It automatically generates gradient for any function you will write on "Tile" language and it will work on your GPU with 80% speed of tensorflow.

So I cannot understand why ML researchers still using PyTorch ? Let's boost ML science with Intel's plaidML ?

@Degerz
Copy link

@Degerz Degerz commented Feb 26, 2019

@iperov Care to know why practically no one uses PlaidML ?

  1. It runs pitifully slow on AMD's OpenCL implementations compared to Tensorflow's CUDA backend so there goes at least half the reason to use it. Performance so bad that using Tensorflow with CPUs is competitive or even outright beats their hardware using PlaidML ?

  2. Nobody is interested in maintaining their specialized Tile programming language in which only someone like a pure maths professor would concoct so PlaidML's code quality just goes down the drain and no serious programmers in their right mind would want to deal with overly clever code ...

  3. This pretty much ties into #2 but ever since Intel bought out Vertex.AI, they don't care about PlaidML anymore. Intel's solution for GPU compute accelerated machine learning is introducing a new compiler specifically for deep learning now known as nGraph to target Tensorflow, PyTorch or other deep learning frameworks as a backend for them. No reason for them to keep developing PlaidML anymore as their intermediary when they have nGraph ...

People use PyTorch for other reasons such as maintainability or other features so to sum it up PlaidML is Intel's tool and they probably don't intend for it to play in any role of the final parts of their plans. nGraph's current Intel GPU backend is based off of OpenCL 2.1 of which only Intel has a conformant implementation so Intel only exists to look out for themselves rather than purely for the betterment of machine learning. When Intel goes on to further developing nGraph, I can't see them continue basing off their GPU backend on OpenCL 2.1 alone since many deep learning frameworks have templated kernels which are not compatible with OpenCL, Metal or Vulkan's separate source programming models so it's probably only for experimentation purposes. Intel's final GPU backend is probably going to either be based off of SYCL 2.2 or something else entirely different like OpenMP and maybe they'll even bring a vendor specific solution ...

As for AMD, who cares ? OpenCL is irrelevant to them and they're finally showing some results with their work on HIP ...

@talregev
Copy link

@talregev talregev commented Feb 26, 2019

@iperov Care to know why practically no one uses PlaidML ?

  1. It runs pitifully slow on AMD's OpenCL implementations compared to Tensorflow's CUDA backend so there goes at least half the reason to use it. Performance so bad that using Tensorflow with CPUs is competitive or even outright beats their hardware using PlaidML ?
  2. Nobody is interested in maintaining their specialized Tile programming language in which only someone like a pure maths professor would concoct so PlaidML's code quality just goes down the drain and no serious programmers in their right mind would want to deal with overly clever code ...
  3. This pretty much ties into #2 but ever since Intel bought out Vertex.AI, they don't care about PlaidML anymore. Intel's solution for GPU compute accelerated machine learning is introducing a new compiler specifically for deep learning now known as nGraph to target Tensorflow, PyTorch or other deep learning frameworks as a backend for them. No reason for them to keep developing PlaidML anymore as their intermediary when they have nGraph ...

People use PyTorch for other reasons such as maintainability or other features so to sum it up PlaidML is Intel's tool and they probably don't intend for it to play in any role of the final parts of their plans. nGraph's current Intel GPU backend is based off of OpenCL 2.1 of which only Intel has a conformant implementation so Intel only exists to look out for themselves rather than purely for the betterment of machine learning. When Intel goes on to further developing nGraph, I can't see them continue basing off their GPU backend on OpenCL 2.1 alone since many deep learning frameworks have templated kernels which are not compatible with OpenCL, Metal or Vulkan's separate source programming models so it's probably only for experimentation purposes. Intel's final GPU backend is probably going to either be based off of SYCL 2.2 or something else entirely different like OpenMP and maybe they'll even bring a vendor specific solution ...

As for AMD, who cares ? OpenCL is irrelevant to them and they're finally showing some results with their work on HIP ...

What about all GPU inside arm machine like mobile phones and raspberry pi odroid and etc?
They don't support opencl?
Google should care about insert tensorflow on gpu on android.
The biggest libraries of neural network training run only on Nvidia gpu, it just make Nvidia gpu more and more expensive (because it people and companies only buy it for professional neural network training), then google will lose more money that way.

@iperov
Copy link

@iperov iperov commented Feb 26, 2019

@Degerz from which planet you are came from?
How you can compare tf-CPU and AMD GPU ?
AMD GPU on plaidML x30 faster than tf-CPU

  1. It runs pitifully slow on AMD's OpenCL implementations compared to Tensorflow's CUDA backend so there goes at least half the reason to use it

in my deepfakes tests OpenCL slower only by 20%, but in some mini networks OpenCL is 20% FASTER.

My project DeepFaceLab has many users that have been waiting for the support of AMD. How many people were delighted when deepfakes can finally be trained on AMD cards.
Also plaidML is the only backend for keras that supports AMD/IntelHD out of the box.
If a new AMD backend for keras appears, of course my project will switch to it.
PyTorch has no future.

What to maintain in plaidML ? Ops are auto differentiable, there is nothing to maintain.

Tile programming language in which only someone like a pure maths professor would concoct

Machine learning is invented by professors of mathematics, isn't it?

@Degerz
Copy link

@Degerz Degerz commented Feb 26, 2019

@talregev What about ARM or Broadcom ? The former probably has subpar OpenCL implementation and the latter doesn't even officially provide OpenCL drivers! It's not Google's responsibility to create and maintain a competent compute stack for hardware vendors ...

@iperov You realize that training neural nets with embedding layers on PlaidML is painful, right ? PlaidML also has a bunch of other limitations as well such as not being all that well suited for DenseNets or the fact that it's computation graphs are static and does PlaidML even work well with RNNs ?

As for your project, don't worry about it. You'll move on to something better like Tensorflow since AMD will soon offer a native GPU backend for it once MIOpen gets upstreamed which is their GPU accelerated library of primitives for deep neural networks similar to their competitor's cuDNN library both of which will leave PlaidML in the dust in terms of performance. Who cares about Intel iGPUs anyway ? If Intel is truly committed to delivering high performance deep learning on their future discrete graphics hardware then they'll offer a single source option just like the others (AMD/HIP and Nvidia/CUDA) did before them ...

PyTorch has no future.

Envy much ? PyTorch is ~10x more popular than PlaidML, newest techniques in DL are implemented easily on PyTorch, tons of different contributors and is actively developed by Facebook all the while Intel hasn't contributed to PlaidML in nearly a month ?

What to maintain in plaidML ? Ops are auto differentiable, there is nothing to maintain.

So I take it from you that PlaidML shouldn't receive any new fixes or new features in the future going forward ? If you don't see the value in improving code then there's no point in convincing you to acknowledge PlaidML's glaring flaws ...

Machine learning is invented by professors of mathematics, isn't it?

Doesn't mean we have to take up whatever programming language they make up especially in the case of Tile where elegance is clearly favoured over readability. It's no wonder why so many potential contributors are scared away from contributing ...

@unoexperto
Copy link

@unoexperto unoexperto commented Feb 26, 2019

Jesus, I wish you guys STFU and get back to work instead. I'll have to unsubscribe from the ticket because it's unbearable to get emails with flame wars. Too bad maintainers do not mute the thread.

@gunan @caisq @sanjoy Could you please do something about it ?

@tensorflow tensorflow locked as too heated and limited conversation to collaborators Feb 26, 2019
@rthadur rthadur removed the cuda label Mar 19, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.