-
Notifications
You must be signed in to change notification settings - Fork 74.7k
Support MPSCNN (MetalPerformanceShaders) on iOS #7958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for filing this issue @cancan101 ! I think the commentary described on #4846 and #3001 describes the current situation well. I'm marking this as "contributions welcome". |
@tatatodd I might be interested in looking at this. Any suggestions for how to tackle this? Is there a concept of GPU device when running on iOS? The BNNS is simpler to think about as it uses the accelerate framework which runs on CPU. |
@petewarden knows the space pretty well, and might have some suggestions. |
CC-ing @keveman who's interested in this area. |
I would like to contribute for this as well. Is there any movement or at least slack for interested people? |
I am putting together a skeletal framework for calling Metal from TF. I would hold off until that is upstreamed. Once that is done, there will room for lot of contributions in building the repertoire of TF operations that can be run using Metal. @s1ddok and @cancan101 , does it sound like a plan? |
One should keep in mind that there are a lot of things lacking in MPS framework. So we will have to provide custom layer-implementation for absent layers. Also, you gain maximum performance when using |
Yes. I am counting on contributors like you to add the missing pieces :) |
You may take a look on this, it is a C++ wrapper around Metal. Not around MPS, but still. https://github.com/naleksiev/mtlpp |
You don't really do it. MPS decreases it everytime that image is being encoded, but what we need to do is to set that P.S. We will of course need to decrease that number in custom layers, but those are details. |
@s1ddok thanks for the pointers. The C++ wrapper around Metal is great. We do really want to call the convolution kernels in MPS, though. |
@sschaetz good to see this. The goal is to of course shuffle the data one time only and then encode verything in one pass. Apple has a good example on how to implement Inception_v3, so basically all the ideas of the best performance can be gained from there. There is also a rendering library called |
@keveman any updates on your framework for calling metal ? |
@cancan101 No significant updates, but coming soon. |
http://caffe2.ai/docs/mobile-integration.html#null__performance-considerations makes the following claim:
|
And here is the pr adding the MPSCNN functionality to caffe2 : facebookarchive/caffe2#215 |
@cancan101 small correction per @ajtulloch: Metal is faster for devices that are iPhone 6s and above, and NNPack faster on the rest of devices. |
Hey @keveman, I wrote the C2 mobile stuff so really interested in the TF approach. How are you thinking of structuring the TF integration? It'd be cool if we can reuse kernel sources and stuff. Some decisions/hacks/notes I made that I was kind of unsure about, so I wondered how you'd approach them:
|
@Yangqing That's what I noticed too. Using Metal on iPhone 6s and above was significantly faster. |
@ajtulloch Thanks for your detailed comments. It looks like the C2 implementation is further along than what I have, but it would be great to share code if possible. I haven't looked at the C2 implementation in detail yet, but I am thinking about all the points you bring up here. Especially, I am super frustrated by the
|
|
At @xmartlabs we have been working on a framework to use neural nets for iOS on top of metal that supports running TensorFlow models, maybe someone is interested in using it, at least until TF supports metal |
Would it make sense to take the same approach to adapt TF models to run on Caffe2? |
@bryant1410 Thanks for the note. The Bender project is awesome! Some of the code, especially the shaders can be shared between the implementations. I'll keep you posted. |
Core ML might be an interesting abstraction to both this and BNNS: #10468. |
@keveman @ajtulloch
Last but not least, while being Swift based, Forge has some nice ideas in it too for building a framework on top of MPS. |
@keveman Do you have your framework somewhere to try ? Even if it is WIP maybe easier to build on the same foundation |
@ofirbb interesting.
|
Is there any updates on supporting MPSCNN in TensorFlow or TensorFlow-Lite? I am new to machine learning so I probably can only describe the problem from a user's perspective. We are currently comparing the accuracy / performance between TF-Lite and Metal (MPSCNN), and it seems Metal works better on iOS (Mobilenet_v1). As we have to support iOS 10 so Core ML is not an option. Though I don't like the Metal approach we used, which required us to extract the weights and biases from frozen TF graph and also we had to write Metal code by ourselves to build the inference. I just thought it would be nice if TF-Lite can provide comparable performance, then there will be no need for us to dig into the Apple's MPSCNN APIs :) Thank you for all your efforts! |
This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you. |
This issue was closed because it has been inactive for 1 year. |
Related to: #3001
Take advantage of the MPSCNN (Metal Performance Shaders) framework from Apple.
See blog post for a comparison of BNSS to MPSCNN (and associated code).
TL; DR BNNS is faster for smaller networks but slower for bigger networks.
Related: #4846
The text was updated successfully, but these errors were encountered: