Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Implement a model serving framework #1873

Closed
futurely opened this issue Apr 17, 2016 · 9 comments
Closed

Implement a model serving framework #1873

futurely opened this issue Apr 17, 2016 · 9 comments

Comments

@futurely
Copy link

as @tqchen suggested in soumith/convnet-benchmarks#101 (comment) to compete with https://github.com/tensorflow/serving.

@revilokeb
Copy link

@futurely @piiswrong a straightforward way to deploy mxnet models into production environments would indeed be highly welcome.
I have made some very good experience with an open source API server called Deepdetect (http://www.deepdetect.com/, https://github.com/beniz/deepdetect) which I am using heavily to deploy models for my commercial production environments. Currently it is supporting Caffe and XGBoost with partial support for Tensorflow on its way (my experience so far only relates to using Caffe). Would this be a route to go down for mxnet?

@jordan-green-zz
Copy link

@futurely @revilokeb @piiswrong Hey guys,
Where did you end up with this? Model serving and management is an area of focus for me, and I'd be keen to spend some dev hours on a compatible solution

@piiswrong
Copy link
Contributor

No one is doing it yet. An easy solution is to use AWS Lambda but it doesn't support GPU and doesn't do batching.

You are welcome to work on it. Please propose a design and we can discuss it

@beniz
Copy link

beniz commented Dec 30, 2016

@jordan-green you may be interested in opening an issue for mxnet prediction support with https://github.com/beniz/deepdetect as it already has support for Caffe, XGBoost and Tensorflow. It may not be executed immediately, though not too difficult I believe. If you can help a bit with it, it is even better and will happen faster.

@zihaolucky
Copy link
Member

Excited to see this!!

@jordan-green-zz
Copy link

Hi all, my current gut feeling is that this piece of functionality may be best provided as a standalone project, under a compatible and permissive license (most likely apache), so as to benefit other frameworks also.

It would seem that outside of TF Serving, there's not a lot out there. Deep Detect looks interesting @beniz, however it appears to be under the GPL license - can you please confirm?

Lambda / OpenWhisk

Lambda would almost certainly be a great option if it had GPU support, and Amazon will almost certainly provide this in the near future, whether via a different class of lambda or via their new elastic GPU offering (which may be slightly less suited here than the prior). This if of course not an open source solution, and as such may not be the ideal. This had me thinking about other options for implementing a simple, server-less method for hosting inference models, and I think OpenWhisk may suit here.

GPU Compatibility

I can't find validation that it works on GPUs, however their generic action invocation appears to run an arbitrary binary via Alpine Linux, which I've used with cuda in the past with some success. I'll spin up an OpenWhisk VM on my GPU box and report back as to whether or not GPUs are accessible, however it's not immediately obvious to me why it shouldn't be.

Simplification

From there, I think making use of the amalgamation script/s within Mxnet to provide a simple 'runnable' object may be a good approach to providing a simple deployment process to users. This will obviously need performance testing.

Mxnet Integration

I think this could prove to be a powerful tool for many ML frameworks, with MxNet serving as the foundation in places. Perhaps this would best be its own project/repository, mirrored within and closely integrating with Mxnet? Thoughts on this are much appreciated.

Please let me know your thoughts, and once I've validated some of the moving pieces, particuarly GPU support on OpenWhisk, I'll knock together a design proposal for further discussion.

@beniz
Copy link

beniz commented Jan 3, 2017

DD is under LGPL, please see https://github.com/beniz/deepdetect/blob/master/COPYING.

@eric-haibin-lin
Copy link
Member

@kevinthesun

@kevinthesun
Copy link
Contributor

@yuruofeifei and I are working on MXNet Model serving. It's still in early stage. In current phase, it creates a http end point and allows developers to fully customize their preprocess and post process function for inference. In the future stage, more powerful functions will be added.
https://github.com/yuruofeifei/mms

@tqchen tqchen closed this as completed Oct 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants