Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Prediction using GPUs with local runner #251

Closed
jlewi opened this issue Feb 15, 2018 · 10 comments
Closed

Batch Prediction using GPUs with local runner #251

jlewi opened this issue Feb 15, 2018 · 10 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Feb 15, 2018

Using GPUs for batch prediction could be really valuable.

There's a couple of different ways we could support this.

  1. If we use a framework like Spark or Flink running on a K8s cluster with GPU nodes then the workers should be able to directly use the GPUs.

  2. If we run Spark/Flink/Dataflow external to K8s such that the workers don't have direct access to GPUs then we could deploy TFServing on a K8s cluster with GPUs and then the workers could send batches of requests to the model to do inference.

@jlewi
Copy link
Contributor Author

jlewi commented Mar 22, 2018

Some things to investigate

  • TFServing currently doesn't support multi GPUs but you can potentially work around this by running 1 container per GPU.

  • NVIDIA has TensorRT to optimize nets for GPUs

    • But this only supports a subset of TF graphs.
  • NVIDIA has GRE

@yixinshi
Copy link
Member

/assign yixinshi

@ankushagarwal
Copy link
Contributor

With the new integration between TensorRT and TensorFlow 1.7, TensorRT optimizes compatible sub-graphs and let's TensorFlow execute the rest.

@jlewi
Copy link
Contributor Author

jlewi commented Apr 9, 2018

@yixinshi How's this going?

@yixinshi
Copy link
Member

I think I will start with a container image of Tf-serving as the base image for batch prediction. Then optimize it to use TensorFlow RT and/or GRE.

@jlewi jlewi changed the title Batch Prediction using GPUs Batch Prediction using GPUs with local runner Apr 30, 2018
@jlewi
Copy link
Contributor Author

jlewi commented Apr 30, 2018

P1 because we'd like to have this in our 0.2 release. However, we will probably only support local beam runner for GPUs in 0.2.

@jlewi
Copy link
Contributor Author

jlewi commented Jun 5, 2018

@yixinshi What is the likelihood this will make 0.2?

@bhack
Copy link

bhack commented Jul 1, 2018

Is tensorflow serving compilable with TensorRT? tensorflow/serving#864

@jlewi
Copy link
Contributor Author

jlewi commented Aug 20, 2018

@bhack Don't know.

@jlewi
Copy link
Contributor Author

jlewi commented Aug 20, 2018

@yixinshi Can we close this issue? I think batch prediction with GPUs and local runner is working now?

@jlewi jlewi closed this as completed Sep 3, 2018
yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021
* fix bayse optimization suggestion

Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>

* add bayseopt-example

Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>

* reset x_train in burn-in

Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>

* validate parameters

Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
elenzio9 pushed a commit to arrikto/kubeflow that referenced this issue Oct 31, 2022
…w#251)

* Specify write access for operator teams and add mxnet-operator team

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>

* Add mxnet-operator to project-maintainers

Signed-off-by: terrytangyuan <terrytangyuan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants