python model server #15

BrentDorsey · 2017-12-07T05:25:33Z

@aronchick / @jlewi - I enjoyed the BoF: Machine Learning on Kubernetes talk this evening, thanks for organizing it!

It would be great if kubeflow could include a Python ML model container with support for:

conda environment.yml requirements management
- I use conda to build production ML containers that support multiple model types including
  - scikit-learn
  - theano
  - xgboost gradient boosted trees
gRPC and REST Prediction API's with Swagger documentation
horizontal pod autoscaling
prometheus application and system metrics

yuvipanda · 2017-12-07T15:39:28Z

For building user images, we (at Project Jupyter) have built http://github.com/jupyter/repo2docker. https://github.com/jupyterhub/binderhub (deployed at mybinder.org) is a JupyterHub service that lets you dynamically build & run images based on a git repo (heroku style!). Perhaps that'll be cool to have?

jlewi · 2017-12-07T16:49:31Z

@BrentDorsey We definitely need containers suitable for serving a variety of models. I think a lot of companies have already developed this piece. For example seldon has defined a prediction API with implementations for various model types.

So hopefully we can figure out how to leverage one or more of the existing solutions rather than developing a new one.

@yuvipanda I'd love to understand binderhub more. One of the problems we have right now is if you're developing on cluster (e.g. via Juptyerlab thank you for that) and writing a program (e.g. TF) we need to build a container from within the cluster. Right now I don't have a good solution for building docker images from within the cluster. It sounds like binderhub might solve this problem.

/cc @cliveseldon

BrentDorsey · 2017-12-07T17:18:37Z

@jlewi - I like Pipeline.AI's prediction API, of course I'm biased as I'm a Pipeline.AI committer.

@yuvipanda - I'd also love to understand more about binderhub.

jlewi · 2017-12-07T20:12:59Z

@BrentDorsey Do you have a pointer to AI's prediction server? I had trouble finding it in the repo.

BrentDorsey · 2017-12-07T21:34:16Z

@jeremy Lewi<mailto:jlewi@google.com> – Let me talk with Chris and get back to you, he’s moved some stuff around. From: Jeremy Lewi <notifications@github.com> Reply-To: google/kubeflow <reply@reply.github.com> Date: Thursday, December 7, 2017 at 2:13 PM To: google/kubeflow <kubeflow@noreply.github.com> Cc: Brent Dorsey <bdorsey@homeaway.com>, Mention <mention@noreply.github.com> Subject: Re: [google/kubeflow] python model server (#15) @BrentDorsey<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_brentdorsey&d=DwMCaQ&c=tmh68fGhvYqZefO02qmwIQ&r=PL_KM3jiN2ffaaJJJxKsSISXcVACew74016vc4vkD0M&m=EqXG8fkvTO29mUYni4oS2iH6nbreC98lRyS-jDFtaSY&s=drxcg_a37uX5PMqf8mvtRyr5Da3c1S7G0VkaISrjyF0&e=> Do you have a pointer to AI's prediction server? I had trouble finding it in the repo. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_google_kubeflow_issues_15-23issuecomment-2D350081077&d=DwMCaQ&c=tmh68fGhvYqZefO02qmwIQ&r=PL_KM3jiN2ffaaJJJxKsSISXcVACew74016vc4vkD0M&m=EqXG8fkvTO29mUYni4oS2iH6nbreC98lRyS-jDFtaSY&s=4TZzjNR2xpCCYr9DQyg8tIGLcsbcJXdcD7S0MFn5La8&e=>, or mute the thread<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AB1qGBKP-5FZAp9ryS5lcj5LowgcSg4skRks5s-2DEbLgaJpZM4Q5A4o&d=DwMCaQ&c=tmh68fGhvYqZefO02qmwIQ&r=PL_KM3jiN2ffaaJJJxKsSISXcVACew74016vc4vkD0M&m=EqXG8fkvTO29mUYni4oS2iH6nbreC98lRyS-jDFtaSY&s=3cVubdzrKRJBJaJlAVa7t4igMZxmIs7JWHlBFwfa1JU&e=>.

ukclivecox · 2017-12-22T16:51:11Z

Hi,

I work for Seldon, referenced above by @jlewi

We have pushed our seldon-core product to github https://github.com/SeldonIO/seldon-core

It's goals are to allow generic machine learning models to be deployed on Kubernetes. It would be great to get some feedback and how this contributes to this issue and more generally for the deployment side of KubeFlow.

aronchick · 2017-12-22T17:27:54Z

Thanks so much for letting us know, @cliveseldon! I think the overlap here is really strong, and would love to collaborate/include Seldon-core.

Do you have any thoughts on whether or not it would be monolithic, or break it down to component parts? It does a lot of different things, and was just curious about your philosophy.

ukclivecox · 2017-12-22T18:48:55Z

Thanks @aronchick We're focused just on serving, not training. The core of what we provide is allow runtime graphs to be specified via a CRD. Those graphs may contain containers that are microservices which expose ML predictions but also routers (e.g., AB routing), combiners (e.g., ensembling), transformers (e.g., feature transforms) which may also be microservices inside containers.

At present this functionality is handled by 3 components that run in kubernetes, an operator to handle the CRD, an engine to handle the request/response service mesh described by your graph and an API front end that exposes these to the outside world as gRPC and REST. All this is quite tightly coupled to the CRD. So I suppose these parts would be similar to the TensorFlow serving part of the present KubeFlow except aiming to allow more complex graphs and generic ML runtime Tools (thus the discussion in this issue).

There is also Prometheus, Grafana (for metrics and a dashboard) and Kafka (and thus zookeeper) in the current helm chart but its not core and could be replaced/ignored.

aronchick · 2017-12-22T19:43:32Z

Oh for sure but, unless I misunderstand, Seldon-core also handles packaging of the model for serving? If it's limited to serving, then I totally agree with you. When I was referring to components, it was more the different stages of a pipeline.

ukclivecox · 2017-12-22T19:53:57Z

Yes, we provide a generic API for prediction models to handle request/responses. At a high level we want to make it easy to add runtime models so either the data scientist can do it themselves (dockerize their model and expose a REST/gRPC endpoint that respects the API) and/or we plan to provide some automated wrappers to make this easy.

So to be clear, packaging of the model is separate so could be done outside the project.

jlewi · 2017-12-23T01:00:52Z

I spent some time looking at it this morning. The Seldon prediction graph and CRD looks super cool and should be pretty easy to incorporate.

I think we would do something very similar to what we do for the TfJob operator.

We can create a ksonnet package to deploy the SeldonDeployment CRD. At this point I believe users should just be able to create SeldonDeployment's and they would work. We could also deploy the Grafana and Prometheus stuff but it might make sense to make that a separate ksonnet package or include it in the core ksonnet package.

/cc @cliveseldon @aronchick

jlewi · 2018-03-07T00:26:09Z

Seldon is now available as part of Kubeflow. So I'm closing this issue.

* update readme * add centraldashboard

Sync RHODS 1.31 branch with v1.7 ODH branch

lluunn mentioned this issue Dec 23, 2017

python prediction server #65

Closed

jlewi added the area/inference label Jan 7, 2018

jlewi closed this as completed Mar 7, 2018

ankushagarwal mentioned this issue Apr 5, 2018

Polish the github issue summarization UI kubeflow/examples#69

Merged

kimwnasptd pushed a commit to arrikto/kubeflow that referenced this issue Mar 5, 2019

Define a group for ci-team@kubeflow.org (kubeflow#15)

6f42be5

jlewi added this to To Do in Needs Triage Jul 26, 2019

jlewi removed this from To Do in Needs Triage Jul 26, 2019

yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Nov 1, 2019

add centraldashboard (kubeflow#15)

1c30d49

* update readme * add centraldashboard

harshad16 added a commit to harshad16/odh-kubeflow that referenced this issue Aug 9, 2023

Merge pull request kubeflow#15 from harshad16/sync-rhods-1.31

66f6353

Sync RHODS 1.31 branch with v1.7 ODH branch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python model server #15

python model server #15

BrentDorsey commented Dec 7, 2017 •

edited

yuvipanda commented Dec 7, 2017

jlewi commented Dec 7, 2017

BrentDorsey commented Dec 7, 2017

jlewi commented Dec 7, 2017

BrentDorsey commented Dec 7, 2017 via email

ukclivecox commented Dec 22, 2017

aronchick commented Dec 22, 2017

ukclivecox commented Dec 22, 2017

aronchick commented Dec 22, 2017

ukclivecox commented Dec 22, 2017

jlewi commented Dec 23, 2017

jlewi commented Mar 7, 2018

python model server #15

python model server #15

Comments

BrentDorsey commented Dec 7, 2017 • edited

yuvipanda commented Dec 7, 2017

jlewi commented Dec 7, 2017

BrentDorsey commented Dec 7, 2017

jlewi commented Dec 7, 2017

BrentDorsey commented Dec 7, 2017 via email

ukclivecox commented Dec 22, 2017

aronchick commented Dec 22, 2017

ukclivecox commented Dec 22, 2017

aronchick commented Dec 22, 2017

ukclivecox commented Dec 22, 2017

jlewi commented Dec 23, 2017

jlewi commented Mar 7, 2018

BrentDorsey commented Dec 7, 2017 •

edited