Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc for TF Serving #1165

Merged
merged 5 commits into from Jul 18, 2018
Merged

Add doc for TF Serving #1165

merged 5 commits into from Jul 18, 2018

Conversation

lluunn
Copy link
Contributor

@lluunn lluunn commented Jul 10, 2018

/cc @jlewi
/cc @ynqa


This change is Reviewable

@k8s-ci-robot k8s-ci-robot requested a review from jlewi July 10, 2018 21:46
@k8s-ci-robot
Copy link
Contributor

@lluunn: GitHub didn't allow me to request PR reviews from the following users: ynqa.

Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @jlewi
/cc @ynqa

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lluunn
Copy link
Contributor Author

lluunn commented Jul 11, 2018

/retest

@lluunn
Copy link
Contributor Author

lluunn commented Jul 11, 2018

Tests timeout

@lluunn
Copy link
Contributor Author

lluunn commented Jul 11, 2018

#1184

@lluunn
Copy link
Contributor Author

lluunn commented Jul 11, 2018

/retest

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @lluunn and @jlewi)


docs_dev/tf_serving.md, line 3 at r1 (raw file):

# TF Serving in Kubeflow

Status:

add authors and list yourself


docs_dev/tf_serving.md, line 21 at r1 (raw file):

#### HTTP proxy
The [http proxy](https://github.com/kubeflow/kubeflow/tree/master/components/k8s-model-server/http-proxy)

You should explain why we have this given TF 1.8 supports it. And also mention current thinking about the plans going forward.


docs_dev/tf_serving.md, line 29 at r1 (raw file):

Check out gcr.io/kubeflow-images-public registry for other versions.

#### Istio integration

I'd start a section monitoring and put this in that section. Can you also add information about promtheus integration and liveness/healthness checks?

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions (waiting on @lluunn and @jlewi)


docs_dev/tf_serving.md, line 35 at r1 (raw file):

See [detail](https://github.com/kubeflow/kubeflow/blob/master/components/k8s-model-server/istio-integration.md).

## Usage

Lets also add a section about Seldon. This could be as simple as an open question is how to combine TF Serving with Seldon's model management features . I think there might be some open issues about this.

Copy link
Contributor Author

@lluunn lluunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions (waiting on @jlewi)


docs_dev/tf_serving.md, line 3 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

add authors and list yourself

Done.


docs_dev/tf_serving.md, line 21 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

You should explain why we have this given TF 1.8 supports it. And also mention current thinking about the plans going forward.

Done.


docs_dev/tf_serving.md, line 29 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

I'd start a section monitoring and put this in that section. Can you also add information about promtheus integration and liveness/healthness checks?

Done.


docs_dev/tf_serving.md, line 35 at r1 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Lets also add a section about Seldon. This could be as simple as an open question is how to combine TF Serving with Seldon's model management features . I think there might be some open issues about this.

Done.

@lluunn
Copy link
Contributor Author

lluunn commented Jul 12, 2018

Thank you, PTAL @jlewi


## SeldonIO
[SeldonIO](https://github.com/SeldonIO/seldon-core) is a platform for serving ML
models. It's an open question on how should we integrate TF Serving with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... on how should we -> Its an open question how we should

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Seldon to get nice features like routing and model management.

## Next steps
- [kubeflow/kubeflow#1000](https://github.com/kubeflow/kubeflow/issues/1000): Support logging request and response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about autoscaling?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a benchmark issue below, which should cover it? I added some word

TF 1.8). Going forward, we will still maintain and support this http proxy as
it has the ability to provide additional features such as:
- exporting metrics, [kubeflow/kubeflow#1036](https://github.com/kubeflow/kubeflow/issues/1036).
- request logging. See
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we planning on doing request logging in the http proxy? Why can't we do that in TF Serving?
Would we have to also proxy gRPC requests in order to get that functionality when gRPC is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's easier to make progress if we do it in our http proxy comparing to TF serving.

We can discuss on this thread #1000

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a link to the issue in the doc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions (waiting on @jlewi and @lluunn)


docs_dev/tf_serving.md, line 64 at r2 (raw file):

Previously, lluunn (Lun-Kai Hsu) wrote…

There is a benchmark issue below, which should cover it? I added some word

Autoscaling seems different from benchmarking. Do we have an issue about autoscaling? Some of the questions are how do we take advantage of K8s autoscaling? Do we have the right metrics to trigger K8s autoscaling?


docs_dev/tf_serving.md, line 49 at r3 (raw file):

See [detail](https://github.com/kubeflow/kubeflow/blob/master/components/k8s-model-server/istio-integration.md).

We are working on surfacing those metrics and integrating with Prometheus

I thought we were only planning on using Prometheus for application, TF Serving specific metrics?

Copy link
Contributor Author

@lluunn lluunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 4 unresolved discussions (waiting on @jlewi and @lluunn)


docs_dev/tf_serving.md, line 64 at r2 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

Autoscaling seems different from benchmarking. Do we have an issue about autoscaling? Some of the questions are how do we take advantage of K8s autoscaling? Do we have the right metrics to trigger K8s autoscaling?

I see. Filed an issue and linked it.


docs_dev/tf_serving.md, line 49 at r3 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…

I thought we were only planning on using Prometheus for application, TF Serving specific metrics?

I think we can do all, then we don't have to rely on Istio?

@lluunn
Copy link
Contributor Author

lluunn commented Jul 16, 2018

@jlewi PTAL, thanks!

@lluunn
Copy link
Contributor Author

lluunn commented Jul 16, 2018

Error: No module named py.

@lluunn
Copy link
Contributor Author

lluunn commented Jul 16, 2018

#1218

@lluunn
Copy link
Contributor Author

lluunn commented Jul 16, 2018

/retest

1 similar comment
@lluunn
Copy link
Contributor Author

lluunn commented Jul 16, 2018

/retest

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @jlewi)


docs_dev/tf_serving.md, line 49 at r3 (raw file):

(https://github.com/kubeflow/kubeflow/issues/1036

Do you mean we can do that in our proxy? Why would we want to build a custom proxy to collect telemtry metrics when this is exactly what ISTIO does?

We'd like to be able to do this for all Kubeflow microservices not just TF Serving.

@lluunn
Copy link
Contributor Author

lluunn commented Jul 17, 2018

/retest

1 similar comment
@lluunn
Copy link
Contributor Author

lluunn commented Jul 17, 2018

/retest

@lluunn
Copy link
Contributor Author

lluunn commented Jul 18, 2018

@lluunn
Copy link
Contributor Author

lluunn commented Jul 18, 2018

/retest

Copy link
Contributor

@jlewi jlewi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 files reviewed, 2 unresolved discussions (waiting on @jlewi)


docs_dev/tf_serving.md, line 49 at r3 (raw file):

Previously, jlewi (Jeremy Lewi) wrote…
(https://github.com/kubeflow/kubeflow/issues/1036

Do you mean we can do that in our proxy? Why would we want to build a custom proxy to collect telemtry metrics when this is exactly what ISTIO does?

We'd like to be able to do this for all Kubeflow microservices not just TF Serving.

I think the text is fine but I think we should continue to discuss whether its better to use ISTIO or not.

@jlewi
Copy link
Contributor

jlewi commented Jul 18, 2018

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlewi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit f5801b1 into kubeflow:master Jul 18, 2018
saffaalvi pushed a commit to StatCan/kubeflow that referenced this pull request Feb 11, 2021
* tf serving md

* fix

* address comments

* address review

* comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants