New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TF serving support request logging #1229
Conversation
/hold |
Please update the PR description to describe the approach being taken in this PR to solve the request logging problem. |
time.sleep(10) | ||
if not os.path.isfile(target_file): | ||
continue | ||
print('found the file\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use logging not print.
@@ -0,0 +1 @@ | |||
google-cloud-bigquery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need bigquery?
import os | ||
import time | ||
|
||
from google.cloud import bigquery |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to write our own logging agent as opposed to using fluentd?
For example, https://github.com/kaizenplatform/fluent-plugin-bigquery
@@ -19,10 +19,12 @@ | |||
from itertools import repeat | |||
import base64 | |||
import logging | |||
import grpc | |||
import random |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do it in the http server proxy we won't get this for gRPC requests; right?
Are you emitting the logs in json format? |
Using fluentd now, PTAL |
/hold cancel It's working now |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 10 unresolved discussions (waiting on @jlewi and @lluunn)
components/k8s-model-server/request-logging.md, line 3 at r3 (raw file):
# Request logging for TF Serving It currently supports streaming to BigQuery.
Is this still true even though we are using fluentd?
I'd suggest putting this into the existing dev doc on serving that you have. User facing documentation can be added to the website.
components/k8s-model-server/request-logging.md, line 16 at r3 (raw file):
Modify bigquery dataset and schema in
fluent.conf
.
How does fluentd.conf get passed to the agent?
components/k8s-model-server/fluentd-logger/Dockerfile, line 1 at r3 (raw file):
FROM fluent/fluentd:v1.2-debian
Can you add a comment? Are we building our own fluentd agent because we want to use BigQuery?
kubeflow/examples/fluent.conf, line 1 at r3 (raw file):
<source>
I think this belongs in the serving package not the examples package.
kubeflow/examples/fluent.conf, line 16 at r3 (raw file):
auth_method application_default # change these!
Is this comment valid?
How would user actually supply this? Would it be via a ConfigMap?
kubeflow/examples/fluent.conf, line 17 at r3 (raw file):
# change these! project TBD
Can we make the configmap containing conf a prototype and make these parameters?
You can just use ksonnet to build the string and then add in the variables e.g.
local config = ".....
project " + params.project
ksonnet supports raw multiline strings. I forget what the syntax is.
Another approach would be to use environment variables in the conf
https://docs.fluentd.org/v0.12/articles/faq#how-can-i-use-environment-variables-to-configure-parameters-dynamically?
And then specify those environment variables on the pod.
kubeflow/examples/prototypes/tf-serving-with-request-log.jsonnet, line 18 at r3 (raw file):
// Change this! local gcpSecretName = "TBD";
Why is this TBD? Why not make it a parameter that defaults to "user-gcp-sa" which is the default secret we set up.
This is great. Can you update the PR description to explain the design choices; i.e. the fact that we are using fluentd and BigQuery? |
Some of the comment are regarding how we present the prototype:
I think the current libsonnet is already messy (filed #1264 ), so I choose option 2 here. WDYT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 10 unresolved discussions (waiting on @lluunn and @jlewi)
components/k8s-model-server/request-logging.md, line 3 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Is this still true even though we are using fluentd?
I'd suggest putting this into the existing dev doc on serving that you have. User facing documentation can be added to the website.
Yes, I only installed bigquery plugin now.
I will add this info to website under Guides: TF serving
components/k8s-model-server/request-logging.md, line 16 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
How does fluentd.conf get passed to the agent?
it sits locally with jsonnet prototype. The prototype imports it as configmap
components/k8s-model-server/fluentd-logger/Dockerfile, line 1 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Can you add a comment? Are we building our own fluentd agent because we want to use BigQuery?
Yes. https://github.com/fluent/fluentd-docker-image#3-customize-dockerfile-to-install-plugins-optional
kubeflow/examples/fluent.conf, line 16 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Is this comment valid?
How would user actually supply this? Would it be via a ConfigMap?
jsonnet imports this as a configmap
kubeflow/examples/prototypes/tf-serving-with-request-log.jsonnet, line 18 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Why is this TBD? Why not make it a parameter that defaults to "user-gcp-sa" which is the default secret we set up.
Done.
components/k8s-model-server/request-logger/logging_worker.py, line 27 at r1 (raw file):
obsolete
obsolete
components/k8s-model-server/request-logger/logging_worker.py, line 50 at r1 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Use logging not print.
obsolete
components/k8s-model-server/request-logger/requirements.txt, line 1 at r1 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Why do you need bigquery?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 5 files reviewed, 6 unresolved discussions (waiting on @jlewi and @lluunn)
components/k8s-model-server/fluentd-logger/Dockerfile, line 3 at r4 (raw file):
FROM fluent/fluentd:v1.2-debian # Fluentd image with plugin installed.
Nit move comment to top of file
kubeflow/examples/fluent.conf, line 1 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
I think this belongs in the serving package not the examples package.
Can we move this to serving package please?
The prototype belongs in the tf-serving package regardless of how you generate it; i.e. you can still copy the whole jsonnet. Telling a user to change certain lines in a text file is much less convenient then doing
Since these parameters are required we should try to expose them as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 4 files reviewed, 5 unresolved discussions (waiting on @jlewi)
kubeflow/examples/fluent.conf, line 1 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Can we move this to serving package please?
Done.
kubeflow/examples/fluent.conf, line 16 at r3 (raw file):
Previously, lluunn (Lun-Kai Hsu) wrote…
jsonnet imports this as a configmap
Done.
kubeflow/examples/fluent.conf, line 17 at r3 (raw file):
Previously, jlewi (Jeremy Lewi) wrote…
Can we make the configmap containing conf a prototype and make these parameters?
You can just use ksonnet to build the string and then add in the variables e.g.local config = ".....
project " + params.projectksonnet supports raw multiline strings. I forget what the syntax is.
Another approach would be to use environment variables in the conf
https://docs.fluentd.org/v0.12/articles/faq#how-can-i-use-environment-variables-to-configure-parameters-dynamically?And then specify those environment variables on the pod.
Done.
Remove the conf file and make it in the prototype. project etc can be set via |
test failure: #1266 |
}, | ||
// Http proxy | ||
{ | ||
name: "mnist-http-proxy", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove mnist in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
}, | ||
], | ||
}, | ||
// Logging container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets file an issue to create an admission controller to inject the side car.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}, | ||
// Logging container. | ||
{ | ||
name: "mnist-logging", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use mnist in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Sorry for the slow reply; GCP next. Minor comment about replacing the name "mnist" somewhere. The duplication of code with the other templates is unfortunate. I think the best option is probably to create an admission controller that can inject the relevant information and then just make that its own prototype. Lets file a follow on issue. |
Thanks for the review, PTAL |
/retest |
1 similar comment
/retest |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* tf serving request logging * use fluentd * add md * fix link * fix js lint * comment * make config in prototype * fix js * address comments
For #1000
/cc @jlewi
This change is