New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New TF Serving template #1589
New TF Serving template #1589
Conversation
/retest |
1 similar comment
/retest |
/assign @kunmingg |
/cc @kkasravi |
I think we're headed in the direction of having multiple prototypes (we already have 2) this adds a 3rd. @kkasravi's recent prototype #1501 suggests to me (see comment) that we should define a base libsonnet file and then use k8s libsonnet to modify it for each prototype. I considered having a basic prototype with all the logic in the .jsonnet file that people can use as a simple example so they don't have to learn the k8s lib. But even that seems like it would get pretty unruly. My initial thought is that we should have the following prototypes
So just supporting those 3 different use cases will lead to a lot of code duplication if we put all the code into the .jsonnet file. |
I was thinking the we can use only this one jsonnet later. Here the jsonnet structure has two parts (before and after line 75):
WDYT? |
One of the reasons for having parameters is for making it easy for users to figure out what they have to set. If we have separate prototypes for GCP and S3. Then in the prototype for GCP we can explicitly define all the relevant parameters e.g.
Then when the user invokes help on the component they can get a meaningful list of parameters that they could set. |
But how about istio and request logs? They can co-exist with gcp credentials and S3. We can add both gcpSecretName and s3Enabled as flags in this prototype, and add better documentation. WDYT? |
I think its fine to include options for istio and request logs since those aren't mutually exclusive. We should introduce different prototypes for things that are exclusive; the choice of cloud is exclusive. You are likely either running on AWS or running on GCP. We already have 8 non cloud specific parameters S3 has 7 different parameters GCP has 1 The whole point of adding parameters is to make it easy for users to customize the variables they care about. IMO the easy of use starts to decrease if we provide so many parameters that the user can't figure out which ones they care about. A GCP use could reasonable be confused by the fact that there are a bunch of S3 related parameters and wonder why they have to set them. The situation will only get worse as we add support for other Clouds (e.g. ACK, Azure). So I think having multiple prototypes is more tractable and we can iterate on what belong in a separate prototype vs. options in the existing prototypes. I think a good place to start would be to think about a prototype specific to GCP and add the parameters and options that make sense on GCP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like defining the separate prototypes as @jlewi outlined:
- TFServing using S3
- kubeflow/tf-serving/prototypes/tf-serving-aws.jsonnet
- TFServing using GCP
- kubeflow/tf-serving/prototypes/tf-serving-gcp.jsonnet
- TFServing using PVC
- kubeflow/tf-serving/prototypes/tf-serving-pvc.jsonnet
though the istio, request logs would need to be added to all 3 prototypes via a mixin at the libsonnet level.
Tensorboard has
- Tensorboard using S3
- kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet
- Tensorboard using GCP
- kubeflow/tensorboard/prototypes/tensorboard-gcp.jsonnet
however in {tensorboard-aws.jsonnet, tensorboard-gcp.jsonnet} I neglected to include the params in the comments at the top of each file. I'll submit a PR for the tensorboard change.
// @optionalParam deployHttpProxy string false Whether to deploy http proxy | ||
// @optionalParam modelBasePath string gs://kubeflow-examples-data/mnist The model path | ||
// @optionalParam modelName string mnist The model name | ||
// @optionalParam s3Enable string false Whether to enable S3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be moved into a file called kubeflow/tf-serving/prototypes/tf-serving-aws.jsonnet
} + params; | ||
|
||
// Parameters that control S3 access. Need to set params.s3Enable to true | ||
local s3params = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if these were moved into the comments in a kubeflow/tf-serving/prototypes/tf-serving-aws.jsonnet then it would be a better locality of reference. For example
// @apiversion 0.1
// @name io.ksonnet.pkg.tf-serving-template
// @description TensorFlow serving
// @shortDescription A TensorFlow serving deployment
// @param name string Name to give to each of the components
// @optionalParam namespace string kubeflow The namespace
// @optionalParam numGpus string 0 Number of gpus to use
// @optionalParam deployHttpProxy string false Whether to deploy http proxy
// @optionalParam modelBasePath string gs://kubeflow-examples-data/mnist The model path
// @optionalParam modelName string mnist The model name
// @optionalParam s3SecretName string "" Name of the k8s secrets containing S3 credentials
// @optionalParam s3SecretAccesskeyidKeyName string "AWS_ACCESS_KEY_ID" Name of the key in the k8s secret containing AWS_ACCESS_KEY_ID
// @optionalParam s3AwsRegion string "us-west-1" S3 region
// @optionalParam s3UseHttps string "true" Whether or not to use https
// @optionalParam s3VerifySsl string "true" Whether or not to verify https certificates for S3 connections
// @optionalParam s3Endpoint string "http://s3.us-west-1.amazonaws.com" URL for your s3-compatible endpoint
}, | ||
type: "ClusterIP", | ||
}, | ||
}; // service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a
service:: service
as described here
}, | ||
], | ||
env: [] | ||
+ if util.toBool(params.s3Enable) then s3Env else [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this into deployment.mapContainers in the kubeflow/tf-serving/prototypes/tf-serving-aws.jsonnet file as described above
httpProxyContainer, | ||
] else [], | ||
volumes: [] | ||
+ if gcpParams.gcpCredentialSecretName != "" then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move into a kubeflow/tf-serving/prototypes/tf-serving-gcp.jsonnet file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comment otherwise
/lgtm
}, // tfDeployment | ||
tfDeployment:: tfDeployment, | ||
|
||
all:: [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this if we're generating the lists in the prototypes?
/retest |
2 similar comments
/retest |
/retest |
argo test failed in kfctl |
/retest |
argo test failed again, there might be an issue |
still argo test failure, but I don't see how this PR is related to that. |
I can see the argo pod up at 11:07:47, but it timeout at 07:03. So I am extending the wait time. |
kubeflow/tf-serving/util.libsonnet
Outdated
@@ -1,5 +1,7 @@ | |||
// Some useful routines. | |||
{ | |||
local k = import "k.libsonnet", | |||
|
|||
// Convert a string to upper case. | |||
upper:: function(x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std library has std.asciiUpper(str)
http://jsonnet.org/ref/stdlib.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@lluunn jfmt errors |
Thanks @kkasravi |
This looks great. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
not sure why notebook release workflow is triggered... |
/retest |
* new template * fix * fix * fix * fix * fix * fix * fix * review * make timeout longer for deploy argo * fix
For #1264
Move things into jsonnet instead of relying on import.
It's easier to customize.
/cc @jlewi
This change is