Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TFServing deployment should support GPUs #292

Closed
jlewi opened this issue Feb 24, 2018 · 1 comment
Closed

TFServing deployment should support GPUs #292

jlewi opened this issue Feb 24, 2018 · 1 comment

Comments

@jlewi
Copy link
Contributor

jlewi commented Feb 24, 2018

Our TFServing component should have options to support serving with GPUs.

@jlewi
Copy link
Contributor Author

jlewi commented Mar 7, 2018

/assign jlewi
/unassign @lluunn

I've started work on this.

@k8s-ci-robot k8s-ci-robot assigned jlewi and unassigned lluunn Mar 7, 2018
jlewi added a commit to jlewi/kubeflow that referenced this issue Mar 7, 2018
…clouds.

* To support GPUs and specific clouds we refactor the component to make
  it easy to override the parts we care about (e.g. container environment
  variables, resources, etc...).

* We do this by moving the things we care about up to the root of tf-serving.libsonnet.

* We rely on jsonnet late binding (http://jsonnet.org/docs/tutorial.html).
Late binding allows us to devine dictionaries (e.g. params, tfServingContainer)
in tf-serving.libsonnet. We can then create manifests based on those
objects (e.g. tfDeployment). We can then override values (e.g. params) and
the derived objecs (e.g. tfDeployment) will use the overwritten values.

* We introduce a parameter "cloud" which allows us to control which "prototype" to use. We use this to use cloud specific customizations; like setting
  the environment variables on AWS to use S3.

* Late binding also makes it possible to select an appropriate default image
  based on whether GPUs are bing used or not while still allowing the
  user to override the images.

* We remove parameter definitions from the prototypes. The set of parameters
  ends up being conditional based on flags like cloud, GPUs so its
  not clear how scalable that was.

Related Issues:

Fix kubeflow#292
jlewi added a commit to jlewi/kubeflow that referenced this issue Mar 8, 2018
…clouds.

* To support GPUs and specific clouds we refactor the component to make
  it easy to override the parts we care about (e.g. container environment
  variables, resources, etc...).

* We do this by moving the things we care about up to the root of tf-serving.libsonnet.

* We rely on jsonnet late binding (http://jsonnet.org/docs/tutorial.html).
Late binding allows us to devine dictionaries (e.g. params, tfServingContainer)
in tf-serving.libsonnet. We can then create manifests based on those
objects (e.g. tfDeployment). We can then override values (e.g. params) and
the derived objecs (e.g. tfDeployment) will use the overwritten values.

* We introduce a parameter "cloud" which allows us to control which "prototype" to use. We use this to use cloud specific customizations; like setting
  the environment variables on AWS to use S3.

* Late binding also makes it possible to select an appropriate default image
  based on whether GPUs are bing used or not while still allowing the
  user to override the images.

* We remove parameter definitions from the prototypes. The set of parameters
  ends up being conditional based on flags like cloud, GPUs so its
  not clear how scalable that was.

* Use camelCase not underscores for parameters.
  See kubeflow#303.

Related Issues:

Fix kubeflow#292

Update the test to work with the changes.

* Parameters are now camelCase. They also aren't parameters of the
  prototype so we can't set them in the call to generate.

* So we need to modify deploy to take a list of the parameters to set
  on the component.
k8s-ci-robot pushed a commit that referenced this issue Mar 8, 2018
…clouds (#387)

* Refactor the TFServing component to better support GPUs and specific clouds.

* To support GPUs and specific clouds we refactor the component to make
  it easy to override the parts we care about (e.g. container environment
  variables, resources, etc...).

* We do this by moving the things we care about up to the root of tf-serving.libsonnet.

* We rely on jsonnet late binding (http://jsonnet.org/docs/tutorial.html).
Late binding allows us to devine dictionaries (e.g. params, tfServingContainer)
in tf-serving.libsonnet. We can then create manifests based on those
objects (e.g. tfDeployment). We can then override values (e.g. params) and
the derived objecs (e.g. tfDeployment) will use the overwritten values.

* We introduce a parameter "cloud" which allows us to control which "prototype" to use. We use this to use cloud specific customizations; like setting
  the environment variables on AWS to use S3.

* Late binding also makes it possible to select an appropriate default image
  based on whether GPUs are bing used or not while still allowing the
  user to override the images.

* We remove parameter definitions from the prototypes. The set of parameters
  ends up being conditional based on flags like cloud, GPUs so its
  not clear how scalable that was.

* Use camelCase not underscores for parameters.
  See #303.

Related Issues:

Fix #292

Update the test to work with the changes.

* Parameters are now camelCase. They also aren't parameters of the
  prototype so we can't set them in the call to generate.

* So we need to modify deploy to take a list of the parameters to set
  on the component.

* jsonnet format.
yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021
Signed-off-by: YujiOshima <yuji.oshima0x3fd@gmail.com>
elenzio9 pushed a commit to arrikto/kubeflow that referenced this issue Oct 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants