Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate the TfImage field #330

Closed
DjangoPeng opened this issue Jan 19, 2018 · 7 comments
Closed

Deprecate the TfImage field #330

DjangoPeng opened this issue Jan 19, 2018 · 7 comments

Comments

@DjangoPeng
Copy link
Member

In the current CRD, we have a TfImage field here for TensorFlow replicas. While the pod template has a Image field as well.

According to the discussion of TfImage here, we'd like to deprecate it.

/cc @jlewi @gaocegege @ScorpioCPH

@jlewi
Copy link
Contributor

jlewi commented Jan 20, 2018

There are currently two use cases for TFImage

  • Using it for TensorBoard
  • Using it for a vanilla parameter server

I'd be ok with changing how we handle both of these so that we no longer need TFImage.

For TensorBoard, I think it might make sense to launch TB separately. The original reason for making TB a replica was as a convenient way to tie lifetime of TB to lifetime of job. In practice, I think that might be less useful because if you forget to set TB in the TfJob you probably don't want to restart the job just to launch TB.

I suspect a better way to tie lifetime of TB to the job might be to just set the owner reference on the TB job.

The hope though is that TB moves to a model where data is stored in a DB and served by stateless webservers which would remove the need to launch individual TB instances for each results.

For default parameter servers, this is just sugar and we can find other (possibly) better ways to provide that sugar e.g ksonnet template a Docker image containing the code.

So I vote to get rid of it. Since it simplifies things.

@gaocegege
Copy link
Member

gaocegege commented Feb 16, 2018

I think we should wait until #347 is solved.

Now our pod template spec is not required because we have TfImage and TfPort, if we deprecate it, we should add validation and defaults.

@gaocegege
Copy link
Member

Assign to me since no one is interested in it and it is urgent. /assign @gaocegege

@gaocegege
Copy link
Member

/assign gaocegege

@jlewi
Copy link
Contributor

jlewi commented Mar 7, 2018

@gaocegege Is this still blocked? Any particular reason you think its P1 and not P2?

It will definitely be fixed as part of the V2 API. Does it make sense to fix in the v1alpha1 API?

@gaocegege
Copy link
Member

@jlewi I will fix it in v1alpha2, and it could be p2 since it will not block the main use case.

@gaocegege
Copy link
Member

Closed by #492

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants