-
Notifications
You must be signed in to change notification settings - Fork 827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all distributions of Kubeflow 1.1 have incorrect container images #1553
Comments
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
@jlewi you might want to be across this |
/priority p0 |
@thesuperzapper I would suggest breaking this issue upto into specific issues for each each application and assigning it to the application owners. Application owners are responsible for ensuring their manifests are shipping the right docker image release. |
I spot checked the date of the commits of two of the images: The apps cd pipeline looks like it has entries to update the base_v3 entry for most manifests Likewise there are auto opened the PRs to update the manifests So it looks to me like application owners didn't review the PRs and get them merged and they missed the release train. |
@thesuperzapper Can you and the other proposed notebook lead fix this for notebooks? |
/assign @thesuperzapper |
For anyone who comes across this issue and needs a fix, add this to your images:
- name: gcr.io/kubeflow-images-public/admission-webhook
newName: gcr.io/kubeflow-images-public/admission-webhook
newTag: v1.1.0-g3ac3d08b
- name: gcr.io/kubeflow-images-public/centraldashboard
newName: gcr.io/kubeflow-images-public/centraldashboard
newTag: v1.1.0-g35d7484a
- name: gcr.io/kubeflow-images-public/jupyter-web-app
newName: gcr.io/kubeflow-images-public/jupyter-web-app
newTag: v1.1.0-gd3377cbd
- name: gcr.io/kubeflow-images-public/notebook-controller
newName: gcr.io/kubeflow-images-public/notebook-controller
newTag: v1.1.0-gd3377cbd
- name: gcr.io/kubeflow-images-public/kfam
newName: gcr.io/kubeflow-images-public/kfam
newTag: v1.1.0-g9f3bfd00
- name: gcr.io/kubeflow-images-public/profile-controller
newName: gcr.io/kubeflow-images-public/profile-controller
newTag: v1.1.0-ga49f658f
- name: gcr.io/kubeflow-images-public/pytorch-operator
newName: gcr.io/kubeflow-images-public/pytorch-operator
newTag: v1.1.0-gd596e904
- name: gcr.io/kubeflow-images-public/tf_operator
newName: gcr.io/kubeflow-images-public/tf_operator
newTag: v1.1.0-g92389064 |
@thesuperzapper ping? What's the plan for notebooks? |
@jlewi, other than fixing it for Kubeflow 1.2, we don't have a plan right now. Can I ask why we have chosen to use a branch like The current |
@thesuperzapper I'm not sure what you are referring to. We do have immutable tags for each specific patch release. We use branches as opposed to immutable tags on master because following standard practice we want to allow the releases and masters to evolve independently in case of need to support fixes.
It is up to the KFDef owners to decide whether to pin to a specific tag or pull from the head of the branch depending on what they are optimizing for. |
@thesuperzapper @kimwnasptd what's the plan for getting the manifests on master upgraded? |
@jlewi I'll take a look at the current open PRs that update the controller and jupyter web app and test the latest built tags. If they work as expected then I'll move on with merging the corresponding PRs and closing the older ones. |
This is a good point.. I think last release didn't handle v1.1-branch assets well. We fixed few issues in the past like profile-controller and training images. It's WG owner's responsibility to update them before the release. This definitely needs more collaboration |
@kubeflow/wg-notebook-leads and @thesuperzapper Any update on this for notebooks? Do you have a plan for getting the notebook manifests updated before the release? |
@jlewi @Jeffwan for the controller we have #1570 which I propose to drop, since the changes were introduced from an irrelevant PR. So the controller should be the current ones. For the web app the we have #1531, which contains @thesuperzapper's changes for Tolerations and affinity. Merging this PR alone will not suffice since we will need to also make a follow up PR to update the jupyter web app's config map in the manifests. If @thesuperzapper has the cycles to make the PR to also update the config map in the manifests then we can merge #1531 and be ready for 1.2. If not then lets stick to the current manifests and aim for an overall update on 1.3 which will include the updated jupyter web app code kubeflow/kubeflow#5310. |
Thanks @kimwnasptd for the update. Do you have a plan for how you will publish docker images going forward? |
Currently, this repository (both
master
andv1.1-branch
) have the incorrect images for some components of kubefow.As far as I can see, this should affect all stacks at least partially, including: GCP, AWS, Azure, IBM, Vanilla K8S.
Most of these errors have occurred because the stacks have moved to the
base_v3
versions of the components, whereas the maintainers/bot has only updated thebase
spec. But in some cases, clearly no-one is maintaining the YAML for the component.In terms of fixing this, we obviously need the owners of each section to fix their manifests.
Assuming we fix these issues, and cherry-pick them into the
v1.1-branch
, there will still be an issue, as kfdef files target the branch directly rather than a tag, so users may not realise that if they rerunkfctl build ...
they will get different manifests.Related issues: #1496
admission-webhook
gcr.io/kubeflow-images-public/admission-webhook:vmaster-gaf96e4e3
gcr.io/kubeflow-images-public/admission-webhook:v1.1.0-g3ac3d08b
centraldashboard
gcr.io/kubeflow-images-public/centraldashboard:vmaster-gf39279c0
gcr.io/kubeflow-images-public/centraldashboard:v1.1.0-g35d7484a
jupyter-web-app
gcr.io/kubeflow-images-public/jupyter-web-app:vmaster-gd9be4b9e
gcr.io/kubeflow-images-public/jupyter-web-app:v1.1.0-gd3377cbd
notebook-controller
gcr.io/kubeflow-images-public/notebook-controller:vmaster-gf39279c0
gcr.io/kubeflow-images-public/notebook-controller:v1.1.0-gd3377cbd
kfam
gcr.io/kubeflow-images-public/kfam:vmaster-gf3e09203
gcr.io/kubeflow-images-public/kfam:v1.1.0-g9f3bfd00
profile-controller
gcr.io/kubeflow-images-public/profile-controller:vmaster-g34aa47c2
gcr.io/kubeflow-images-public/profile-controller:v1.1.0-ga49f658f
pytorch-operator
gcr.io/kubeflow-images-public/pytorch-operator:vmaster-gd596e904
gcr.io/kubeflow-images-public/pytorch-operator:v1.1.0-gd596e904
tf_operator
gcr.io/kubeflow-images-public/tf_operator:vmaster-ga2ae7bff
gcr.io/kubeflow-images-public/tf_operator:v1.1.0-g92389064
The text was updated successfully, but these errors were encountered: