Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create test jobs to validate that debs/rpms can be installed #821

Open
justaugustus opened this issue Jul 8, 2019 · 29 comments
Open

Create test jobs to validate that debs/rpms can be installed #821

justaugustus opened this issue Jul 8, 2019 · 29 comments
Assignees
Labels
area/release-eng Issues or PRs related to the Release Engineering subproject lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/release Categorizes an issue or PR as relevant to SIG Release.
Milestone

Comments

@justaugustus
Copy link
Member

Per @spiffxp's review comment:

Is there an open issue for this somewhere? Would rather we use an issue to track this rather than commented out code.

creating an issue instead of having this as a code TODO.

# TODO(justaugustus): Add tests to verify installation of debs and rpms
# We comment out the current job, as the logic needs to be modified for it to be useful.
# The job has failed consistently since the 1.14.0 release:
# - https://testgrid.k8s.io/sig-release-misc#periodic-packages-install-deb
#
# Example:
#- name: periodic-kubernetes-e2e-packages-install-deb
#  interval: 24h
#  decorate: true
#  extra_refs:
#  - org: kubernetes
#    repo: kubeadm
#    base_ref: master
#    path_alias: k8s.io/kubeadm
#  spec:
#    containers:
#    - image: gcr.io/k8s-testimages/kubekins-e2e:latest-master
#      imagePullPolicy: Always
#      command:
#      - ./tests/e2e/packages/verify_packages_install_deb.sh

/assign
/area release-eng
/milestone v1.16
/sig release
/priority important-soon

@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Jul 8, 2019
@k8s-ci-robot k8s-ci-robot added area/release-eng Issues or PRs related to the Release Engineering subproject sig/release Categorizes an issue or PR as relevant to SIG Release. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jul 8, 2019
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 6, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 5, 2019
@justaugustus
Copy link
Member Author

/remove-lifecycle rotten
/unassign

@kubernetes/release-engineering -- If someone is interested in taking this, go for it!
/milestone v1.18

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Nov 28, 2019
@k8s-ci-robot k8s-ci-robot modified the milestones: v1.16, v1.18 Nov 28, 2019
@xmudrii
Copy link
Member

xmudrii commented Nov 28, 2019

@justaugustus This might be an interesting one. Is it okay if I gave it a try and ask for some initial guidance?

@justaugustus
Copy link
Member Author

Sounds good, @xmudrii!
We can discuss next week.

/assign @xmudrii

@xmudrii
Copy link
Member

xmudrii commented Dec 18, 2019

I want to start working on this, but I need a few pointers to get started.

  • What is expected for this job to do?
  • In the kubeadm repo, there is still a script that has been used by the old job. Should we re-use the same script or build a new one?
    • The old script seems to be only for deb packages. We should eventually extend it (or add a new one) for rpm packages.
  • The testgrid link in the comment doesn't work anymore since it's old. Does anyone remember why the old job was failing?

@justaugustus
Copy link
Member Author

@xmudrii -- Sorry I didn't get to this this week. Still working through the post-holiday queue. Will write something up for you next week.

@xmudrii
Copy link
Member

xmudrii commented Mar 16, 2020

@justaugustus Reminder to take a look at this issue if you have some time. 🙂

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2020
@xmudrii
Copy link
Member

xmudrii commented Jun 14, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2020
@justaugustus justaugustus added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 6, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 4, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 2, 2021
@xmudrii
Copy link
Member

xmudrii commented Feb 2, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 2, 2021
@LappleApple
Copy link

@xmudrii Hey, checking in on this -- do your questions from December 2019 still stand/do you still need the help you requested back then? @sladyn98 is interested in this item as well.

/assign @sladyn98

@xmudrii
Copy link
Member

xmudrii commented Apr 8, 2021

I've discussed this issue with @justaugustus a few months ago and I'll try to recap our conversation.

There are two ways to fix this issue: a simple and a complete way.

A simple way means that we move the old script to this repo and make it work. The script can be found here: https://github.com/kubernetes/kubeadm/blob/master/tests/e2e/packages/verify_packages_install_deb.sh

A complete way means that we do something along:

  • generate the packages
  • verify the packages
  • install the packages on a clean system
  • run e2e tests against the new cluster

The complete way is much more complicated and it's unclear how it should look like. We should start simple, port the old script, and create a job that would use it.

@sladyn98
Copy link

So if I understand correctly the steps we are likely to take are:
a) Move the sh file to this repo: I see a verify-published.sh inside the hacks/packages folder so would we be moving this script there as well.
b) Create a job that runs this sh file: Would we be adding a setp to the cloudbuild.yaml file which runs this after a certain step

@xmudrii
Copy link
Member

xmudrii commented Apr 11, 2021

a) Move the sh file to this repo: I see a verify-published.sh inside the hacks/packages folder so would we be moving this script there as well.

I would only add that we might want to slightly revisit the script. I see that it has a part for verifying stable packages. I think that this might be covered by verify-published.sh, and in that case, we should drop that part from that script.

b) Create a job that runs this sh file: Would we be adding a setp to the cloudbuild.yaml file which runs this after a certain step

I'm not sure that we need to update cloudbuild.yaml. I think that, at least at the beginning, a periodic job should be enough. Here's how the periodic for verifying published debs (verify-published.sh) looks like: https://github.com/kubernetes/test-infra/blob/675a42cb78d8eb6239cf4117e39a78f21c4e6891/config/jobs/kubernetes/release/release-config.yaml#L442-L463

@sladyn98
Copy link

So looking at current script, it does not check the versions of the installed packages and does not check kubeadm, kubelet, kubectl etc

TODO: Add logic to support checking all the existence of all packages, not just kubeadm
#       The full list is: kubeadm, kubelet, kubectl, cri-tools, kubernetes-cni

So would we be better off keeping the check or maybe adding it to the verify-published.sh and then dropping it from the new script.

@xmudrii
Copy link
Member

xmudrii commented Apr 29, 2021

We've discussed this issue in the #release-management channel on Slack. I'll try to recap the most important points.

The original script from the kubeadm repo doesn't work anymore because we are no longer building debs and rpms for each CI build. Historically, we would build debs/rpms for each CI build and put them in a bucket, from where anyone could grab them. However, we are not doing this since a while ago, so it's not possible to use the original script at all.

Instead, we want to use kubepkg to build debs and rpms, and then test those packages. However:

  1. kubepkg is still not completed (some functionalities are missing), so it might not be possible to use it for building and testing packages
  2. kubepkg is currently not used anywhere, so we would end up testing packages that are not used at all. That doesn't make much sense.

Because of the reasons stated above, we've concluded that it makes sense to put this issue on hold until we don't start using packages built with kubepkg.

Also, @puerco proposed that instead of running this job as a periodic, we run it as part of the release process (e.g. when staging the release).

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Apr 29, 2021
@afbjorklund
Copy link

afbjorklund commented Aug 29, 2022

This job should probably also verify that the installed software (from the packages) has the expected version ?

Currently the kubernetes-cni 0.8.7 deb contains a repackaging of the 0.8.6 cni-plugins tgz, i.e. the wrong one

https://apt.kubernetes.io/pool/kubernetes-cni_0.8.7-00_amd64_ca2303ea0eecadf379c65bad855f9ad7c95c16502c0e7b3d50edcb53403c500f.deb

https://github.com/containernetworking/plugins/releases/download/v0.8.6/cni-plugins-linux-amd64-v0.8.6.tgz

@afbjorklund
Copy link

@saschagrunert
Copy link
Member

Failed again, for the 1.1.1 deb. It also contains 0.8.6

https://apt.kubernetes.io/pool/kubernetes-cni_1.1.1-00_amd64_910d5920eab8d7cf9a66cc4b0faf998fb3ee909bd6ee0bd7d1c6605b6a82f98f.deb

Ugh, this may fix it: #2673

@afbjorklund
Copy link

afbjorklund commented Oct 9, 2022

@saschagrunert : would it be possible to release new packages to apt, with the correct contents ?

kubernetes-cni_0.8.6-00 (this one seems OK, no new release needed)
kubernetes-cni_0.8.7-01
kubernetes-cni_1.1.1-01

EDIT: These would be imaginary new packages, to replace the old ones with the wrong content:

$ apt list kubernetes-cni -a
Listing... Done
kubernetes-cni/kubernetes-xenial 1.1.1-00 amd64
kubernetes-cni/kubernetes-xenial 0.8.7-00 amd64
kubernetes-cni/kubernetes-xenial 0.8.6-00 amd64
kubernetes-cni/kubernetes-xenial 0.7.5-00 amd64
kubernetes-cni/kubernetes-xenial 0.6.0-00 amd64
kubernetes-cni/kubernetes-xenial 0.5.1-00 amd64
kubernetes-cni/kubernetes-xenial 0.3.0.1-07a8a2-00 amd64

@saschagrunert
Copy link
Member

@saschagrunert : would it be possible to release new packages to apt, with the correct contents ?

kubernetes-cni_0.8.6-00 (this one seems OK, no new release needed) kubernetes-cni_0.8.7-01 kubernetes-cni_1.1.1-01

They should be automatically generated with the October patch releases. Not sure if we have to bump the package rev, though.

@afbjorklund
Copy link

afbjorklund commented Oct 10, 2022

Not sure if we have to bump the package rev, though.

If you do a "stealth" update, the old ones might be used from the cache (depending on how people set up their mirroring, when filenames are same)

EDIT: My bad, the filename would change:
pool/kubernetes-cni_1.1.1-00_amd64_910d5920eab8d7cf9a66cc4b0faf998fb3ee909bd6ee0bd7d1c6605b6a82f98f.deb

Also I don't think apt and yum will see it as an update, if it has the same EVR ?

But we can try it, something simple like /opt/cni/bin/bridge --version should do

@saschagrunert
Copy link
Member

@afbjorklund I'm wondering why we are at revision 01 for those packages, it should be 00 per:

revision = "00"

@afbjorklund
Copy link

We are at 00, unfortunately the contents are 0.8.6. So I was thinking that 01 was the next revision after ?

Normally the default debian revision is "0", so I'm not sure where the extra zero came from to start with...

@afbjorklund
Copy link

afbjorklund commented Dec 11, 2022

Still happening in Kubernetes 1.26.0:

anders@lima-k8s:~$ apt list | grep kubernetes-xenial

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

cri-tools/kubernetes-xenial,now 1.25.0-00 amd64 [installed]
docker-engine/kubernetes-xenial 1.11.2-0~xenial amd64
kubeadm/kubernetes-xenial,now 1.26.0-00 amd64 [installed]
kubectl/kubernetes-xenial,now 1.26.0-00 amd64 [installed]
kubelet/kubernetes-xenial,now 1.26.0-00 amd64 [installed]
kubernetes-cni/kubernetes-xenial,now 1.1.1-00 amd64 [installed]
rkt/kubernetes-xenial 1.29.0-1 amd64
anders@lima-k8s:~$ /usr/bin/crictl --version
crictl version v1.25.0
anders@lima-k8s:~$ /opt/cni/bin/portmap --version
CNI portmap plugin v0.8.6

Due to the packages not being bumped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/release-eng Issues or PRs related to the Release Engineering subproject lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence. sig/release Categorizes an issue or PR as relevant to SIG Release.
Projects
None yet
Development

No branches or pull requests

8 participants