Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split coredns dependency to decouple kubeadm and kube-up #80749

Merged
merged 1 commit into from Jul 31, 2019

Conversation

yastij
Copy link
Member

@yastij yastij commented Jul 30, 2019

Signed-off-by: Yassine TIJANI ytijani@vmware.com

What type of PR is this?

/kind bug
/sig release
/sig architecture
/priority important-soon

What this PR does / why we need it: decouples kubeadm and kube-up and allows bumping coredns separately. Unblocks #78033

Which issue(s) this PR fixes: ref #80546

Special notes for your reviewer:

/assign @cblecker @neolit123

Does this PR introduce a user-facing change?:

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


Signed-off-by: Yassine TIJANI <ytijani@vmware.com>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jul 30, 2019
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like a good way to fix the blockade in #78033
TL;DR: is that we are making a change in how kubeadm is doing coredns upgrades and also bumping the coredns version in kubeadm. but we are not bumping the version in kube-up and CI is failing.

it feels to me that the sync between the two deployers is much desired in this regard, yet it should not be forced, thus this PR.

/lgtm
@kubernetes/sig-cluster-lifecycle-pr-reviews
/assign @justaugustus
cc @MrHohn

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Jul 30, 2019
@yastij
Copy link
Member Author

yastij commented Jul 30, 2019

/retest

@justaugustus
Copy link
Member

Just noting here that I don't want us to get in the habit of bifurcating versions for the same dependency.
However, reading through #78302 suggests that there are a few things unrelated to the CoreDNS version for kube-up.sh that would hold up that PR.

Let's get this merged to unblock the kubeadm work with a future goal to dedupe the dependency.

/lgtm
/approve

/hold
(@yastij -- can you remove the Fixes #80546 from the PR description? There's more work/discussion to be had there and I don't want that to close once this merges)

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 30, 2019
@cblecker
Copy link
Member

I'd echo @justaugustus, that this concerns me.

I'd like some way to make sure that when we get to release time, both officially supported install paths should install the same version of CoreDNS. Is that at least documented somewhere in policy?

I have a couple questions:

@neolit123
Copy link
Member

@justaugustus

Just noting here that I don't want us to get in the habit of bifurcating versions for the same dependency.

that is true, at the same time ideally kubeadm PRs should not be touching /cluster.
perhaps the alternative is to apply the upgrade logic and only then bump the versions in both deployers in a single PR to comply with the new verify/dep test.

However, reading through #78302 suggests that there are a few things unrelated to the CoreDNS version for kube-up.sh that would hold up that PR.

i see the same. will defer to the list of folks bellow [1].

@cblecker

Where are we testing the kubeadm deployment path? Do we have proper coverage? (#78033 (comment))

the test jobs here exercise both the deploy and deploy + upgrade paths for kubeadm using coredns:
https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-kubeadm

If we are splitting the upgrade work into two PRs, when is the deadline for the kube-up path to go in?

[1] possibly @rajansandeep , @chrisohaver and @MrHohn can answer that.
i think the plan was - 1.16 for both deployers.

If we don't make this deadline, will we revert the kubeadm bump?

that's a good question. i personally think that we should not. there has been a few important bug fixes in coredns after 1.3.1 including scalability and it's time to move on. if kube-up is delayed maybe it can catch up next cycle.

to quote myself from earlier:

it feels to me that the sync between the two deployers is much desired in this regard, yet it should not be forced

@cblecker
Copy link
Member

cblecker commented Jul 31, 2019

Having it be enforced technically in such a way that creates unreviewably large PRs, I can see being a problem. But considering that all our presubmits use kube-up right now, having code merging into master that isn't being tested with all components doesn't seem right to me... especially letting a release go out without it.

The board of jobs you linked is all for periodic (ci-*) jobs, rather than presubmit (pull-*).

@cblecker
Copy link
Member

To quote my comment on the other PR (#78033 (comment)):

I'm concerned about any other possible regression (such as performance) that we aren't testing by moving kubeadm before kube-up.

@neolit123
Copy link
Member

But considering that all our presubmits use kube-up right now, having code merging into master that isn't being tested with all components doesn't seem right to me...especially letting a release go out without it.

a few of points:

  • AFAIK we don't have upgrade presubmits for kube-up either and these coredns PRs need that.
  • kubeadm presubmits that are always_run: true saw resistance in the past. maybe the work @BenTheElder is doing right now will make this happen.
  • the PR for kubeadm only touches /cmd/kubeadm, which is technically an addon modification for a certain deployer and it's up to the deployer maintainers to resolve regressions.
  • i was actually trying to completely disable kube-up based presubmits on the /cmd/kubeadm namespace, but run_if_changed makes this difficult.
  • enabling kubeadm presubmits for run_if_changed: /cmd/kubeadm` is viable today.

The board of jobs you linked is all for periodic (ci-) jobs, rather than presubmit (pull-).

these periodics are pretty well monitored.

@rajansandeep
Copy link
Contributor

@cblecker @neolit123 I will be updating the upgrade logic for CoreDNS in kube-up (soon) the same way the upgrade migration logic is underway for kubeadm.
Since this is done is separate PRs, it is better that the CoreDNS version is bumped up on the respective PRs.
This is mainly to avoid what happened in the last release, where I had initially updated the CoreDNS version and then we started to observe CI test failures (the PRs for the migration logic wasn’t merged yet), causing panic and eventually we had to revert back the CoreDNS version.

I foresee the split to update the version to happen only this time since this is a major change to how the upgrade in CoreDNS will be handled.
It is the intention to always try and ensure that the CoreDNS version always be in sync in kubeadm and kube-up.

@neolit123
Copy link
Member

neolit123 commented Jul 31, 2019

I'm concerned about any other possible regression (such as performance) that we aren't testing by moving kubeadm before kube-up.

that is true. we do not have performance tests in the kubeadm jobs, so some of the kube-up tests might catch that.

@chrisohaver do you have performance tests on the coredns repository side?
(i'm sure there should be some).

@cblecker
Copy link
Member

I foresee the split to update the version to happen only this time since this is a major change to how the upgrade in CoreDNS will be handled.
It is the intention to always try and ensure that the CoreDNS version always be in sync in kubeadm and kube-up.

Then perhaps we revert this change after this upgrade is done? Either way, the intent here is to update kube-up in short order, but just not the same exact PR. That addresses my main concern.. I just don't want to forget kube-up, or leave it to really really late in the cycle and then have to make tough decisions.

Do you have a time estimate on the kube-up version of this upgrade work @rajansandeep? Is there an PR open already (it's okay if not, but if there is I'd like to follow it)?

cc: @lachie83 -- Just to make sure this is on your radar.

@rajansandeep
Copy link
Contributor

@cblecker yes, I think it’s safe to say that this PR can be reverted after the upgrade in kubeadm(#78033) and Kube-up(#78302) is done.

I’ll most likely update the PR for kube-up tomorrow so that there is enough time to test out the upgrade/downgrade in the CI tests.

@cblecker
Copy link
Member

@rajansandeep Fantastic! All my concerns are more than addressed. Will follow along in those PRs.

/lgtm
/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 31, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cblecker, justaugustus, yastij

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 31, 2019
@BenTheElder
Copy link
Member

sounds good to me

this problem [coordinating versions and testing ...] will be tricker as more things move out of the mono repo ... something to think about going forward 🤔

@k8s-ci-robot k8s-ci-robot merged commit 8c8c411 into kubernetes:master Jul 31, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.16 milestone Jul 31, 2019
@neolit123
Copy link
Member

this problem [coordinating versions and testing ...] will be tricker as more things move out of the mono repo ... something to think about going forward thinking

once kubeadm moves out of tree we are unlikely to enforce matching of the same coredns version as kube-up, unless the coredns maintainers prefer that.

then again, the hardcoded coredns addon in kubeadm itself will eventually move outside of kubeadm, and into the "cluster-addons" project, which will probably allow the users to install any version, but still have a sane default.

@BenTheElder
Copy link
Member

once kubeadm moves out of tree we are unlikely to enforce matching of the same coredns version as kube-up, unless the coredns maintainers prefer that.

right, that just means there may be even more spread on what we test and more places to need to go track down updates...

then again, the hardcoded coredns addon in kubeadm itself will eventually move outside of kubeadm, and into the "cluster-addons" project, which will probably allow the users to install any version, but still have a sane default.

that would be very nice I'm hopeful for consuming this sort of thing in kind as well :-)

@justaugustus justaugustus added this to Done (1.16) in SIG Release Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/release Categorizes an issue or PR as relevant to SIG Release. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants