-
Notifications
You must be signed in to change notification settings - Fork 39k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubeadm: fix offline and air-gapped support #67397
Conversation
9955c1b
to
6d1202e
Compare
@@ -56,9 +56,12 @@ func SetInitDynamicDefaults(cfg *kubeadmapi.InitConfiguration) error { | |||
// This is the same logic as the API Server uses | |||
ip, err := netutil.ChooseBindAddress(addressIP) | |||
if err != nil { | |||
return err | |||
defaultBind := "127.0.0.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only problem I see with this is that ChooseBindAddress has more than one error that it can return. We should really be returning a specific error type and only defaulting on the error type we expect otherwise we might ignore an actual error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense.
but given it returns a string should we just string compare for "no default routes", like the unit tests are doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like comparing strings for errors, I'd prefer type checking as described https://blog.golang.org/error-handling-and-go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i will add the error type check.
6d1202e
to
d7e294b
Compare
update:
|
case *netutil.NoRoutesError: | ||
defaultBind := "127.0.0.1" | ||
glog.Warningf("could not obtain advertise address for the API Server: %v; using: %s", err, defaultBind) | ||
cfg.API.AdvertiseAddress = defaultBind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will lead to the error. net space inside container with apiserver will be not the same as host, thus if we can't get bind ip, init must fail. What we can do, is to try to keep AdvertiseAddress empty, and in places where it is important, do the checks for it to have some normal values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kad i was hesitant about adding 127.0.0.1.
i will give ""
a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but please have a check that it is valid in the places where it is really used. e.g. with https://golang.org/pkg/net/#IP.IsGlobalUnicast
d7e294b
to
032e629
Compare
@kad
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/apis/kubeadm/validation/validation.go#L303
|
The solution SGTM 👍 |
@neolit123 Here is example of what I've meant with checking of validity of undetected AdvertiseAddress:
That's an error situation. If AdvertiseAddress is not detected nor specified manually in config, code that tries to use it (certificate creation, init) should fail. |
i will make the warning to be more clear:
i will add the check to the cert phase. but it becomes a question of what command should fail, because things like |
In print defaults I think this field should be set to empty if it is 0, to force user to set proper value there. |
@@ -56,9 +56,17 @@ func SetInitDynamicDefaults(cfg *kubeadmapi.InitConfiguration) error { | |||
// This is the same logic as the API Server uses | |||
ip, err := netutil.ChooseBindAddress(addressIP) | |||
if err != nil { | |||
return err | |||
switch err.(type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we make it to a private function, since it is used in both masterconfig and nodeconfig.
/assign @cheftako |
032e629
to
7f35a8c
Compare
@kad @dixudx @xiangpengzhao updated and also included a patch for the air-gapped fix we talked about (on timeout, fallback to local client version). |
/test pull-kubernetes-integration ^ needs a small fix:
WIP |
/lgtm |
a577e19
to
a4e98be
Compare
rebased after conflicts with master. |
} | ||
return nil, err | ||
} | ||
*valueToUpdate = ip.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantics are weird of this signature, you already return the ip where any consumer can call ip.String()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd either call that externally or return the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will update this.
// including the offset e.g. `alpha.0.206`. This is done to comply with GCR image tags. | ||
pre := v.PreRelease() | ||
patch := v.Patch() | ||
if len(pre) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sttts don't we have utils for this stuff somewhere?
|
||
// VerifyAPIServerBindAddress can be used to verify if a bind address for the API Server is 0.0.0.0, | ||
// in which case this address is not valid and should not be used. | ||
func VerifyAPIServerBindAddress(address string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there negative tests for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests are missing. will add them.
Change the error output of getAllDefaultRoutes() so that it includes information on which files were probed for the IP routing tables even if such files are obvious. Introduce a new error type which can be used to figure out of this error is exactly of the "no routes" type.
a4e98be
to
d4eff6c
Compare
@neolit123 - Could you squash some of the commits? |
1) Do not fail in case a bind address cannot be obtained If netutil.ChooseBindAddress() fails looking up IP route tables it will fail with an error in which case the kubeadm config code will hard stop. This scenario is possible if the Linux user intentionally disables the WiFi from the distribution settings. In such a case the distro could empty files such files as /proc/net/route and ChooseBindAddress() will return an error. For improved offline support, don't error on such scenarios but instead show a warning. This is done by using the NoRoutesError type. Also default the address to 0.0.0.0. While doing that, prevent some commands like `init`, `join` and also phases like `controlplane` and `certs` from using such an invalid address. Add unit tests for the new function for address verification. 2) Fallback to local client version If there is no internet, label versions fail and this breaks air-gapped setups unless the users pass an explicit version. To work around that: - Remain using 'release/stable-x.xx' as the default version. - On timeout or any error different from status 404 return error - On status 404 fallback to using the version of the client via kubeadmVersion() Add unit tests for kubeadmVersion(). Co-authored-by: Alexander Kanevskiy <alexander.kanevskiy@intel.com>
d4eff6c
to
90df4b4
Compare
squashed two - kubeadm related. p.s. i do not agree with the kubernetes commit squash policies. ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
LGTM , but I'd like @kad todo final lgtm.
/lgtm |
looks like a verify timeout. /test pull-kubernetes-verify |
@kubernetes/sig-api-machinery-pr-reviews - this requires your approval. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks @neolit123
@sttts could you please take a look at the api machinery change for approval. |
/retest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dixudx, kad, neolit123, sttts, timothysc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Automatic merge from submit-queue (batch tested with PRs 67397, 68019). If you want to cherry-pick this change to another branch, please follow the instructions here: https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md. |
What this PR does / why we need it:
Change the error output of getAllDefaultRoutes() so that it includes
information on which files were probed for the IP routing tables
even if such files are obvious.
Introduce a new error type which can be used to figure out of this
error is exactly of the "no routes" type.
If netutil.ChooseBindAddress() fails looking up IP route tables
it will fail with an error in which case the kubeadm config
code will hard stop.
This scenario is possible if the Linux user intentionally disables
the WiFi from the distribution settings. In such a case the distro
could empty files such files as /proc/net/route and ChooseBindAddress()
will return an error.
For improved offline support, don't error on such scenarios but instead
show a warning. This is done by using the NoRoutesError type.
Also default the address to 0.0.0.0.
While doing that, prevent some commands like
init
,join
and alsophases like
controlplane
andcerts
from using such an invalidaddress.
If there is no internet, label versions fail and this breaks
air-gapped setups unless the users pass an explicit version.
To work around that:
kubeadmVersion()
Add unit tests for kubeadmVersion().
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):refs kubernetes/kubeadm#1041
Special notes for your reviewer:
1st and second commits fix offline support.
3rd commit fixes air-gabbed support (as discussed in the linked issue)
the api-machinery change is only fmt.Errorf() related.
Release note:
/cc @kubernetes/sig-cluster-lifecycle-pr-reviews
/cc @kubernetes/sig-api-machinery-pr-reviews
/assign @kad
/assign @xiangpengzhao
/area UX
/area kubeadm
/kind bug