Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial (experimental) CoreOS support #1813

Merged
merged 4 commits into from
Feb 14, 2017
Merged

Conversation

aledbf
Copy link
Member

@aledbf aledbf commented Feb 7, 2017

replaces #1480

  • Detect CoreOS
  • Move key manifests to code, to tolerate read-only mounts
  • Run nodeup as a oneshot systemd service, rather than relying on
  • cloud-init behaviour which varies across distros

Edit: tested with --bastion using weave and calico

Edit 2: this change requires nodeup with flag --install

Edit 3: also tested with kopeio-vxlan


This change is Reviewable

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 7, 2017
@chrislovecnm
Copy link
Contributor

Should we use a feature flag for this? You can look at how I am using on in my rolling-update PR https://github.com/kubernetes/kops/pull/1134/files#diff-7095872b64962f5aefa022d844e0dc40R283

@aledbf
Copy link
Member Author

aledbf commented Feb 8, 2017

@chrislovecnm what do you mean? Add a feature flag to install in CoreOS?

@aledbf
Copy link
Member Author

aledbf commented Feb 9, 2017

TODO:

  • change default --volume-plugin-dir=/usr/libexec/kubernetes/kubelet-plugins/volume/exec/" to /opt/kubernetes/kubelet-plugins/volume/exec/
    (/usr/libexec is read-only)

@chrislovecnm
Copy link
Contributor

@aledbf this is experimental, and should only be used in my mind if the user deliberately turns on the feature flag.

Question for you: how do rolling updates work of not work with CoreOS.

@philk
Copy link
Contributor

philk commented Feb 10, 2017

Testing this out (seems to be working, thanks!) and noticed something.

Without using kubelet-wrapper or providing a socat binary this will still run into the socat issue which breaks helm

Copy link
Contributor

@chrislovecnm chrislovecnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions for u

@@ -227,6 +227,11 @@ func (d *dockerVersion) matches(arch Architecture, dockerVersion string, distro
}

func (b *DockerBuilder) Build(c *fi.ModelBuilderContext) error {
if b.Distribution == distros.DistributionCoreOS {
glog.Infof("Detected CoreOS; won't install Docker")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we running rkt or docker?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's preinstalled though (we can't really install complicated things on CoreOS)

return nil
}

// TODO: Do we actually use the user anywhere?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we figure this out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unused; we remove and verify nothing breaks. This PR takes the first step on that.

{
pod, err := b.buildPod()
if err != nil {
return fmt.Errorf("error building kube-apiserver pod: %v", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error building kube-apiserver pod

Kinda sound like we failed to deploy a pod, instead of failing to build the struct that we use to build the api server yaml that is used by kubelet. Thoughts on rewording?

return nil, fmt.Errorf("error building kube-apiserver flags: %v", err)
}

redirectCommand := []string{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a todo for setting this up to get logs from kubectl logs we have another issue in to allow for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, can you add a comment in the code to fix this to send logs to docker so that kubectl logs works with these pods.

Copy link
Member Author

@aledbf aledbf Feb 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The redirect to file was removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have #1813 ; I think that is separate from this PR

@aledbf
Copy link
Member Author

aledbf commented Feb 10, 2017

@philk the "problem" with kubelet-wrapper is that we need to use hyperkube and that means kops for coreos ends being different.

@aledbf aledbf force-pushed the coreos branch 2 times, most recently from db3e6e1 to 477ce0b Compare February 10, 2017 16:49
@philk
Copy link
Contributor

philk commented Feb 10, 2017

@aledbf I don't disagree that it sucks to ship kubelet two different ways but shipping a (slightly) broken kubelet isn't great either. Do you know the progress on using kubelet in a container for all installs? (or has that idea been abandoned?)

@aledbf
Copy link
Member Author

aledbf commented Feb 10, 2017

Do you know the progress on using kubelet in a container for all installs? (or has that idea been abandoned?)

ping @justinsb @chrislovecnm

@justinsb justinsb modified the milestone: 1.5.2 Feb 11, 2017
justinsb and others added 3 commits February 11, 2017 13:57
* Detect CoreOS
* Move key manifests to code, to tolerate read-only mounts
* Misc refactorings so more code can be shared
* Change lots of ints to int32s in the models
* Run nodeup as a oneshot systemd service, rather than relying on
cloud-init behaviour which varies across distros
@aledbf aledbf force-pushed the coreos branch 3 times, most recently from 19ccaea to 6715bd5 Compare February 11, 2017 20:03
@justinsb justinsb mentioned this pull request Feb 11, 2017
@justinsb
Copy link
Member

I opened #1861 to track the problems of #NoSocat, as it feels separable from this PR.

kubelet in a container has not made any progress AFAIK; my understanding is that is has caused problems in the past (metrics?) and so it isn't the default.

I don't think it's a huge deal to do, but I'm also not sure that socat should be the deciding factor...

Copy link
Member

@justinsb justinsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but we need a second LGTM as I authored some of it. @kris-nova ?

@@ -227,6 +227,11 @@ func (d *dockerVersion) matches(arch Architecture, dockerVersion string, distro
}

func (b *DockerBuilder) Build(c *fi.ModelBuilderContext) error {
if b.Distribution == distros.DistributionCoreOS {
glog.Infof("Detected CoreOS; won't install Docker")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's preinstalled though (we can't really install complicated things on CoreOS)

@@ -50,6 +52,9 @@ func main() {
target := "direct"
flag.StringVar(&target, "target", target, "Target - direct, cloudinit")

install := false
flag.BoolVar(&install, "install", install, "If true, will install a systemd unit instead of running directly")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the --install flag too ambiguous here? At a glance I didn't really understand what this was doing. Should we consider a more verbose name?

}
if i == 0 {
// We could also try to evaluate based on cwd
s, err = os.Readlink("/proc/self/exe")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we

  1. Check that file exists
  2. Check that it is in fact a symlink

Before just bailing out here?

Command: command,
FSRoot: flagRootFS,
}
err = i.Run()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't look like we do anything if we get an Error back from i.Run() can we at least log it please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we logged and retried, using the same logic as before.


## CoreOS

CoreOS support is highly experimental. Please report any issues.
Copy link
Contributor

@krisnova krisnova Feb 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is super experimental should we have something in the software that suggests/warns about that?

Ideas

  1. log message that yells and says "HEY THIS IS EXPERIMENTAL" (Not literally)
  2. whatever flags we use to trigger the behavior could have the word experimental in them a la VENDOREXPERIMENT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have an easy way to know the image OS in kops, so I'm not sure how practical this is.

return nil
}

func (b *KubectlBuilder) kubectlPath() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we should borrow some functionality from the which command here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where I can find the referenced command?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am referring to the which command commonly found on *nix flavored file systems. I think it ships in coreutils

I think it can be done in Go, but I guess the real point I am getting at is:

Would it be helpful to look up the kubectl executable at runtime instead of assuming where it is based on the OS?

Copy link
Member Author

@aledbf aledbf Feb 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually don't want to rely on the PATH here. We want to run our version of kubectl,

But in this case, this is the path where we're installing kubectl anyway, so we can't find it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you just use hyperkube on all platforms for kubectl?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. What is the advantage of doing so @chancez?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More containers less binaries is always a good thing IMO. Plus, you'll already be using hyperkube on each host, so it isn't additional overhead either. It'll also make updating the binary easier, and will handle verification.

Copy link
Contributor

@krisnova krisnova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please advertise the "experimental"ness of this PR a bit more before merging? See notes on including language in the logs, or in the flag itself.

LGTM as far as code is concerned. Nice PR!

@aledbf aledbf force-pushed the coreos branch 5 times, most recently from 494a064 to 6715bd5 Compare February 12, 2017 14:58
@justinsb
Copy link
Member

I'm not really sure where we would advertise it, because kops doesn't know the OS when it is installing. I guess we could check the image owner and warn if it matches the well known coreos.com value, but we don't do this for the other images also...

@aledbf
Copy link
Member Author

aledbf commented Feb 13, 2017

@justinsb @kris-nova anything else to add/change in this PR?

@aledbf
Copy link
Member Author

aledbf commented Feb 14, 2017

@justinsb ping

@justinsb justinsb merged commit 1c78188 into kubernetes:master Feb 14, 2017
@justinsb
Copy link
Member

Looks great - let's get some miles on it :-)

@ahasnaini
Copy link

Is there any info available on how to deploy a k8s cluster on coreOS, currently it deploy on debian, which flas am I missing ?

@aledbf
Copy link
Member Author

aledbf commented Feb 23, 2017

@ahasnaini you need to change the image flag to something like --image 595879546273/CoreOS-stable-1235.9.0-hvm

@aledbf aledbf mentioned this pull request Feb 23, 2017
@ahasnaini
Copy link

thanks that worked, I used the following command
kops create cluster --name=xxx.xxx.xxxx --state=s3://kops1213 --zones=eu-west-1a --node-count=1 --node-size=t2.micro --master-size=t2.micro --dns-zone=xxx.xxx.xx

EC2 instances got started but dnszone didn't get updated, without the image it did. when i logged on the master instance nothing was running under docker. do I need any more flags ?

@vmrm
Copy link

vmrm commented May 24, 2019

Hello, can't find any info - does kops support CoreOS images in Google Cloud Platform? Tried to deploy one of ig of existing cluster - VMs are started, but it seems that they aren't provisioned with kubelet and etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants