Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FCOS as Kubernetes / OKD node #93

Open
jasonbrooks opened this issue Dec 12, 2018 · 35 comments
Open

FCOS as Kubernetes / OKD node #93

jasonbrooks opened this issue Dec 12, 2018 · 35 comments

Comments

@jasonbrooks
Copy link
Collaborator

jasonbrooks commented Dec 12, 2018

Serving as a clustered server node for running Kubernetes / OpenShift OKD is a primary use case for Fedora CoreOS. This use case requires that certain components, in certain versions, be present on the node, either via package layering or by inclusion in the base image.

Relying on package layering allows for more flexibility in components and component versions, and keeps the base image size small for users that don't wish to use FCOS as a kube/OKD node. However, the general thinking in the group has been that package layering should be an exception, rather than a rule, and leaving component choices to the user will likely lead to a poorer, more fragmented user experience.

Including certain components in the base image, such as cri-o and the kubelet, allows for a simpler, more controlled experience, and users can still override included packages via package layering if they prefer different versions. However, baking specific components into the image involves choosing to include particular components, and exclude others.

OKD or Kubernetes?

Do we want to focus on upstream Kubernetes, tracking a particular version, and including the cri-o, kubelet, and kubeadm (I think kubeadm is our best bet for a suggested deployment strategy) for that version? Do we track the latest Kubernetes version? There are typically three stable versions available at a time.

Do we want to focus on OKD, tracking the latest version, and including the cri-o and origin packages for that version? Here, I think that openshift-installer is our best bet for an OKD deployment strategy.

It's possible that we could try to make a mixed approach work. OKD's origin-node package presents itself as kubelet.service on the host, so it's possible that we could focus on OKD, but also include a kubeadm package to deploy a Kubernetes cluster running on that origin-based kubelet. Or, we could see if the reverse works, focus on Kubernetes and maybe a OKD cluster would run on top of the upstream kubelet? In both cases, most of the components are delivered in containers, so it's really the kubelet and the cri-o (or other runtime if we prefer) that we'd be including in the image.

Whether we focus on OKD or Kubernetes (or neither) we're going to have people wanting an image that's doing the other thing. We're pointing to FCOS as the community half of the RHCOS coin, but if we don't support OKD directly, then there will likely be a desire for a separate host or host image that will serve that role. Likewise, if we focus on OKD, there'll likely be demand for a Kubernetes-focused version.

@jlebon
Copy link
Member

jlebon commented Dec 13, 2018

Making sure we rope in some other Fedora maintainers for OKD & kube.
/cc @ingvagabund @jcajka

@jlebon
Copy link
Member

jlebon commented Dec 14, 2018

Related: what's the state of kubelet-in-container upstream? I know OKD moved away from that, though is that true for Kubernetes as well? E.g. I see that the kubelet CLI has a --containerized switch. Would it be possible to ship OKD's kubelet on the host (as is done in RHCOS) to support both OKD and Kubernetes at that version, and otherwise for other versions, rely on a containerized kubelet?

@dustymabe
Copy link
Member

Related: what's the state of kubelet-in-container upstream

i'm pretty sure it's not officially supported and probably never going to be considered stable because of problems. This is why OKD went away from it to begin with.

OKD or Kubernetes?

My personal opinion is OKD (Openshift Origin). It's what I'm most familiar with and there are a lot of "papercuts" that get handled for you with openshift that makes it nice. However, is OKD the right decision for FCOS? I don't know. We've been intentionally vague on this point so far. Now is probably the right time to try to decide if we should concentrate on one (probably since we have finite resources) or try to do both.

@smekkley
Copy link

At the moment, CoreOS only provides wrapper scripts for kubernetes, etcd, flanneld to run in rkt.
So as far as the packaging goes, it is very agnostic about what orchestration tools it should use. I think FCOS should follow this no binary approach. In my opinion, even those wrapper scripts aren't necessary, as long as users can specify what to run on kickstart. Some documentation and example scripts outside the image would be enough.

In case of Kubernetes, kubelet can run in rkt container outside kubernetes while everything else runs in docker if you use bootkube to set up Kubernetes. Upgrading kubelet is a matter of upgrading rkt image by incrementing variable in the wrapper script.

@ingvagabund
Copy link

Kubernetes in Fedora is currently in not maintained state (with v1.10.3, built in May 2018). Origin seems active (built in December 2018 by @jcajka). I would not recommend to maintain both flavors due to resource constraints. Plus, Origin with a lot of its nicely smelling toys that promotes children play with Kubernetes to a new level and given the latest interest in Kubernetes in Fedora. OKD sounds like a better choice for me.

@jcajka
Copy link
Contributor

jcajka commented Jan 2, 2019

IMHO I think that origin should be the choice(as Origin should be compatible drop in replacement), if it will be possible to provide it from the Fedora/CoreOS infrastructure. If based on @ingvagabund kube is dead in Fedora then, Origin is in state of clinical death(I'm only active maintainer and I'm mostly in for multi-arch delivery). There is possibility that the 3.11 will be last and only "full" distribution of the Origin in Fedora as possible next release AFAIK might take significant shift in the delivery method(s) that with current resources will be not reproducible in Fedora.

That brings me to the question are you folks(@dustymabe @smekkley @jlebon @jasonbrooks ) in touch with the Origin upstream? IMO best would be to collaborate with them on coming up with the right delivery method that we can implement and deploy on our infrastructure.

@ajeddeloh
Copy link
Contributor

This also seems like something where systemd portable services could be used. Does anyone know if there have been any efforts to try running kubernetes in a portable service?

@dustymabe
Copy link
Member

This also seems like something where systemd portable services could be used. Does anyone know if there have been any efforts to try running kubernetes in a portable service?

I had the same thought.

@smekkley
Copy link

smekkley commented Jan 6, 2019

It seems like this issue has to wait until some decision is made in #37 .

@jcajka
Copy link
Contributor

jcajka commented Jan 8, 2019

@dustymabe @smekkley Do I understand you correctly that Fedora CoreOS wants to build its own distribution of the OKD/Kube that will be independent, with its own way to install/installer, to the way upstream is doing/supporting it. Including building its own separate build/installation/distribution infrastructure. Is this correct statement?

@dustymabe
Copy link
Member

@jcajka my preference would be that we re-use as much of "upstream OKD" as possible. In practice in the past we have mostly just pulled OKD containers that were built by the openshift releng teams (i.e. not relying on Fedora packages). In RHCOS we are "baking in" the kubelet into the base OSTree. This provides a better user experience (don't have to go grab my orchestrator from somewhere else on boot), but obviously requires a more tightly integrated build process. In FCOS, we're still debating how best to pull this off. One thing that won't work is "re-implementing our own versions/processes" for everything.

@jcajka that's a long way of saying - we're still figuring things out, and could use help :)

@jcajka
Copy link
Contributor

jcajka commented Jan 10, 2019

@dustymabe ah :), I will be happy to help as much as I can to plan it and make it

@Conan-Kudo
Copy link

I'm also happy to help where I can with OKD package maintenance in Fedora with @jcajka if I can.

@bgilbert bgilbert added this to Proposed in Fedora CoreOS preview via automation Jan 22, 2019
@bgilbert bgilbert removed this from Proposed in Fedora CoreOS preview Feb 19, 2019
@bgilbert bgilbert added this to Proposed in Fedora CoreOS stable via automation Feb 19, 2019
@cgwalters
Copy link
Member

I think it'd be great to have FCOS take over where Container Linux is today in the upstream Kubernetes community. Particularly for people who want to try out the latest upstream Kubernetes, and also track e.g. the latest kernel/systemd type features.

I would also like to see the combined FCOS/RHCOS team own a "build' that is coreos-assembler of FCOS as a base with current OpenShift cri-o/kubelet and test it. It wouldn't be a product, but e.g. would be a good way to track "did the latest Linux kernel break kubelet"?

I don't think Red Hat will productize or support in production FCOS running OpenShift. But if people who want to "DIY" it with upstream Kubernetes choose FCOS, that's something it makes sense for us to at least enable at a basic level.

@DanyC97
Copy link

DanyC97 commented May 16, 2019

But if people who want to "DIY" it with upstream Kubernetes choose FCOS,

@cgwalters i think folks who would want to give OKD v4 a try do depend on a operating system with ignition support, at least for the K8s control plane. For the compute i guess you could argue that RHEL/ CentOS/ Fedora can be use although some leg working is needed to join the cluster etc ..

@cgwalters
Copy link
Member

xref http://lists.openshift.redhat.com/openshift-archives/dev/2019-June/msg00016.html

@jasonbrooks
Copy link
Collaborator Author

Worth watching: https://youtu.be/921NezIOJNw

@arzarif
Copy link

arzarif commented Jul 23, 2019

Curious - was there a particular direction chosen with regards to this subject matter?

I've seen conflicting information with regards to the inclusion of CRI-O as a base package. It's not included in the current pre-release build, however, I'm not sure whether it's likely that that could change in the future. I thought there might be some context outside of this thread that I might be missing.

@lucab
Copy link
Contributor

lucab commented Jul 24, 2019

For reference Typhoon (a pure k8s distribution similar to Tectonic / openshift-installer) added initial support for FCOS in poseidon/typhoon#512.

@smekkley
Copy link

smekkley commented Aug 5, 2019

For what it's worth, I've also confirmed that FCOS works with bootkube by running kubelet in the podman on Baremetal. In the end typhoon is just a wrapper for bootkube.

I don't know if podman is going to stay in FCOS, but it can be definitely used for system containers(such as etcd and anything you want out of k8s cluster).
It's just a matter of slightly modifying wrapper scripts come from CoreOS/Flatcar Linux. This also effectively allows user to run whatever version of kubernetes. I've naturally run the latest version(1.15.1).

@bgilbert
Copy link
Contributor

bgilbert commented Aug 5, 2019

I don't know if podman is going to stay in FCOS

It will.

@runiq
Copy link

runiq commented Aug 6, 2019

[podman] can be definitely used for system containers(such as etcd and anything you want out of k8s cluster).

@smekkley Thanks, that's great to hear!

@smekkley smekkley mentioned this issue Aug 7, 2019
@cgwalters
Copy link
Member

There's an immense amount of discussion scattered around on this.

One concrete sticking point seems to be handling of the OpenShift installer and particularly the bootstrap node, which currently uses both kubelet and oc.

Given how primary Kubernetes is in the ecosystem, we've been having some backchannel discussion around baking in a kubelet by default (maybe something like /usr/lib/coreos-kubelet/kubelet) that could be used for bootstrapping or potentially directly for production workloads.
It's important to note that upstream Kubernetes has deprecated containerized kubelet, and certainly for RHCOS/OpenShift 4 it was a very intentional decision to stop containerizing kubelet.

The CLI issue would then still be a sticking point for the OpenShift installer, but I bet it wouldn't be too hard to change the installer to just invoke podman run --rm -ti openshift/cli instead. (It'd probably also just work to scrape the binary out of the cli container too).

@smekkley
Copy link

@cgwalters
I think you are also misunderstanding with --containerized option here.
kubernetes/kubernetes#74148 (comment)

Running kubelet in a pseudo container is trivial, with or without support from the project.
The same goes for etcd and all other system containers. People just need to run in an environment where only necessary dependencies exist and a container happens to provide that.

In order to achieve a native deployment of kubelet, OS needs to provide gluster client, ceph client, and all sorts of dependencies and those dependencies are dynamic.

@Conan-Kudo
Copy link

But this is what package layering is for. We can more or less trivially overlay the kubelet and anything we need for that if an FCOS machine is targeted to be a k8s/okd node.

@jasonbrooks
Copy link
Collaborator Author

As long as the cluster can handle these layering steps alongside the other management of the node, layering is fine. It might be simpler to have an OKD branch of the tree for OKD nodes.

@smekkley
Copy link

As long as users have freedom to choose k8s version and add/run binaries in a controlled manner like through package manager or container like CoreOS, I don't think that the detail matters. And it shouldn't break on OS upgrade.
With PXE environment, OS upgrade happens by replacing vmlinuz and cpio.gz and reboot.

@strigazi
Copy link

strigazi commented Sep 24, 2019

The only issue I had so far with F29AH and layering was that the release model of kubernetes and fedora packaging is kind of incompatible. If I choose to install k8s from fedora's rpm repos, I can only use what is available there. I can't select the verison I want. I think "detail" is important. People might want to stays 8 months in a releass, other might want to keep up with the latest stable.

Running k8s in atomic containers worked really well (as long as you "package" k8s in systems containers). Running k8s with podman might be an option?

IMO, OKD branches sound good as long as they keep up with k8s' releases.

@dustymabe dustymabe added the meeting topics for meetings label Sep 24, 2019
@smekkley
Copy link

In that case, the specific package manager you've mentioned falls into the category of not giving users freedom to choose k8s version, which I don't think any of us wants here.

@strigazi
Copy link

In that case, the specific package manager you've mentioned falls into the category of not giving users freedom to choose k8s version, which I don't think any of us wants here.

@smekkley

To which package manager you refer to, that none of us want? I said that fedora packages do not much the k8s release model.

@strigazi
Copy link

strigazi commented Sep 29, 2019

Since the last meeting [0], I tried to run k8s 1.16 with podman (in FCOS 30) and with atomic (in F29AH) and with the standalone binary.

  • for running in a container (either with podman or atomic) this patch mentioned above [1] removes the nsenter library. mounts do not work properly with it (eg the serviceaccount tokens can be mounted so calico and flannel can not even start).
    -- If we want to run in a container we could wrap nsenter? (not sure if it will work) I'll investigate but even the idea feels wrong
    UPDATE: with these mounts [2] posted my luca above, it sems to work. I'm running conformance.
  • for just running kubelet (just by dropping in the binary) both focs and f29ah miss socat and ethtool, without them portforwarding doesn't work. If you try to bootstrap the node with kubeadm the preflight check complain too.
    -- adding socat and ethtool in the OS image (Installed size: 1.8 M) would allow us to just use the hyperkube binary

[0] https://meetbot-raw.fedoraproject.org/teams/fedora_coreos_meeting/fedora_coreos_meeting.2019-09-25-16.30.html
[1] kubernetes/kubernetes@3b2a61d#diff-e0160f895346669c2e4495acfd858ea7L376
[2] https://github.com/poseidon/typhoon/pull/512/files#diff-080579587ebc63b482cddeadd3166349R68

@dustymabe
Copy link
Member

dustymabe commented Sep 30, 2019

  • UPDATE: with these mounts [2] posted my luca above, it sems to work. I'm running conformance.

Does that mean you got it to work with a workaround? Can you give a little more detail?

  • for just running kubelet (just by dropping in the binary) both focs and f29ah miss socat and ethtool, without them portforwarding doesn't work. If you try to bootstrap the node with kubeadm the preflight check complain too.
    -- adding socat and ethtool in the OS image (Installed size: 1.8 M) would allow us to just use the hyperkube binary

Good to know! Thanks

@strigazi
Copy link

strigazi commented Oct 2, 2019

  • UPDATE: with these mounts [2] posted my luca above, it sems to work. I'm running conformance.

Does that mean you got it to work with a workaround? Can you give a little more detail?

replace all atomic install command with systemd units and podman: https://review.opendev.org/#/c/685749/
similar to poseidon: #93 (comment)

  • for just running kubelet (just by dropping in the binary) both focs and f29ah miss socat and ethtool, without them portforwarding doesn't work. If you try to bootstrap the node with kubeadm the preflight check complain too.
    -- adding socat and ethtool in the OS image (Installed size: 1.8 M) would allow us to just use the hyperkube binary

Good to know! Thanks

@ajeddeloh ajeddeloh removed the meeting topics for meetings label Oct 9, 2019
@dghubble
Copy link
Member

dghubble commented Feb 12, 2020

I see folks referencing the Kubelet with podman example from Typhoon above so this seems possibly relevant. Just wanted to report that with the Docker log driver set to json-file as a workaround (to a k8s bug), it is possible to pass the CNCF conformance test suite with Fedora CoreOS nodes.

@dustymabe
Copy link
Member

thanks for the info @dghubble

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests