New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubelet: enable crio runtime #235

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
8 participants
@rphillips

rphillips commented Sep 11, 2018

Enables the crio runtime

Depends on #234 and coreos-inc/tectonic-operators#457

@openshift-ci-robot

This comment has been minimized.

Show comment
Hide comment
@openshift-ci-robot

openshift-ci-robot Sep 11, 2018

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rphillips
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: wking

If they are not already assigned, you can assign the PR to them by writing /assign @wking in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot commented Sep 11, 2018

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rphillips
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: wking

If they are not already assigned, you can assign the PR to them by writing /assign @wking in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -11,6 +11,9 @@ ExecStart=/usr/bin/hyperkube \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--rotate-certificates \
--container-runtime=remote \
--container-runtime-endpoint=unix:///var/run/crio/crio.sock \

This comment has been minimized.

@aaronlevy

aaronlevy Sep 11, 2018

Member

This would also need to align with a change in the pod-checkpointer daemonset: https://github.com/kubernetes-incubator/bootkube/blob/master/cmd/checkpoint/main.go#L21

And if tests pass without this change - I'd be worried (and we should make sure checkpoint tests are actually running.

@aaronlevy

aaronlevy Sep 11, 2018

Member

This would also need to align with a change in the pod-checkpointer daemonset: https://github.com/kubernetes-incubator/bootkube/blob/master/cmd/checkpoint/main.go#L21

And if tests pass without this change - I'd be worried (and we should make sure checkpoint tests are actually running.

This comment has been minimized.

@rphillips

rphillips Sep 11, 2018

Thanks. I added the command line arg here.

@rphillips

rphillips Sep 11, 2018

Thanks. I added the command line arg here.

@wking

This comment has been minimized.

Show comment
Hide comment
@wking
Member

wking commented Sep 12, 2018

@abhinavdahiya

This comment has been minimized.

Show comment
Hide comment
@abhinavdahiya

abhinavdahiya Sep 13, 2018

Member

Trying to test CRI-O:

We use following flags on kubelet right now:

 --cni-conf-dir=/etc/kubernetes/cni/net.d \
 --cni-bin-dir=/var/lib/cni/bin \

And our networking setup using network operator

  • puts the 10-flannel.conflist to /etc/kubernetes/cni/net.d
  • puts cni binaries to /var/lib/cni/bin

But when i switch on cri-o with kubelet

# The "crio.network" table contains settings pertaining to the
# management of CNI plugins.
[crio.network]

# network_dir is is where CNI network configuration
# files are stored.
network_dir = "/etc/cni/net.d/"

# plugin_dir is is where CNI plugin binaries are stored.
plugin_dir = "/usr/libexec/cni"

There seems to be a mismatch and pods come up with wrong networking...

4: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 42:fa:29:92:56:47 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::40fa:29ff:fe92:5647/64 scope link
       valid_lft forever preferred_lft forever
18: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether f2:d6:e5:f2:11:06 brd ff:ff:ff:ff:ff:ff
    inet 10.2.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::f0d6:e5ff:fef2:1106/64 scope link
       valid_lft forever preferred_lft forever

@eparis @ashcrow @aaronlevy

Member

abhinavdahiya commented Sep 13, 2018

Trying to test CRI-O:

We use following flags on kubelet right now:

 --cni-conf-dir=/etc/kubernetes/cni/net.d \
 --cni-bin-dir=/var/lib/cni/bin \

And our networking setup using network operator

  • puts the 10-flannel.conflist to /etc/kubernetes/cni/net.d
  • puts cni binaries to /var/lib/cni/bin

But when i switch on cri-o with kubelet

# The "crio.network" table contains settings pertaining to the
# management of CNI plugins.
[crio.network]

# network_dir is is where CNI network configuration
# files are stored.
network_dir = "/etc/cni/net.d/"

# plugin_dir is is where CNI plugin binaries are stored.
plugin_dir = "/usr/libexec/cni"

There seems to be a mismatch and pods come up with wrong networking...

4: cni0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 42:fa:29:92:56:47 brd ff:ff:ff:ff:ff:ff
    inet 10.88.0.1/16 scope global cni0
       valid_lft forever preferred_lft forever
    inet6 fe80::40fa:29ff:fe92:5647/64 scope link
       valid_lft forever preferred_lft forever
18: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
    link/ether f2:d6:e5:f2:11:06 brd ff:ff:ff:ff:ff:ff
    inet 10.2.1.0/32 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::f0d6:e5ff:fef2:1106/64 scope link
       valid_lft forever preferred_lft forever

@eparis @ashcrow @aaronlevy

@abhinavdahiya abhinavdahiya referenced this pull request Sep 13, 2018

Closed

kubelet: enable crio #52

@aaronlevy

This comment has been minimized.

Show comment
Hide comment
@aaronlevy

aaronlevy Sep 13, 2018

Member

There are some side discussions ongoing - but my general position regarding CNI plugins/configuration:

  • If the OS needs to ship with a default set of read-only CNI plugin binaries / configuration, that's fine - but this should be assumed to be for non-openshift uses (e.g. running podman by hand).

  • All OpenShift CNI binaries / CNI configuration should be distributed / controlled by the network-operator / be coming from inside the cluster.

So hypothetically the OS ships with runtime configured to use:
plugin_dir = /usr/libexec/bin
network_dir = /etc/cni/net.d

However, for OpenShift clusters, we provide kubelet configuration that sets those locations to different locations (just as example):
plugin_dir = /etc/kubernetes/cni/bin
network_dir = /etc/kubernetes/cni/net.d

And both of those dirs are empty until a network daemonset has been deployed (which sets up plugins/configuration).

Member

aaronlevy commented Sep 13, 2018

There are some side discussions ongoing - but my general position regarding CNI plugins/configuration:

  • If the OS needs to ship with a default set of read-only CNI plugin binaries / configuration, that's fine - but this should be assumed to be for non-openshift uses (e.g. running podman by hand).

  • All OpenShift CNI binaries / CNI configuration should be distributed / controlled by the network-operator / be coming from inside the cluster.

So hypothetically the OS ships with runtime configured to use:
plugin_dir = /usr/libexec/bin
network_dir = /etc/cni/net.d

However, for OpenShift clusters, we provide kubelet configuration that sets those locations to different locations (just as example):
plugin_dir = /etc/kubernetes/cni/bin
network_dir = /etc/kubernetes/cni/net.d

And both of those dirs are empty until a network daemonset has been deployed (which sets up plugins/configuration).

@abhinavdahiya

This comment has been minimized.

Show comment
Hide comment
@abhinavdahiya

abhinavdahiya Sep 13, 2018

Member

In an out-of-band discussion with @sjenning @mrunalp , the move forward:

  1. CRI-O on RHCOS defaults to using /etc/cni/net.d/ and /opt/cni/bin
  2. Network Operator requires that /etc/cni/net.d/ and /opt/cni/bin be empty;
  3. Network Operator drop its configuration to /etc/cni/net.d/ and /opt/cni/bin
    RHCOS makes sure number 1 and 2 are met
Member

abhinavdahiya commented Sep 13, 2018

In an out-of-band discussion with @sjenning @mrunalp , the move forward:

  1. CRI-O on RHCOS defaults to using /etc/cni/net.d/ and /opt/cni/bin
  2. Network Operator requires that /etc/cni/net.d/ and /opt/cni/bin be empty;
  3. Network Operator drop its configuration to /etc/cni/net.d/ and /opt/cni/bin
    RHCOS makes sure number 1 and 2 are met
@sjenning

This comment has been minimized.

Show comment
Hide comment
@sjenning

sjenning Sep 13, 2018

Contributor

in parallel @mrunalp team is getting RHCOS compose to use a faster moving repo for cri-o and podman packages that will have crio.conf values that match the kubelet defaults (/etc/cni/net.d/ and /opt/cni/bin)

Contributor

sjenning commented Sep 13, 2018

in parallel @mrunalp team is getting RHCOS compose to use a faster moving repo for cri-o and podman packages that will have crio.conf values that match the kubelet defaults (/etc/cni/net.d/ and /opt/cni/bin)

@@ -11,6 +11,9 @@ ExecStart=/usr/bin/hyperkube \
--bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \
--kubeconfig=/var/lib/kubelet/kubeconfig \
--rotate-certificates \
--container-runtime=remote \
--container-runtime-endpoint=unix:///var/run/crio/crio.sock \
--runtime-request-timeout=10m \

This comment has been minimized.

@wking

wking Sep 13, 2018

Member

Motivation for this setting is here.

@wking

wking Sep 13, 2018

Member

Motivation for this setting is here.

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow
Member

ashcrow commented Sep 14, 2018

#234 and coreos-inc/tectonic-operators#457 have been closed.

@sjenning sjenning changed the title from WIP: kubelet: enable crio runtime to kubelet: enable crio runtime Sep 14, 2018

@ashcrow

This comment has been minimized.

Show comment
Hide comment
@ashcrow

ashcrow Sep 14, 2018

Member

Changes look good.

Member

ashcrow commented Sep 14, 2018

Changes look good.

@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Sep 14, 2018

Member

/retest

Member

crawford commented Sep 14, 2018

/retest

@abhinavdahiya

This comment has been minimized.

Show comment
Hide comment
@abhinavdahiya

abhinavdahiya Sep 14, 2018

Member

/hold

Networking needs work before merging

Member

abhinavdahiya commented Sep 14, 2018

/hold

Networking needs work before merging

@openshift-ci-robot

This comment has been minimized.

Show comment
Hide comment
@openshift-ci-robot

openshift-ci-robot Sep 14, 2018

@rphillips: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-smoke 91cc901 link /test e2e-aws-smoke

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-ci-robot commented Sep 14, 2018

@rphillips: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-smoke 91cc901 link /test e2e-aws-smoke

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@sjenning

This comment has been minimized.

Show comment
Hide comment
@sjenning

sjenning Sep 14, 2018

Contributor

change happening in #251 now

Contributor

sjenning commented Sep 14, 2018

change happening in #251 now

@sjenning sjenning closed this Sep 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment