Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.9 backports 2020-12-14 #14405

Merged
merged 10 commits into from
Dec 15, 2020
Merged

v1.9 backports 2020-12-14 #14405

merged 10 commits into from
Dec 15, 2020

Conversation

joestringer
Copy link
Member

@joestringer joestringer commented Dec 15, 2020

Skipped due to conflicts, PTAL:

Once this PR is merged, you can update the PR labels via:

$ for pr in 14297 14313 14300 14286 14325 14299 14294 14375 14378; do contrib/backporting/set-labels.py $pr done 1.9; done

aditighag and others added 10 commits December 14, 2020 16:09
[ upstream commit 8917d33 ]

Signed-off-by: Aditi Ghag <aditi@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit 478f409 ]

In some occasions the metric `k8s_event_lag_seconds` could be presented
as an overflown value such as `9223372036854775807`. This commit fixes
this by checking if the calculated value is less than zero by only
setting this metric for positive times.

Fixes: 4e29130 ("pkg/endpoint: calculate Kube API-Server lag from pod events")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
…rbage-collection

[ upstream commit 1ef686b ]

Signed-off-by: John Watson <johnw@planetscale.com>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit 7ba7edc ]

Our fromCIDR+toPorts tests have been flaking for a while. In particular,
the check on the number of policy verdicts found in the output of cilium
monitor sometimes fails. Those fails happen because we read cilium
monitor's output too soon, before it has a chance to drain the ring
buffer (see #12596 for a detailed analysis of the logs).

This commit fixes it by repeatedly checking the output against our
regular expression until it succeeds or times out, similarly to what is
already done in K8sDatapathConfig MonitorAggregation.

Fixes: #12596
Fixes: 94eb491 ("test: Add policy test for ingress FromCIDR+ToPorts")
Fixes: 74be0b2 ("test: fromCIDR+toPorts host policy")
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit 002bd9c ]

Suggested-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Paul Chaignon <paul@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit 431eda4 ]

As per Documentation of microk8s (https://microk8s.io/docs/addons), enabling add-ons are being invoked with `microk8s enable <addon>` - note the missing „.“.
Also tested this locally

`microk8s enable cilium`

Signed-off-by: Philipp Gniewosz <philipp.gniewosz@posteo.de>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit d117d50 ]

Previously, a race condition between Kubernetes initialization logic and
the IPAM / address handling / NodeDiscovery logic could cause a crash on
startup in rare circumstances due to the k8s subsystem handling a policy
update event before the initialization was complete.

André pointed out that InitK8sSubsystem() occurs asynchronously to
initialize and begin handling k8s events, and this is initiated prior to
the allocation / initialization of the IP handling subsystem. Looking at
the stacktrace, we see that the "index out of range" error is triggered
from handling a policyAdd event, through the ObjectCache initialization,
through to the headerfile write / nodeconfig fetch logic, through to
logic which would serialize the node IP addressing information into the
node config file, subsequently to be used by other datapath
initialization logic:

    panic: runtime error: index out of range [3] with length 0

    goroutine 145 [running]:
    encoding/binary.bigEndian.Uint32(...)
        /usr/local/go/src/encoding/binary/binary.go:112
    github.com/cilium/cilium/pkg/byteorder.HostSliceToNetwork(...)
        /go/src/github.com/cilium/cilium/pkg/byteorder/byteorder.go:134 +0x1ee
    github.com/cilium/cilium/pkg/datapath/linux/config.(*HeaderfileWriter).WriteNodeConfig(...)
        /go/src/github.com/cilium/cilium/pkg/datapath/linux/config/config.go:108 +0x5baf
    ...
    github.com/cilium/cilium/pkg/datapath/loader.(*Loader).Reinitialize(...)
        /go/src/github.com/cilium/cilium/pkg/datapath/loader/base.go:187 +0x2c0
    github.com/cilium/cilium/daemon/cmd.(*Daemon).policyAdd(...)
        /go/src/github.com/cilium/cilium/daemon/cmd/policy.go:285 +0x1ff7
    github.com/cilium/cilium/daemon/cmd.(*PolicyAddEvent).Handle(...)
        /go/src/github.com/cilium/cilium/daemon/cmd/policy.go:217 +0x5c
    ...
    created by github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).Run
        /go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:240 +0x62

This occurs when converting the IPv4 loopback IP to network byte order,
and the index out of range implies that the IP address is not yet
initialized. The IP loopback address is initialized from
Daemon.allocateIPs() -> node.SetIPv4Loopback(...).

I had initially considered extending EventQueue to allow an
"initialization prerequisite" to be configured in the form of a channel.
This would require the daemon to create a channel for ip initialization
and either pass it down through the stack or pass it via the Daemon
object through to the datapath initialization logic and the endpoint add
logic (including host endpoint add), then into the endpoint EventQueue
initialization routines, which would keep a reference to the channel
around and wait on the close of the channel when EventQueue.Run() is
called. This would ensure that the "wait" is asynchronous, and hence
shouldn't slow down any of the ongoing initialization logic, however
it's a little more invasive than I'd like for a bugfix.

Looking a little closer around this logic, I realized that
loader.Reinitialize() consumes information from NodeDiscovery.
NodeDiscovery is initialized even later than allocateIPs(), and the
NodeDiscovery object has a convenient "Registered" channel which is
closed when the Daemon completes NodeDiscovery initialization. So, if we
wait on that channel, we could ensure all prerequisite IP addressing
information would be initialized prior to datapath generation.

From there, it's just a question of whether this path is guaranteed to
be handled asynchronously or whether there are some other paths that
call Reinitialize() that could potentially end up causing some kind of
deadlock. That does not appear possible upon initial inspection:

      File        Function     Line
    0 policy.go   <global>     286 if (err := d.Datapath().Loader().Reinitialize(d.ctx, d, d.mtuConfig.GetDeviceMTU(), d.Datapath(), d.l7Proxy); err != nil {
    1 policy.go   <global>     632 if (err := d.Datapath().Loader().Reinitialize(d.ctx, d, d.mtuConfig.GetDeviceMTU(), d.Datapath(), d.l7Proxy); err != nil {
    2 loader.go   <global>      34 Reinitialize(ctx context.Context, o BaseProgramOwner, deviceMTU int , iptMgr IptablesManager, p Proxy) error
    3 config.go   Handle        90 if (err := d.Datapath().Loader().Reinitialize(d.ctx, d, d.mtuConfig.GetDeviceMTU(), d.Datapath(), d.l8Proxy); err != nil {
    4 daemon.go   init         228 if (err := d.Datapath().Loader().Reinitialize(d.ctx, d, d.mtuConfig.GetDeviceMTU(), d.Datapath(), d.l7Proxy); err != nil {
    5 daemon.go   func         795 if (err := d.Datapath().Loader().Reinitialize(d.ctx, d, d.mtuConfig.GetDeviceMTU(), d.Datapath(), d.l7Proxy); err != nil {
    6 datapath.go Reinitialize 131 func (f *fakeLoader) Reinitialize(ctx context.Context, o datapath.BaseProgramOwner, deviceMTU int , iptMgr datapath.IptablesManager, p
                                   datapath.Proxy) error {
    7 base.go     Reinitialize 171 func (l *Loader) Reinitialize(ctx context.Context, o datapath.BaseProgramOwner, deviceMTU int , iptMgr datapath.IptablesManager, p
                                   datapath.Proxy) error {

0. policyAdd (From EventQueue)
1. policyDelete (From EventQueue)
2. type Loader interface
3. Daemon /config API (API only served after initialization is complete)
4. Daemon.init() (Occurs after StartDiscovery()
5. TriggerReloadWithoutCompile (Triggered from ipcache GC controller)
6. fakeLoader (unit tests)
7. Loader (function definition)

So, by blocking on node discovery when fetching LocalConfig(), we are
able to ensure that ip addressing configuration is properly initialized
prior to using it from datapath Reinitialize().

Fixes: #14245

Reported-by: Jorge Arco <jorge.arcoma@gmail.com>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit 26f318d ]

Support for ResourceQuotas (specifically for GKE) was added in #13878.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit b5c04ee ]

If a CCNP does not contain a spec nor specs, which means the embedded
CNP is nil, it will cause a nil pointer exception if we try to access
those fields. To prevent this we should be checking for this embedded
CNP struct before access its fields.

Fixes: 9d7082e ("policy: add k8s controller to daemon for clusterwide policies")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Joe Stringer <joe@cilium.io>
[ upstream commit 27cd5d2 ]

Similarly to a node name, a cluster name may contain dots (.). In such a
case, dots need to be replaced by dashes (-) so that the TLS server name
matches the certificate server's identity.

This patch is similar to the way node names containing dots are treated.

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>
Signed-off-by: Joe Stringer <joe@cilium.io>
@joestringer joestringer requested a review from a team as a code owner December 15, 2020 00:12
@joestringer joestringer added backport/1.9 kind/backports This PR provides functionality previously merged into master. labels Dec 15, 2020
@joestringer
Copy link
Member Author

test-backport-1.9

Copy link
Member

@aditighag aditighag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my change.

Copy link
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my changes as well.

Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for my PR.

Copy link
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for my commit!

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for my changes

@joestringer joestringer merged commit 5251a49 into v1.9 Dec 15, 2020
@joestringer joestringer deleted the pr/v1.9-backport-2020-12-14 branch December 15, 2020 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants