Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.2 backports 2018-08-28 #5407

Closed
wants to merge 26 commits into from
Closed

Conversation

nebril
Copy link
Member

@nebril nebril commented Aug 28, 2018

tgraf and others added 26 commits August 28, 2018 23:09
[ upstream commit 03273d6 ]

The existing code errored out when a port was already configured with a
redirect of a different kind. Add the ability to change the parser type.  A new
redirect will be recreated and a new port will be allocated. Existing
connections will receive an RST as the port is closed.

Fixes: cilium#5202

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 4f81ff6 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 786f6c8 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 77b62b3 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit bb9cfd9 ]

As the state of the endpoint can change during the regenerate, by
calling e.getLogger() every time it will make loggs less confusing
to read.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit fce1786 ]

This reverts commit c162b90.

Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 50f29e9 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit adb3524 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit c941ed4 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 9234914 ]

Kubernetes 1.12 added node.kubernetes.io/not-ready for nodes that don't
have its runtime network ready. Since Cilium needs to be deployed on
nodes so it can setup the CNI configuration the not-ready toleration
needs to be added to the DaemonSet.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
…resources

[ upstream commit d7e25ee ]

With debugging enabled, each compilation invokes clang two additional times and
llc one additional time to save the preprocessor state and to store the
assembly state in clear text. These are very costly. Thus hide this behind a
flag

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 267fec1 ]

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
…n the background

[ upstream commit 35377a1 ]

```
msg="Load 1-min: 0.77 5-min: 0.59 15min: 0.45" endpointID=20094 subsys=endpoint
msg="Memory: Total: 3006 Used: 1372 (45.64%) Free: 616 Buffers: 74 Cached: 943" endpointID=20094 subsys=endpoint
msg="Swap: Total: 0 Used: 0 (0.00%) Free: 0" endpointID=20094 subsys=endpoint
msg="NAME java STATUS S PID 22130 CPU: 40.62% MEM: 16.92% RSS: 508 VMS: 3273 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME kworker/1:0 STATUS S PID 22373 CPU: 3.67% MEM: 0.00% RSS: 0 VMS: 0 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME cilium-agent STATUS S PID 23779 CPU: 11.44% MEM: 3.46% RSS: 104 VMS: 1089 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME cilium-node-monitor STATUS S PID 24170 CPU: 1.70% MEM: 0.52% RSS: 15 VMS: 534 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME cilium-health STATUS S PID 24238 CPU: 3.06% MEM: 0.61% RSS: 18 VMS: 467 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME cilium-health STATUS S PID 24245 CPU: 4.63% MEM: 0.64% RSS: 19 VMS: 611 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME clang-3.8 STATUS R PID 24264 CPU: 32.38% MEM: 1.36% RSS: 40 VMS: 86 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
msg="NAME clang-3.8 STATUS R PID 24270 CPU: 30.31% MEM: 1.36% RSS: 40 VMS: 86 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=20094 subsys=endpoint
```

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit b7256a9 ]

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 154c91d ]

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 5342815 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit e116ba6 ]

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 36a595c ]

No need to hold the overall controller manager lock for several operations.

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 460c376 ]

The Go docs specify that a time.Duration "represents the elapsed
time between two instants as an int64 nanosecond count."
However, TimeoutConfig's Ticker and Timeout fields violated that
semantics as they were defined as time.Duration while storing
durations in Seconds.

Fix the timeout handling in WaitForServiceEndpoints, which
converted timeouts into billions of seconds.

To prevent such bugs from occurring again, specify clearly that
Ticker and Timeout contain seconds, and change their type to
int64.

Signed-off-by: Romain Lenglet <romain@covalent.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit c2f3b28 ]

The following error was found regularly in logs. The event this happens is not
an error. It is a regular incident that the controller keeps running while the
endpoint is marked to be disconnected. The endpoint is first stopped and
removed before the controllers are removed. The log message can be safely
removed.

```
msg="before syncing policy maps in controller" containerID=4fb21d6b1d datapathPolicyRevision=0 endpointID=1133 error="lock failed: endpoint is in the process of being removed" ipv4=10.10.0.167 ipv6="f00d::a0a:0:0:46d" k8sPodName=default/spaceship-589d768cc4-ccc87 policyRevision=8
```

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 5779f6a ]

There is a small race window in which the endpoint may have been disconnected
already and another deletion would be scheduled and then attempt deletion a
second time, resulting in errors.

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit a31587e ]

```
msg="NAME cilium-agent STATUS S PID 20446 CPU: 10.79% MEM: 3.40% CMDLINE: /usr/bin/cilium-agent --debug --auto-ipv6-node-routes --debug-verbose flow --ipv4-range 10.11.0.0/16 --kvstore-opt consul.address=192.168.33.11:8500 --kvstore consul --container-runtime=docker --container-runtime-endpoint=unix:///var/run/docker.sock -t vxlan --access-log=/var/log/cilium-access.log --fixed-identity-mapping=128=kv-store --fixed-identity-mapping=129=kube-dns --pprof RSS: 102 VMS: 1088 Data: 0 Stack: 0 Locked: 0 Swap: 0" endpointID=49112 subsys=endpoint
```

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit ec610ee ]

This avoids testing an unsupported up and downgrade path from 1.1 to 1.2.

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
… identity

[ upstream commit d001a33 ]

By waiting for the initial set of identities without creating the daemon
socket, Cilium would remain unavailable for kubelet to create endpoints,
breaking the etcd-operator integration.

Fixes: a6d3a5d ("identity: Wait for initial set of security identities before restoring endpoints")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
[ upstream commit 54816d2 ]

Signed-off-by: Romain Lenglet <romain@covalent.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
…tion

[ upstream commit 29f9581 ]

The agent status report acquire the CompilationLock in attempt to detect
deadlocks. This lock can receive a lot of contention as it is being held
throughout entire compilation cycles. At times, when many endpoints need to be
rebuilt. Waiting for this lock can potentially take a minute or longer.

There is no point in acquiring this lock with the motivation to detect
deadlocks as this lock may be held for a long time.

Signed-off-by: Thomas Graf <thomas@cilium.io>
Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
@nebril nebril requested a review from a team as a code owner August 28, 2018 21:31
@@ -913,7 +913,7 @@ var _ = Describe("RuntimePolicies", func() {
// increment it by 1 again. We can wait for two policy revisions to happen.
// Once we have an API to expose DNS->IP mappings we can also use that to
// ensure the lookup has completed more explicitly
timeout_s := 3 * fqdn.DNSPollerInterval / time.Second // convert to seconds
timeout_s := int64(3 * fqdn.DNSPollerInterval / time.Second) // convert to seconds

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't use underscores in Go names; var timeout_s should be timeoutS

@nebril nebril added kind/backports This PR provides functionality previously merged into master. backport/1.2 labels Aug 28, 2018
@nebril
Copy link
Member Author

nebril commented Aug 28, 2018

test-me-please

21:34:50 make[1]: go-bindata: Command not found

@tgraf
Copy link
Member

tgraf commented Aug 28, 2018

test-me-please

@tgraf
Copy link
Member

tgraf commented Aug 29, 2018

@nebril

Cilium image didn't update correctly
Expected
    <string>: cilium/cilium:v1.1.4
to equal
    <string>: docker.io/cilium/cilium:v1.1.4

@jrfastab
Copy link
Contributor

jrfastab commented Aug 30, 2018

AFAIK there is no easy way to updates someone elses PR. Here is new PR #5432.

The failing issue is due to string compare in above PR I fixed with updated backport patch,

commit d7dc45c8ea443bdc19956c7cf3535ca53c220043
Author: Thomas Graf <thomas@cilium.io>
Date:   Mon Aug 27 11:43:20 2018 +0200

    test: 1.1.4 is required to up- and downgrade from 1.2.0
    
    [ upstream commit ec610ee45934a6d0382424aad5759ebff1c22f7f ]
    
    This avoids testing an unsupported up and downgrade path from 1.1 to 1.2.
    
    Signed-off-by: Thomas Graf <thomas@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej.iai@gmail.com>
    Signed-off-by: John Fastabend <john.fastabend@gmail.com>

diff --git a/test/helpers/cons.go b/test/helpers/cons.go
index feb1c75..3bf7134 100644
--- a/test/helpers/cons.go
+++ b/test/helpers/cons.go
@@ -161,7 +161,7 @@ const (
        KubectlPolicyNameLabel      = k8sConst.PolicyLabelName
        KubectlPolicyNameSpaceLabel = k8sConst.PolicyLabelNamespace
 
-       StableImage = "cilium/cilium:v1.1.1"
+       StableImage = "cilium/cilium:v1.1.4"
 
        configMap = "ConfigMap"
        daemonSet = "DaemonSet"

per @aanm we can also backport 1dbceb7 which would also fix the issue. I have a slight preference to not break it in the first place and just fix up the original back ported patch. Backporting #5282 might be a good idea in general though seeing it applies easily.

@jrfastab
Copy link
Contributor

test-me-please

@jrfastab
Copy link
Contributor

16:47:36 [K8s-1.11]   kube-dns was not able to get into ready state
16:47:36 [K8s-1.11]   Expected
16:47:36 [K8s-1.11]       <*errors.errorString | 0xc420441980>: {
16:47:36 [K8s-1.11]           s: "Timeout reached: timed out waiting for pods with filter -l k8s-app=kube-dns to be ready",
16:47:36 [K8s-1.11]       }
16:47:36 [K8s-1.11]   to be nil

@jrfastab
Copy link
Contributor

PR replaced by #5432

@jrfastab jrfastab closed this Aug 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants