Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.6 backports 2020-02-17 #10223

Merged
merged 14 commits into from
Feb 21, 2020
Merged

v1.6 backports 2020-02-17 #10223

merged 14 commits into from
Feb 21, 2020

Commits on Feb 20, 2020

  1. kvstore: Move Trace calls into backends

    [ upstream commit f868f97 ]
    
    We need to avoid using the package level functions but need to keep the
    tracing for debug. Moving the trace calls into the kvstore backends is
    verbose but is the simplest way to do this.
    
    Signed-off-by: Ray Bejjani <ray@isovalent.com>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    raybejjani authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    a45b318 View commit details
    Browse the repository at this point in the history
  2. kvstore: Replace package calls with .Client.function

    [ upstream commit 76c7fca ]
    
    We need to use different backends at the same time. Currently we assume a
    single connected backend instance by using the package level functions.
    We now direclty call these functions on the default client instead,
    allowing future changes to use non-global instances of the backends.
    
    Signed-off-by: Ray Bejjani <ray@isovalent.com>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    raybejjani authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    cf0e6a6 View commit details
    Browse the repository at this point in the history
  3. kvstore/allocator: Keep internal reference to backend

    [ upstream commit 913ed22 ]
    
    We previously only used single pkg/kvstore/allocator.kvstoreBackend
    instance system-wide. This also used a single, global, etcd/consul
    backend instance (in this case a backend to allocator.kvstoreBackend,
    itself the kvstore backend for the identity allocator).
    With clustermesh, cilium-agent needs to sync with remote etcd instances.
    This means that multiple etcd clients can exist, and that the identity
    allocator may use different ones. This is now possible by keeping
    an internal reference instead of using the global one.
    
    Signed-off-by: Ray Bejjani <ray@isovalent.com>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    raybejjani authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    aeeb572 View commit details
    Browse the repository at this point in the history
  4. clustermesh: Use remote kvstore backend

    [ upstream commit f28f6cd ]
    
    Clustermesh remote-cluster watching accidentally used the local caching
    identity allocator in cilium-agent. This meant that while ipcache and
    nodes were sycned correctly, identities were not. This was because the
    watch was called on the main identity allocator, adding itself to itself
    as a remote cluster. This resulted in two connections to the local etcd,
    and double entries when listing identities.
    
    A new allocator instance is created to use the remote cluster etcd
    backend. This is then added to the main allocators remote clusters list.
    
    Signed-off-by: Ray Bejjani <ray@isovalent.com>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    raybejjani authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    fe85efe View commit details
    Browse the repository at this point in the history
  5. bugtool: Dump NAT BPF maps entries with bpftool

    [ upstream commit 5e01fb6 ]
    
    If "cilium bpf nat list" fails, then we will be able to rely to
    the raw dumps when debugging.
    
    Signed-off-by: Martynas Pumputis <m@lambda.lt>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    brb authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    10836ce View commit details
    Browse the repository at this point in the history
  6. maps/nat: return NatEntry6 for NatKey6 NewValue() instead of NatEntry4

    [ upstream commit 8408128 ]
    
    When Cilium performs a connection tracking garbage collection it will
    also purge NAT entries from the NAT bpf map. To perform this action,
    Cilium will execute BPF map lookups from userspace by giving the memory
    location of a golang structure, previously NatEntry4 now NatEntry6. The
    kernel will then write into this memory location the value of the NAT
    entry being looked up. As the size of NatEntry4 is only 38 bytes, as
    oppose the NatEntry6 which is 50 bytes, the kernel would then write 12
    bytes in random memory locations of the Cilium agent. This could cause
    Cilium agent to crash or to have errors such as:
    
    ```
    msg="Cannot create CEP" ... error="CiliumEndpoint.cilium.io \"��\\x00\\x00\\x00\\x00\\x00\\x00-multi-node-headless-854b65674d-lj45r\" is invalid: metadata.name: Invalid value: \"��\\x00\\x00\\x00\\x00\\x00\\x00-multi-node-headless-854b65674d-lj45r\":
    ```
    
    Fixes: b9d2a0a ("maps/nat: Add NatKey{4,6} types")
    Signed-off-by: André Martins <andre@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    aanm authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    2f28d90 View commit details
    Browse the repository at this point in the history
  7. docs: fix link for Cilium-PR-Kubernetes-Upstream job

    [ upstream commit 5b9002d ]
    
    Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    tklauser authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    daf4b33 View commit details
    Browse the repository at this point in the history
  8. bpf: Fix space hack in Makefile

    [ upstream commit 4630078 ]
    
    Fix the space hack which stopped working with make v4.3 (works with
    v4.2 though):
    
        [..]
        " [-DENABLE_HOST_REDIRECT-DENABLE_IPV4-DENABLE_IPV6-DENABLE_NAT46]";
        clang -DENABLE_HOST_REDIRECT-DENABLE_IPV4-DENABLE_IPV6-DENABLE_NAT46
        -I/home/brb/sandbox/gopath/src/github.com/cilium/cilium/bpf/include
        -I/home/brb/sandbox/gopath/src/github.com/cilium/cilium/bpf
        -D__NR_CPUS__=8 -O2 -g -target bpf -emit-llvm -Wall -Werror
        -Wno-address-of-packed-member -Wno-unknown-warning-option -c bpf_lxc.c
        -o bpf_lxc.ll; llc -march=bpf -mcpu=probe -mattr=dwarfris -o /dev/null
        bpf_lxc.ll;  \
        fi
        In file included from <built-in>:323:
        <command line>:1:20: error: ISO C99 requires whitespace after the macro
        name [-Werror,-Wc99-extensions]
        #define ENABLE_IPV4-DHAVE_LPM_MAP_TYPE 1
    
    Signed-off-by: Martynas Pumputis <m@lambda.lt>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    brb authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    6999dd6 View commit details
    Browse the repository at this point in the history
  9. charts: Generate versions from VERSION file

    [ upstream commit 640b7a6 ]
    
    Use the top-of-tree VERSION file to generate the chart versions and
    update the pull policy using the following rules:
    
      * Set the helm chart versions to the VERSION in the file
      * If the VERSION file ends with ".90":
        - Set the cilium tag to 'latest'
        - Set the pullPolicy to 'Always'
      * If the VERSION file does not end with ".90":
        - Set the cilium tag to the VERSION in the file
        - Set the pullPolicy to 'IfNotPresent'
      * Set the managed-etcd version tag to the version specified at the top
        of this Makefile. This must be manually bumped, it does not appear
        to follow the standard Cilium docker image tag process.
    
    Signed-off-by: Joe Stringer <joe@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    joestringer authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    c7fcbe0 View commit details
    Browse the repository at this point in the history
  10. pkg/eventqueue: fix concurrent access of waitConsumeOffQueue

    [ upstream commit da2e652 ]
    
    spanStart of waitConsumeOffQueue could be read before being written in
    case the buffered event was executed before the execution of
    waitConsumeOffQueue.Start() in the modified lines of this commit. To fix
    this we should execute waitConsumeOffQueue.Start() even before the event
    is put into the queue. Although it does not give the correct span stat,
    the developer or user can derive it by subtracting the waitEnqueue span
    to retrieve the real waitConsumeOffQueue span.
    
    Fixes: add0d65 ("add eventqueue package")
    Signed-off-by: André Martins <andre@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    aanm authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    e8bfd59 View commit details
    Browse the repository at this point in the history
  11. pkg/endpoint: access endpoint state safely across go routines

    [ upstream commit 4b61dd5 ]
    
    getState function should only be called when the endpoint mutex is being
    held which was not always the case. Changed the function call so that it
    can be accessed concurrently for the data race detected bellow:
    
    ```
    ==================
    WARNING: DATA RACE
    Write at 0x00c000544f30 by goroutine 56:
      github.com/cilium/cilium/pkg/endpoint.(*Endpoint).SetStateLocked()
          /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:1641 +0xe9
      github.com/cilium/cilium/pkg/endpoint.(*Endpoint).LeaveLocked()
          /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:1386 +0x333
      main.(*Daemon).deleteEndpointQuiet()
          /go/src/github.com/cilium/cilium/daemon/endpoint.go:651 +0x3e4
      main.(*Daemon).regenerateRestoredEndpoints.func2()
          /go/src/github.com/cilium/cilium/daemon/state.go:388 +0x49
    
    Previous read at 0x00c000544f30 by goroutine 165:
      github.com/cilium/cilium/pkg/endpointmanager.releaseID()
          /go/src/github.com/cilium/cilium/pkg/endpoint/endpoint.go:1549 +0xb5
      github.com/cilium/cilium/pkg/endpointmanager.Remove.func1()
          /go/src/github.com/cilium/cilium/pkg/endpointmanager/manager.go:335 +0x88
    
    Goroutine 56 (running) created at:
      main.(*Daemon).regenerateRestoredEndpoints()
          /go/src/github.com/cilium/cilium/daemon/state.go:382 +0x916
      main.(*Daemon).initRestore()
          /go/src/github.com/cilium/cilium/daemon/state.go:454 +0xf3
      main.runDaemon()
          /go/src/github.com/cilium/cilium/daemon/daemon_main.go:1441 +0xa17
      main.NewDaemon()
          /go/src/github.com/cilium/cilium/daemon/daemon.go:894 +0x23a4
      main.runDaemon()
          /go/src/github.com/cilium/cilium/daemon/daemon_main.go:1396 +0x314
      main.glob..func1()
          /go/src/github.com/cilium/cilium/daemon/daemon_main.go:121 +0xbf
      github.com/cilium/cilium/vendor/github.com/spf13/cobra.(*Command).execute()
          /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:766 +0x8eb
      github.com/cilium/cilium/vendor/github.com/spf13/cobra.(*Command).ExecuteC()
          /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:850 +0x41b
      main.daemonMain()
          /go/src/github.com/cilium/cilium/vendor/github.com/spf13/cobra/command.go:800 +0x237
      main.main()
          /go/src/github.com/cilium/cilium/daemon/main.go:18 +0x2f
    
    Goroutine 165 (finished) created at:
      github.com/cilium/cilium/pkg/endpointmanager.Remove()
          /go/src/github.com/cilium/cilium/pkg/endpointmanager/manager.go:324 +0x121
      main.(*Daemon).deleteEndpointQuiet()
          /go/src/github.com/cilium/cilium/daemon/endpoint.go:614 +0x1a8
      main.(*Daemon).regenerateRestoredEndpoints.func2()
          /go/src/github.com/cilium/cilium/daemon/state.go:388 +0x49
    ==================
    ```
    
    Fixes: f71d87a ("endpointmanager: signal when work is done")
    Signed-off-by: André Martins <andre@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    aanm authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    61a2574 View commit details
    Browse the repository at this point in the history
  12. ipcache: Add probe to check for dump capability to support delete

    [ upstream commit 69a6fce ]
    
    Besides the ability to delete, the ipcache garbage collector also requires the
    ability to dump the table. Add this to the existing probe which feeds
    `SupportsDelete()`.
    
    	level=debug msg="Detected IPCache delete operation support: true" subsys=map-ipcache
    	level=debug msg="Detected IPCache dump operation support: true" subsys=map-ipcache
    
    Fixes: #10080
    
    Signed-off-by: Thomas Graf <thomas@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    tgraf authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    5e1a960 View commit details
    Browse the repository at this point in the history
  13. doc: Document L7 limitation in azure-cni chaining mode

    [ upstream commit 6b9559e ]
    
    Signed-off-by: Thomas Graf <thomas@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    tgraf authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    8b3849f View commit details
    Browse the repository at this point in the history
  14. bpf: Fix proxy redirection for egress programs

    [ upstream commit c58d749 ]
    
     * Handling of the proxy redirection was missing entirely in the to-container
       section for IPv6. Add it.
    
     * The to-container section case was assuming that any proxy redirection is
       indicated by ipv{46}_policy() returning a non-zero proxy port. This is no
       longer true since commit 830adba. Fix this by using a separate return code
       to indicate proxy redirection and treating the proxy port as optional.
    
    The above deficits lead to proxy redirection being ineffective when the setting
    EnableEndpointRoutes was set.
    
    Fixes: 830adba ("bpf: Support proxy using original source address and port.")
    Fixes: 25a80df ("bpf: Add to-container section to bpf_lxc")
    Fixes: #10105
    
    Signed-off-by: Thomas Graf <thomas@cilium.io>
    Signed-off-by: Maciej Kwiek <maciej@isovalent.com>
    tgraf authored and nebril committed Feb 20, 2020
    Configuration menu
    Copy the full SHA
    63ac0dc View commit details
    Browse the repository at this point in the history