Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More scalable kvstore interaction layer #2708

Merged
merged 4 commits into from Feb 7, 2018

Conversation

Projects
None yet
6 participants
@tgraf
Copy link
Member

commented Feb 5, 2018

No description provided.

@tgraf tgraf added the wip label Feb 5, 2018

@tgraf tgraf requested review from cilium/agent as code owners Feb 5, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2018

test-me-please

type AllocatorEventChan chan AllocatorEvent

// AllocatorEvent is an event sent over AllocatorEventChan
type AllocatorEvent struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorEvent by other packages, and that stutters; consider calling this Event

}

// AllocatorEventChan is a channel to receive allocator events on
type AllocatorEventChan chan AllocatorEvent

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorEventChan by other packages, and that stutters; consider calling this EventChan

return a.GetNoCache(key)
}

// Get returns the ID which is allocated to a key in the kvstore

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

comment on exported method Allocator.GetNoCache should be of the form "GetNoCache ..."


// AllocatorKey is the interface to implement in order for a type to be used as
// key for the allocator
type AllocatorKey interface {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorKey by other packages, and that stutters; consider calling this Key

}

// AllocatorOption is the base type for allocator options
type AllocatorOption func(*Allocator)

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorOption by other packages, and that stutters; consider calling this Option

// ControllerStatusConfiguration Configuration of controller
// swagger:model ControllerStatusConfiguration

type ControllerStatusConfiguration struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type ControllerStatusConfiguration should have comment or be unexported

// ControllerStatus Status of a controller
// swagger:model ControllerStatus

type ControllerStatus struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type ControllerStatus should have comment or be unexported

// ControllerStatuses Collection of controller statuses
// swagger:model ControllerStatuses

type ControllerStatuses []*ControllerStatus

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type ControllerStatuses should have comment or be unexported


import (
"github.com/cilium/cilium/api/v1/models"
. "github.com/cilium/cilium/api/v1/server/restapi/policy"

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

should not use dot imports

// KVstoreConfiguration Configuration used for the kvstore
// swagger:model KVstoreConfiguration

type KVstoreConfiguration struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type KVstoreConfiguration should have comment or be unexported

type AllocatorEventChan chan AllocatorEvent

// AllocatorEvent is an event sent over AllocatorEventChan
type AllocatorEvent struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorEvent by other packages, and that stutters; consider calling this Event

}

// AllocatorEventChan is a channel to receive allocator events on
type AllocatorEventChan chan AllocatorEvent

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorEventChan by other packages, and that stutters; consider calling this EventChan

return a.GetNoCache(key)
}

// Get returns the ID which is allocated to a key in the kvstore

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

comment on exported method Allocator.GetNoCache should be of the form "GetNoCache ..."


// AllocatorKey is the interface to implement in order for a type to be used as
// key for the allocator
type AllocatorKey interface {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorKey by other packages, and that stutters; consider calling this Key

}

// AllocatorOption is the base type for allocator options
type AllocatorOption func(*Allocator)

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

type name will be used as allocator.AllocatorOption by other packages, and that stutters; consider calling this Option

// ControllerStatusConfiguration Configuration of controller
// swagger:model ControllerStatusConfiguration

type ControllerStatusConfiguration struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type ControllerStatusConfiguration should have comment or be unexported

// ControllerStatus Status of a controller
// swagger:model ControllerStatus

type ControllerStatus struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type ControllerStatus should have comment or be unexported

// ControllerStatuses Collection of controller statuses
// swagger:model ControllerStatuses

type ControllerStatuses []*ControllerStatus

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type ControllerStatuses should have comment or be unexported


import (
"github.com/cilium/cilium/api/v1/models"
. "github.com/cilium/cilium/api/v1/server/restapi/policy"

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

should not use dot imports

// KVstoreConfiguration Configuration used for the kvstore
// swagger:model KVstoreConfiguration

type KVstoreConfiguration struct {

This comment has been minimized.

Copy link
@houndci-bot

houndci-bot Feb 5, 2018

exported type KVstoreConfiguration should have comment or be unexported

@tgraf tgraf referenced this pull request Feb 5, 2018

Closed

New kvstore layer #2613

@tgraf tgraf force-pushed the kvstore-2.1 branch from 784e0d1 to 481f854 Feb 5, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2018

test-me-please

@tgraf tgraf force-pushed the kvstore-2.1 branch from 481f854 to c86fb49 Feb 5, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2018

test-me-please

@tgraf tgraf force-pushed the kvstore-2.1 branch from c86fb49 to c036390 Feb 5, 2018

@tgraf tgraf requested review from cilium/bpf as code owners Feb 5, 2018

@tgraf tgraf force-pushed the kvstore-2.1 branch from c036390 to 79244e9 Feb 5, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2018

test-me-please

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 5, 2018

test-me-please

1 similar comment
@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 6, 2018

test-me-please

@tgraf tgraf force-pushed the kvstore-2.1 branch from 5f97d7c to 1d07c77 Feb 6, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 6, 2018

test-me-please

@joestringer
Copy link
Contributor

left a comment

Some minor notes.

It'd be nice to get -o json support in the new operations if we're going to test their output, but since it's a debug option anyway it doesn't seem worth gating the change on. We can update #2083.

@@ -184,8 +184,7 @@ func LaunchAsEndpoint(owner endpoint.Owner, hostAddressing *models.NodeAddressin
}

// Give the endpoint a security identity
_, errLabelsAdd := owner.UpdateSecLabels(info.ContainerID, lbl, labels.Labels{})
if errLabelsAdd != nil {
if errLabelsAdd := ep.ModifyIdentityLabels(owner, lbl, labels.Labels{}); err != nil {

This comment has been minimized.

Copy link
@joestringer

joestringer Feb 7, 2018

Contributor

Is 'err' from the previous command intentionally checked here? or did you intend to check if errLabelsAdd != nil at the end of the line?

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

c.Assert(a.keyToID(path.Join(a.idPrefix, "10"), false), Equals, ID(10))
}

//func testParallelAllocator(c *C, maxID ID, allocatorName string, suffix string) {

This comment has been minimized.

Copy link
@joestringer

joestringer Feb 7, 2018

Contributor

There's still some commented-out tests here.

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

Yes, these tests are failing in ginkgo CI. I can run them for hours locally. I haven't figured out what exactly is different when ran on Jenkins. I'll add a comment explaining why they are commented out.

func setup(selectedBackend string, opts map[string]string) error {
module := getBackend(selectedBackend)
if module == nil {
return fmt.Errorf("unknown keyy-value store type %q. See cilium.link/err-kvstore for details", selectedBackend)

This comment has been minimized.

Copy link
@aanm

aanm Feb 7, 2018

Member

s/keyy/key

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

@@ -283,3 +263,161 @@ func GetReservedID(name string) NumericIdentity {
}
return IdentityUnknown
}

type globalIdentity struct {

This comment has been minimized.

Copy link
@aanm

aanm Feb 7, 2018

Member

what is a globalIdentity?

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

Added comment

@@ -26,6 +26,7 @@ CLI for interacting with the local Cilium Agent
* [cilium debuginfo](cilium_debuginfo.html) - Request available debugging information from agent
* [cilium endpoint](cilium_endpoint.html) - Manage endpoints
* [cilium identity](cilium_identity.html) - Manage security identities
* [cilium kvstore](cilium_kvstore.html) - Direct access to the kvstore

This comment has been minimized.

Copy link
@aanm

aanm Feb 7, 2018

Member

Don't forget to update #373

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

@tgraf tgraf force-pushed the kvstore-2.1 branch from 1d07c77 to e38f3d1 Feb 7, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 7, 2018

test-me-please

@raybejjani
Copy link
Contributor

left a comment

the only blocking thing is the missing Unlock. All the other comments are suggestions.

break
}

return fmt.Errorf("label %s not found", k)

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

need to release e.Mutex.Lock()

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

// endpoint will receive a new identity and will be regenerated. Both of these
// operations will happen in the background.
func (e *Endpoint) ModifyIdentityLabels(owner Owner, addLabels, delLabels labels.Labels) error {
e.Mutex.Lock()

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

We hold this for most of the function. In light of the accidental lock leak below, and since what we do after the unlock is so simple, how about:

defer e.runLabelsResolver(owner, rev)
e.Mutex.Lock()
defer e.Mutex.Unlock()

This would still run the code in order, but avoid managing the unlock. I'm not sure it's actually better (how do you determine rev here?) since we only have one rogue return.

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed with a deferred Unlock

})
e.Mutex.RUnlock()

e.Mutex.RLock()

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

Is this RUnlock RLock pair on purpose? If so, could you leave a comment explaining why we have it and what behaviour we expect to support/happen

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

Looks like a side effect of a recent merge. Removed.

verified bool
}

type localKeyMap map[string]*localKey

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

we only use this type in the localKeys struct. Maybe just directly have the map there? I know we normally have a type to capture a map's type info, but this is a private type.

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

// must implement. Direct use of this interface is possible but will bypass the
// tracing layer.
type BackendOperations interface {
// BEGIN Obsolete API

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

Should these not be used? Could you include a note explaining what to do instead?

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

"github.com/cilium/cilium/common"
)

func deleteLegacyPrefixes() {

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

Could you add a comment explaining that this should no longer be used, if so, otherwise, perhaps explain the legacy structure we're replacing/deleting since it probably is no longer documented elsewhere (since we changed it).

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

fixed

session *concurrency.Session
lockPathsMU lock.Mutex
lockPaths map[string]*lock.Mutex
}

type EtcdLocker struct {
type etcdMutex struct {

This comment has been minimized.

Copy link
@raybejjani

raybejjani Feb 7, 2018

Contributor

I guess this was already like this but if it was type etcdMutex concurrency.Mutex I think it can avoid the pointer and the struct.

This comment has been minimized.

Copy link
@tgraf

tgraf Feb 7, 2018

Author Member

Agreed, this is a non trivial change though because the Locker interface currently uses Unlock() and so does concurrency.Mutex but with a different function signature. I'm happy if someone cleans this up after.

tgraf added some commits Jan 31, 2018

cli: Add "cilium kvstore (get|set|delete)" tooling
Signed-off-by: Thomas Graf <thomas@cilium.io>
kvstore: New kvstore abstraction API
 - Basic lockless operations:
    - Get, GetPrefix, ListPrefix, Set, Delete, DeletePrefix, CreateOnly
    - ListAndWatch, CreateLease, KeepAlive
 - Advanced operations (etcd: lockless, consul: locking required)
    - CreateIfExists
 - Unit tests automatically test etcd and consul backends in single run,
   no need for special make targets anymore
 - Abstracted encoding of keys to allow for binary keys

 - New generic allocator:
   - Maps keys to identifiers using lockless operations in the fast path
   - Uses TTLs to protect reference counting keys in case the node
     disappears
   - Utilizes garbage collector which uses distribtued locks to release
     unused identities
   - Local cache of all identities and keys for fast allocation and
     retrieval
   - Local reference counting if key->id mapping is used multiple times
     from a single node
   - Prepared to support lockless operations with etcd 3.3

Fixes: #915
Fixes: #916
Fixes: #2629

Signed-off-by: Thomas Graf <thomas@cilium.io>
ginkgo: Double time for monitor to pick up relevant messages
Signed-off-by: Thomas Graf <thomas@cilium.io>
tests: Fix up identity-list tests
* The test was assuming that the kvstore still knew about an identity from a previous test
* Do not depend on the exact formatting of the CLI

Signed-off-by: Thomas Graf <thomas@cilium.io>

@tgraf tgraf force-pushed the kvstore-2.1 branch from 0840dba to eb520e1 Feb 7, 2018

@tgraf

This comment has been minimized.

Copy link
Member Author

commented Feb 7, 2018

test-me-please

@tgraf tgraf merged commit ed527e7 into master Feb 7, 2018

2 of 3 checks passed

Cilium-Ginkgo-Tests Build started sha1 is merged.
Details
Cilium-Bash-Tests Build finished.
Details
hound 7 violations found.

@tgraf tgraf deleted the kvstore-2.1 branch Feb 7, 2018

@ianvernon
Copy link
Contributor

left a comment

Really stoked for this! Some comments that can be addressed as follow-up.

@@ -213,6 +210,11 @@ func (d *Daemon) StartEndpointBuilders(nRoutines int) {
}
}

// GetTunnelMode returns the path to the state directory

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Documentation is incorrect

@@ -1013,6 +1015,10 @@ func NewDaemon(c *Config) (*Daemon, error) {
ni, n := node.GetLocalNode()
node.UpdateNode(ni, n, node.TunnelRoute, nil)

// This needs to be done after the node addressing has been configured
// as the node address is required as sufix

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

sufix --> suffix

@@ -59,7 +59,7 @@ func (ds *DaemonSuite) TestEndpointAdd(c *C) {

func (ds *DaemonSuite) TestUpdateSecLabels(c *C) {

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Would be good to rename this unit test to reflect renaming of functions being tested.

@@ -0,0 +1,67 @@
// Copyright 2016-2017 Authors of Cilium

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Copyright header should include 2018

func newGetIdentityIDHandler(d *Daemon) GetIdentityIDHandler { return &getIdentityID{} }

func (h *getIdentityID) Handle(params GetIdentityIDParams) middleware.Responder {
nid, err := policy.ParseNumericIdentity(params.ID)

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Can you add a log message similar to line 32? I've found such messages to be quite helpful when debugging.

@@ -0,0 +1,23 @@
// Copyright 2016-2017 Authors of Cilium

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Update copyright header for 2018

// registerBackend must be called by kvstore backends to register themselves
func registerBackend(name string, module backendModule) {
if _, ok := registeredBackends[name]; ok {
log.Panicf("backend with name '%s' already registered", name)

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Is panic level warranted here? Seems extreme, as if it's already registered this function can just return?

log.WithFields(logrus.Fields{
fieldRev: lastRev,
fieldWatcher: w,
}).Debugf("Emiting %v event for %s=%v", event.Typ, event.Key, event.Value)

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

Emitting

return nil
}

// Init must be called to initialize the

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

incomplete documentation

return kvstore.Encode(gi.SortedList())
}

// PutKey() decides a globalIdentity from its string representation

This comment has been minimized.

Copy link
@ianvernon

ianvernon Feb 7, 2018

Contributor

decodes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.