Store container names in memdb #33886

aaronlehmann · 2017-06-30T02:03:20Z

Currently, names are maintained by a separate system called "registrar".
This means there is no way to atomically snapshot the state of
containers and the names associated with them.

We can add this atomicity and simplify the code by storing name
associations in the memdb. This removes the need for pkg/registrar, and
makes snapshots a lot less expensive because they no longer need to copy
all the names. This change also avoids some problematic behavior from
pkg/registrar where it returns slices which may be modified later on.

Note that while this change makes the snapshotting atomic, it doesn't
yet do anything to make sure containers are named at the same time that
they are added to the database. We can do that by adding a transactional
interface, either as a followup, or as part of this PR.

See #33863 and #33885

I still need to validate these changes and add test coverage.

cc @fabiokung @cpuguy83

aaronlehmann · 2017-06-30T02:10:54Z

Note that while this change makes the snapshotting atomic, it doesn't
yet do anything to make sure containers are named at the same time that
they are added to the database. We can do that by adding a transactional
interface, either as a followup, or as part of this PR.

Thinking about this a little bit, usually the only name we care about is container.Name. The only case when we have other names are when links are in use, and I don't think there's any point in adding these transactionally. So maybe all we need to do here is automatically reserve container.Name when a container is added to the database. We're already automatically unregistering the name when the container is deleted.

Edit: Actually, there's no point in doing this, because we want to pre-reserve the name before we create the container. Otherwise, we might go through all the work of creating the container, and then find out the name is taken when we try to insert it into the memdb. So I think the PR makes sense as-is, and we just have to accept that some names in the DB might not have corresponding containers yet (see #33883).

aaronlehmann · 2017-07-06T21:50:42Z

Rebased and added test coverage. PTAL

fabiokung

I like this.

Other than inline comments, the only remark I have is that this adds more lock contention on all container update operations.

Hopefully all name reservation operations should be "quick" and not hold the lock for too long. All I could spot seemed to be O(1), but it may not be a bad idea to put this through some concurrency tests and try to measure lock contention with lots of names reserved, and containers being modified in parallel.

fabiokung · 2017-07-09T18:30:12Z

container/view.go

+		return nil
+	}
+
+	txn.Insert(memdbNamesTable, nameAssociation{name: name, containerID: containerID})


Any reason why error is being swallowed here?

fabiokung · 2017-07-09T18:31:29Z

container/view.go

+// A name reservation is globally unique
+func (db *memDB) ReserveName(name, containerID string) error {
+	txn := db.store.Txn(true)
+	defer txn.Commit()


don't we need to check the returned error and txn.Abort() if necessary on errors?

fabiokung · 2017-07-09T18:31:50Z

container/view.go

@@ -106,25 +144,75 @@ func (db *memDB) Snapshot(index *registrar.Registrar) View {
 func (db *memDB) Save(c *Container) error {
 	txn := db.store.Txn(true)
 	defer txn.Commit()
-	return txn.Insert(memdbTable, c)
+	return txn.Insert(memdbContainersTable, c)
 }

 // Delete removes an item by ID
 func (db *memDB) Delete(c *Container) error {
 	txn := db.store.Txn(true)
 	defer txn.Commit()


Probably should abort on errors here too.

fabiokung · 2017-07-09T18:32:41Z

container/view.go

+func (db *memDB) ReleaseName(name string) {
+	txn := db.store.Txn(true)
+	txn.Delete(memdbNamesTable, nameAssociation{name: name})
+	txn.Commit()


check all errors and txn.Abort() as necessary

we may need to change the interface here to return an error too

fabiokung · 2017-07-09T18:37:18Z

container/view.go

 }

 // Delete removes an item by ID
 func (db *memDB) Delete(c *Container) error {
 	txn := db.store.Txn(true)
 	defer txn.Commit()
-	return txn.Delete(memdbTable, NewBaseContainer(c.ID, c.Root))
+
+	// Delete any names referencing this container's ID.


can lines below (156-168) reuse func getNames?

aaronlehmann · 2017-07-10T17:06:52Z

I've added a new commit with the requested changes. I have mixed feelings about these changes. I think you are correct that adding these error checks and aborts is technically the right thing to do, but I'm concerned that it makes the code more complex and fragile. Since we can't use defer txn.Commit(), it will be much easier to leak the transaction lock in the future. I'm not sure there are any failure modes in practice that would cause Abort vs Commit to make a difference - the only paths that would allow us to commit a change partially seem to be internal memdb errors that shouldn't happen.

I'm not concerned about the lock contention. I think that's the price that has to be paid for keeping a consistent view of naming. All of these reservation and release operations should be trivial and only hold the lock very briefly.

fabiokung · 2017-07-10T18:10:24Z

(snip), but I'm concerned that it makes the code more complex and fragile.

This may be a bit more robust way to abort on errors, which will avoid leaking the transaction lock: https://play.golang.org/p/KGszH6RfPP

The error handling could be extracted into a func commitOrAbort(err error), or something similar, and be used as defer one-liners (defer commitOrAbort(retErr)) as well.

fabiokung · 2017-07-10T18:24:11Z

Alternatively, you can also go the functional route with:

func ...() {
	withTx(func(tx boltdb.Tx) error {
		// ...
	})
}

aaronlehmann · 2017-07-10T20:37:16Z

I've taken the functional approach.

fabiokung · 2017-07-10T21:16:06Z

container/view_test.go

+	assert.EqualError(t, db.ReserveName("name2", "containerid3"), ErrNameReserved.Error())
+
+	// Releasing a name allows the name to point to something else later.
+	db.ReleaseName("name2")


wrap in assert.NoError()

fabiokung · 2017-07-10T21:16:24Z

container/view_test.go

+	assert.EqualError(t, err, ErrNameNotReserved.Error())
+
+	// Releasing and re-reserving a name doesn't affect the snapshot.
+	db.ReleaseName("name2")


assert.NoError

fabiokung · 2017-07-10T21:19:43Z

daemon/names.go

@@ -77,7 +76,7 @@ func (daemon *Daemon) reserveName(id, name string) (string, error) {
 }

 func (daemon *Daemon) releaseName(name string) {
-	daemon.nameIndex.Release(name)
+	daemon.containersReplica.ReleaseName(name)


worth at least logging the error here?

Not sure if this is used in cases where the name may not be reserved. I didn't want to change the behavior of this part of the code, only the implementation.

fabiokung · 2017-07-10T21:24:16Z

daemon/names.go

-			id, err := daemon.nameIndex.Get(name)
+	if err := daemon.containersReplica.ReserveName(name, id); err != nil {
+		if err == container.ErrNameReserved {
+			id, err := daemon.containersReplica.Snapshot().GetID(name)


minor nit here: there's still a small chance for a race. If the name gets released after the attempt failed, id will be empty and the output will look weird. Not sure it matters much since it is a corner case and only informational, but it may cause confusion if there's enough concurrency on the daemon to trigger this.

The proper fix for this would be to return the conflicting id on Reserve call that failed (from the same memdb transaction).

I agree, this really should be transactional. But I think that since the impact of the race is only missing information in the log message, it's not very important to fix right now.

fabiokung · 2017-07-10T21:27:18Z

daemon/rename.go

@@ -55,7 +55,7 @@ func (daemon *Daemon) ContainerRename(oldName, newName string) error {
 	}

 	for k, v := range links {
-		daemon.nameIndex.Reserve(newName+k, v.ID)
+		daemon.containersReplica.ReserveName(newName+k, v.ID)


not necessarily related to the changes being made, all these errors being swallowed (here and lines 69, 61 and 74) make me a bit nervous 😬

fabiokung · 2017-07-10T21:31:46Z

daemon/delete.go

@@ -128,7 +128,6 @@ func (daemon *Daemon) cleanupContainer(container *container.Container, forceRemo
 		return errors.Wrapf(err, "unable to remove filesystem for %s", container.ID)
 	}

-	daemon.nameIndex.Delete(container.ID)


Maybe I missed it, but I didn't see this being replaced by something else.

Missing a if err := daemon.containerReplica.ReleaseName(...); err != nil {...} here?

The name is released by daemon.containersReplica.Delete(container) below.

fabiokung · 2017-07-11T00:03:48Z

LGTM

cpuguy83 · 2017-07-13T18:13:08Z

SGTM

Needs a rebase.

cpuguy83 · 2017-07-13T18:14:24Z

What do you think about also doing the same treatment for ID's, currently stored in a separate patricia-trie for prefix matching (as a separate PR).

Currently, names are maintained by a separate system called "registrar". This means there is no way to atomically snapshot the state of containers and the names associated with them. We can add this atomicity and simplify the code by storing name associations in the memdb. This removes the need for pkg/registrar, and makes snapshots a lot less expensive because they no longer need to copy all the names. This change also avoids some problematic behavior from pkg/registrar where it returns slices which may be modified later on. Note that while this change makes the *snapshotting* atomic, it doesn't yet do anything to make sure containers are named at the same time that they are added to the database. We can do that by adding a transactional interface, either as a followup, or as part of this PR. Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

aaronlehmann · 2017-07-13T19:36:45Z

Rebased.

What do you think about also doing the same treatment for ID's, currently stored in a separate patricia-trie for prefix matching (as a separate PR).

This would be an excellent way to support ID lookups. memdb uses a prefix tree internally and supports prefix matching for lookups. The IDs are already stored in the container objects, so there would be no ned to store anything else. We could just remove other redundant code.

cpuguy83

LGTM

ehazlett · 2017-07-17T18:16:32Z

LGTM

GordonTheTurtle added the status/0-triage label Jun 30, 2017

aaronlehmann mentioned this pull request Jun 30, 2017

no need to re-fetch the nameIndex during docker ps queries #33885

Closed

LK4D4 added status/1-design-review and removed status/0-triage labels Jul 6, 2017

LK4D4 requested a review from cpuguy83 July 6, 2017 16:36

aaronlehmann force-pushed the names-in-memdb branch from c76f9e6 to 864b6d2 Compare July 6, 2017 21:50

aaronlehmann changed the title ~~[WIP] Store container names in memdb~~ Store container names in memdb Jul 6, 2017

aaronlehmann force-pushed the names-in-memdb branch from 864b6d2 to b5b7d7c Compare July 6, 2017 21:54

fabiokung suggested changes Jul 9, 2017

View reviewed changes

fabiokung approved these changes Jul 10, 2017

View reviewed changes

fabiokung suggested changes Jul 10, 2017

View reviewed changes

aaronlehmann force-pushed the names-in-memdb branch from 30a2f21 to 4ec8864 Compare July 10, 2017 22:18

cpuguy83 added status/2-code-review and removed status/1-design-review labels Jul 13, 2017

aaronlehmann added 3 commits July 13, 2017 12:35

container: Abort transactions when memdb calls fail

bc3209b

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

container: Use wrapper to ensure commit/abort happens

0e57eb9

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>

aaronlehmann force-pushed the names-in-memdb branch from 4ec8864 to 0e57eb9 Compare July 13, 2017 19:36

cpuguy83 approved these changes Jul 13, 2017

View reviewed changes

GordonTheTurtle assigned ehazlett Jul 15, 2017

ehazlett merged commit 458f671 into moby:master Jul 17, 2017

thaJeztah added the impact/changelog label Jul 17, 2017

vieux mentioned this pull request Jul 26, 2017

Container that failed creation prevents creating new container with the same name #34270

Closed

aaronlehmann deleted the names-in-memdb branch July 26, 2017 23:12

thaJeztah mentioned this pull request Sep 18, 2023

Find a place for /pkg #32989

Open

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Store container names in memdb #33886

Store container names in memdb #33886

aaronlehmann commented Jun 30, 2017

aaronlehmann commented Jun 30, 2017 •

edited

Loading

aaronlehmann commented Jul 6, 2017

fabiokung left a comment

fabiokung Jul 9, 2017

fabiokung Jul 9, 2017

fabiokung Jul 9, 2017

fabiokung Jul 9, 2017

fabiokung Jul 9, 2017

fabiokung Jul 9, 2017

aaronlehmann commented Jul 10, 2017

fabiokung commented Jul 10, 2017 •

edited

Loading

fabiokung commented Jul 10, 2017

aaronlehmann commented Jul 10, 2017

fabiokung Jul 10, 2017

fabiokung Jul 10, 2017

fabiokung Jul 10, 2017

aaronlehmann Jul 10, 2017

fabiokung Jul 10, 2017

aaronlehmann Jul 10, 2017

fabiokung Jul 10, 2017

fabiokung Jul 10, 2017

aaronlehmann Jul 10, 2017

fabiokung commented Jul 11, 2017

cpuguy83 commented Jul 13, 2017

cpuguy83 commented Jul 13, 2017 •

edited

Loading

aaronlehmann commented Jul 13, 2017

cpuguy83 left a comment

ehazlett commented Jul 17, 2017

Store container names in memdb #33886

Store container names in memdb #33886

Conversation

aaronlehmann commented Jun 30, 2017

aaronlehmann commented Jun 30, 2017 • edited Loading

aaronlehmann commented Jul 6, 2017

fabiokung left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aaronlehmann commented Jul 10, 2017

fabiokung commented Jul 10, 2017 • edited Loading

fabiokung commented Jul 10, 2017

aaronlehmann commented Jul 10, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabiokung commented Jul 11, 2017

cpuguy83 commented Jul 13, 2017

cpuguy83 commented Jul 13, 2017 • edited Loading

aaronlehmann commented Jul 13, 2017

cpuguy83 left a comment

Choose a reason for hiding this comment

ehazlett commented Jul 17, 2017

aaronlehmann commented Jun 30, 2017 •

edited

Loading

fabiokung commented Jul 10, 2017 •

edited

Loading

cpuguy83 commented Jul 13, 2017 •

edited

Loading