-
Notifications
You must be signed in to change notification settings - Fork 350
Rewrite metadata store #66
Rewrite metadata store #66
Conversation
CRI Validation test result is not changed with this PR:
|
6a918d5
to
22d4d80
Compare
hack/verify-lint.sh
Outdated
@@ -24,6 +24,7 @@ for d in $(find . -type d -a \( -iwholename './pkg*' -o -iwholename './cmd*' \)) | |||
--exclude='.*_test\.go:.*error return value not checked.*\(errcheck\)$' \ | |||
--exclude='duplicate of.*_test.go.*\(dupl\)$' \ | |||
--exclude='.*/mock_.*\.go:.*\(golint\)$' \ | |||
--exclude='declaration of "err" shadows declaration at.*\(vetshadow\)' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing will introduce the possibility of programming errors with err
vars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that the check is known to be error prone. It sometimes report error and sometimes not, and I don't think something like:
if err := func1(); err != nil {
}
if err := func2(); err != nil {
}
should be an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only with err? Or are you arguing for a different recommended pattern of use of variables in go? Should we always redeclare :=
go variables instead of assigning via =? What's the point of have var declaration and assignment as separate language features if we are only going to use declaration and assignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only with err. I mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously err := foo
is just a shortcut for var err error =
In your above use case you are repetitivly stating var err error, when all you are really doing is assigning with an =. In your use case the cleaner way is to declare first then just use err = since you don't like using := sometimes and = other times, yes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my question is then, why is err a special use case of variable declaration and assignment? I understand the thought for keeping each test independent but the variable redeclaration doesn't make the test more independent than using assignment. Symmetry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offline discussed. Keep the check and update the code.
When you refactor you really refactor! :-) Will review from the perspective of it being a complete replacement. |
40aba30
to
ea21c3c
Compare
Will review today. |
@yujuhong Thanks~ |
pkg/store/container/container.go
Outdated
}, nil | ||
} | ||
|
||
// LoadContainer loads the internal used container type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove 'used' or s/internal used/internally used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/store/container/container.go
Outdated
// TODO(random-liu): Add stop channel to get rid of stop poll waiting. | ||
} | ||
|
||
// NewContainer creates a internal used container type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove 'used' or s/internal used/internally used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
// NewStore creates a container store. | ||
func NewStore() *Store { | ||
return &Store{containers: make(map[string]Container)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that you pass around Container
instead of *Container
in the Store operations. Is this intentional to prevent callers from modifying the struct (which cannot work completely without deep copying)? This would lead to more memory allocation/copying, so just want to understand the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use Container
rather than *Container
because:
- The value copy could prevent callers from mutating the struct fields in some way, because most fields are primitive types.
- Most importantly, when caller using the api, usually they won't expect to be able to mutate fields of a returned value, but may expect to mutate a returned pointer.
As for the overhead, metadata is small, so there shouldn't be too much overhead. :)
} | ||
|
||
// Delete deletes the container status from disk atomically. | ||
func (m *statusStorage) Delete() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we delete m.status
in this function? It looks like Delete()
is only meaningful if we back the data on the disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter whether we delete m.status
or not, because once the container is removed from container list, no one should be able to retrieve and access it.
Delete m.status
will only cause race condition between operations which already get the reference.
|
||
// NewStore creates an image store. | ||
func NewStore() *Store { | ||
return &Store{images: make(map[string]Image)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question about the rationale of storing Image
vs *Image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with above.
pkg/store/image/image.go
Outdated
) | ||
|
||
// Image contains all resources associated with the image. All fields | ||
// are **read-only**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that true? In Add()
, you merge RepoTags and RepoDigest and write it back to the store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update the comment to "All fields MUST not be mutated directly".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this change universally on the other read only comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/store/container/container.go
Outdated
) | ||
|
||
// Container contains all resources associated with the container. All fields of | ||
// Container are thread safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how fields can be thread-safe. I think you mean the methods to mutate the status are thread-safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thanks. Will update the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// Store stores all sandboxes. | ||
type Store struct { | ||
lock sync.RWMutex | ||
sandboxes map[string]Sandbox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question about storing pointers vs. structs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
Will rebase this and address comments. |
Moved to v0.2.0, because this PR mainly:
There is no new feature introduced. Since we don't have time to finish the containerd client switch and permanent network namespace work. Let's also punt this one to v0.2.0. |
3847f17
to
454280a
Compare
454280a
to
82f82a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see questions and comments.
@@ -1,183 +0,0 @@ | |||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest rename from refactor metadata store to rewrite metadata store. Or merge the delete and create commits into one so we can see the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You won't see the diff even after I merge those 2. Will change to "rewrite".
return nil, fmt.Errorf("failed to add container metadata %+v into store: %v", | ||
meta, err) | ||
if err := c.containerStore.Add(container); err != nil { | ||
return nil, fmt.Errorf("failed to add container %q into store: %v", id, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.. like the pattern using new() (container, error) then calling add with container vs using .create(meta). But it's strange that you had to defer the container.Status.Delete for a container that never gets added to the store?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the storing of the status in the NewContainer() call but don't understand why the status is stored before the container. Is that to make sure the container always has status? maybe the container store.add could be changed to add container with status to make it more atomic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add
need to lock the container list. I don't think we want to lock the list and do some disk operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added Container.Delete
function.
@@ -45,15 +45,16 @@ func (c *criContainerdService) ExecSync(ctx context.Context, r *runtime.ExecSync | |||
} | |||
}() | |||
|
|||
// Get container metadata from our container store. | |||
meta, err := c.containerStore.Get(r.GetContainerId()) | |||
// Get container from our container store. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like the comment but the naming might get confusing down the road over the two container stores, ours and containerd's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a problem... :(
|
||
var containers []*runtime.Container | ||
for _, meta := range metas { | ||
containers = append(containers, toCRIContainer(meta)) | ||
for _, container := range containersInStore { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is why containerd came up with the name task/tasks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean here? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you refactored from range metas to range containersInStore presumably cause you didn't like the term meta... and didn't have a better option than just calling the data container again... :-)
if err := c.sandboxStore.Create(meta); err != nil { | ||
return nil, fmt.Errorf("failed to add sandbox metadata %+v into store: %v", | ||
meta, err) | ||
sandbox.CreatedAt = time.Now().UnixNano() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer delete for state of sandbox?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't maintain state of sandbox. We infer that directly from sandbox container state.
We could not do that for container because we also need exit code, finished timestamp etc. which we have to checkpoint ourselves. But for sandbox, we only care about whether it is running or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yeah been having some debug issues on the update PR for sandboxes that have a messed up state (can't be stopped) then later on can't be created cause /pause isn't there.. the rootfs is messed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll help debug your PR over the weekend. :)
pkg/store/container/container.go
Outdated
Metadata | ||
// Status stores the status of the container. | ||
Status StatusStorage | ||
// TODO(random-liu): Add containerd container client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean in the future, we could this here, after we start to use containerd client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk
pkg/store/container/metadata.go
Outdated
) | ||
|
||
// NOTE(random-liu): | ||
// 1) Metadata is read-only. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? you mean can't be updated after create?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, will update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
pkg/store/container/metadata.go
Outdated
|
||
// NOTE(random-liu): | ||
// 1) Metadata is read-only. | ||
// 2) Metadata is checkpointed as containerd container label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just confirming we'll use labels not annotations for all metadata so we can find the container by the metadata labels or... maybe some combination of labels and annotations...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containerd only has label, there is no notion of annotation in containerd.
And we store metadata into label just for checkpoint, not for filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oci will have the annotations.. and I suspect containerd will let them pass through without providing first class support.. I was thinking about image labels.. my bad.
pkg/store/image/image.go
Outdated
) | ||
|
||
// Image contains all resources associated with the image. All fields | ||
// are **read-only**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this change universally on the other read only comments?
Signed-off-by: Lantao Liu <lantaol@google.com>
Signed-off-by: Lantao Liu <lantaol@google.com>
Signed-off-by: Lantao Liu <lantaol@google.com>
Signed-off-by: Lantao Liu <lantaol@google.com>
82f82a8
to
f4df66e
Compare
@mikebrow Addressed comments. The code could be cleaned up more, especially when we start to use containerd client. We could do that in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Rewrite metadata store
Respect OCI image config default username
Previously, containerd only managed running container, so we have to do checkpoint to maintain complete lifecycle of a container and sandbox. That's why we introduced the metadata store, which was targeted to be a checkpoint store for container, sandbox and image if needed.
However, it seems that we should not rely on the metadata store now, because:
Update
function, which has some limitations:a)
Update
function is not flexible enough, e.g. for image, we needCreateOrUpdate
operation to create a metadata if it does not exsit, or update it if does.b)
Update
and checkpoint doesn't make sense to some in-memory data we need to associate with the metadata lifecycle, e.g. containerd container/image client, cni netns object, stop container channel etc.c) The whole metadata store logic is unnecessarily complex given that we don't need a general checkpoint solution now.
Thus I made this change. Relatively large, but help us avoid future pain.
Use container store as an example, the store is organized like this:
For sandbox, we don't need status.
For image, we don't event need a separate
Metadata
field, because containerd provided everything we need.