New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"podman manifest add" is not concurrent safe #14667
Comments
|
Hi @Romain-Geissler-1A, Thanks for creating the issue. I have not played with the patch yet but i think manifest list API needs a lock, buildah has a similar code for commit where it gets the locker for manifest list and locks while adding image to list I think we need to do same there here but I have a inclination to do this at Caution: I have not played with patch yet but something like this should work diff --git a/libpod/runtime.go b/libpod/runtime.go
index 6c8a99846..d7a0be55d 100644
--- a/libpod/runtime.go
+++ b/libpod/runtime.go
@@ -86,7 +86,6 @@ type Runtime struct {
libimageRuntime *libimage.Runtime
libimageEventsShutdown chan bool
lockManager lock.Manager
-
// Worker
workerChannel chan func()
workerGroup sync.WaitGroup
@@ -1040,6 +1039,11 @@ func (r *Runtime) configureStore() error {
return nil
}
+// Libpod return store
+func (r *Runtime) GetStore() storage.Store {
+ return r.store
+}
+
// LibimageRuntime ... to allow for a step-by-step migration to libimage.
func (r *Runtime) LibimageRuntime() *libimage.Runtime {
return r.libimageRuntime
diff --git a/pkg/domain/infra/abi/manifest.go b/pkg/domain/infra/abi/manifest.go
index 8b52c335c..30640a589 100644
--- a/pkg/domain/infra/abi/manifest.go
+++ b/pkg/domain/infra/abi/manifest.go
@@ -10,6 +10,7 @@ import (
"github.com/containers/common/libimage"
cp "github.com/containers/image/v5/copy"
"github.com/containers/image/v5/manifest"
+ "github.com/containers/common/libimage/manifests"
"github.com/containers/image/v5/pkg/shortnames"
"github.com/containers/image/v5/transports"
"github.com/containers/image/v5/transports/alltransports"
@@ -185,6 +186,12 @@ func (ir *ImageEngine) ManifestAdd(ctx context.Context, name string, images []st
if err != nil {
return "", err
}
+ locker, err := manifests.LockerForImage(ir.Libpod.GetStore(), manifestList.ID())
+ if err != nil {
+ return "", err
+ }
+ locker.Lock()
+ defer locker.Unlock()
addOptions := &libimage.ManifestListAddOptions{
All: opts.All,or ( this one is what I would prefer but it would require all calle to stop locking at higher level, breaking PATCH ) diff --git a/vendor/github.com/containers/common/libimage/manifest_list.go b/vendor/github.com/containers/common/libimage/manifest_list.go
index 4e8959004..db0a8b629 100644
--- a/vendor/github.com/containers/common/libimage/manifest_list.go
+++ b/vendor/github.com/containers/common/libimage/manifest_list.go
@@ -221,6 +221,12 @@ type ManifestListAddOptions struct {
// Add adds one or more manifests to the manifest list and returns the digest
// of the added instance.
func (m *ManifestList) Add(ctx context.Context, name string, options *ManifestListAddOptions) (digest.Digest, error) {
+ locker, err := manifests.LockerForImage(m.image.runtime.store, m.ID())
+ if err != nil {
+ return "", err
+ }
+ locker.Lock()
+ defer locker.Unlock()
if options == nil {
options = &ManifestListAddOptions{}
} |
Ideally lbimage should take care of that. If all callers of an API have to lock, it smells like something's missing. |
`podman manifest add` uses `ManifestList.Add(` but of now `Add(` does not locks while adding instances to the list thus causing race scenarios where storage is not reloaded and overrided by another invocation of the command. Following problem is solved in two steps * Add -> LockByInstance: Acquire a fs lock by instance ID so other invocation waits until this invocation completes its write. * Add -> LockByInstance -> reload: Reload instance digests from storage just after acquiring lock to make sure we are not overriding any just written instance. Reproducer: containers/podman#14667 (comment) Closes: containers/podman#14667 [NO NEW TESTS NEEDED] [NO TESTS NEEDED] This needes integration tests so its hard to verify race in CI. Signed-off-by: Aditya R <arajan@redhat.com>
|
Created a PR and verified manually in various sample space by reproducer shared here: #14667 (comment) [fl@fedora bin]$ ./verify.sh
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2 |
`podman manifest add` uses `ManifestList.Add(` but of now `Add(` does not locks while adding instances to the list thus causing race scenarios where storage is not reloaded and overrided by another invocation of the command. Following problem is solved in two steps * Add -> LockByInstance: Acquire a fs lock by instance ID so other invocation waits until this invocation completes its write. * Add -> LockByInstance -> reload: Reload instance digests from storage just after acquiring lock to make sure we are not overriding any just written instance. Reproducer: containers/podman#14667 (comment) Closes: containers/podman#14667 [NO NEW TESTS NEEDED] [NO TESTS NEEDED] This needes integration tests so its hard to verify race in CI. Signed-off-by: Aditya R <arajan@redhat.com>
|
@Romain-Geissler-1A Above PR in |
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind bug
Description
The command "podman manifest add" is not concurrent safe. If you try to add different images to the same manifest in parallel, sometimes one images is "lost".
Steps to reproduce the issue:
Reproduced from inside a container quay.io/podman/upstream started with --privileged mode
my.registry/my-namespace/my-image:aarch64andmy.registry/my-namespace/my-image:x86_64of dummy images using 2 distincts architectures:[root@5c9771d2427c /]# for i in {1..100}; do podman manifest create my-manifest-$i; podman manifest add my-manifest-$i containers-storage:my.registry/my-namespace/my-image:aarch64 & podman manifest add my-manifest-$i containers-storage:my.registry/my-namespace/my-image:x86_64 & doneWait some time till the previous commands are finished.
Now inspect the manifests, some contain two images, some others just one:
Describe the results you received:
Manifests are created with "random" number of images.
Describe the results you expected:
Manifest should always contain 2 images.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
podman version:The text was updated successfully, but these errors were encountered: