-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
progress bars: use github.com/vbauerster/mpb #581
Conversation
I'll spin up a Skopeo CI PR, where you can also test. |
de3c68b
to
f42f87e
Compare
f42f87e
to
711b2c1
Compare
8087313
to
762977b
Compare
Works for me, but need the skopeo change. |
b7e4147
to
67f0a25
Compare
Had two deadlocks in the code which are now fixed and Skopeo is green (see containers/skopeo#602). |
LGTM |
67f0a25
to
27dc728
Compare
(updated to latest master) |
27dc728
to
49da360
Compare
I've left a comment in the Skopeo's PR, I am seeing a problem when I try to pull |
5b9542e
to
e8eedb8
Compare
Use the github.com/vbauerster/mpb library for displaying the progress bars. It is actively maintained, is designed for multi-bar usage and just gets the job done. This will fix issues with single bars being displayed multiple times. Note that updating the bar's prefix does not work anymore. We're now logging those entries, for instance, when we skip a blob as it's already present. The output now looks as follows (e.g., when pulling jboss/wildfly): ``` Getting image source signatures Copying blob aeb7866da422 [======================================] 71.2 MiB / 71.2 MiB Copying blob 157601a0b538 [======================================] 10.3 MiB / 10.3 MiB Copying blob 642f4164f381 [======================================] 1.9 KiB / 1.9 KiB Copying blob bda512e97517 [======================================] 74.6 MiB / 74.6 MiB Copying blob 4cccaafdae21 [======================================] 170.9 MiB / 170.9 MiB Copying config 2602b48525 [======================================] 6.5KiB / 6.5KiB Writing manifest to image destination Storing signatures ``` Note that this change requires the following dependencies: * github.com/vbauerster/mpb v3.3.4 * github.com/mattn/go-isatty v0.0.4 * github.com/VividCortex/ewma v1.1.1 Once we migrate to using Go modules, we can start using mpb v4.x. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
e8eedb8
to
d3f7fcb
Compare
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m sorry about getting to this so late. I can do the smaller cleanups myself.
@vrothberg PTAL at least at the Progress.Wait
comment — am I missing anything?
destInfo, err := c.copyBlobFromStream(ctx, bytes.NewReader(configBlob), srcInfo, nil, false, true, bar) | ||
if err != nil { | ||
return err | ||
} | ||
if bar != nil { | ||
bar.SetTotal(0, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be 0 or len(configBlob)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 is intended. It has no impact on the displayed total size but it will trigger the bar to be completed.
canModifyBlob bool, isConfig bool, bar *pb.ProgressBar) (types.BlobInfo, error) { | ||
canModifyBlob bool, isConfig bool, bar *mpb.Bar) (types.BlobInfo, error) { | ||
c.progressWG.Add(1) | ||
defer c.progressWG.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does c.progressWG
do anything we need? The synchronization of parallel layer copy AFAICS happens independently via copyGroup
, and the only effect of mpb.WithWaitGroups
seems to be that ic.c.progressPool.Wait()
blocks on c.progressWG
in addition to doing everything else it does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's intended as well. While testing this PR, we've experienced situations where the progress was still rendering although the copy operations have completed (including the one of the config). It's a background process with a specific refresh rate and the copier's WaitGroup makes sure that we progress only when it's done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the copier's WaitGroup makes sure that we progress only when it's done.
I can’t see how that works; the only use of progressWG
inside the code is p.uwg.Wait()
immediately at the start of Progress.Wait
, which AFAICS is immediately satisfied because copyGroup.Wait
ensured that all layer copies are done, and copyConfig
has returned.
Progress.Wait
does wait for the separate render process to finish, via pState.zeroWait
(“please exit”) and Progress.done
(“exited”); I can’t see that this depends on progressWG
in any way.
@@ -600,13 +580,31 @@ func (ic *imageCopier) copyUpdatedConfigAndManifest(ctx context.Context) ([]byte | |||
return nil, err | |||
} | |||
|
|||
ic.c.progressPool.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation of Progress.Wait
says
After this method has been called, there is no way to reuse *Progress instance.
The middle of copyUpdatedConfigAndManifest
is not a good place for such a call, because copyUpdatedConfigAndManifest
can be called multiple times per image (as well, as, strictly speaking, zero times if an error happens before). Either the pool should live across all of those calls (and in that case the “Writing manifest to image destination” message just below needs to happen in some way that does not interfere with the progressPool
), or it should be closed before even starting to deal with configs/manifests (using just a message, not progress bar, for the config, or maybe creating an one-use single-bar progress pool just for each attempt to copy a manifest).
(Similar in nature is a concern about attaching progressPool
to copier
vs. imageCopier
; right now that makes ~zero difference; longer-term after #400 is resurrected, copier
might live across copies of multiple single-arch images, whereas imageCopier
would carry state related to one of them.)
(Purely aesthethically it would be nice for the “closing” of the progress pool to happen (or, to be guaranteed to happen as a last-resort backstop) at the same level of the call stack that creates the progress pool.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, that's intended. We need to make sure the progress bar has finished rendering before printing, see the next line ic.c.Printf("Writing manifest to image destination\n")
. I understand it's not ideal when looking at the assumptions on the code, but I could not find any other way and am very hesitant to rewrite copy.go
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Similar in nature is a concern about attaching
progressPool
tocopier
vs.imageCopier
; right now that makes ~zero difference; longer-term after #400 is resurrected,copier
might live across copies of multiple single-arch images, whereasimageCopier
would carry state related to one of them.)
It's possible to add new bars dynamically and it will render correctly as long as nothing prints in between.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Purely aesthethically it would be nice for the “closing” of the progress pool to happen (or, to be guaranteed to happen as a last-resort backstop) at the same level of the call stack that creates the progress pool.)
I share that desire but mpb doesn't expose a Nuke()
API. It is possible though to nuke each bar indivually which would allow to work around that.
} | ||
} else { | ||
cld.destInfo, cld.diffID, cld.err = ic.copyLayer(ctx, srcLayer, bar) | ||
} | ||
data[index] = cld | ||
if bar != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bar
can never be nil
AFAICS, so drop this check. (The code would crash in bar.ProxyReader
anyway.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was one test where exactly this case happened while copying a config. It's really just a defense for corner cases.
destInfo, err := c.copyBlobFromStream(ctx, bytes.NewReader(configBlob), srcInfo, nil, false, true, bar) | ||
if err != nil { | ||
return err | ||
} | ||
if bar != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bar
can never be nil
AFAICS, so drop this check. (The code would crash in bar.ProxyReader
anyway.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was one test case where this happened. I could not figure out why and also had a look at the mpb code but figured to just add the check to defend from the corner case. I think it's related to non-TTY as the test case (i.e., TestCopyOCIRoundTrip
) worked when doing it manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second, working around crashes without understanding how they happen is not great for reliability. AFAICS the only way bar
can be nil
is if AddBar
returns nil
, which in turn can happen only if Progress.Wait
has been called.
… which probably points to #581 (comment) , calling Wait
and looping to copyUpdatedConfigAndManifest
as the underlying cause: the code tried schema2, failed, tried schema1, and copyConfig
did not use the bar
(and did not trigger the bar.ProgressReader
crash) because schema1 does not have a separate config object.
I’ll take a stab at fixing this; if you can remember what the failing situation was (or is it just straightforward make test-skopeo
failing?), that would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely right, and I was/am not happy with it either. The failure is straightforward to reproduce with integration tests (i.e., TestCopyOCIRoundTrip
).
I think a more or less easy solution is to break out the parts printing (including related code) and move the Wait() call to the end of copyOneImage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, reproduced. I was not quite right about this happening with schema1 fallback: it is copying an OCI image and falling back to schema2. For the record, the reason it does not crash in bar.ProgressReader
is that creating the ProgressReader
does not access bar
, only non-empty reads do, and PutBlob
detects that the config is already present at the destination, so no reads happen.
I think I know how to deal with this…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I know how to deal with this…
Thank you so much for looking into this 🙏
All your comments are valid concerns, but I couldn't find any other way without refactoring parts of
You missed a lot of pain :^) The progress bars are always "rendered" in the background with a specific refresh rate, which makes them subject to race conditions with the code. This is why the "old" library had issues and mpb avoids such situations by integration a |
Use the github.com/vbauerster/mpb library for displaying the progress
bars. It is actively maintained, is designed for multi-bar usage and
just gets the job done.
This will fix issues with single bars being displayed multiple times.
Note that updating the bar's prefix does not work anymore. We're now
logging those entries, for instance, when we skip a blob as it's
already present.
The output now looks as follows (e.g., when pulling jboss/wildfly):
Signed-off-by: Valentin Rothberg rothberg@redhat.com