Conversation
while pending: | ||
future = asyncio.wait(pending, return_when=asyncio.FIRST_COMPLETED) | ||
completed, pending = loop.run_until_complete(future) | ||
for task in completed: | ||
if self.bar: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the use of the BatchIterator unusual for downloading.
Say for example, the we have to download 20 artifacts overall. The size of the artifacts are 1* 1GB + 19 * 1MB (they are yielded in this order by the artifact iterator).
With a batch size of 10, the first 10 downloads will be started in parallel. However, all of them need to terminate (including the 1 GB one) before the second batch is started. In the first batch, there will be only one concurrent download for most of the time with 10 queued downloads that do not start.
I would expect pending
to be filled with up to 10 downloads initially. Then, it is backfilled by one download for each completed task in order to always have 10 concurrent downloads. But perhaps I am missing something here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch @gmbnomis. You have identified a flaw where. Using the BatchIterator
here to support the progressive progress reporting will still be useful but the batch needs to be bigger (like the default: 1024). Then within each batch the self.concurrent
needs to be used differently to constrain the number of concurrent downloads. Will fix.
else: | ||
return | ||
|
||
|
||
class Batch(Iterable, Sized): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the Batch
because it adds no value. Not sure what I was thinking :-/
Codecov Report
@@ Coverage Diff @@
## master #3464 +/- ##
=========================================
- Coverage 56.64% 0.24% -56.4%
=========================================
Files 59 59
Lines 2415 2415
=========================================
- Hits 1368 6 -1362
- Misses 1047 2409 +1362
Continue to review full report at Codecov.
|
https://pulp.plan.io/issues/3582
Improvements:
ChangeSet.apply_and_drain()
removed.SizedIterator
no longer used.ContentIterator
as it will be useful for other abstractions.