Mitigate a disk consumption issue during sync#7113
Conversation
| yield content | ||
|
|
||
| async def batches(self, minsize=500): | ||
| async def batches(self, minsize=settings.MAX_CONCURRENT_CONTENT): |
There was a problem hiding this comment.
I think this probably makes sense, but I have some additional thoughts... I wonder if it's not sufficient.
- Maybe
ArtifactSaverstage should use an independently configurable batch size, such that it can be made lower without compromising the performance of the rest of the pipeline? - Does it actually make sense to use
settings.MAX_CONCURRENT_CONTENThere? The description of it (see Allow batch size of artifacts processed during sync to be configured #7037) makes it sound very much like that would be the case. But currently that value is only used within theArtifactDownloaderstage and maybe the meaning is distinct enough to warrant being independent anyway
There was a problem hiding this comment.
So GenericDownloader's default is 200 and this iterator's is 500 - but I suspect that both of those numbers are in the category of "well this seems to be a reasonable number". I don't think the possible gain in fine-grained-optimization is worth the complexity of having them be separate. FWIW, I'd go with this implementation, unless/until there's some serious proof it's a bad idea.
There was a problem hiding this comment.
The main question on my side is, we might want the value to be significantly below 200 on the "how many artifacts do we want to save at a time (how many should be batched up)". Maybe we do, and maybe there's a decent reason to keep the other value around 200 or at least a good reason not to drop it as low as we might want the batching before the ArtifactSaver stage.
But otherwise I agree with you, I'd love for it to be this simple.
bad7e02 to
5bf1ab8
Compare
f714672 to
cb37bc8
Compare
The ArtifactSaver stage is acting as a bottleneck due to batching, and as a result artifacts downloaded by the ArtifactDownloader stage aren't being flushed out quickly enough. Using a default batch size of 500 is too much for some stages. closes pulp#7064
cb37bc8 to
014bcf6
Compare
|
I tested making the @balasankarc Was there a particular reason for choosing the number 200? Is there any good reason for keeping these independent (e.g. reducing the batching even more while keeping the concurrency higher), or otherwise to not reduce |
Pretty sure it was 200 because that was the original hard-coded value : https://github.com/pulp/pulpcore/pull/7037/files#diff-28560cb4dba596314e3577becbd1974a015d47c08f8f17b38e6030a701999169L142 - and it's been that way basically Forever. |
|
I agree with having just the one tuning parameter. 25 is probably kinder to our upstream-remotes than 200 was as well. But if downloading defaults to 25-concurrent where it used to be 200 concurrent, will that be noticeable to users that aren't disk/memory constrained? Or at least, "noticeable enough" to leave the default at 200 and let the users who actually are tight on disk/memory change it? |
|
@ggainey Remotes already had tighter default restrictions on the concurrent connections / downloads than this limit, so I doubt it would make any real difference. The batch size is only place where there's any practical amount of additional overhead. |
Backport to 3.93: 💚 backport PR created✅ Backport PR branch: Backported as #7115 🤖 @patchback |
The ArtifactSaver stage is acting as a bottleneck due to batching, and as a result artifacts downloaded by the ArtifactDownloader stage aren't being flushed out quickly enough. Using a default batch size of 500 is too much for some stages.
closes #7064