-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable throttling for publish command #29
Conversation
cabdba4
to
4ab1c62
Compare
This change introduces --throttle option that is usable for publish command. This option allows to specify a number of repositories that can be publish at one time. This may help with not overloading pulp server with mass publishes and leave some capacity of workers to do another tasks.
d8d3092
to
6800e73
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this isn't the ideal place to implement it:
If you have a look in pubtools-pulplib, it already has throttling support built-in. It just happens to use a built-in value which the caller is currently not allowed to control:
pubtools/pulplib/_impl/client/client.py- self._task_executor = (
pubtools/pulplib/_impl/client/client.py- Executors.thread_pool(max_workers=self._REQUEST_THREADS)
pubtools/pulplib/_impl/client/client.py- .with_map(self._unpack_response)
pubtools/pulplib/_impl/client/client.py- .with_map(self._log_spawned_tasks)
pubtools/pulplib/_impl/client/client.py- .with_poll(poller, cancel_fn=poller.cancel)
pubtools/pulplib/_impl/client/client.py: .with_throttle(self._TASK_THROTTLE)
pubtools/pulplib/_impl/client/client.py- .with_retry(retry_policy=self._RETRY_POLICY)
pubtools/pulplib/_impl/client/client.py- )
I think rather than implementing a separate throttling loop which is only going to work for publish, we should leverage the above, so the appropriate course here would be:
- adjust pubtools-pulplib to make the throttle count a supported part of the API (e.g. it can be passed to constructor of the client)
- just make pubtools-pulp pass the appropriate value when creating the client
Also, what do you think of allowing all tasks to set the throttle rather than just publish? This was requested for publish being the most significant use-case, but in reality just about every task using Pulp can benefit from having the ability to throttle it. If it doesn't complicate the current issue much, I'd say it'd be worthwhile to add the option to all of the tasks here.
Actually, from reviewing how the code in pubtools-pulp is set up, I'd say we almost get this "for free" - pulp client creation is abstracted by PulpClientService and if you add --throttle argument there and pass it into pubtools-pulplib, I think every task is going to gain support for throttling at once, consistently.
Just to clarify, I only mean within pubtools-pulp itself here, I'm not proposing to make the same adjustment through all Pub commands. |
@rohanpm |
Yeah, in particular I would think that using a low throttle count for 'publish' and a high throttle count for 'associate' tasks, which usually take only a second or two, would make sense. This would probably have to be built-in to pubtools-pulplib since it knows which type of tasks are being created. |
Closing in favor of #32 |
This change introduces --throttle option that
is usable for publish command.
This option allows to specify a number of repositories
that can be publish at one time. This may help
with not overloading pulp server with mass publishes
and leave some capacity of workers to do another tasks.