Skip to content

Vastly speed up provider documentation publishing#33792

Merged
potiuk merged 2 commits intoapache:mainfrom
potiuk:parallelize-docs-publishing
Aug 27, 2023
Merged

Vastly speed up provider documentation publishing#33792
potiuk merged 2 commits intoapache:mainfrom
potiuk:parallelize-docs-publishing

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Aug 27, 2023

Publishing documentation is quite slow when preparing multiple provider documentation - and this is mostly because we are serializing copying of directories and checking if the directories are present. Hhowever we can easily speed it up by parallelising the publishing per-package.

This will speed up both CI and release-manager's docs publishing step (vastly).


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member Author

potiuk commented Aug 27, 2023

I noticed that docs publishing step was super-slow on CI (> 3m on main build) and this was mainly because of serializing publishing for multiple providers).

Parallelising it will increase the speed by 2x for public runners (to ~ 1.5 m if all providers are being built) and ~ 8x on self-hosted runners (~ 25 seconds or so)

Also for Release manager of providers (cc: @eladkal) this should cut waiitng time for "publish-docs" from few minutes to seconds.

Publishing documentation is quite slow when preparing multiple
provider documentation - and this is mostly because we are serializing
copying of directories and checking if the directories are present.
Hhowever we can easily speed it up by parallelising the publishing
per-package.

This will speed up both CI and release-manager's docs publishing
step (vastly).
@potiuk potiuk force-pushed the parallelize-docs-publishing branch from b9e9cfc to 4b44bad Compare August 27, 2023 06:10
@potiuk
Copy link
Member Author

potiuk commented Aug 27, 2023

Quite a few images regenerated because I found and corrected a mistake in help of the common "--run-in-parallel" option - it mentioned python but it's used to parallelize many things.

Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!!!
This step indeed takes time

Co-authored-by: Elad Kalif <45845474+eladkal@users.noreply.github.com>
@potiuk potiuk merged commit 8227db3 into apache:main Aug 27, 2023
@potiuk potiuk deleted the parallelize-docs-publishing branch August 27, 2023 08:38
@potiuk
Copy link
Member Author

potiuk commented Aug 27, 2023

56 seconds instead of 3.5 minutes. Not bad :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments