Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46825][DOCS] Build Spark only once when building docs #44865

Closed

Conversation

nchammas
Copy link
Contributor

@nchammas nchammas commented Jan 24, 2024

What changes were proposed in this pull request?

As suggested here, this change improves the documentation build so that it builds Spark at most one time, regardless of what API docs are requested in the build.

Why are the changes needed?

There is no need to build Spark multiple times when generating docs. In particular, building Scala and Python docs, or Scala and SQL docs, causes Spark to be built twice.

Fixing this problem saves us a couple of minutes.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

I built the docs as follows on master as well as on this branch:

time SKIP_RDOC=1 SKIP_PYTHONDOC=1 bundle exec jekyll build

The time results before and after this change are as follows:

before
------
real    6m48.815s
user    23m17.943s
sys     1m29.578s

after
-----
real    4m10.672s
user    14m10.130s
sys     1m0.773s

That's a savings of about 2.5 minutes.

Additionally, I diffed the generated _site/ dir across master and this branch and confirmed they are essentially identical except for some general SQL examples files.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the DOCS label Jan 24, 2024
@HyukjinKwon
Copy link
Member

Merged to master.

@nchammas nchammas deleted the SPARK-46825-jekyll-build-spark-once branch January 24, 2024 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants