From 2d95883ee0018aade5365b74841ff81cbc92dd95 Mon Sep 17 00:00:00 2001 From: Dennis Felsing Date: Sat, 30 May 2026 02:39:34 +0000 Subject: [PATCH] ci: parallelize dev.materialize.com docs build and upload The deploy-devsite step ran for ~40 minutes, ~33 of which were two sequential `aws s3 sync` uploads of the Rust docs (~13k tiny HTML files, ~203 MiB each) at the aws CLI's default concurrency of 10, making them request-latency bound rather than bandwidth bound. This also fixes a latent bug: both `bin/doc` invocations wrote to the same target dir, so the `--document-private-items` build clobbered the public build before either was uploaded, and both `api/rust` and `api/rust-private` ended up serving the private docs. Changes: - Build the public (`api/rust`) and private (`api/rust-private`) docsets into separate target dirs and run each build+upload pipeline in parallel. - Bump `s3.max_concurrent_requests` to 100 so the many small files upload concurrently. - Add `--delete` to the Rust syncs so each prefix is a clean mirror and the private pages previously mis-uploaded to `api/rust` are purged. Co-Authored-By: Claude Opus 4.8 (1M context) --- ci/deploy/devsite.sh | 31 +++++++++++++++++++++++++++---- 1 file changed, 27 insertions(+), 4 deletions(-) diff --git a/ci/deploy/devsite.sh b/ci/deploy/devsite.sh index 9fd858a78f89b..e112a1cb42fe5 100755 --- a/ci/deploy/devsite.sh +++ b/ci/deploy/devsite.sh @@ -13,6 +13,12 @@ set -euo pipefail +# rustdoc emits ~13k tiny HTML files per docset, so the S3 syncs below are bound +# by per-object request latency rather than bandwidth. The aws CLI defaults to +# only 10 concurrent requests; bump it so the many small files upload in +# parallel. +aws configure set default.s3.max_concurrent_requests 100 + cargo about generate ci/deploy/licenses.hbs > misc/www/licenses.html aws s3 cp --recursive misc/www/ s3://materialize-dev-website/ @@ -20,10 +26,27 @@ aws s3 cp --recursive misc/www/ s3://materialize-dev-website/ # We exclude all of these pages from search engines for SEO purposes. We don't # want to spend our crawl budget on these pages, nor have these pages appear # ahead of our marketing content. -RUSTDOCFLAGS="--html-in-header $PWD/ci/deploy/noindex.html" bin/doc -RUSTDOCFLAGS="--html-in-header $PWD/ci/deploy/noindex.html" bin/doc --document-private-items -aws s3 sync --size-only target-xcompile/doc/ s3://materialize-dev-website/api/rust -aws s3 sync --size-only target-xcompile/doc/ s3://materialize-dev-website/api/rust-private +# +# `api/rust` gets the public docs; `api/rust-private` gets the docs with private +# items. We build each docset into its own target dir and build+upload them in +# parallel. --delete makes each prefix a clean mirror (and purges private pages +# previously uploaded to api/rust). +export RUSTDOCFLAGS="--html-in-header $PWD/ci/deploy/noindex.html" + +( + CARGO_TARGET_DIR=target-xcompile bin/doc + aws s3 sync --size-only --delete target-xcompile/doc/ s3://materialize-dev-website/api/rust +) & +rust_pid=$! + +( + CARGO_TARGET_DIR=target-xcompile-private bin/doc --document-private-items + aws s3 sync --size-only --delete target-xcompile-private/doc/ s3://materialize-dev-website/api/rust-private +) & +rust_private_pid=$! + +wait "$rust_pid" +wait "$rust_private_pid" bin/pydoc aws s3 sync --size-only --delete target/pydoc/ s3://materialize-dev-website/api/python