Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: 🪨 improve incremental builds by linking to rocksdb #4496

Merged
merged 3 commits into from
May 30, 2024

Conversation

cratelyn
Copy link
Contributor

@cratelyn cratelyn commented May 29, 2024

first, the cool part: here are incremental build timings for cargo build -p cnidarium filtering for compilation units that took > 1.5 seconds to compile.

on my machine, this improved incremental build times by 72.92% ❤️

before 😩

warm-build-timing-without-rocksdb-linking

after 😍

warm-build-timing-with-rocksdb-linking

➕ changes

docs: 🪨 add notes on linking to prebuilt rocksdb

💭 background

librocksdb-sys(build) is a somewhat infamous source of lengthy build
times when engineers are working in the monorepo. this is a crate whose
build script, clones and builds the rocksdb database from source,
statically linking against the generated librocksdb.a file.

this interacts poorly with a Cargo workspace like ours, which contains a
number of different features. because features are additive, running
commands like cargo build --workspace and cargo build --package penumbra-app
can result in different feature sets being enabled for leaf
dependencies, mandating that the library be rebuilt (which means
rebuilding librocksdb.a once more
).

flake: 🔗 link against a precompiled rocksdb

this adds a ROCKSDB_LIB_DIR environment variable to the nix
development shell.

this environment variable points to the directory that
contains the librocksdb.a file, to be statically linked against the
librocksdb-sys crate.

a nixpkgs.rocksdb build input is added.

📐 build timing methodology

this shell script was used to generate build timings. in order to see
how this behaved on an incremental build, the script will clear the
cache, build the workspace, and then build a crate that depends on
librocksdb-sys.

 #!/usr/bin/env bash
 #
 # see https://doc.rust-lang.org/cargo/reference/timings.html for more info.
 set -euox pipefail

 # clean out any preëxisting build artifacts.
 cargo clean;

 # build once before we generate our report.
 cargo build --workspace

 # build a particular crate, recording the build timings.
 cargo build --package cnidarium --timings

 # hang on to the report generated.
 cp target/cargo-timings/cargo-timing.html .

@cratelyn cratelyn self-assigned this May 29, 2024
@cratelyn cratelyn added this to the Sprint 7 milestone May 29, 2024
@cratelyn cratelyn added A-tooling Area: developer tooling for building Penumbra itself A-docs Area: Documentation needs for the project labels May 29, 2024
@cratelyn cratelyn force-pushed the kate/link-system-rocksdb branch from 61ad631 to ed1e8c3 Compare May 29, 2024 17:16
@cratelyn cratelyn changed the base branch from main to kate/pin-rocksdb-crate-version May 29, 2024 17:19
@cratelyn cratelyn changed the title docs: 🪨 add notes on linking to prebuilt rocksdb rocksdb: 🪨 linking to prebuilt rocksdb May 29, 2024
@cratelyn cratelyn changed the title rocksdb: 🪨 linking to prebuilt rocksdb chore: 🪨 improve incremental builds by linking to rocksdb May 29, 2024
Copy link
Contributor Author

@cratelyn cratelyn May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc: @plaidfinch, a validator using nix. some additional effort would be needed to fetch the same version of RocksDB used implicitly by librocksdb-sys, before this is truly production-ready, but this greatly improved incremental builds on my machine.

hopefully, this is a good starting point for your own infrastructure. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and cross-referencing #4495, which codifies that version explicitly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine with me to do a non-incremental build for production. I'm not using the flake in the Penumbra repo for production, for reasons just like this; I'm using a separate flake (see: https://github.com/starlingcyber/infra).

Copy link
Contributor Author

@cratelyn cratelyn May 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd like to note that these docs are best effort, as a user of the nix development shell myself. i did find that the version of rocksdb we use has aged enough to have run into problems when compiled with a recent version of clang/gcc, but this is the broad shape of how someone using a distro like Ubuntu would do this.

i'm eager to land the flake changes, below, so if these are causing problems i'm happy to drop them from this changeset and help land these docs separately.

cratelyn added 2 commits May 29, 2024 13:41
 #### 💭 background

NB: this commit is based on #4495.

`librocksdb-sys(build)` is a somewhat infamous source of lengthy build
times when engineers are working in the monorepo. this is a crate whose
build script, clones and builds the rocksdb database from source,
statically linking against the generated `librocksdb.a` file.

this interacts poorly with a Cargo workspace like ours, which contains a
number of different features. because features are additive, running
commands like `cargo build --workspace` and `cargo build --package penumbra-app`
can result in different feature sets being enabled for leaf
dependencies, mandating that the library be rebuilt (_which means
rebuilding librocksdb.a once more_).
this adds a `ROCKSDB_LIB_DIR` environment variable to the nix
development shell.

this environment variable points to the directory that
contains the `librocksdb.a` file, to be statically linked against the
`librocksdb-sys` crate.

a `nixpkgs.rocksdb` build input is added.

 #### 📐 build timing methodology

this shell script was used to generate build timings. in order to see
how this behaved on an incremental build, the script will clear the
cache, build the workspace, and _then_ build a crate that depends on
`librocksdb-sys`.

```sh
 #!/usr/bin/env bash
 #
 # see https://doc.rust-lang.org/cargo/reference/timings.html for more info.
 set -euox pipefail

 # clean out any preëxisting build artifacts.
 cargo clean;

 # build once before we generate our report.
 cargo build --workspace

 # build a particular crate, recording the build timings.
 cargo build --package cnidarium --timings

 # hang on to the report generated.
 cp target/cargo-timings/cargo-timing.html .
```
@cratelyn cratelyn force-pushed the kate/link-system-rocksdb branch from ed1e8c3 to a3f536a Compare May 29, 2024 17:41
@cratelyn cratelyn changed the base branch from kate/pin-rocksdb-crate-version to main May 29, 2024 17:41
@cratelyn cratelyn marked this pull request as ready for review May 29, 2024 17:51
@conorsch conorsch self-requested a review May 29, 2024 20:43
Copy link
Contributor

@conorsch conorsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although I am not using the nix setup myself. As a general rule, I think we should be careful to document abstruse setup steps like these as optional. Down the road, it would be grand to have a reusable nix-ish environment that's usable both on workstations and in CI. Thanks for driving toward that!

@cratelyn
Copy link
Contributor Author

LGTM, although I am not using the nix setup myself. As a general rule, I think we should be careful to document abstruse setup steps like these as optional. Down the road, it would be grand to have a reusable nix-ish environment that's usable both on workstations and in CI. Thanks for driving toward that!

192663f adds (Optional) to be a little more explicit about that. thanks for taking a look!

@cratelyn cratelyn merged commit ec8b4e0 into main May 30, 2024
13 checks passed
@cratelyn cratelyn deleted the kate/link-system-rocksdb branch May 30, 2024 06:22
cratelyn added a commit to penumbra-zone/galileo that referenced this pull request May 31, 2024
see penumbra-zone/penumbra#4496.

this is a replication of the seame change; adding rocksdb as a build
input, and setting an environment variable to point the penumbra
repository's `librocksdb-sys` dependency's build script at that
package's static library.

this spares our builds the task of compiling the rocksdb database from
source.
cratelyn added a commit to penumbra-zone/galileo that referenced this pull request May 31, 2024
see penumbra-zone/penumbra#4496.

this is a replication of the seame change; adding rocksdb as a build
input, and setting an environment variable to point the penumbra
repository's `librocksdb-sys` dependency's build script at that
package's static library.

this spares our builds the task of compiling the rocksdb database from
source.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: Documentation needs for the project A-tooling Area: developer tooling for building Penumbra itself
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants