-
Notifications
You must be signed in to change notification settings - Fork 1
Bazel
Bazel, BuildBuddy & GitHub Actions have replaced Concourse as our primary test suite execution mechanism in 2021.
GitHub Actions is GitHub's automation/continuous integration feature. It allows workflows vaguely similar to Concourse pipelines to be defined in YAML, and when commited to the apropriate location in a repository, can be triggered by various events, such as new commits to a repo. However unlike Concourse, a single workflow cannot "watch" multiple repos jointly for changes. For most workflows in rabbitmq-server, Bazel is used to build and test.
Bazel is a build and test tool. It is useful to us as it supports caching of test results and remote parallel execution. Once we merged the broker and all of the tier-1 plugin repositories into a monorepo, the set of tests naively trigged by every commit to the monorepo was simply too large to ignore. Unfortunately Bazel does not have built in Erlang support, but since it is extensible, we wrote rules_erlang.
BuildBuddy is a hosted Bazel Remote Build Execution service. Most of our actual execution of test cases occurs on buildbuddy workers.
So graphical form, we have:
push new commit -> Test Workflow -> `bazel test //...` (Erlang 23) -> a_SUITE (executed by BuildBuddy)
-> b_SUITE (executed by BuildBuddy)
-> `bazel test //...` (Erlang 24) -> a_SUITE (executed by BuildBuddy)
-> b_SUITE (executed by BuildBuddy)
-> Test (Mixed Versions) Workflow -> ...
_> ...
Bazel rules used by RabbitMQ assume a number of developer tools available locally:
- Modern C++ compiler toolchain (
clang
,g++
): comes from thebuild-essential
package on Debian-based Linux and XCode command line tools on macOS -
sha256sum
: comes fromcoreutils
on Linux and thesha3sum
formula via Homebrew on macOS
Bazel can be used to run test locally, just like make
. First install Bazelisk,
a user-friendly launcher for bazel
that will also respect the .bazelversion
file in the repository. The erlang and elixir installations used will be picked up from your PATH
, or they can be specified by exporting ERLANG_HOME
and ELIXIR_HOME
environment variables.
One should also copy user-template.bazelrc
to user.bazelrc
or $HOME/.bazelrc
:
# rabbitmqctl wait shells out to 'ps', which is broken in the bazel macOS
# sandbox (https://github.com/bazelbuild/bazel/issues/7448)
# adding "--spawn_strategy=local" to the invocation is a workaround
build --spawn_strategy=local
# don't re-run flakes automatically on the local machine
build --flaky_test_attempts=1
build:buildbuddy --remote_header=x-buildbuddy-api-key=YOUR_API_KEY
# cross compile for linux (if on macOS) with rbe
build:rbe --host_cpu=k8
build:rbe --cpu=k8
Once the above is complete, you should be able to run some tests with
bazel test //deps/rabbit_common:all
So what is a test label? Bazel has a notion of repositories and packages, and the name of a target within that hierarchy is its label.
So, for instance, the label for the backing_queue_SUITE
for the rabbit
application from rabbitmq-server
,
found at deps/rabbit/test/backing_queue_SUITE.erl
is //deps/rabbit:backing_queue_SUITE
.
And, if you want to run that suite, you can do so with
bazel test //deps/rabbit:backing_queue_SUITE
The complete label is actually @rabbitmq-server//deps/rabbit:backing_queue_SUITE
,
but the head can be left off if rabbitmq-server
is the current repository.
You can also run tests matching a label pattern, so if you wanted to run all of
the tests for the rabbit
application (this will take a while and consume all available local CPU cores!),
do it with
# runs ALL RabbitMQ server core test suites
bazel test //deps/rabbit:all
To build everything in rabbitmq-server
, use
# runs ALL RabbitMQ server core test suites
bazel build //...
Finally, to run all test suites in the repository:
# runs ALL RabbitMQ server core test suites
bazel test //...
To know more about what
:all
or//...
means, check this out.
To execute all tests in a plugin, say, rabbitmq_shovel
:
# erlang.mk
cd deps/rabbitmq_shovel
gmake tests
# Bazel
bazel test //deps/rabbitmq_shovel:all
To execute one test in a plugin, say, rabbitmq_auth_backend_oauth2
:
# erlang.mk
cd deps/rabbitmq_auth_backend_oauth2
gmake ct-unit
# Bazel
bazel test //deps/rabbitmq_auth_backend_oauth2:unit_SUITE
To execute one test in a test suite, say test case test_successful_token_refresh
of group basic_happy_path
of test suite system_SUITE
of subproject rabbitmq_auth_backend_oauth2
:
# erlang.mk
gmake -C deps/rabbitmq_auth_backend_oauth2 ct-system t=basic_happy_path:test_successful_token_refresh
# Bazel
bazel test //deps/rabbitmq_auth_backend_oauth2:system_SUITE --test_env FOCUS="-group basic_happy_path -case test_successful_token_refresh"
bazel test --cache_test_results=no //deps/rabbitmq_auth_backend_oauth2:all
To consult Common Test logs after running a test suite (or all test suites):
# erlang.mk
cd deps/rabbitmq_auth_backend_oauth2
gmake ct-unit
open logs/index.html
# Bazel
bazel test //deps/rabbitmq_auth_backend_oauth2:unit_SUITE
bazel run test-logs //deps/rabbitmq_auth_backend_oauth2:unit_SUITE
To inspect node data directories after a test run:
# erlang.mk
cd deps/rabbitmq_auth_backend_oauth2
gmake ct-unit
# opens top level directory for all test run data
open logs
# Bazel
bazel test //deps/rabbit:maintenance_mode_SUITE
# opens test run data directory of the last trun
bazel run test-node-data //deps/rabbit:maintenance_mode_SUITE
# erlang.mk
make -C deps/rabbitmq_amqp1_0 ct FULL=1 COVER=1
open deps/rabbitmq_amqp1_0/logs/index.html
In the browser, click on the test name, then on Coverage log
.
# Bazel
bazel coverage //deps/rabbitmq_amqp1_0:all -t-
genhtml --output genhtml "$(bazel info output_path)/_coverage/_coverage_report.dat"
open genhtml/index.html
where genhtml
is https://github.com/linux-test-project/lcov/blob/master/bin/genhtml and can be installed with brew install lcov
on Mac OS.
# erlang.mk
gmake run-broker PLUGINS="rabbitmq_management rabbitmq_shovel rabbitmq_shovel_management rabbitmq_top" RABBITMQ_CONFIG_FILE=/path/to/rabbitmq.conf
# Bazel
bazel run broker RABBITMQ_ENABLED_PLUGINS="rabbitmq_management,rabbitmq_shovel,rabbitmq_shovel_management,rabbitmq_top" RABBITMQ_CONFIG_FILE=/path/to/rabbitmq.conf
# erlang.mk, from the directory used to run 'gmake run-broker'
./sbin/rabbitmq-diagnostics status
# Bazel, from the directory used to run 'bazel run broker'
bazel run rabbitmq-diagnostics status
# Running the CLIs through bazel is pretty slow. You can use the CLIs directly, once they are built:
./bazel-bin/broker-home/sbin/rabbitmqctl
# For even more convenience, just add this folder to your PATH (you may need to adjust it of course)
export PATH=$PATH:~/rabbitmq-server/bazel-bin/broker-home/sbin/rabbitmq-defaults
# erlang.mk
gmake start-cluster NODES=5 TEST_TMPDIR="$HOME"/scratch/myrabbit
# Bazel
bazel run start-cluster NODES=5 TEST_TMPDIR="$HOME"/scratch/myrabbit
Stop the cluster:
# erlang.mk
gmake stop-cluster NODES=5 TEST_TMPDIR="$HOME"/scratch/myrabbit
# Bazel
bazel run stop-cluster NODES=5 TEST_TMPDIR="$HOME"/scratch/myrabbit
# erlang.mk
gmake package-generic-unix
gmake docker-image
# Bazel
bazel run //packaging/docker-image:rabbitmq
# erlang.mk
cd deps/rabbitmq_shovel
gmake xref
gmake dialyze
# Bazel
bazel test //deps/rabbitmq_shovel:xref
bazel test //deps/rabbitmq_shovel:dialyze
# to skip xref
bazel test //deps/rabbitmq_shovel:all --test_tag_filters="xref"
# to skip Dialyzer
bazel test //deps/rabbitmq_shovel:all --test_tag_filters="dialyze"
# to skip both xref and Dialyzer
bazel test //deps/rabbitmq_shovel:all --test_tag_filters="xref,dialyze"
# erlang.mk
gmake clean
# Bazel
bazel clean
# Bazel
bazel build "//:package-generic-unix"
To generate language server files for ./deps
directories, run
bazel run //tools:symlink_deps_for_erlang_ls
To use Bazel and leverage remote build execution with BuildBuddy, you need to create a BuildBuddy account
and fill in the token value in the user.bazelrc
file.
Then, you can run tests with the rbe-23
or rbe-24
configurations active, such as bazel test //... --config=rbe-24
.
When tests are run with with (or without) RBE, the logs can be found under the bazel-testlogs
directory.
This directory mirrors the package structure of the repo, so the backing_queue_SUITE
logs will be found
in the bazel-testlogs/deps/rabbit/backing_queue_SUITE
directory.
The RBE configuration is such that remote execution uses the pivotalrabbitmq/rabbitmq-server-buildenv:linux-rbe
docker image. The Dockerfile for that image can be found at https://github.com/rabbitmq/rabbitmq-ci/blob/main/docker/rabbitmq-server-buildenv/linux-rbe/Dockerfile. A Hush House pipeline watches that repo and will rebuild the image automatically when it changes. However, that is not enough for the change to propagate to RBE. After the image has been rebuilt, the nightly (also manually triggerable) https://github.com/rabbitmq/rbe-erlang-platform/actions/workflows/rbe_configs_gen.yaml GitHub Actions workflow must see the new image and create a corresponding PR. Only once that PR is merged will RBE pick up the new image.
By default, bazel does not stream test outputs, and typically there are many different tests running in parallel.
In fact, streaming output with the --test_output=streamed
flag disables remote execution and runs tests in sequence.
Therefore, when tests are run in CI, most of the logs are not visible until you follow the link in the GitHub Actions log to the run in BuildBuddy. It will look something like:
INFO: Streaming build results to: https://app.buildbuddy.io/invocation/43e8a3a0-78db-444d-8316-737426d65de1
From there, you can see the results of the run and click through to the logs of a failing test. Even then, to see
the Erlang Common Test HTML logs, you will have to scroll to the bottom of the page and download the test.outputs__outputs.zip
file.
Furthermore, it's currently a limitation of BuildBuddy that if a test is sharded, it's not clear in the UI which log file goes with
which shard, as they all the same name. I've raised this with them and they have said that they will fix it.
Thankfully these outputs are distinguishable on the machine that ran bazelisk test
. When you see a failure in CI,
remember that running the same suite from your local machine using remote execution is an option.
Additionally, if the test flaky, one can easily run 10 or 20 copies (or even 100) in parallel using
the approach and a few more flags. For example,
bazel test //deps/rabbitmq_federation:exchange_SUITE --config=rbe-24 -t- --runs_per_test=10
The -t-
flag tells Bazel to ignore cached test results.
The exchange_SUITE
is actually an interesting example, as it also currently a sharded test.
When executed with the above flags, part of the output is:
//deps/rabbitmq_federation:exchange_SUITE PASSED in 144.6s
Stats over 60 runs: max = 144.6s, min = 26.7s, avg = 91.7s, dev = 31.8s
Which is correct since each of the 6 shards is run 10 times, for a total of 60 runs (and 60 parallelizable jobs).
Sometimes you might want to test a coordinated change between something like osiris
and rabbitmq-server
.
In that case, you can clone osiris
next to rabbitmq-server
, and add the additional --override_repository rules_erlang~3.8.5~erlang_package~osiris=$PWD/../osiris
flag to bazel
commands in rabbitmq-server
. RBE still works, and local changes are honored. Unfortunately at this point the rules_erlang
version needs to be included in the flag, so you will need to keep this command up to date as we upgrade rules_erlang
. To get the exact repository name to override, you can ls bazel-rabbitmq-server/external/
and use the folder/link name as displayed there.
You can use hot code reloading to quickly iterate on code changes - no need to restart the broker/cluster to check if things work as expected.
- Start a local RabbitMQ with
-c dbg
:
# single node
bazel run -c dbg broker
# cluster
bazel run -c dbg start-cluster
- Uncomment
code_reload
section inerlang_ls.config
and set the hostname to your local name (as used by RabbitMQ nodes). For example:
code_reload:
node: rabbit@mymachine
and use a text editor with LSP support (pretty much any editor). When you save a file, it will be reloaded by erlang_ls.
Starting with erlang_ls
0.50.0, you can configure multiple nodes for code reload, so you can develop against a cluster:
code_reload:
node: [rabbit-1@mymachine, rabbit-2@mymachine, rabbit-3@mymachine]
and have your local changes reloaded on all nodes immediately.
Here's a demo of what that looks like (in this case using neovim
and ToggleTerm
but you can use it with other editors and ways of calling functions):
- Open a terminal with
rabbitmq-diagnostics remote_shell
- Use (neo)vim
autocmd
to automatically run a command in the terminal on buffer save (since reloading on 3 nodes takes a moment, I use an artificial delay before running the function) - When the function is modified to return a new value and the buffer gets saved, a function is called. In this case I specifically use
rabbit_misc:append_rpc_all_nodes
to show that all 3 nodes return the new value.