This documentation is aimed at infrastructure developers, for Snabb infrastructure usage see the Snabblab section in Snabb manual.
Overview of Snabblab CI infrastructure
Source code for managing infrastructure for Snabb community providing tools to build, test and benchmark Snabb software to developers.
Under the hood, Nix language is used.
Source code serves two purposes:
Server deployments using NixOps and Hydra. Relevant folders are
Motivations for Snabblab infrastructure
Following separate topics are all covered in this repository.
Snabblab is a group of servers with attached Networking cards on which Snabb can be developed and used. The cluster needs to be managed and deployed without too much hustle.
Snabb has unit and functional tests that require specific setup and environment to run successfully.
It's critical that Snabb doesn't regress in performance throughout development.
Different Snabb applications integrate into other software, requiring interesting set of software combinations to be benchmarked with.
- 10 different test cases.
- 5 versions of QEMU.
- 10 different guest VMs (Linux and DPDK).
- 16 combinations of Virtio-net options.
- 2 NUMA setups ("good" and "bad")
- 2 polling modes (engine "busy loop" and sleep/backoff)
- 2 error recovery modes (engine supervising apps vs process restart)
- 2 C libraries (glibc and musl)
- 3 CPUs (Sandy Bridge, Haswell, Skylake)
Be familiar with:
existence of basic Nix datatype manipulation functions
The very core of Hydra are jobsets. They define configuration how and when a specific Nix expression is executed.
Jobsets are grouped into projects for easier separation of concerns.
For example, snabb/master means master jobset for snabb project.
jobsets/snabb.nix expressions is evaluated using the highlighted function inputs that jobset configures.
The jobset configuration page defines:
jobsets/snabb.nixin input named
snabblab(which fetches https://github.com/snabblab/snabblab-nixos.git into Nix store).
snabbSrcfunction input as https://github.com/snabbco/snabb.git imported into Nix store
nixpkgsavailable in Nix search path to be imported anywhere in the expression
Once evaluation is triggered (every 300 seconds in this case), inputs are fetched and the whole Nix expression is evaluated. For each Nix derivation the hash is calculated and if it changes, the derivation is rebuilt.
An example evaluation shows that all jobs still succeed. Under the "Inputs" tab one can observe what inputs were used in this specific evaluation and due to Nix design and property of referential transparency, one should always get the same derivations for those inputs.
Each job can also provide "build products" which define what files are inside the resulting derivations and ready for download. Clicking on the manual job it lists different files representing manual formats contained inside the Nix store path.
- Snabb binary
- Snabb manual
- Snabb tests (make test)
- Snabb, not using Nix expression but rather packages on specific distribution (CentOS, OpenSUSE, Debian, Ubuntu, Fedora)
Note: clicking on specific jobset, on "Configuration" tab one can see what inputs are used for the Nix expression: here is an example.
The jobset will build all specified Snabb branches (
pairs). Additionally, you specify which
kernelVersions will be used. Using all these software versions, a big matrix
of combinations of inputs is computed and used to execute selected benchmarks.
benchmarkNames is a list of benchmark names
being executed on the matrix.
numTimesRunBenchmark input specifies how many times each benchmarks is
nixpkgs points to a specific commit, pinning all software used.
Once all benchmarks are executed, a big CSV file is generated based on results.
Last but not least,
reports is a list of reports names
that consumes the CSV and produces a nice report using R and markdown.
Under the hood of a specific benchmark (outputs)
Infrastructure behind a call to execute a benchmark consists of jobset function
outputs, spans over 700 lines in
jobsets/snabb-matrix.nix file and supporting
lib/ folder and begins at building all software used in the matrix.
Using sets of different (Snabb/Qemu/Dpdk/kernel) versions and names of benchmarks, a huge list of benchmark derivations is generated.
namejust being the identifier of the benchmark
checkPhasein bash executing the benchmark itself and writing output to stdout and a log file
toCSVtaking derivation result as input and extracting benchmarking value out of it
provides an environment in which all Snabb tests/benchmark are executed. All
software and environment settings are configured for
checkPhase to execute
correctly. For some benchmarks/tests
~/.test_env inside the chrooted
environment is populated using mkTestNixEnv
function that builds two qemu images (one plain NixOS and one with dpdk l2fwd
running) and corresponding
initrd kernel fixtures.
Using all executed benchmarks, [mkBenchmarkCSV] generates (https://github.com/snabblab/snabblab-nixos/blob/master/lib/benchmarks.nix#L200-217) one big CSV consisting of inputs specification and measures benchmarking values.
Note: this is very WIP and not all servers are deployed using this workflow yet.
NixOps is used for provisioning the machines.
It uses an sqlite database (
~/.nixops/deployments.nixops) to store state
about the provisioning. For example SSH keys, path to nix files, current
First, create a nixops deployment:
$ nixops create -d lab-production ./machines/lab.nix ./machines/lab-production.nix
The server needs a basic NixOS install running SSH with your public key configured.
machines/lab-production.nixand add a new machine.
$ nixops deploy -d lab-production --include mymachine
machines/lab-production.nixand add a new machine.
To bootstrap Hetzner machine we need to use https://robot.your-server.de/ account:
$ HETZNER_ROBOT_USER= HETZNER_ROBOT_PASS= deploy -d lab-production --include mymachine
Copy generated Nix configuration into separate file:
$ nixops export -d lab-production | ./convert_export.py > ./machines/lab-export.nix
A developer pushes a configuration change into Git, Hydra builds and tests it, servers are setup to automatically update themselves from Hydra. For each machine there is a separate channel that serves up that machine's software and configuration.
Testing Snabblab changes manually
Some changes in the repository may trigger massive rebuilds, for example some benchmarks can take more than a day to execute.
For this reason, such changes should go to the