Skip to content

Commit

Permalink
docs: Add initial LMAT instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
Callisto13 committed Aug 2, 2022
1 parent 2dc51e8 commit 51cfe3d
Show file tree
Hide file tree
Showing 3 changed files with 451 additions and 2 deletions.
261 changes: 259 additions & 2 deletions README.md
Expand Up @@ -9,16 +9,273 @@ and run: mdtoc -inplace README.md
- [What they test](#what-they-test)
- [How they work](#how-they-work)
- [How to run...](#how-to-run)
- [Locally](#locally)
- [Locally (option 1)](#locally-option-1)
- [Tunables](#tunables)
- [Locally (option 2)](#locally-option-2)
- [Locally (option 3)](#locally-option-3)
- [In CI](#in-ci)
<!-- /toc -->

## What they test

The LMATS are the highest level suite for the Liquid Metal project. Thus they
ensure that the basic behaviour exposed to a user does what it should.

They ensure that the 2 key components of Liquid Metal ([flintlock][flintlock]
and [CAPMVM][capmvm]) work properly together.

They run daily as a Github Action. See [here][actions] for runs and results.

## How they work

This repo contains the infrastructure config and "trigger points" for running the
LMATS on a non-local (as in not on your computer) bare-metal environment.
The test code itself for now lives in [CAPMVM][capmvm-e2e].

There are 2 main parts to this repo:
- [`terraform/`][tf] which contains manifests for provisioning bare-metal infrastructure
and configuring flintlock.
- [`cmd/`][tool] which triggers the execution of the tests.

The sequence of events for a full run is:
- Terraform section...
- Check capacity of Equinix for requested device types and elect metro with
sufficient space
- Generate SSH keys for use during infrastructure provisioning and later test
execution
- Create Equinix project
- Create 1 host to act as the CAPI management cluster and network "hub"
- Create 2 further hosts to run flintlock _can be overridden_
- Bootstrap some rudimentary networking
- Provision flintlock
- Prepare the "management" host to run CAPI
- Test runner section (over SSH to the management host)...
- Prepare configuration based on the output of the Terraform step and any
action inputs
- Clone CAPMVM on the management host
- Change into the directory and run the e2e tests
- E2E section (streamed over SSH from the management host)...
- Create a kind cluster
- Initialise the cluster with required CAPI controllers
- Generate a template for the CAPMVM workload
- Apply the workload to the kind cluster
- Ensure all supplied flintlock hosts have been used
- Deploy an application to the workload cluster
- Teardown

## How to run...

### Locally
This system is primarily intended to be used by:
- CI (we cannot enable KVM in action runners, so we have to do a lot of infra
provisioning)
- People who do not want, do not have, or have totally borked their local flintlock /
general Liquid Metal environment on their own computer

It is possible, although not really advisable or necessary, to run it locally and
there are a few options for doing so.

### Locally (option 1)

To run the LMATS against non-local bare-metal infrastructure, first clone and
change into this repo:

```bash
git clone https://github.com/weaveworks-liquidmetal/liquid-metal-acceptance-tests
cd liquid-metal-acceptance-tests
```

Set the required environment variables:

```bash
export METAL_AUTH_TOKEN=
export METAL_ORG_ID=
```

_If you are a quicksilver team-member, or part of Weaveworks, these credentials
can be found in 1Pass. Ask Claudia if you are not sure where._

Call the Make command:

```bash
make all
```

This process is quite lengthy, you are looking at 10-20 mins. The test section
alone can take up to 5 mins to run (I am working on making that faster).

To work in steps, or to run the tests several times with the same infrastructure,
you can call the individual targets:

```bash
make tf-up
make e2e # add any flags here as E2E_ARGS="--foo bar" see 'Tunables' below for more
make tf-down
```

#### Tunables

The following configuration options/variables can be changed via the environment:
- `PROJECT_NAME`: change the name of the project to be created in Equinix (default:
`"liquid-metal-acceptance-tests"`. Note that project names in Equinix are not
unique, so if you wish to use an existing project, setting this will not work.
- `FLINTLOCK_VERSION`: change the version of flintlock used in the tests (default:
[latest][flintlock-releases]).
- `DEVICE_COUNT`: change the number of bare-metal hosts which will run flintlock
(default: `2`).
- `DEVICE`: change the type of Equinix devices (default: `c3.small.x86`).
- `E2E_ARGS`: append flags to the test command:
- `-version`: the version of CAPMVM to use in the tests (if set will override
`repo` and `branch`). Must match exactly the tag name of the release, eg: `v0.1.0`.
- `-repo`: the URL to a repo (fork) of CAPMVM to use in the tests.
- `-branch`: the name of a branch to use in the tests. Can be used in combination
with `repo` or alone to target a branch of the upstream repo.
- These flags are properties of the test runner. For more information on how
that works and what other flags are available, see the [tool readme][tool].

For example, to run the LMATS against version `v0.1.0` of Flintlock and against
a branch on my fork of CAPMVM:

```
export FLINTLOCK_VERSION=v0.1.0
make all E2E_ARGS="--repo https://github.com/Callisto13/cluster-api-provider-microvm --branch e2e"
```

### Locally (option 2)

If you are not interested in running the tests against a bare-metal host so far
away, you can simply run the E2Es in CAPMVM without any of this. You wont need
to clone this repo, but you will need two others and will need to put a bit more
work into setting up.

_Note this will only be applicable to people running Linux._

First set up a flintlock server:

```bash
git clone https://github.com/weaveworks-liquidmetal/flintlock
cd flintlock
sudo ./hack/scripts/provision.sh --grpc-address 0.0.0.0:9090 --dev --insecure
# the script will ask you to confirm some choices
cd ..
```

Then clone CAPMVM:

```bash
git clone https://github.com/weaveworks-liquidmetal/cluster-api-provider-microvm
cd cluster-api-provider-microvm
```

Ensure you have the following installed:
- [kind](https://kind.sigs.k8s.io/)
- [docker](https://docs.docker.com/engine/install/ubuntu/)
- [kubectl](https://kubernetes.io/docs/tasks/tools/)
- [clusterctl](https://cluster-api.sigs.k8s.io/user/quick-start.html#install-clusterctl)

And run the tests 100% locally from the CAPMVM repo:

```bash
FL=$(hostname -I | awk '{print $1}') # should get the first private IP of your machine
export FLINTLOCK_HOSTS="$FL:9090"
make e2e
```

More options/flags are available on the tests at this level, see their dedicated
[docs][capmvm-e2e] for more.

### Locally (option 3)

The last option is for those who have borked or just don't want to set up their
flintlock, but they perhaps want to iterate on a local version of CAPMVM. Here we
have a mix of both worlds, where you use the LMATS to provision flintlock on remote
Equinix hosts, and then tell the local E2Es where those hosts are.

Clone this repo:

```bash
git clone https://github.com/weaveworks-liquidmetal/liquid-metal-acceptance-tests
cd liquid-metal-acceptance-tests
```

Set the required environment variables:

```bash
export METAL_AUTH_TOKEN=
export METAL_ORG_ID=
```

Create the Equinix infrastructure:

```bash
make tf-up
# take note of the 'host_ips' in the terraform output
```

TODO: there is some additional networking needed here to ensure that CAPMVM can
access the load balancer address of the created workload cluster. I will add it
at some point. https://github.com/weaveworks-liquidmetal/liquid-metal-acceptance-tests/issues/5

Then clone CAPMVM:

```bash
git clone https://github.com/weaveworks-liquidmetal/cluster-api-provider-microvm
cd cluster-api-provider-microvm
```

Ensure you have the following installed:
- [kind](https://kind.sigs.k8s.io/)
- [docker](https://docs.docker.com/engine/install/ubuntu/)
- [kubectl](https://kubernetes.io/docs/tasks/tools/)
- [clusterctl](https://cluster-api.sigs.k8s.io/user/quick-start.html#install-clusterctl)

And run the tests locally:

```bash
# replace the ips here with the ones you noted from the terraform output
export FLINTLOCK_HOSTS="1.2.3.4:9090,5.6.7.8:9090"
make e2e
```

More options/flags are available on the tests at this level, see their dedicated
[docs][capmvm-e2e] for more.

Don't forget to destroy the infrastructure when you are done:

```bash
make tf-down
```

### In CI

The LMATS will run every day automatically, but they can also be triggered manually
and configured to run with a combination of component versions.

_Note: this option is only available to members of Weaveworks._

Navigate to the [actions tab][actions].

Select the `Run workflow` on the right.

To run with the default settings, click the green `Run workflow` button.

Otherwise you can configure any/all of the below before triggering:
- `flintlock_version`: the version of flintlock to use in the tests.
- `capmvm_version`: the version of CAPMVM to use in the tests (if set will override
`capmvm_repo` and `capmvm_branch`). Must match exactly the tag name of the release, eg: `v0.1.0`.
- `capmvm_repo`: the URL to a repo (fork) of CAPMVM to use in the tests.
- `capmvm_branch`: the name of a branch to use in the tests. Can be used in combination
with `capmvm_repo` or alone to target a branch of the upstream repo.

It can take up to 20 mins to provision the infra and run the tests. The result will
be posted in the `#team-quicksilver` slack channel.

If anything goes wrong there is a step in the action to remove all the infra.
I will be exposing an option to keep things around if needed.

[flintlock]: https://github.com/weaveworks-liquidmetal/flintlock
[capmvm]: https://github.com/weaveworks-liquidmetal/cluster-api-provider-microvm
[capmvm-e2e]: https://github.com/weaveworks-liquidmetal/cluster-api-provider-microvm/test/e2e
[flintlock-releases]: https://github.com/weaveworks-liquidmetal/flintlock/releases
[tool]: /cmd
[tf]: /terraform
[actions]: https://github.com/weaveworks-liquidmetal/liquid-metal-acceptance-tests/main/workflows/nightly_e2e.yml
100 changes: 100 additions & 0 deletions cmd/README.md
@@ -0,0 +1,100 @@
# Cmd

This is a small helper tool to run the Acceptance tests (LMATS) on remote
Equinix infrastructure.

This tool will change when we have built the scheduler component.

This tool will go away when I have figured out some more networking for the infra.

## What and why

There are 2 reasons it exists:
1. To save time on networking complexity during my initial stab at these tests,
I chose not to set it up so that the CAPI management cluster could be run
from outside the Equinix infra network.
_Technically_ then can be since the flintlock servers are bound to a public
interface, but the next hurdle then would have been the control plane
load balancer address: I would have had to figure out a way to dynamically reserve
an IPv4 address and then ensure that it was allocated to the workload cluster.
This is not easy to do in Equinix.
Alternatively, I would have had to automate a VPN to route the private subnets
of the infra, which again is a pain. At some point I will get to solving these.
2. Until we develop the dynamic scheduler, we need to inject the individual
flintlock server IPs into any CAPMVM workload cluster template. This is a pain
to do with CAPI/clusterctl and naturally these IPs are not known ahead of time
(although I could do something with DNS I suppose? But then would I have to deal
with records not being updated in time for the test?). So the tests are built
to receive the IPs and then alter the template; this tool handles the extraction,
formatting and pass-through of the created infra IPs from the Terraform output
to the tests. See [here][capmvm-e2e] for more on how the e2es work.

So for now, the tests are triggered locally but actually run from within one of
the Equinix machines.

The sequence of events is as follows:
- The tool is built and called from the Makefile (`make e2e`)
- It processes and validates any given flags
- It parses the `../terraform/terraform.tfstate` file for the `outputs.host_ips`
and `outputs.management_ip`
- The `host_ips` are formatted ready for use as flintlock addresses by the tests
- The command to run over SSH is built from `e2e.sh` template
- A connection to the `management_ip` is opened using the keys created by the terraform
provisioning script
- The command is executed
- Clone CAPMVM at the set version/repo/branch
- `cd` and start tests
- `cd ..` and remove the directory
- All output is streamed back in real time

## How to use

The tool is most often called from the root Makefile:

```bash
make build-e2e # creates the binary
make e2e # executes the tool
```

The tool has various flags, none of which need to be set:

```
Usage of ./cmd/bin/e2e:
-address string
IP address of host to run SSH command on. (optional)
-branch string
Branch within CAPMVM repository to clone for tests. (optional)
-command string
Non-standard command to run on the target machine. (optional)
-flintlock-hosts string
Comma separated list of flintlock server addresses with ports. (optional)
-private-key string
Path to file containing private key for connection address. (optional) (default "keys/lm-ed")
-repo string
URL of non-default CAPMVM repository to clone for tests. (optional)
-state-file string
Path to terraform state file from which to derive host addresses. (optional) (default "terraform/terraform.tfstate")
-user string
User to run command as. (optional) (default "root")
-version string
Version of CAPMVM to test against. (optional)
```

These can be passed either to the binary directly:

```bash
./cmd/bin/e2e -repo foo
```

Or when calling the `make` command (preferred):

```bash
make e2e E2E_ARGS="-repo foo"
```

Some flags have an order of precedence:
- If `-version` is set, `-repo` and `-branch` will be ignored
- If `-flintlock-hosts` OR `-address` are set, the tool will not look up the
required connection/test info from the terraform output.

[capmvm-e2e]: https://github.com/weaveworks-liquidmetal/cluster-api-provider-microvm/test/e2e

0 comments on commit 51cfe3d

Please sign in to comment.