Skip to content
Permalink
Browse files

This adds the automated performance test to SAF for use with minishif…

…t... (#49)

This adds the automated performance test to SAF for use with minishift.

The performance test deploys two pods that execute test logic and host Grafana for data analysis. Collectd-tg is used to pump simulated network data through the smart gateway and into the prometheus database at user specified rates, intervals, and sizes. Grafana dashboards are automatically generated to graphically display queries to the database during the time of each test.
  • Loading branch information...
pleimer committed Aug 20, 2019
1 parent 59acd87 commit e185e0b2e4462fd42720af5649421866b36402f1
@@ -12,5 +12,5 @@ install:
script:
- cd deploy
- ./quickstart.sh
- cd ../tests
- ./smoketest.sh
- cd ../tests/smoketest
- ./smoketest.sh
@@ -0,0 +1,22 @@
#--- Build SAF performance test ---
FROM golang:1.12.7
WORKDIR /go/src/performance-test/

COPY ./dashboard.go ./main.go ./parser.go .

RUN go get gopkg.in/yaml.v2 && \
go get github.com/grafana-tools/sdk && \
go build -o main && \
mv main /tmp/

#--- Create performance test layer ---
FROM tripleomaster/centos-binary-collectd:current-tripleo-rdo
USER root

RUN yum install golang -y && \
yum update -y && \
yum clean all

COPY --from=0 /tmp/main /performance-test/exec/main
COPY grafana/apikey /performance-test/grafana/apikey
COPY deploy/scripts/unit-test.sh /performance-test/exec/unit-test.sh
@@ -0,0 +1,71 @@
# SAF Performance Test

## Introduction
The performance test provides an automated environment in which to to run stress
tests on the SAF locally using Minishift. Collectd-tg is used to simulate
extensive netwrok traffic to pump through SAF. Because Minishift only supports a
single node at a time, this test demonstrates the limits of SAF in a constrained
environment. Test scenarios are manually configured in a yaml file and results
can be analyzed in a series of grafana dashboards.

Two additional pods are deployed by the performance test: one that hosts a
grafana instance and one that executes the testing logic.

![A Performance Test Dashboard](images/dashboard.png)

## Configuring Tests

Individual tests are configured in the `deploy/config/test-configs.yaml` file.
Each test uses the following format:

```yaml
- metadata:
name:
spec:
value-lists:
hosts:
plugins:
interval:
length:
queries:
```

To run multiple tests in sequence, utilize the above format in additional list
entries within the config file. Each test generates a unique dashboard within
grafana and each query adds a new graph to its respective dashboard.

# Options

Option | Description
-------|------------
name | name of the test entry. This will be reflected in the dashboard title
value-lists | collectd-tg option
hosts | collectd-tg option
plugins | collectd-tg option
interval | collectd-tg option
length | number of seconds the test should run, expressed as an unsigned integer
queries | list of PromQL queries that will be graphed within the Grafana dashboard

More information about collectd-tg options can be found in the
[collectd-tg docs](https://collectd.org/documentation/manpages/collectd-tg.1.shtml)

# Example Test
```yaml
- metadata:
name: SAF Performance Test 1
spec:
value-lists: 10000
hosts: 5000
plugins: 100
interval: 1
length: 900
queries:
- rate(sa_collectd_total_amqp_processed_message_count[10s])
- sa_collectd_cpu_total
```
View the [performance test deployment instructions](deploy/) to launch
the performance test on Minishift.

Once each test is completed, a new dashboard will be written to grafana at which
all of the queries will be graphed. This can be seen by navigating to
`http://<grafana route URL>/dashboards` in a local browser.
@@ -0,0 +1,127 @@
package main

import (
"errors"
"github.com/grafana-tools/sdk"
"io/ioutil"
"log"
"time"
)

type Dashboard struct {
client *sdk.Client
board *sdk.Board
grafUrl string
promUrl string
panelTemplate []byte
}

//findDsWithUrl searches for a datasource within a grafana instance based on URL
func findDsWithUrl(matchURL string, dsList []sdk.Datasource) (*sdk.Datasource, error) {
for _, dsB := range dsList {
if matchURL == dsB.URL {
return &dsB, nil
}
}
return nil, errors.New("Data source url not found")
}

// Setup configures a dashboard object to talk to a grafana instance within a specified time period
func (d *Dashboard) Setup(title string, apiKey string, grafUrl string, start time.Time, end time.Time) {
log.Print("Setting up grafana client\n")

d.grafUrl = grafUrl
d.client = sdk.NewClient(d.grafUrl, apiKey, sdk.DefaultHTTPClient)

log.Printf("Creating dashboard %s", title)
d.board = sdk.NewBoard(title)
d.board.Timezone = "utc"
d.board.Time = sdk.Time{
From: start.Format("2006-01-02 15:04:05Z"),
To: end.Format("2006-01-02 15:04:05Z"),
}
}

// NewPrometheusDs configures a prometheus data source within grafana that is compatable with
// the dashboard object
func (d *Dashboard) NewPrometheusDs(url string) error {
d.promUrl = url
existingDS, err := d.client.GetAllDatasources()
if err != nil {
return err
}

_, err = findDsWithUrl(url, existingDS)
if err != nil {
log.Print("Building new Prometheus datasource")
newDs := sdk.Datasource{
Name: "Prometheus",
Type: "prometheus",
URL: url,
Access: "direct",
BasicAuth: func() *bool { b := false; return &b }(),
IsDefault: true,
JSONData: map[string]string{
"timeInterval": "1s",
},
}
_, err := d.client.CreateDatasource(newDs)
time.Sleep(5)
return err
} else {
log.Print("Datasource with duplicate URL found. Utilizing original datasource")
}

return err
}

// LoadPanelTemplate loads a pre-defined grafana panel template from a file named fn
func (d *Dashboard) LoadPanelTemplate(fn string) error {
var err error
d.panelTemplate, err = ioutil.ReadFile(fn)
return err
}

// AddGraph generates a new graph within a grafana dashboard
func (d *Dashboard) AddGraph(title string, query string) error {
log.Printf("Adding graph '%s'", title)
row := d.board.AddRow(title)

var graph sdk.Panel
err := graph.UnmarshalJSON(d.panelTemplate)
if err != nil {
return err
}

graph.AddTarget(&sdk.Target{
Expr: query,
})
row.AddGraph(graph.GraphPanel)
// log.Print(row.Panels[:len(row.Panels)-1])
graph.Title = title

return err
}

//Update updates the dashboard if it alread exists in Grafana. Otherwise, create a new one.
//If the current dashboard is found to already exist, it is deleted and a new
//board of the same name is written. This is necessary because the API
//overwrite function cannot dynamically update number of graphs
func (d *Dashboard) Update() error {
res, err := d.client.SearchDashboards(d.board.Title, false)
if err != nil {
return err
}
if len(res) > 0 && res[0].Title == d.board.Title {
log.Print("Overwritting existing dashboard")
_, err = d.client.DeleteDashboard(res[0].URI)
if err != nil {
return err
}
err = d.client.SetDashboard(*d.board, false)
} else {
log.Print("Creating new dashboard")
err = d.client.SetDashboard(*d.board, false)
}
return err
}
@@ -0,0 +1,103 @@
# Setup and Deploy Performance Test

## Environment

minishift v1.34.1
docker v17.05+

### Setup
SAF must already be deployed on a local minishift with the registry-route addon
enabled. A quick way to do this is using the `quickstart.sh` script in
`telemetry-framework/deploy/` directory to run SAF upstream version (quickstart
can also be used to deploy the downstream):

```shell
$ minishift addons enable registry-route # Run BEFORE starting minishift
$ minishift start
$ eval $(minishift oc-env)
$ cd $WORKDIR/telemetry-framework/deploy/; ./quickstart.sh
```
More details about deploying SAF on Minishift can be found in the
[SAF deployment docs](../../../deploy/)

The Minishift registry needs to be configured such that a local docker image can
be pushed to it. To do this, a new account must be created on Minishift that has
admin privledges. The default admin account cannot be used because it does not
provide a token with which to login to the registry with docker.

```shell
$ oc login -u developer -p passwd # create new user if it does not already exist
$ oc login -u system:admin
$ oc adm policy add-cluster-role-to-user cluster-admin developer # give user admin privlidges
$ oc login -u developer -p passwd
$ oc project sa-telemetry # must use same project as SAF
```
## Build

Minishift does not have an up-to-date version of docker and cannot execute
multistage builds. As a result, the performance test image must be built locally
with docker v17.05 or higher and pushed to the minshift internal docker registry.

The Minishift registry must be registered as an insecure registry for the local
docker daemon to be able to push to it. On Fedora 30, this can be done like so:

```shell
$ echo { \"insecure-registries\" : [\"$(minishift openshift registry)\"] } \
| sudo tee /etc/docker/daemon.json # add -a if you wish to preserve other insecure registry configurations
$ sudo systemctl daemon-reload
$ sudo systemctl restart docker
$ docker login -u developer -p $(oc whoami -t) $(minishift openshift registry) # log in to registry
```
Check that docker is using the new registry - the address should match that
shown by `oc get routes -n default`
```shell
$ docker info
.
.
Insecure Registries:
docker-registry-default.192.168.42.121.nip.io
.
.
```
Create and push the image to the Minishift registry
```
$ cd $WORKDIR/telemetry-framework/tests/performance-test/
$ DOCKER_IMAGE="$(minishift openshift registry)/$(oc project -q)/performance-test:dev"
$ docker build -t $DOCKER_IMAGE .
$ docker push $DOCKER_IMAGE #sometimes this needs to be run more than once
```
Note: if an earlier version of the performance test image has been previously
uploaded to the Minishift registry, the previous image stream and associated
containers must be deleted before pushing up the new version else it will not
be properly updated. Refer to the `performance-test/docker-push.sh` steps to
do that.
## Deploy
Ensure that all of the SAF pods are already marked running with `oc get pods`.
Next, launch the grafana instance for test results gathering. This only needs
to be done once:
```shell
$ cd $WORKDIR/telemetry-framework/tests/performance-test/deploy
$ ./grafana-launcher.sh
```
The grafana launcher script will output a URL that can be used to log into the
dashboard. This Grafana instance has all authentication disabled - if, in the
future, the performance test should report to an authenticated grafana instance,
the test scripts must be modified. Once the Grafana instance is running, launch
the performance test OpenShift job:
```shell
$ ./performance-test.sh
```
This will run all of the tests specified in the test-configs.yaml file in
sequence.
Monitor the performance test status by watching the job with
`oc get job performance-test-job -w`. The job will run for the sum of the lengths
of all of the tests in the test-config file. Logs can be viewed with
`oc logs -f performance-test-<unique-pod-id>`
@@ -0,0 +1 @@
{"grafana-host":"grafana-sa-telemetry.192.168.42.143.nip.io","prometheus-host":"prometheus-sa-telemetry.192.168.42.143.nip.io"}
@@ -0,0 +1,23 @@
Interval 1

LoadPlugin cpu
LoadPlugin amqp1
LoadPlugin network

<Plugin "amqp1">
<Transport "name">
Host "qdr-white.sa-telemetry.svc.cluster.local"
Port "5672"
Address "collectd"
<Instance "telemetry">
Format JSON
PreSettle true
</Instance>
</Transport>
</Plugin>

<Plugin network>
<Listen "127.0.0.1">
SecurityLevel None
</Listen>
</Plugin>
@@ -0,0 +1,25 @@
- metadata:
name: SAF Performance Test 1
spec:
value-lists: 10000
hosts: 5000
plugins: 100
interval: 1
length: 20
queries:
- rate(sa_collectd_total_amqp_processed_message_count[10s])
- rate(sa_collectd_cpu_total[10s])
- scrape_samples_scraped

- metadata:
name: SAF Performance Test 2
spec:
value-lists: 10000
hosts: 5000
plugins: 100
interval: 1
length: 500
queries:
- rate(sa_collectd_total_amqp_processed_message_count[10s])
- rate(sa_collectd_cpu_total[10s])
- scrape_samples_scraped

0 comments on commit e185e0b

Please sign in to comment.
You can’t perform that action at this time.