Load Testing Plan

This document aims at describing the load testing procedure ran on June 1st

Objectives

Verify that the system can handle the high load when approaching the deadline of the application submission
Collect performance data on queries and mutations
Assess which operations can be optimized

Tools

k6, an open-source load-testing toolkit
- in conjunction with easygraphql-load-tester,
  a GraphQL-specific layer on top of k6
nodejs to orchestrate k6 calls
an ad-hoc helm chart to deploy load-testing data and run a load-testing job on the OpenShift platform

Testing plan

We split the load testing into 2 plans:

Queries: load test

Admin queries and Reporter queries were split up in 2 separate load testing scenarios, that were ran sequentially.
each VU (Virtual User) cycles through all the queries as fast as it can. k6 allows us to specify how many VUs
are used for how long, and interpolates between the points that we give.

// k6 scenario that was used for the queries load test
export default {
  stages: [
    {duration: '2m', target: 100},
    {duration: '2m', target: 100},
    {duration: '1m', target: 200},
    {duration: '1m', target: 200},
    {duration: '1m', target: 100},
    {duration: '2m', target: 100},
    {duration: '2m', target: 0}
  ]
};

Mutations: spike test

We identified the mutations that are the heaviest on the system:

createApplicationMutation that will be called once per facility
updateFormResultMutation that is called a lot while applicants fill out the form

Since only one application is allowed per facility in the system, this required to setup a large amount
of facilities ahead of the testing.

// We have 1000 facilities.
// This scenario will start a spike of 100 VUs,
// each creating 10 applications and updating a form result
export default {
  scenarios: {
    mutations_spike: {
      vus: 100,
      iterations: 10,
      executor: 'per-vu-iterations'
    }
  }
};

Artifacts

All artifacts used for the load testing can be found here:

Data Collected

We collected different metrics from the server and the database:

pg stat data
psql -c "\copy (select * from pg_stat_statements) to stdout csv header" > stats.csv
pg logs from patroni
data from /home/postgres/pgdata/pgroot/pg_log/postgresql-*.csv
k6 log output, as json, stored on an ad-hoc PVC

Data Monitored

Health on an ad-hoc Sysdig dashboard
Pod health on the OCP 4 console
Infrastructure health, monitored by Platform Services

Analysis

...coming up...

Custom Footer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly