Automation of Rosa Scaling Benchmark (#444): Part 2 #492

andyuk1986 · 2023-08-16T07:35:34Z

PR contains:

Created 2 new actions for performing prometheus queries and calculating/storing result numbers into JSON file;
besides the result json file I am publishing the intermediate files with metric numbers before /after benchmark execution, for being able to debug the correctness of numbers before merging the PR;
I have added "continue-on-error": true to benchmark execution steps, as sometimes Gatling gives failure during report generation (if some of conditions doesn't met) and the whole workflow fails, even though the metrics can be taken and can be calculated
fixed if conditions on action steps as the steps were working no matter if condition was met or not;

The workflow execution for 5 minutes gives the following JSON:

{
  "Testing memory for creating sessions": {
    "500 MB Memory per pod in 3 node cluster handles:": "114831 Active User Sessions"
  },
  "Testing CPU usage for user logins": {
    "1vCPU per pod in 3 node cluster handles:": "45 User Logins Per Second"
  },
  "Testing CPU usage for client credential grants": {
    "1vCPU per pod in 3 node cluster handles:": "478 Client Credential Grants Per Second"
  }
}

Closes #444

ahus1 · 2023-08-16T14:25:17Z

@andyuk1986 - thank you for preparing calculation of the results.

I pushed some small code syntax change, I hope they don't break anything while making the code a bit simpler to read. The # language=bash prefix instructs IntelliJ to do syntax highlighting in the following key.

The calculations look good for the requests per seconds for both users and clients looks good.

The calculation for memory is probably off: The CRITERIA_VALUE used here is the entities created in the realm. The correct value to use IMHO would be the number of sessions created. An approximation would be a "users per second logging in" multiplied by "measurement time in seconds". This is also available as a metrics from Prometheus, so one could read it before and after the run, but that would be a bit more work. Please choose either way, and update the calculations.

Looking at the JSON, this is a JSON I found in one of your runs:

{
  "Testing memory for creating sessions": {
    "500 MB Memory per pod in 3 node cluster handles:": "171997 Active User Sessions"
  },
  "Testing CPU usage for user logins": {
    "1vCPU per pod in 3 node cluster handles:": "47 User Logins Per Second"
  },
  "Testing CPU usage for client credential grants": {
    "1vCPU per pod in 3 node cluster handles:": "486 Client Credential Grants Per Second"
  }
}

I ask you to make the result a simple value so it can be used in reports. It shouldn't be quoted as it is a number, not a string.

Due to that, the naming changed a little, and I also tried to shorten it.

{
  "Memory usage for sessions": {
    "Active sessions per 500 MB memory per Pod in 3 Pod cluster": 171997
  },
  "CPU usage for user logins": {
    "User Logins per second per 1vCPU per Pod in a 3 Pod cluster": 47
  },
  "CPU usage for client credential grants": {
    "Client Credential Grants per second per 1vCPU per Pod in a 3 Pod cluster": 486
  }
}

Please let me know if you have questions.

andyuk1986 · 2023-08-17T17:46:14Z

@ahus1 hi, thanks a lot for the review. I have updated PR with suggested changes.
Currently the new JSON file looks like (example for benchmarks run in 5 minutes):

{
  "Memory usage for sessions": {
    "Active sessions per 500 MB memory per Pod in 3 Pod cluster": "143116"
  },
  "CPU usage for user logins": {
    "User Logins per second per 1vCPU per Pod in 3 Pod cluster": "37"
  },
  "CPU usage for client credential grants": {
    "Client Credential Grants per second per 1vCPU per Pod in 3 Pod cluster": "427"
  }
}

ahus1

Thank you for this contribution. Its great that we now have this automation to calculate the metrics by the click of a button within one hour.

I did two small changes:

Adding a |tonumber to the jq command to have the value printed without quotes, so it can be processed in future steps without converting it from a string to a number
Masking the value of OC_TOKEN so it doesn't show up in the console.

andyuk1986 changed the title ~~Retrieving Prometheus metrics and showing results.~~ Automation of Rosa Scaling Benchmark (#444): Part 2 Aug 16, 2023

andyuk1986 mentioned this pull request Aug 16, 2023

Investigate automation of ROSA scaling benchmark #444

Closed

keycloak deleted a comment from andyuk1986 Aug 16, 2023

Anna Manukyan and others added 2 commits August 17, 2023 13:32

Retrieving Prometheus metrics and showing results.

b73f8dd

Reviewing code changes

8f7e2a4

andyuk1986 force-pushed the prometheus_metrics branch 9 times, most recently from eb9b965 to 71e4808 Compare August 17, 2023 16:23

andyuk1986 force-pushed the prometheus_metrics branch 3 times, most recently from 3240be3 to 226727b Compare August 17, 2023 18:41

Fixed memory measurement and string format changes.

ba6b5e9

andyuk1986 force-pushed the prometheus_metrics branch from 226727b to ba6b5e9 Compare August 17, 2023 19:20

updating docs, Hide OC TOKEN in console

b5b696d

ahus1 force-pushed the prometheus_metrics branch from fd4a79b to b5b696d Compare August 17, 2023 21:20

ahus1 approved these changes Aug 17, 2023

View reviewed changes

ahus1 merged commit 73eb97c into keycloak:main Aug 17, 2023
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automation of Rosa Scaling Benchmark (#444): Part 2 #492

Automation of Rosa Scaling Benchmark (#444): Part 2 #492

andyuk1986 commented Aug 16, 2023 •

edited by ahus1

Loading

ahus1 commented Aug 16, 2023

andyuk1986 commented Aug 17, 2023

ahus1 left a comment

Automation of Rosa Scaling Benchmark (#444): Part 2 #492

Automation of Rosa Scaling Benchmark (#444): Part 2 #492

Conversation

andyuk1986 commented Aug 16, 2023 • edited by ahus1 Loading

ahus1 commented Aug 16, 2023

andyuk1986 commented Aug 17, 2023

ahus1 left a comment

Choose a reason for hiding this comment

andyuk1986 commented Aug 16, 2023 •

edited by ahus1

Loading