Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add postgrest-loadtest based on vegeta #1812

Merged
merged 3 commits into from
Nov 27, 2021

Conversation

wolfgangwalther
Copy link
Member

@wolfgangwalther wolfgangwalther commented Apr 12, 2021

This is my take on #1609. As discussed in that PR, I ran a few tests with vegeta.

Compared to locust, as initially tested in #1609, I identified the following advantages:

  • Support for unix sockets out of the box
  • Running single-threaded is possible (needed for CI with limited no of cores)
  • Supports HTTP/2, once we need it
  • No boilerplate at all for tests - using the plain-text HTTP message format
  • Written in a compiled language for more req/s

I have only written one very basic test, so far. We are not doing too much with the results either, yet. However, I have added a nice framework around the test runner. We have:

  • postgrest-loadtest to run the loadtests in the current working tree
  • postgrest-loadtest-against <commit-ish> to run the loadtests twice for A/B testing. Once on the current working tree and once on the branch that is given by the first argument. It defaults to main, that means postgrest-loadtest-against will give you a test run comparing your current branch against main. This, with some fancy number crunching of results, could be a great tool for CI.
  • As a side-effect of implementing that, I learned about git worktree (quite awesome!) and implemented postgrest-with-git <commit-ish> <sub-command>. This will checkout commit-ish in a temporary directory and run the sub-command in it. This not only allows to compare loadtests on different branches, but also things like:
    • I am currently working on a fancy new nix feature (this one here) and some reviewer suggests a change to one of my open PRs. To make the change without doing anything to my current working tree, I just ran postgrest-with-git remove-setup-hs nix-shell. This started a new nix-shell on that branch. Made the change, pushed, exit, done. Ok, that's really what you would do with git worktree anyway - but I didn't know, so that was quite an awesome thing to do :D.
    • Remo is going on another refactor-spree? Not sure what that does to performance/tests/whatever? Ah, just a quick postgrest-with-git remo/untangle-the-world postgrest-test-spec-all will answer that.
    • Just developed some cool new nix tooling, like we always do, but it's not yet committed and you want to run this tool against a different branch? postgrest-with-git that-other-branch postgrest-my-new-tool has you covered, because even though it runs in the other branch, it still takes the postgrest- command from the current instance of nix-shell. Of course, you can do by just running nix-shell on one branch and then checking out the other branch. Once you have uncommitted changes, it becomes a bit complicated with possible merge conflicts, though... postgrest-with-git has you covered!

After writing this down, I am not sure anymore what the main feature of this PR was... ;)

All those tools are available behind nix-shell --arg loadtest true right now - although the only dependency is now vegeta. So we should just enable it by default, I guess?

There are a few smaller todos in code, but the bigger ones are (mostly taken from #1609):

  • Add to CI
  • Create human readable report
  • Validate consistency across runs in CI
  • Crunch the numbers, compare to a threshold and report back to the PR via GitHub Status API
  • Add lots of tests
  • Add proper bash completion

Any feedback more than welcome!

@monacoremo
Copy link
Member

Awesome!

So we should just enable it by default, I guess?

Yes, absolutely!

@steve-chavez
Copy link
Member

steve-chavez commented Apr 12, 2021

postgrest-loadtest-against
postgrest-with-git remo/untangle-the-world postgrest-test-spec-all

I'd love to have those commands available :D

Amazing work Wolfgang 💯 🥇

No boilerplate at all for tests - using the plain-text HTTP message format

I like how simple the HTTP test is. One thing though, would it be possible to add a random id(as a query param) that changes on every request? Similar to what I do here. This would be useful for future load tests we might add.

@wolfgangwalther
Copy link
Member Author

No boilerplate at all for tests - using the plain-text HTTP message format

I like how simple the HTTP test is. One thing though, would it be possible to add a random id(as a query param) that changes on every request? Similar to what I do here. This would be useful for future load tests we might add.

I guess this would be possible, but would require a bit of overhead around it. Not as simple anymore, for sure. What advantage would you expect when we did randomize the id?

@wolfgangwalther
Copy link
Member Author

I added a few tests regarding headers, because I wanted to know more about #1794 (comment).

@steve-chavez
Copy link
Member

What advantage would you expect when we did randomize the id?

Fairer tests I thought. Might not be that relevant on GET, because during my load tests on cloud, I noticed the same req/s with both same id vs random ids. Testing PATCH on different ids might make a difference in results(though maybe it's more of a postgresql concern, see here).

@wolfgangwalther
Copy link
Member Author

What advantage would you expect when we did randomize the id?

Fairer tests I thought. Might not be that relevant on GET, because during my load tests on cloud, I noticed the same req/s with both same id vs random ids. Testing PATCH on different ids might make a difference in results(though maybe it's more of a postgresql concern, see here).

Hm. I was about to argue that compared to the "whole-system-k6-tests" you're running, where PostgreSQL and it's performance is very important, we don't need to care about PostgreSQL in postgrest-loadtest. To a certain degree this is true, because when we're testing the efficiency of our haskell implementation, it wouldn't matter. But when we're looking at things like #1794 (comment), we obviously care about what PostgreSQL does.

So we should probably draw the line somewhere around the fact, where the performance problem can be fixed:

  • If we decided to do something about the GUC interface in general and improve performance, we should add some tests, e.g. with many headers, to prevent us from having a regression here.
  • Reading through the PATCH issue you've linked, I understand that the fix really is in to be done in the PostgreSQL configuration - not in our code. So for that specific case, I would not implement random ids, because we wouldn't test any possible regression here.

Adding randomness also potentially makes test runs less comparable, if we don't counter that with longer runtimes to balance out the random part. We need to keep in mind that the more tests we run with the loadtest, the longer we need to run the test to get proper results per test-case. I don't think we need to be super careful, but we shouldn't add every possible test-case either. The goal should be to have tests in place for all major paths through PostgREST - and then some additional tests for specific issues we've encountered and fixed to avoid regressions.

WDYT?

@steve-chavez
Copy link
Member

Adding randomness also potentially makes test runs less comparable, if we don't counter that with longer runtimes to balance out the random part. We need to keep in mind that the more tests we run with the loadtest, the longer we need to run the test to get proper results per test-case.

Very true. Let's keep random ids out until they're really needed.

The goal should be to have tests in place for all major paths through PostgREST - and then some additional tests for specific issues we've encountered and fixed to avoid regressions.

Yes, Fully agree.

@wolfgangwalther
Copy link
Member Author

Last night I felt brave and tried running postgrest-loadtest-against v7.0.1 to see where we're at. Had to made a few changes to support older versions, because earlier commits don't have environment variable support. That part is not pushed, yet, because I took a few shortcuts with hardcoded paths etc...

Anyway the result was... surprising. v7.0.1 seemed to be muuch faster than current HEAD. I wanted to see what was going on and wrote a small script to run postgrest-loadtest against every commit since v7.0.1 in order. To get an idea of how consistent the measurements are going to be, I made each run 3 times, with 1s, 5s and 25s duration respectively. I only ran one very simple GET request, nothing more. The numbers are given in hypothetical req/s (based on minimum latency). The higher, the better.

Here's the data
hash date 1s 5s 25s subject
222a530 2020-05-18T16:50:53-05:00 3543 3552 3671 Fix Dockerfile
67344c8 2020-05-19T13:59:33-05:00 3565 3572 3622 cabal: fix upload to hackage (#1532)
289bb66 2020-05-20T11:49:11-05:00 3543 3639 3610 Reduce size of the static docker image by patching the Nix openssl package (#1528)
e6874c8 2020-05-20T14:10:53-05:00 3637 3631 3668 Fix UNKNOWN being shown as git commit hash (#1533)
48c9ac3 2020-05-20T21:30:20-05:00 3553 3635 3675 Bump nixpkgs and simplify/clean up nix setup (#1529)
3a1844e 2020-05-21T14:47:07-05:00 3683 3651 3727 Fix the static build (#1534)
d4aba5c 2020-05-22T14:30:23-05:00 3719 3578 3669 Add notice about single Linux static executable
08186ea 2020-05-25T11:27:35-05:00 3724 3687 3682 Nixify CircleCI setup (#1535)
784ebe5 2020-05-26T10:43:14-05:00 3517 3579 3666 Allow aeson 1.5 (#1537)
69b09e3 2020-05-28T12:45:14-05:00 3552 3552 3656 Nixify io and memory tests (#1538)
0f0d617 2020-06-05T13:01:36-05:00 3437 3534 3582 Allow http status override through response.status guc (#1541)
8588a42 2020-06-05T13:40:03-05:00 3517 3556 3595 Fix broken links in .github dir
24064f8 2020-06-05T13:57:47-05:00 3549 3510 3529 github: CHANGELOG reminder in PR template
43d71e9 2020-06-24T19:00:02-05:00 3515 3546 3588 Allow schema cache reloading with NOTIFY (#1542)
a5bc293 2020-07-03T13:54:39-05:00 3462 3582 3621 add test cases for singular minimal
55b4f4f 2020-07-03T17:23:10-05:00 3438 3543 3602 Fix expired JWTs starting an empty transaction
343e41c 2020-07-08T14:36:40-05:00 3525 3548 3570 Location header improvements(#1475)
896b79f 2020-07-13T11:30:16-05:00 3557 3506 3539 refactor: separate reading file from parsing it
96a16a3 2020-07-13T11:30:16-05:00 3538 3522 3545 refactor: move loadDbUriFile/SecretFile to Config
0ff05ed 2020-07-13T11:30:16-05:00 3492 3542 3523 refactor: move parseSecret out of App.postgrest
e272ea4 2020-07-13T11:30:16-05:00 3478 3612 3632 refactor: move logStdout/corsPolicy to Middleware
e8b4e37 2020-07-13T11:30:16-05:00 3592 3536 3524 Allow config file reloading with SIGUSR2
1f6a824 2020-07-13T18:47:45-05:00 3597 3525 3572 Bump nixpkgs pin to unstable as of 2020-07-13
6b2767d 2020-07-16T12:09:40-05:00 3414 3535 3617 "Additionally allow ""bearer"" instead of only ""Bearer"" in Authorization… (#1558)"
1898479 2020-07-16T13:39:25-05:00 3574 3578 3589 Schema cache reload with zero downtime (#1559)
8e4687f 2020-07-16T16:05:57-05:00 3477 3572 3616 Return 405 Method not Allowed for GET of volatile RPC instead of 500 (#1560)
a3f4548 2020-08-14T21:12:58-05:00 3497 3588 3570 Allow base64-bytestring 1.2 and optparse-applicative 0.16
bd2160d 2020-08-14T21:12:58-05:00 3468 3581 3552 Make cabals bound more consistent
7af54c5 2020-09-09T19:08:16-05:00 3390 3590 3512 Upgrade nixpkgs to current master and add an upgrade checklist (#1579)
807dae1 2020-09-22T11:45:58-05:00 3501 3549 3582 Allow wai-extra 3.1
17f04a8 2020-09-27T16:04:47-05:00 3498 3560 3593 Allow hspec-wai/hspec-wai-json 0.11 (#1595)
d395bb6 2020-10-03T10:11:41-05:00 3551 3546 3537 Fix error messages on connection failure for postgres on localized Windows
e4e84e5 2020-10-05T13:47:53-05:00 3457 3612 3541 Upgrade nixpkgs and add Postgres 13 support
741f017 2020-10-05T13:47:53-05:00 3561 3555 3558 Use Postgres 13 in tests/docker-compose.yml
60398ad 2020-10-06T12:45:40-05:00 3469 3497 3558 Allow wai-middleware-static 0.9
e9efcc7 2020-10-06T14:21:46-05:00 3577 3604 3601 Add log-level config
f6b6abe 2020-10-06T14:21:46-05:00 3465 3606 3629 Add error log level
d9a608d 2020-10-06T14:21:46-05:00 3632 3611 3628 Add warn log level
ccad9eb 2020-10-06T14:21:46-05:00 3578 3597 3643 Change default log-level, from info to error
99ecc7a 2020-10-13T10:50:19-05:00 3493 3546 3597 Update stack.yaml.lock (#1616)
2eb8083 2020-10-14T09:10:47-05:00 3578 3625 3673 Simplify Nix CI jobs
5bb670b 2020-10-14T09:26:35-05:00 3462 3646 3582 fix appveyour build script
f5331d1 2020-10-14T09:26:35-05:00 3546 3641 3617 fix build error on windows
b21a0f6 2020-10-14T10:49:47-05:00 3515 3572 3604 fix travis build timeout swallowing error code
29858b8 2020-10-14T22:25:42-05:00 3501 3620 3645 improve cirrus build time by properly using cabal cache
719c4ab 2020-10-15T08:43:28-05:00 3558 3580 3629 Drop support for postgres 9.4
5f33f01 2020-10-16T10:24:41-05:00 3481 3558 3555 correct issue numbers in changelog
7809708 2020-10-16T11:15:20-05:00 3545 3616 3623 ci: run spec tests against all pg versions, even if one of them fails (#1628)
b7b66b6 2020-10-16T13:53:43-05:00 3496 3546 3557 ci: remove io tests from stack-test because of random segmentation faults (#1627)
7e3e19a 2020-10-17T15:01:06-05:00 3576 3705 3558 refactor: untangle addFiltersOrdersRanges
d1d0c67 2020-10-17T15:01:06-05:00 4095 4084 4131 perf: change Text queries to ByteString
eed0d3a 2020-10-22T12:05:25-05:00 3964 4067 4058 ci: add coveralls.io integration
06e8535 2020-10-26T17:49:51-05:00 3992 4030 4076 add tests to pass regular arrays properly to RPCs
18cc214 2020-10-26T17:49:51-05:00 4035 4011 4145 refactor moving findProc into ApiRequest to allow parsing parameters differently by proc
302d4e1 2020-10-26T17:49:51-05:00 3975 4036 4105 Allow calling variadic functions with repeated query params or JSON array in body
04eaeec 2020-10-29T18:53:24-05:00 3921 4051 4144 refactor: reorder PgVersion(95/96)Spec tests
a6696f3 2020-10-29T18:53:24-05:00 3951 4042 4046 refactor: add Hasql arrayColumn decoder
2798ced 2020-10-29T18:53:24-05:00 4048 3933 4116 Fix charset=utf-8 appending to binary output
8618ffa 2020-10-29T18:53:24-05:00 3969 4160 4058 perf: shortcut for proc with no variadic arg
4b4a622 2020-10-30T10:06:53-05:00 3940 4085 4078 fix non-variadic repeated param in variadic function
efc725f 2020-11-02T13:22:01-05:00 4014 4073 3991 run stylish-haskell v0.12.1.0
a7403fe 2020-11-03T10:43:06-05:00 4018 4031 4120 refactor: split InsertSpec into InsertSpec and UpdateSpec
122fea1 2020-11-03T10:43:06-05:00 3763 4020 4089 Add test to insert into a view and expect location header return
9f8d1af 2020-11-10T22:47:58-05:00 3904 4013 4055 nix: add postgrest-docker-login wrapper func
63849f1 2020-11-10T22:47:58-05:00 3838 4008 4146 circleci: make release job a machine
139f9b9 2020-11-10T22:47:58-05:00 3833 3991 4001 nix: fix postgrest-release-dockerhubdescription
5f3b581 2020-11-10T22:47:58-05:00 4031 4020 4046 appveyor: add nightly release
8fd9a9a 2020-11-10T22:47:58-05:00 3929 4021 4043 travisci: add nightly release
51b43f5 2020-11-10T22:47:58-05:00 4015 4038 4069 nix: change release scripts to use a version arg
5d140f6 2020-11-10T22:47:58-05:00 3938 4060 4047 circleci: add nightly release
814da05 2020-11-10T22:47:58-05:00 3988 4033 4036 cirrusci: add nightly release
b6d8d89 2020-11-10T22:47:58-05:00 4020 4086 4086 ci: add git sha with hours/minutes to nightly
3830887 2020-11-10T22:47:58-05:00 3784 4030 4073 ci: add nightly version to cabal file
ed8bf82 2020-11-13T14:01:58-05:00 3997 3961 4033 Update BACKERS.md
65968b5 2020-11-13T14:12:28-05:00 3922 4027 4097 circleci: fix nightly release
6e04fe7 2020-11-15T14:58:57-05:00 3877 4074 4065 Enable embedding through multiple layers of views recursively (#1625)
3f690ec 2020-11-17T18:49:02-05:00 3847 4046 4061 Removing single column limit from join table M2M mapping detection. (#1593)
ca7ffd0 2020-11-20T09:24:32-05:00 3979 4064 4045 Fix RPC return type handling for domains with composite base type
b091586 2020-11-20T09:24:32-05:00 3892 4005 4064 Refactor tests: renaming json table to prevent hiding of native json type
e08bb3a 2020-11-20T09:24:32-05:00 3932 4036 4047 Fix overloading of functions with unnamed arguments (specifically for prefer params=single-object)
2c52b96 2020-11-20T09:24:32-05:00 3993 4091 4047 Refactor tests: replace str with json QuasiQuoter where appropriate
d91f47e 2020-11-20T10:48:47-05:00 3901 4035 4022 Remove executable bit on .nix and .sql file
944efb3 2020-11-20T11:51:49-05:00 3958 4018 4029 Split StructureSpec into OpenApiSpec and OptionsSpec
0d5520d 2020-11-21T15:50:27-05:00 3893 4034 4054 Update README.md
698fac8 2020-11-22T14:48:48-05:00 3859 4056 4043 Add test to ignore functions unnamed arguments for rpc disambiguation, resolves #1638
dbf99c6 2020-11-22T18:21:06-05:00 3930 4035 4039 Add support for Prefer tx=rollback
3425352 2020-11-23T17:40:04+01:00 4055 4068 4063 Clean up postgrest --help output
b13e95a 2020-11-23T22:43:49+01:00 3841 4094 4048 Refactor tx-... config options to single tx-end option
8e58b56 2020-11-23T19:00:50-05:00 3911 4034 4098 refactor: Use hasql-dynamic on RPC/POST/PUT/PATCH
4bd5e6b 2020-11-23T19:00:50-05:00 5530 5812 5846 perf: enable prepared statements for GET
787973f 2020-11-23T19:00:50-05:00 5573 5683 5777 Add db-prepared-statements config
9adec12 2020-11-23T19:00:50-05:00 5603 5678 5766 cirrusci: add build timeout
07cb47e 2020-11-30T22:21:19+01:00 5542 5608 5726 Improve dev tools in the Nix environment (#1666)
f2cb917 2020-12-01T00:37:45+01:00 5442 5693 5692 Use cabal v2-exec to run io tests, include io tests in nix-shell by default; remove obsolete ncat dependency (#1673)
1a5f6cc 2020-12-01T00:42:21+01:00 5549 5577 5811 Add docker files to run nix dev environment on Windows
de5742f 2020-12-01T00:42:21+01:00 5513 5895 5781 Make nix the one and only dev environment; remove obsolete tooling
10a70d4 2020-12-01T22:13:41+01:00 5446 5690 5812 add autocompletion for postgrest-watch
609c9ae 2020-12-03T18:54:33+01:00 5623 5683 5772 Make transaction-rollback=true the default for test-suite (#1663)
7d6d015 2020-12-03T22:32:07+01:00 5552 5924 5766 improve error output for io-tests and with_tmp_db
d218a9e 2020-12-05T16:47:50+01:00 5717 5602 5690 dev: improve postgrest-watch experience
eb46cf2 2020-12-06T13:33:25+01:00 5549 5696 5769 added: cli option --dump-config prints loaded config and exits
8979442 2020-12-06T13:33:25+01:00 5548 5649 5742 dev: add colors to io tests
ab33759 2020-12-06T20:49:48+01:00 5296 5659 5756 ci: fix cirrus low memory error
ed58511 2020-12-06T21:59:03+01:00 5407 5579 5794 feat: renamed config options with prefixes; added aliases for old names
9254f11 2020-12-06T21:59:32+01:00 5543 5762 5837 fix: implement robust parsing of boolean config values
7069bb3 2020-12-07T20:09:18-05:00 4533 4776 4777 refactor: Change SET LOCAL gucs to set_config
11d62a8 2020-12-07T20:09:18-05:00 5431 5538 5641 Prepared statement for set_config
ebd474a 2020-12-09T14:20:00-05:00 5380 5580 5610 fix: retry connection on failed schema cache load (#1685)
093fd3c 2020-12-10T19:48:05+01:00 5569 5493 5554 fix: fix embedding through views with subqueries inside function calls
0c25f12 2020-12-10T19:55:42+01:00 5341 5542 5608 fix: fix output of RPCs returning scalar values with multiple rows
71fa879 2020-12-13T16:19:26+01:00 5355 5639 5693 Migrate io-tests from bash to pytest
a52c114 2020-12-13T22:13:05+01:00 5499 5412 5576 add test for spec idempotence
dbe3c16 2020-12-14T00:42:03+01:00 5369 5543 5584 run spec tests in parallel
fe09637 2020-12-17T17:05:03-05:00 5256 5502 5586 catch ReadTimeout in IO tests
bf141ca 2020-12-20T10:22:52+01:00 5391 5661 5595 feat: Added --dump-schema CLI option to dump JSON of dbStructure schema cache.
9cfc66a 2020-12-23T09:39:36-05:00 5368 5653 5524 upgrade nixpkgs and apply related fixes
f2f639e 2020-12-23T09:39:36-05:00 5403 5517 5591 update stack to lts-16.26
b7fc393 2020-12-23T19:39:40+01:00 5453 5573 5723 feat: Read config directly from environment variables
add326a 2020-12-23T19:39:40+01:00 5352 5497 5816 ci: skip postgrest-release-dockerhubdescription for nightly builds
69b459e 2020-12-23T19:39:40+01:00 5414 5489 5563 Use environment variables for configuration in developer tooling
2cbe1ba 2020-12-23T19:39:40+01:00 5400 5555 5580 feat: Add --example cli option to show example config file
56557c9 2020-12-23T19:39:40+01:00 5534 5560 5731 ci: remove broken travis coverage job
b5185de 2020-12-24T14:44:01+01:00 5364 5557 5557 "nix: add ""cd $rootdir"" to all shell scripts via checked-shell-script"
2425cdd 2020-12-24T16:09:25+01:00 5530 5526 5675 improve io-tests
96478ed 2020-12-28T13:23:12+01:00 5325 5530 5664 Add docs to all checked shell scripts in nix
8e02551 2020-12-29T16:07:07+01:00 5446 5484 5548 "Fix ""unused-imports"" warning for windows build"
2015688 2020-12-29T23:58:37+01:00 5353 5525 5527 Fix spec test concurrency issue related to tx=commit on simple_pk
c7f0d42 2020-12-30T13:22:25+01:00 3642 3744 3725 Add postgrest-coverage to show and upload hpc reports to codecov
568477b 2020-12-30T14:51:58+01:00 3683 3734 3805 Fix io tests not collecting coverage data when using run()
9354103 2020-12-30T16:38:13+01:00 3721 3733 3770 Fix postgrest-coverage throwing error when installed with nix-env
519550e 2020-12-30T17:52:15+01:00 3675 3759 3752 Add codecov to README.md
cc54135 2020-12-31T14:37:52+01:00 3768 3796 3817 Rename master branch to main; replace whiteList with isAllowed
af27884 2021-01-02T13:50:19+01:00 3674 3720 3823 Add basic error handling to --dump-schema
56a7bc5 2021-01-02T13:51:00+01:00 3763 3707 3726 nix: Fix build errors with stale .tix files present
94b516d 2021-01-02T13:51:00+01:00 3701 3739 3752 Add flag to postgrest.cabal to disable collecting coverage data
eed6015 2021-01-03T17:54:28+01:00 3596 3738 3715 refactor: Move addHasVariadic to SQL
6efa304 2021-01-03T17:54:28+01:00 3751 3758 3755 refactor: Move parseArg to SQL
1edbef0 2021-01-03T17:54:28+01:00 3612 3744 3862 Add STABLE RPCs in test fixtures to increase coverage
45ebaf0 2021-01-03T17:54:28+01:00 3767 3756 3745 cov: Add comment on SCHEMA in test fixtures
b8e6450 2021-01-03T17:54:28+01:00 3795 3812 3778 refactor: simplify addXRels
d173b7d 2021-01-03T17:54:28+01:00 3701 3728 3837 cov: remove unused fields from type Column
5223082 2021-01-03T17:54:28+01:00 3704 3703 3739 refactor: Combine relConstraint and relJunction in relLink
ab6f90a 2021-01-04T23:14:44+01:00 3688 3737 3842 nix: Add postgrest-with-* tools to run with temporary databases of different versions
97d8456 2021-01-14T16:40:29+01:00 3729 3726 3846 cov: Add io tests for basic cli commands and invalid config options
6dd1264 2021-01-14T16:40:29+01:00 3712 3747 3728 cov: Remove unused code
ba86b47 2021-01-14T16:40:29+01:00 3699 3762 3781 cov: refactor ApiRequest profile header detection
1503955 2021-01-19T13:49:40-05:00 3687 3783 3738 Correct openapi preparing statements by default
6746150 2021-01-19T13:49:40-05:00 3697 3770 3755 Correct --dump-schema swallowing error
9c005fc 2021-01-19T13:49:40-05:00 2253 2283 2293 feat: get configuration parameters from the db
125ea8f 2021-01-19T13:49:40-05:00 2273 2275 2288 feat: allow reloading config with NOTIFY
4344cc9 2021-01-22T15:56:08-05:00 2279 2288 2319 refactor: use optString and move overrideFrom
17af56a 2021-01-22T15:56:08-05:00 2291 2333 2301 refactor: config validation inside readAppConfig
6557f1f 2021-01-22T15:56:08-05:00 2270 2299 2334 correct NOTIFY config reload dying on error
c93e8f9 2021-01-25T18:38:46-05:00 2282 2287 2308 Correct hardcoded postgrest_test_authenticator (#1743)
0ddd676 2021-02-07T22:15:44+01:00 2279 2292 2321 nix: Add a script to generate a graph of Haskell imports (#1751)
e6973f9 2021-02-23T22:41:48+01:00 2211 2252 2263 refactor: App.hs and related changes (#1725)
b50b674 2021-02-25T20:03:38-05:00 2183 2262 2256 Update CYBERTEC logo and BACKERS
a7dab6d 2021-02-27T13:26:02-05:00 2170 2238 2248 cirrusci: update to FreeBSD 12.2
c3ccaf1 2021-02-28T13:02:37-05:00 2170 2247 2252 Allow lens-5.0
e4516ab 2021-03-05T19:03:15-05:00 2221 2265 2269 "Correct db settings to use ""_"" instead of ""-"""
6750a5c 2021-03-05T19:03:15-05:00 2199 2243 2242 Restrict db settings to current db and global
498e772 2021-03-05T19:03:15-05:00 2195 2235 2242 Reread in-db config when recoverying connection
d3a8b5f 2021-03-05T19:03:15-05:00 2202 2243 2249 Change db-load-guc-config to db-config
5baec48 2021-03-14T12:45:02-05:00 2207 2239 2231 circleci: update base image
3a466ae 2021-03-14T12:45:02-05:00 2202 2291 2240 tests: don't match JSON bodies literally
71f6061 2021-03-25T20:42:43-05:00 2156 2236 2269 Add Supabase as a sponsor
9f074ce 2021-03-26T10:13:38-05:00 2210 2220 2243 cirrusci: update FreeBSD pkg
65e7f9e 2021-04-06T11:46:12+02:00 2235 2234 2233 nix: Add a tool for checking Haskell imports and exports (#1768)
376beac 2021-04-06T17:07:14+02:00 2191 2252 2273 update contributing guidelines
9c79a41 2021-04-06T22:22:13-05:00 2202 2216 2277 Add Contributor Covenant Code of Conduct v2
7f11c1a 2021-04-08T01:45:13+02:00 2208 2279 2231 fix: Void functions return null instead of empty body (#1795)
67f555d 2021-04-08T14:33:18+02:00 2202 2219 2248 nix: Fix incorrect use of bash variable (#1797)
a5372e4 2021-04-09T18:48:00+02:00 2194 2249 2229 nix: show better error messages for missing arguments in hsie scripts (#1798)
9118a4a 2021-04-10T20:38:51+02:00 2206 2254 2243 fix random codecov/project drops in CI, resolves #1800
8c44410 2021-04-11T10:38:35+02:00 2234 2247 2258 remove misleadingly named emptyOnFalse and similar (#1803)
f99fd6c 2021-04-11T18:28:01+02:00 2234 2257 2228 refactor: Split up Types.hs and logically organize modules (#1793)
579a626 2021-04-12T19:03:16+02:00 2216 2246 2225 nix: move withtmpdb to separate file
fc791a6 2021-04-12T19:03:16+02:00 2212 2238 2269 nix: add withTmpDir to checked-shell-script
67ca814 2021-04-12T19:03:16+02:00 2208 2243 2237 nix: properly throw away tix files for postgrest-watch, resolves #1807
801e229 2021-04-12T19:17:47+02:00 2159 2216 2211 refactor: Carve CLI and Version modules out of Config (#1806)
91d0c73 2021-04-12T20:31:32+02:00 2217 2239 2256 add comment to Setup.hs
d40d704 2021-04-12T20:31:32+02:00 2196 2234 2251 remove unused static assets
c2c7bbe 2021-04-12T20:47:22+02:00 2207 2218 2204 refactor: remove obsoleted Private module (#1811)
1db32e2 2021-04-12T21:01:15+02:00 2182 2208 2212 nix: add postgrest-loadtest
e44537a 2021-04-12T21:01:15+02:00 2196 2244 2263 nix: small refactors
c502756 2021-04-13T17:26:19+02:00 2212 2202 2218 add todo
bbcf942 2021-04-13T17:26:33+02:00 2228 2235 2234 add a few header tests
4172a72b 2021-04-13T21:15:40+02:00 2170 2243 2259 legacy support

And here's how that looks:

loadtest

The commits with the major changes are (first number is position on x-axis):

  • 50: d1d0c67 perf: change Text queries to ByteString
  • 89: 4bd5e6b perf: enable prepared statements for GET
  • 105: 7069bb3 refactor: Change SET LOCAL gucs to set_config
  • 106: 11d62a8 Prepared statement for set_config
  • 127: c7f0d42 Add postgrest-coverage to show and upload hpc reports to codecov
  • 148: 9c005fc feat: get configuration parameters from the db

The first 2 are major improvements, but we knew about those before. The two set_config ones are funny - luckily we only had that regression for 1 commit and then introduced prepared statements, too! :)

The coverage-one is not critical. This drop in performance is not real (for production). That's just the effect of changing cabal's build flags to -O0 and enabling HPC coverage all the time to avoid rebuilds. This only happens in development. As a side-note: I think it's actually beneficial, because it seems like before that change the numbers were much noisier. I assume that vegeta on 1 core was running at it's limits at that point. By making PostgREST slower in general, vegeta will still be able to saturate it.

But.. the last one is really surprising. I have no idea why the addition of configuration parameters from the database slows things down so much. This is something we need to investigate.

@steve-chavez, could you do me a favor and test the following commits vs. the commit immediately before with your k6-setup?

  • c7f0d42 - to confirm that the drop related to coverage is really only a development thing. You'd need to make sure you run a production build, but I guess you do anyway.
  • 9c005fc - to see how much that drop is in "real" numbers. The hypothetical req/s are exactly that... hypothetical.

Another observation: The blue numbers (1s) are a bit jumpy, I feel. But there is not too big of a difference between orange and green it seems. So it should be enough to run each test-case for 5s and not 25s.

@steve-chavez
Copy link
Member

:O! Nice plotting work Wolfgang.

could you do me a favor and test the following commits vs. the commit immediately before with your k6-setup?

Sure! I'll create static binaries for those commits and report here.

Meanwhile, I've ran GETSingle on the latest nightly(d3a8b5f) — which already contains the coverage and in-db config changes — and v7.0.1.

Nightly(2071.345877/s):

    data_received..............: 20 MB  665 kB/s
    data_sent..................: 6.5 MB 217 kB/s
  ✓ dropped_iterations.........: 750    24.955574/s
  ✓ failed requests............: 0.00%  ✓ 0     ✗ 62251
    http_req_blocked...........: avg=8.52µs   min=1.79µs   med=3.3µs   max=55.83ms  p(90)=4.5µs   p(95)=5.94µs
    http_req_connecting........: avg=2.57µs   min=0s       med=0s      max=32.5ms   p(90)=0s      p(95)=0s
  ✓ http_req_duration..........: avg=7.5ms    min=795.97µs med=2.05ms  max=150.47ms p(90)=25.32ms p(95)=39ms
    http_req_receiving.........: avg=109.1µs  min=12.87µs  med=36.42µs max=41.28ms  p(90)=81.51µs p(95)=179.47µs
    http_req_sending...........: avg=327.89µs min=7.32µs   med=27.77µs max=47.78ms  p(90)=48.94µs p(95)=109.54µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s      max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=7.06ms   min=753.68µs med=1.98ms  max=150.36ms p(90)=23.5ms  p(95)=37.03ms
    http_reqs..................: 62251  2071.345877/s
    iteration_duration.........: avg=7.64ms   min=889.53µs med=2.18ms  max=150.61ms p(90)=25.49ms p(95)=39.2ms
    iterations.................: 62251  2071.345877/s
    vus........................: 121    min=107 max=121
    vus_max....................: 121    min=107 max=121

v7.0.1(1347.378153/s):

    data_received..............: 12 MB  411 kB/s
    data_sent..................: 4.3 MB 141 kB/s
  ✗ dropped_iterations.........: 22213  734.732071/s
  ✓ failed requests............: 0.00%  ✓ 0     ✗ 40735
    http_req_blocked...........: avg=105.85µs min=1.47µs  med=3.79µs   max=166.33ms p(90)=6.57µs   p(95)=15.15µs
    http_req_connecting........: avg=50.66µs  min=0s      med=0s       max=165.71ms p(90)=0s       p(95)=0s
  ✓ http_req_duration..........: avg=207.25ms min=1.64ms  med=199.9ms  max=710.8ms  p(90)=351.09ms p(95)=412.99ms
    http_req_receiving.........: avg=891.97µs min=13.56µs med=35.72µs  max=174.84ms p(90)=324.3µs  p(95)=2.04ms
    http_req_sending...........: avg=12.83ms  min=7.95µs  med=29.88µs  max=272.68ms p(90)=49.77ms  p(95)=87.19ms
    http_req_tls_handshaking...: avg=0s       min=0s      med=0s       max=0s       p(90)=0s       p(95)=0s
    http_req_waiting...........: avg=193.52ms min=1.56ms  med=193.4ms  max=629.33ms p(90)=320.26ms p(95)=352.7ms
    http_reqs..................: 40735  1347.378153/s
    iteration_duration.........: avg=209.56ms min=1.8ms   med=200.23ms max=783.64ms p(90)=358.69ms p(95)=422.12ms
    iterations.................: 40735  1347.378153/s
    vus........................: 514    min=110 max=514
    vus_max....................: 514    min=110 max=514

Seems there's no regression.

@steve-chavez
Copy link
Member

steve-chavez commented Apr 14, 2021

c7f0d42 - to confirm that the drop related to coverage is really only a development thing

2061.35802/s - 2015688 - Fix spec test concurrency issue related to tx=commit on simple_pk

    data_received..............: 20 MB  655 kB/s
    data_sent..................: 6.5 MB 216 kB/s
  ✓ dropped_iterations.........: 925    30.716974/s
  ✓ failed requests............: 0.00%  ✓ 0     ✗ 62075
    http_req_blocked...........: avg=8.64µs   min=1.61µs   med=3.54µs  max=20.46ms  p(90)=4.83µs  p(95)=6.23µs
    http_req_connecting........: avg=2.79µs   min=0s       med=0s      max=20.34ms  p(90)=0s      p(95)=0s
  ✓ http_req_duration..........: avg=9.59ms   min=809.9µs  med=2.19ms  max=201.53ms p(90)=33.22ms p(95)=47.23ms
    http_req_receiving.........: avg=115.67µs min=12.3µs   med=37.03µs max=52.77ms  p(90)=85.91µs p(95)=191.4µs
    http_req_sending...........: avg=365.3µs  min=7.59µs   med=28.25µs max=47.23ms  p(90)=50.24µs p(95)=128.92µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s      max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=9.11ms   min=765.52µs med=2.1ms   max=201.46ms p(90)=31.58ms p(95)=45.97ms
    http_reqs..................: 62075  2061.35802/s
    iteration_duration.........: avg=9.74ms   min=937.83µs med=2.33ms  max=202.04ms p(90)=33.4ms  p(95)=47.39ms
    iterations.................: 62075  2061.35802/s
    vus........................: 133    min=108 max=133
    vus_max....................: 133    min=108 max=133

2039.801516/s - c7f0d42 - Add postgrest-coverage to show and upload hpc reports to codecov

    data_received..............: 20 MB  649 kB/s
    data_sent..................: 6.4 MB 213 kB/s
  ✓ dropped_iterations.........: 1418   46.95822/s
  ✓ failed requests............: 0.00%  ✓ 0     ✗ 61596
    http_req_blocked...........: avg=8.6µs    min=1.58µs   med=3.23µs  max=43.54ms  p(90)=4.43µs  p(95)=5.74µs
    http_req_connecting........: avg=3.8µs    min=0s       med=0s      max=43.35ms  p(90)=0s      p(95)=0s
  ✓ http_req_duration..........: avg=10.8ms   min=820.23µs med=2.25ms  max=201.15ms p(90)=39.68ms p(95)=55.87ms
    http_req_receiving.........: avg=136.35µs min=11.61µs  med=35.39µs max=101.31ms p(90)=79.13µs p(95)=184.77µs
    http_req_sending...........: avg=580.15µs min=7.28µs   med=27.49µs max=60.75ms  p(90)=52.18µs p(95)=167.99µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s      max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=10.09ms  min=771.72µs med=2.16ms  max=166.73ms p(90)=36.34ms p(95)=53.09ms
    http_reqs..................: 61596  2039.801516/s
    iteration_duration.........: avg=10.95ms  min=920.37µs med=2.38ms  max=201.32ms p(90)=39.83ms p(95)=56.05ms
    iterations.................: 61596  2039.801516/s
    vus........................: 139    min=108 max=139
    vus_max....................: 139    min=108 max=139

9c005fc - to see how much that drop is in "real" numbers. The hypothetical req/s are exactly that... hypothetical.

2062.206612/s - 6746150 - Correct --dump-schema swallowing error

    data_received..............: 20 MB  656 kB/s
    data_sent..................: 6.5 MB 216 kB/s
  ✓ dropped_iterations.........: 871    28.910524/s
  ✓ failed requests............: 0.00%  ✓ 0     ✗ 62129
    http_req_blocked...........: avg=17.24µs  min=1.59µs   med=3.45µs  max=35.34ms  p(90)=4.95µs  p(95)=12.29µs
    http_req_connecting........: avg=7.41µs   min=0s       med=0s      max=24.52ms  p(90)=0s      p(95)=0s
  ✓ http_req_duration..........: avg=8.32ms   min=822.77µs med=2.19ms  max=169.44ms p(90)=28.16ms p(95)=43.97ms
    http_req_receiving.........: avg=116.38µs min=12.6µs   med=38.48µs max=47.48ms  p(90)=88.56µs p(95)=192.07µs
    http_req_sending...........: avg=281.97µs min=7.51µs   med=28.07µs max=48.56ms  p(90)=50.5µs  p(95)=115.44µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s      max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=7.92ms   min=759.4µs  med=2.1ms   max=152.55ms p(90)=26.89ms p(95)=43.01ms
    http_reqs..................: 62129  2062.206612/s
    iteration_duration.........: avg=8.47ms   min=919.07µs med=2.33ms  max=171.59ms p(90)=28.31ms p(95)=44.16ms
    iterations.................: 62129  2062.206612/s
    vus........................: 125    min=103 max=125
    vus_max....................: 125    min=103 max=125

2060.299624/s - 9c005fc - feat: get configuration parameters from the db

    data_received..............: 20 MB  655 kB/s
    data_sent..................: 6.5 MB 216 kB/s
  ✓ dropped_iterations.........: 910    30.175151/s
  ✓ failed requests............: 0.00%  ✓ 0     ✗ 62133
    http_req_blocked...........: avg=6.42µs   min=1.61µs   med=3.43µs  max=11.3ms   p(90)=4.75µs  p(95)=6.34µs
    http_req_connecting........: avg=1.56µs   min=0s       med=0s      max=8.92ms   p(90)=0s      p(95)=0s
  ✓ http_req_duration..........: avg=7.43ms   min=831.65µs med=2.06ms  max=166.89ms p(90)=22.17ms p(95)=36.46ms
    http_req_receiving.........: avg=115.21µs min=13.24µs  med=38.59µs max=40.69ms  p(90)=89.01µs p(95)=191.54µs
    http_req_sending...........: avg=380.04µs min=7.61µs   med=28.48µs max=52.2ms   p(90)=51.85µs p(95)=122.66µs
    http_req_tls_handshaking...: avg=0s       min=0s       med=0s      max=0s       p(90)=0s      p(95)=0s
    http_req_waiting...........: avg=6.93ms   min=773.79µs med=1.98ms  max=166.74ms p(90)=20.61ms p(95)=34.15ms
    http_reqs..................: 62133  2060.299624/s
    iteration_duration.........: avg=7.57ms   min=923.64µs med=2.19ms  max=167.55ms p(90)=22.35ms p(95)=36.62ms
    iterations.................: 62133  2060.299624/s
    vus........................: 119    min=106 max=119
    vus_max....................: 119    min=106 max=119

@wolfgangwalther
Copy link
Member Author

I have done a few further tests and checks:

  • When I disable the development / hpc flags, the coverage-drop is gone as expected. But the config-drop is still there - down to ~3000 hypo/s.
  • The same shows for all metrics I collect (mean, median, min). Actual req/s in my test-setup go down from ~2200 to ~1400 with the config commit. This does not seem to be an issue with the metrics I am using.

@wolfgangwalther
Copy link
Member Author

@steve-chavez In your setup: Which parts of the setup (postgrest, database, workers) are on the same machine? All three? Or are any of them going through a real network?

@steve-chavez
Copy link
Member

postgrest + database are on the same machine(a t3a.nano, low-end instance) the k6 runner is in another machine(t3a.medium). The two machines are under the same VPC(AWS's VLAN). So yep, the requests go through network.

@wolfgangwalther
Copy link
Member Author

Figured it out:

  • When I disable db-config, the regression is gone.
  • Because we're running the loadtest with the test fixtures schema, we also got a few settings of postgrest_test_authenticator. Namely:
    • ALTER ROLE postgrest_test_authenticator SET pgrst.db_prepared_statements = 'false';
    • ALTER ROLE postgrest_test_authenticator SET pgrst.db_pre_request = 'test.custom_headers';

Just by removing those two lines, I am up at 5200 hypo/s again (including the disabled dev flags).

Soo.... we should either disable db-config for those loadtests or use a completely different schema / user etc. right away.

@steve-chavez
Copy link
Member

steve-chavez commented Apr 14, 2021

ALTER ROLE postgrest_test_authenticator SET pgrst.db_prepared_statements = 'false';
ALTER ROLE postgrest_test_authenticator SET pgrst.db_pre_request = 'test.custom_headers';

Aha! Disabling the prepared statements should have the most impact. Btw, regarding pre-request, I also noted it impacts perf. My results with a simple prereq that did nothing but return false were 1844.326975/s(around 11% perf loss) on the same GETSingle test.

Edit: Added the 4 tests on #1812 (comment).

@wolfgangwalther
Copy link
Member Author

Aha! Disabling the prepared statements should have the most impact. Btw, regarding pre-request, I also noted it impacts perf. My results with a simple prereq that did nothing but return false were 1844.326975/s(around 11% perf loss) on the same GETSingle test.

Yes, enabling prepared statements gave me the biggest boost. That 11% drop is probably that extra round-trip on every request then.

Edit: Added the 4 tests on #1812 (comment).

Thanks. I'm glad that we're still running fast :)

@wolfgangwalther
Copy link
Member Author

@monacoremo could you have a look at my latest commit 2b71f53? I hope that will fix the "missing GHC" error I got in CI. What do you think about the way I did that - any nix-anti-pattern or something that I hit? ;)

@wolfgangwalther
Copy link
Member Author

@monacoremo another question for, maybe you already have an idea. I am now at a stage where I want to parse not only arguments, but options to those shell scripts (loadtest and changelog stuff, too). I thought I'd get my feet wet with the "do something with docs TODO" we have. I have looked at getopt and getopts so far. Not terribly convinced, yet. Do you know anything else that could help here? Or have any experience with either of the two?

@wolfgangwalther
Copy link
Member Author

wolfgangwalther commented Apr 14, 2021

A few runs of postgrest-loadtest in CI. It runs a comparison against the main branch, but since we don't have any source-code changes in this branch, we should expect the same result.

First run:

Running loadtest on "main"...

Requests      [total, rate, throughput]         5698, 1139.48, 1139.25
Duration      [total, attack, wait]             5.002s, 5.001s, 1.011ms
Latencies     [min, mean, 50, 90, 95, 99, max]  640.089µs, 873.787µs, 802.365µs, 1.131ms, 1.236ms, 1.513ms, 3.974ms
Bytes In      [total, mean]                     239316, 42.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:5698  
Error Set:

Running loadtest on HEAD...

Requests      [total, rate, throughput]         5549, 1109.72, 1109.38
Duration      [total, attack, wait]             5.002s, 5s, 1.562ms
Latencies     [min, mean, 50, 90, 95, 99, max]  592.531µs, 897.615µs, 827.983µs, 1.172ms, 1.265ms, 1.504ms, 4.553ms
Bytes In      [total, mean]                     233058, 42.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:5549  
Error Set:

Second run:

Running loadtest on "main"...

Requests      [total, rate, throughput]         5269, 1053.77, 1053.52
Duration      [total, attack, wait]             5.001s, 5s, 1.174ms
Latencies     [min, mean, 50, 90, 95, 99, max]  681.89µs, 945.223µs, 867.14µs, 1.232ms, 1.356ms, 1.679ms, 10.254ms
Bytes In      [total, mean]                     221298, 42.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:5269  
Error Set:

Running loadtest on HEAD...

Requests      [total, rate, throughput]         5934, 1186.62, 1186.39
Duration      [total, attack, wait]             5.002s, 5.001s, 980.151µs
Latencies     [min, mean, 50, 90, 95, 99, max]  653.65µs, 839.168µs, 772.683µs, 1.074ms, 1.176ms, 1.42ms, 2.3ms
Bytes In      [total, mean]                     249228, 42.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:5934  
Error Set:

Third run (after rebase). Now with cabal dev flag (before was without):

Running loadtest on "main"...

Requests      [total, rate, throughput]         5164, 1032.74, 1032.55
Duration      [total, attack, wait]             5.001s, 5s, 914.268µs
Latencies     [min, mean, 50, 90, 95, 99, max]  778.277µs, 966.802µs, 936.126µs, 1.07ms, 1.133ms, 1.573ms, 5.061ms
Bytes In      [total, mean]                     216888, 42.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:5164  
Error Set:

Running loadtest on HEAD...

Requests      [total, rate, throughput]         5179, 1035.65, 1035.47
Duration      [total, attack, wait]             5.002s, 5.001s, 887.333µs
Latencies     [min, mean, 50, 90, 95, 99, max]  789.899µs, 964.038µs, 934.12µs, 1.074ms, 1.143ms, 1.539ms, 2.88ms
Bytes In      [total, mean]                     217518, 42.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:5179  
Error Set:

Fourth run, with different request types. No report-per-test, yet:

Running loadtest on "main"...

Requests      [total, rate, throughput]         1423, 284.56, 284.45
Duration      [total, attack, wait]             5.003s, 5.001s, 1.948ms
Latencies     [min, mean, 50, 90, 95, 99, max]  540.562µs, 3.511ms, 1.574ms, 11.861ms, 20.627ms, 22.467ms, 25.492ms
Bytes In      [total, mean]                     1194427, 839.37
Bytes Out     [total, mean]                     25134, 17.66
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:855  201:142  204:426  
Error Set:

Running loadtest on HEAD...

Requests      [total, rate, throughput]         1422, 283.60, 283.49
Duration      [total, attack, wait]             5.016s, 5.014s, 2.027ms
Latencies     [min, mean, 50, 90, 95, 99, max]  539.206µs, 3.523ms, 1.577ms, 10.942ms, 20.755ms, 22.694ms, 25.117ms
Bytes In      [total, mean]                     1194334, 839.90
Bytes Out     [total, mean]                     25134, 17.68
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:854  201:142  204:426  
Error Set:

Fifth run, different request types, 60s:

Running loadtest on "main"...

Requests      [total, rate, throughput]         17596, 293.26, 293.25
Duration      [total, attack, wait]             1m0s, 1m0s, 1.711ms
Latencies     [min, mean, 50, 90, 95, 99, max]  595.925µs, 3.406ms, 1.447ms, 13.08ms, 20.727ms, 22.836ms, 29.857ms
Bytes In      [total, mean]                     14701228, 835.49
Bytes Out     [total, mean]                     311499, 17.70
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:10557  201:1760  204:5279  
Error Set:

Running loadtest on HEAD...

Requests      [total, rate, throughput]         17776, 296.26, 296.26
Duration      [total, attack, wait]             1m0s, 1m0s, 1.227ms
Latencies     [min, mean, 50, 90, 95, 99, max]  586.3µs, 3.371ms, 1.476ms, 10.708ms, 20.145ms, 22.258ms, 28.36ms
Bytes In      [total, mean]                     14849804, 835.39
Bytes Out     [total, mean]                     314685, 17.70
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:10665  201:1778  204:5333  
Error Set:

First run on GitHub Actions:

Running loadtest on "main"...

Requests      [total, rate, throughput]         25562, 425.96, 425.95
Duration      [total, attack, wait]             1m0s, 1m0s, 886.317µs
Latencies     [min, mean, 50, 90, 95, 99, max]  345.807µs, 2.346ms, 813.28µs, 8.284ms, 16.36ms, 16.722ms, 31.954ms
Bytes In      [total, mean]                     21493997, 840.86
Bytes Out     [total, mean]                     452412, 17.70
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:15338  201:2556  204:7668  
Error Set:

Done running on "main".


Running loadtest on HEAD...

Requests      [total, rate, throughput]         26252, 437.52, 437.52
Duration      [total, attack, wait]             1m0s, 1m0s, 744.813µs
Latencies     [min, mean, 50, 90, 95, 99, max]  348.607µs, 2.284ms, 792.471µs, 9.693ms, 15.928ms, 16.379ms, 22.443ms
Bytes In      [total, mean]                     22074011, 840.85
Bytes Out     [total, mean]                     464625, 17.70
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:15752  201:2625  204:7875  
Error Set:

Done running on HEAD.

Seems like the machines here are a bit faster.

@wolfgangwalther
Copy link
Member Author

wolfgangwalther commented Apr 14, 2021

@monacoremo another question for, maybe you already have an idea. I am now at a stage where I want to parse not only arguments, but options to those shell scripts (loadtest and changelog stuff, too). I thought I'd get my feet wet with the "do something with docs TODO" we have. I have looked at getopt and getopts so far. Not terribly convinced, yet. Do you know anything else that could help here? Or have any experience with either of the two?

I have just found https://argbash.io. Looks like we could integrate that into our checked-shell-script. This way we could provide an additional argument args or something that holds the argbash template. We could then add the --help option in checked-shell-script to that and let argbash do the rest. WDYT?

Edit: https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/tools/misc/argbash/default.nix ... yes!

@monacoremo
Copy link
Member

I have just found https://argbash.io.

Looks super promising! That would bring all our scripts to the next level!

I'm only familiar with optparse-applicative in Haskell and click in Python. But this could be awesome for our bash scripts!

@wolfgangwalther
Copy link
Member Author

To be able to run the loadtest consistently against different commits, we need to not only run the same requests, but also run them on the same schema. Assuming some PR makes a change and adds a loadtest to show the result - the same test, including fixtures, needs to be run against the main branch. To avoid a complicated procedure of needing to commit tests/fixtures first, etc. I was looking for a way to tell withTmpDb to load a different set of fixtures - not the spec-test fixtures.

To do this, I wanted to add CLI option to with_tmp_db and then felt the best way to do that, was to make it a checked-shell-script. Yes, this removes the ability to run it without nix, but I felt it was worth it, because:

  • There is still test/create_test_db
  • One could install postgrest-with-xxx via nix-env and then still run stack tests against it (or do it inside nix-shell)
  • It makes the script significantly shorter, because of features already included in checkedShellScript
  • It allows us to add the --fixtures CLI option easily

@monacoremo WDYT?


Now, I added a very small schema for the loadtest and also more requests to test. My first goal was to match all request handlers in App.hs. Did I get all of them? What main features, that would touch more (or possibly performance critical) code, should we add?

@monacoremo
Copy link
Member

monacoremo commented Apr 18, 2021

To do this, I wanted to add CLI option to with_tmp_db and then felt the best way to do that, was to make it a checked-shell-script. Yes, this removes the ability to run it without nix, but I felt it was worth it

Yes, I see how that would be nice - especially with the argbash integration now. test/create_test_db is a bit of a grown mess, would be good if we could phase it out actually...

I was thinking whether there could be a compromise - e.g. generating and checking in a with_tmp_db based on the nix version. Should be doable with a bit of sed (changing the shebang, removing nix-store refences, embedding the arg parsing, adding a comment that the file is generated etc) and we could check the consistency in CI as we do with style. Seems like a lot of extra work though. We should then also move the stack tests to with_tmp_db and kill create_test_db completely.

Or we just maintain two versions of the script, it's been relatively stable...

We should keep some version of the standalone script in any case I think, otherwise we seriously gimp any non-nix developers.

WDYT?

@wolfgangwalther
Copy link
Member Author

We should keep some version of the standalone script in any case I think, otherwise we seriously gimp any non-nix developers.

I don't think there is a reason to do that. I can see the reasoning for allowing to build with stack. But not without nix. All our dev environment is based on nix + we have the docker-compose setup for cases where that doesn't work natively. To be honest: The biggest hurdle for me going up to speed initially was the fact that I was not using nix. Once I did, everything was so much smoother.

Let's turn this around and create a postgrest-build-stack command in nix. And then we keep everything in nix.

@wolfgangwalther
Copy link
Member Author

I rebased this after the move to GitHub Actions. Right now, cachix/install-nix-action@v13 seems to have some problems, see e.g.:

https://github.com/PostgREST/postgrest/runs/4228318238?check_suite_focus=true

Makes all the nix based jobs fail randomly.


The load test job is supposed to upload a markdown table to the checks page of the PR. This is not working right now, because PRs from a forked repo only have read access through secrets.GITHUB_TOKEN. I'll need to pass the markdown file as an artifact to a trusted workflow and then upload it from there, as mentioned here: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/.

wolfgangwalther added a commit to wolfgangwalther/postgrest that referenced this pull request Nov 24, 2021
wolfgangwalther added a commit to wolfgangwalther/postgrest that referenced this pull request Nov 24, 2021
wolfgangwalther added a commit to wolfgangwalther/postgrest that referenced this pull request Nov 24, 2021
wolfgangwalther added a commit that referenced this pull request Nov 24, 2021
@wolfgangwalther wolfgangwalther force-pushed the nix/loadtest branch 2 times, most recently from 18d4373 to 49093e0 Compare November 24, 2021 22:02
@wolfgangwalther
Copy link
Member Author

So the loadtest report seems to be uploaded successfully, according to the workflow:
https://github.com/PostgREST/postgrest/runs/4317690798?check_suite_focus=true

Unfortunately I have no idea where it ended up. Can't find it anywhere, yet. Will have to investigate later..

@monacoremo
Copy link
Member

Unfortunately I have no idea where it ended up. Can't find it anywhere, yet. Will have to investigate later..

Something turned up here!

Screenshot_20211125-230511_Chrome

https://github.com/PostgREST/postgrest/runs/4317691924?check_suite_focus=true

Need to check whether that went to the correct check run, but seems to work great otherwise. Very stable results also, awesome work!

@wolfgangwalther
Copy link
Member Author

Something turned up here!
[...]
Need to check whether that went to the correct check run, [...]

Ah nice find. The workflow that runs on workflow_run is running on the latest commit of the default branch - that's why the report went to that commits check run. I just pushed a fix for that, it should now target the correct commit.

Let's see what happens in a bit.

@wolfgangwalther
Copy link
Member Author

Ok, so that worked fine now, it shows up here. Right now the results are only added with a neutral conclusion. There is no comparison against a threshold, yet, and the pipeline would not fail because of a drop in performance.

We can improve the formatting of the results later on, too - and also write a more sophisticated python script to analyze the numbers in detail, e.g. separated by request. Right now we just a have one set of numbers for all kinds of requests mixed together.

My idea would be to merge this now, and then let this run a bit on all pull requests and pushes to main. We can then get a feel for the numbers and see what kind of threshold we would like to implement.

WDYT?

nix/tools/withTools.nix Outdated Show resolved Hide resolved
Copy link
Member

@monacoremo monacoremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work! Great integration with our CI, will be very useful to have

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants