Add integration testsuite #1678

ehuss · 2022-12-30T00:17:08Z

This adds an integration testsuite for testing triagebot's behavior. There are two parts of the testsuite:

github_client tests the GithubClient by sending requests to a local server and checking the response.
server_test actually launches the triagebot executable, sets up some servers, launches PostgreSQL, and injects webhooks, and checks for the response.

An overview of the changes here:

Some of the networking side was changed to be able to use a different host other than api.github.com, raw.githubusercontent.com, and the teams API.
The constructor for GithubClient was simplified a little since it was always used in the same way.
Added some features to help with testing, such as recording webhook events.

The comments and docs should give some overview of how things are set up and how to write and run tests.

ehuss · 2022-12-30T00:17:16Z

This is a little incomplete as I wanted to get some feedback to see if this is something you would be willing to consider accepting. This only includes a few tests, so it doesn't cover very much. Some questions to consider:

Is this worth it? I'm a bit on the fence, but I think it would make working on triagebot a bit easier, at least to have some more confidence when making changes.
Requiring the existence of PostgreSQL is a pain. I'm wondering if there is some alternative?
I hacked the Dockerfile to get it to support postgres. I expect it needs some cleanup.
- Adding postgresql to the Docker image adds a fair amount of size (185MB?). It might be worth considering changing this so that the testing image is separate from the build image.
I could split the two kinds of tests into two separate integration tests, but I'm not sure if it is worth the hassle (particularly with sharing the common code).

Mark-Simulacrum

I think the high-level feedback is that the overall direction is good, I think we need to improve how the JSON files are kept and generated. One thought I had is that it seems like we ought to be able to support something like --bless to generate them automatically? Doing so on CI (with perhaps a tarball artifact produced or some other easy-ish way of using results locally) would make iterating a little better.

Otherwise I'm worried that the bar for adding tests is pretty high if you need to figure out gh api and such.

Mark-Simulacrum · 2022-12-30T01:12:51Z

README.md

+   >     --url=http://127.0.0.1:8000/github-hook --secret somelongsekrit
+   > ```
+   >
+   > Where the value in `--secret` is the secret value you place in `GITHUB_WEBHOOK_SECRET` described below, and `--repo` is the repo you want to test against.


Do you know what permissions are necessary for this? It seems like it might be a convenient way to test against actual GitHub from CI (too), but it's not clear whether the permissions model really makes that easy.

For CI at least we could probably set up our own forwarding if needed though.

IIRC, the default token generated by gh auth login will have full permissions. gh webhook just needs the ability to alter webhook settings on the repo, which that token should have. At least in my testing, it "just worked" without me needing to do anything special (other than signing up for the beta).

I'm not sure it would really be feasible to set it up on CI. I think it would need to start by creating a bot account, and manually creating a token with a limited scope to a single "test" repo, and add that token to the workflow secrets. But then I think PR authors would then have full write access to that repo, which could be dangerous.

I think it would be challenging to do in a secure way, but I haven't thought about it much. I don't know when it will be out of beta, either.

Mark-Simulacrum · 2022-12-30T01:14:50Z

tests/common/mod.rs

+            "PUT" => Method::PUT,
+            "DELETE" => Method::DELETE,
+            "PATCH" => Method::PATCH,
+            _ => panic!("unexpected HTTP method {s}"),


FWIW it seems plausible that https://docs.rs/http/latest/http/method/struct.Method.html might work instead of this (and some other manually defined structs here). Though historically I've not been super happy with the http crate API for this kind of thing... so maybe hand rolling is better.

The reason I didn't use it is because:

I wanted an enum to make it easy to use Method::*. http uses associated consts.

I also wanted it to impl Copy to make it easy to use.

I had a custom STOP verb, which is not standard, and the http Method extensions weren't easy to use. I could easily switch that to use a different method, though.

I understand that it might be nicer to not use a custom http server, but I suspect adding anything else would have a pretty large build time and complexity hit.

I could trim a few things down, like removing custom header support. It isn't being used here, and I'm not sure it will ever be needed (I copied this from Cargo which does use it). However, it only adds a few lines so I figured it didn't hurt to keep it.

Mark-Simulacrum · 2022-12-30T01:21:08Z

tests/github_client/mod.rs

@@ -0,0 +1,543 @@
+//! `GithubClient` tests.


This seems like an improvement over status quo, but I am worried about the amount of JSON we're checking in here.

I have a few thoughts on alternatives:

Store the JSON elsewhere (at minimum: a separate directory would help to avoid losing track of the .rs files)

S3 seems like a possible candidate but is annoying to update, etc.

Don't try to make it possible to run this offline (basically: require CI or a GITHUB_TOKEN -- and "just" query the github API).

The dynamic nature of replies seems like it makes this quite annoying to do well.

Ultimately I'm not opposed to just checking in a bunch of JSON but I've always felt a little icky about that kind of thing. I'm not sure there are good alternatives though.

(I guess the "big" option is getting rid of our own client and using octocrab... which IIRC might be auto-generated at least in part? But that doesn't really solve the problem, just shifts it elsewhere).

Yea, the JSON files are a mess. I'll look into organizing it. I think I would prefer to avoid keeping them external, since I think that would make it more difficult to manage.

I'm not sure how feasible it is to make it run online (particularly on CI). Some of the CI complications are authentication and handling concurrent jobs. I also wanted to ensure this is as easy as possible for contributors to run.

I've been thinking about this off and on, and I'm feeling like the server tests are going to be a pain to maintain. They aren't too difficult to create, but if someone modifies some behavior in triagebot that perturbs them in any way, they are going to need to figure out how to recreate the tests which I think is just going to be too painful.

I think I'm going to need to figure out a different strategy.

If I dropped the server tests from this PR, and left just the GithubClient and database tests, would you be more willing to merge it? I think those should be much more maintainable, and at least provide some increase in testing.

In general -- happy to experiment, the main "blocker" for this PR has (from my perspective) been that you feel happy with it -- I would like to understand the tests as well, and feel OK updating them, but the main thing is that you seem good since we can always just not run them / delete them.

So reducing scope seems fine.

Mark-Simulacrum · 2022-12-30T14:25:58Z

tests/server_test/mod.rs

+//! At the end of the test, you should call `ctx.events.assert_eq()` to
+//! validate that the correct HTTP actions were actually performed by
+//! triagebot. If you are uncertain about what to put in there, just start
+//! with an empty list, and the error will tell you what to add.


In broad strokes this seems great -- I really like the assertions -- but I am (similarly to the github client) worried about maintaining and updating the JSON files.

Mark-Simulacrum · 2022-12-30T14:30:46Z

Is this worth it? I'm a bit on the fence, but I think it would make working on triagebot a bit easier, at least to have some more confidence when making changes.

Do you have a sense of how hard (time/complexity) adding new tests will be? Would it be reasonable to expect drive-by contributors to figure out the test framework, or will maintainers (e.g., me, you, etc.) need to write tests? Either seems workable, but it'll help gauge the level of complexity we can afford.

Requiring the existence of PostgreSQL is a pain. I'm wondering if there is some alternative?

I agree it's a pain -- most triagebot features don't actually rely on the database, so I wonder if we could stub it out. Separately I'd personally expect that having an sqlite backend wouldn't actually be that hard (perf.rlo does something like this), but in production Postgres is much easier to manage (mostly because we can have an "instance" rather than worrying about the files, and can login to fix the database up concurrently with the service).

apiraino · 2022-12-30T16:37:34Z

I've used cargo insta for recording and comparing json replies from API endpoints, it's quite useful and easy to use. Could it be an alternate way to handle these JSON files?

ehuss · 2022-12-30T17:22:26Z

Do you have a sense of how hard (time/complexity) adding new tests will be? Would it be reasonable to expect drive-by contributors to figure out the test framework, or will maintainers (e.g., me, you, etc.) need to write tests? Either seems workable, but it'll help gauge the level of complexity we can afford.

The GithubClient tests were relatively easy. Each one took just a few minutes to write.

The server tests were quite a bit more challenging, and took maybe 10 minutes to write each one (and that's for an easy one). So I do fear it might be difficult for people unfamiliar with it to contribute easily.

most triagebot features don't actually rely on the database, so I wonder if we could stub it out.

I'll look into that, or maybe something like sqlite. I was reluctant to bring in another dependency. I would like to avoid some of the complexity here.

One thought I had is that it seems like we ought to be able to support something like --bless to generate them automatically?

I've used cargo insta for recording and comparing json replies from API endpoints, it's quite useful and easy to use. Could it be an alternate way to handle these JSON files?

I'll look into a more automated way to record the behavior, and possibly using something like cargo insta to manage it. I especially didn't like the current structure where you have to both write an endpoint (.api_handler(…)) and then verify that those endpoints were hit (ctx.events.assert_eq(…)) since it essentially duplicates all that. I think it is going to be challenging to make that completely automated, but it might be worth it.

TRIAGEBOT_TEST_RECORD can be used to record webhook events to disk. TRIAGEBOT_TEST_DISABLE_JOBS can be used to disable automatic jobs.

ehuss · 2023-02-05T22:28:16Z

src/github.rs

                        .configure(self)
                        .build()
                        .unwrap(),
                )
                .await?;
+            rate_resp.error_for_status_ref()?;


Note: this call to error_for_status_ref() is a small unrelated fix to fail early if the /rate_limit URL fails. Otherwise this was ignoring the status code, which can cause a confusing error.

ehuss · 2023-02-05T22:47:12Z

OK, I pushed a new version that takes a different approach of just using recorded playbacks. It is substantially easier to write tests, though I'm a little concerned about the possible fragility.

The PR is a little large, so let me recap the changes:

Lots of small changes scattered throughout to ensure that the remote URLs are configurable.
Lots of small changes to GithubClient to make it easier to record requests.
Added src/test_record.rs. These are functions for recording test data as JSON and saving them to disk when the TRIAGEBOT_TEST_RECORD environment variable is set.
There are two main styles of tests, tests/github_client/mod.rs which just focuses on GithubClient testing and is much lighter weight and tests/server_test/mod.rs which focuses on webhook tests and is much heavier weight. See those respective files for an introduction on how to write those tests.
- I stuck the recorded JSON files in a separate directory for each test.
Both of those tests are compiled in the same integration test called tests/testsuite.rs.
The HTTP requests are recorded as JSON in a structure called an Activity. When the test runs, the test framework cycles through the activities in order, injecting webhooks, and waiting for the expected HTTP request to come in.

I wanted to do a snapshot at this point and see what you think of this new approach.

I'm not sure how reliable these tests are going to be. If in the long run they end up being too fussy, then I think they can be removed. However, I should be around in case there are issues or people need help.

apiraino

@ehuss left a few comments, I think nothing but nits.

I would agree with @Mark-Simulacrum to investigate using a sqlite DB for the use case of testing.

apiraino · 2023-02-06T10:30:33Z

src/main.rs

+        if env::var_os("TRIAGEBOT_TEST_DISABLE_JOBS").is_some() {
+            return;
+        }


The purpose of TRIAGEBOT_TEST_DISABLE_JOBS is explained in ./tests/server_test/mod.rs. Perhaps also write here the same bit of documentation? I would also add a short note in the README.md.

Also, would it change anything if the env var check stays outside of the task spawning block (since tasks are disabled at all when running tests)? Would that spare some cycles or change anything in the compiled code?

I shuffled the spawning code to separate functions to hopefully make the structure a little clearer (the run_server function was starting to get a little long in my opinion).

I'd rather not mention TRIAGEBOT_TEST_DISABLE_JOBS in README.md, since users should never need to know about it specifically. I did add some documentation to server_test/mod.rs discussing scheduled jobs.

apiraino · 2023-02-06T10:32:23Z

src/test_record.rs

+///
+/// Returns `None` if recording is disabled.
+fn record_dir() -> Option<PathBuf> {
+    let Some(test_record) = std::env::var_os("TRIAGEBOT_TEST_RECORD") else { return None };


It seems that record_dir() is only checked in the initial main(). Could checking for TRIAGEBOT_TEST_RECORD be moved there instead of checking it in init() and record_event()? Would that help it being a bit more visible?

Also I would mention it in the README.md (for increased awareness when someone wants to run tests)

I'm not entirely clear on this suggestion. Each time an event comes in, it needs to record it to disk. In order for it to know where to record it, it needs to read the TRIAGEBOT_TEST_RECORD. It can't be checked in one place. Unless this is perhaps suggesting to store the value somewhere and read that directly? It seems like that would make it a bit more difficult to use since there would need to be some global state tracked somewhere.

I added a brief mention of recording in README.md

apiraino · 2023-02-06T11:05:09Z

tests/server_test/mod.rs

+    // TODO: This is a poor way to choose a TCP port, as it could already
+    // be in use by something else.
+    let triagebot_port = NEXT_TCP_PORT.fetch_add(1, std::sync::atomic::Ordering::SeqCst);


Does it test if the port is already taken?

Another approach could be like Wiremock does: allocate a pool of free ports at startup.

I'm not sure I'm clear on the suggestion about setting up a pool. Each triagebot process is independent, and wouldn't be able to easily "take" a port from a pool (sending file descriptors over sockets exists, but not something I would want to bother with).

I'm not too worried about using up a lot of ports. I would only expect for there to ever be a few dozen tests, and there should be over 15,000 ports available starting at 50,000.

I added some code to pick a free port, though it may not be very reliable.

apiraino · 2023-02-06T11:13:40Z

src/team_data.rs

@@ -4,7 +4,8 @@ use rust_team_data::v1::{Teams, ZulipMapping, BASE_URL};
 use serde::de::DeserializeOwned;

 async fn by_url<T: DeserializeOwned>(client: &GithubClient, path: &str) -> anyhow::Result<T> {
-    let url = format!("{}{}", BASE_URL, path);
+    let base = std::env::var("TEAMS_API_URL").unwrap_or(BASE_URL.to_string());


mention TEAMS_API_URL in the README with purpose and a suggested default value?

Can you say what should be mentioned? These environment variables (GITHUB_API_URL, GITHUB_RAW_URL, etc.) are intended only for testing, and should only ever be set automatically. Users shouldn't ever need to know about them.

Yes, documenting for developers the env vars needed to run the triagebot (new one: TEAMS_API_URL). I only half agree but probably just unimportant nits, I guess :-)

This helps keep the `run_server` function a little smaller, and makes the disabling code a little clearer. This also adds a check if test recording is enabled, to help avoid jobs interfering with recording.

ehuss · 2023-02-06T18:07:30Z

I'd be happy to investigate adding sqlite support. What do you think about using something like sqlx instead of recreating the pooling and migration stuff? It seems like it already implements everything that would be needed. Or would you prefer to just copy the code over from rustc-perf?

apiraino · 2023-02-06T22:20:53Z

What do you think about using something like sqlx

I am not familiar with how rustc-perf handles sqlite DB connections but in my experience sqlx is a hefty dependency to pull in (in terms of compile time and maintenance chores). (fwiw) past comments from Mark point at keeping external dependencies at the minimum for the triagebot. Just my .2 cents.

ehuss · 2023-02-06T23:08:47Z

Oh yea, I strongly agree with keeping deps at a minimum. I haven't used sqlx, and I was mostly curious about it, since it supports async, handles pooling, and migrations among other things (and seems well maintained and often recommended). Copying from rustc-perf means pulling in the pooling code and the migration code, and needing to write every DB query twice. It's not huge to copy in, so I can take a crack to see how difficult it is.

Mark-Simulacrum · 2023-02-06T23:18:39Z

I have no direct objections to sqlx, but when I wrote the rustc-perf logic I believe they encouraged having a database around for schema checking or similar at build time which I really didn't want. It may have also been the case that they had a stale version of rusqlite in dependencies, and we needed something newer?

In part I also believe I ended up feeling like the pooling and other abstraction layer was less work than learning sqlx and then writing atop that some amount of abstraction (since there were still incompatibilities between SQLite and postgres backends, so just raw sqlx wasn't enough).

It's quite possible these problems have been solved since then.

Not all API responses are JSON, which can make it a bit awkward to differentiate which kind of responses are JSON. This unifies the two Activity variants so that there is only one, which simplifies things a bit. Non-JSON responses are stored as a JSON string, under the assumption that GitHub never response with a JSON string.

ehuss · 2023-02-21T04:05:29Z

I have pushed an update that adds sqlite support. I essentially copied the code from rustc-perf. There is a Connection trait which abstracts the backends. The postgres side should be essentially the same. The tests will use postgres if it is available, but will use sqlite if not. There's also coverage tests for the Connection API which exercises both postgres and sqlite.

The way sqlite handles concurrency and locking is a bit different from Postgres, but for testing it should be fine.

This can help avoid accidentally deleting something that hasn't been checked in yet.

Apparently some environments roundtrip datetimes in Postgres with different precision.

This reverts commit 5254631.

Mark-Simulacrum reviewed Dec 30, 2022

View reviewed changes

ehuss added 7 commits February 5, 2023 14:01

Add some more debugging information for some errors

e085431

Make it so that the API URLs are configurable for testing

e9a1639

Factor out the signing code so the test framework can use it

1e46b9f

Some updates to assist with testing

17a8f15

TRIAGEBOT_TEST_RECORD can be used to record webhook events to disk. TRIAGEBOT_TEST_DISABLE_JOBS can be used to disable automatic jobs.

Add an integration test

2ffe1ef

Add some docs on testing

b3dffd6

Hack the Dockerfile to support testing

41be270

ehuss force-pushed the gh-test branch 2 times, most recently from dd97c88 to 1617f0d Compare February 5, 2023 22:02

ehuss commented Feb 5, 2023

View reviewed changes

Rewrite testing framework to use recordings only

b7dbcc1

ehuss force-pushed the gh-test branch from 1617f0d to b7dbcc1 Compare February 5, 2023 22:44

apiraino reviewed Feb 6, 2023

View reviewed changes

ehuss added 3 commits February 6, 2023 07:40

Briefly mention test recording in README.md

a4d4a68

Move job task spawners to separate functions.

436a1ee

This helps keep the `run_server` function a little smaller, and makes the disabling code a little clearer. This also adds a check if test recording is enabled, to help avoid jobs interfering with recording.

Check for a free TCP port for the test to listen on.

03a83c2

Add basic server tests for mentions.

68e214c

ehuss force-pushed the gh-test branch from 1391c8f to 05c928d Compare February 21, 2023 04:08

ehuss added 3 commits February 20, 2023 20:11

Add support for using SQLite.

9b477b0

Don't delete recordings.

5254631

This can help avoid accidentally deleting something that hasn't been checked in yet.

Deal with subsecond precision differents in db tests.

99bb336

Apparently some environments roundtrip datetimes in Postgres with different precision.

ehuss force-pushed the gh-test branch from 05c928d to 99bb336 Compare February 21, 2023 04:11

ehuss added 2 commits March 6, 2023 13:08

Revert "Don't delete recordings."

de44778

This reverts commit 5254631.

WIP

def8ccc

ehuss mentioned this pull request Jun 7, 2023

Create testsuite for GithubClient #1698

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add integration testsuite #1678

Add integration testsuite #1678

ehuss commented Dec 30, 2022

ehuss commented Dec 30, 2022

Mark-Simulacrum left a comment

Mark-Simulacrum Dec 30, 2022

ehuss Dec 30, 2022

Mark-Simulacrum Dec 30, 2022

ehuss Dec 30, 2022

Mark-Simulacrum Dec 30, 2022

ehuss Dec 30, 2022

ehuss Feb 21, 2023

Mark-Simulacrum Feb 21, 2023

Mark-Simulacrum Dec 30, 2022

Mark-Simulacrum commented Dec 30, 2022

apiraino commented Dec 30, 2022 •

edited

Loading

ehuss commented Dec 30, 2022

ehuss Feb 5, 2023

ehuss commented Feb 5, 2023

apiraino left a comment

apiraino Feb 6, 2023

ehuss Feb 6, 2023

apiraino Feb 6, 2023

ehuss Feb 6, 2023

apiraino Feb 6, 2023

ehuss Feb 6, 2023

apiraino Feb 6, 2023

ehuss Feb 6, 2023

apiraino Feb 6, 2023

ehuss commented Feb 6, 2023

apiraino commented Feb 6, 2023

ehuss commented Feb 6, 2023

Mark-Simulacrum commented Feb 6, 2023 •

edited

Loading

ehuss commented Feb 21, 2023

Add integration testsuite #1678

Are you sure you want to change the base?

Add integration testsuite #1678

Conversation

ehuss commented Dec 30, 2022

ehuss commented Dec 30, 2022

Mark-Simulacrum left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mark-Simulacrum commented Dec 30, 2022

apiraino commented Dec 30, 2022 • edited Loading

ehuss commented Dec 30, 2022

Choose a reason for hiding this comment

ehuss commented Feb 5, 2023

apiraino left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ehuss commented Feb 6, 2023

apiraino commented Feb 6, 2023

ehuss commented Feb 6, 2023

Mark-Simulacrum commented Feb 6, 2023 • edited Loading

ehuss commented Feb 21, 2023

apiraino commented Dec 30, 2022 •

edited

Loading

Mark-Simulacrum commented Feb 6, 2023 •

edited

Loading