From 61cc907b6e0d21864328fab86b0484fead5f0e2c Mon Sep 17 00:00:00 2001 From: Marin Bareta Date: Tue, 27 Aug 2024 16:34:33 +0200 Subject: [PATCH 1/7] Add automated testing page --- README.md | 4 ++ recipes/automated-testing.md | 83 ++++++++++++++++++++++++++++++++++++ 2 files changed, 87 insertions(+) create mode 100644 recipes/automated-testing.md diff --git a/README.md b/README.md index 3d2aed3..8bd815f 100644 --- a/README.md +++ b/README.md @@ -21,6 +21,10 @@ 4. [LTI - Learning Tools Interoperability Protocol](./recipes/lti.md) 5. [CircleCI Build Guide](./recipes/circleci-build-guide.md) +## Guides + +1. [Automated Testing](./recipes/automated-testing.md) + ## 🙌 Want to contribute? We are open to all kinds of contributions. If you want to: diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md new file mode 100644 index 0000000..f7446e0 --- /dev/null +++ b/recipes/automated-testing.md @@ -0,0 +1,83 @@ +# Automated Testing + +## Types of Automated Tests + +There are different approaches to testing and depending on the level of the +entry point, we can split tests into following categories. + +- **Unit Tests** +- **Integration Tests** +- **E2E Tests** +- **Load/Performance Tests** +- **Visual Tests** + +*Note that some people can call these tests by different names, but for Studion +internal purposes, this should be considered the naming convention.* + +### Unit Tests + +These are the most isolated tests that we can write. They should take a specific +function/service/helper/module and test its functionality. Unit tests will +usually require mocked data, but since we're testing that specific input produces +specific output, the mocked data set should be minimal. + +Unit testing is recommended for functions that contain a lot of logic and/or branching. +It is convenient to test a specific function at the lowest level so if the logic +changes, we can make minimal changes to the test suite and/or mocked data. + + +### Integration Tests (API Tests) + +This is the broadest test category. With these tests, we want to make sure our +API contract is valid and the API returns the expected data. That means we write +tests for the publically available endpoints. + +**TODO**: do we want to add that we should run full backend for these type of tests? + +**TODO**: do we want to write anything about mocking the DB data/seeds? + +In these tests we should cover *at least* the following: +- **authorization** - make sure only logged in users with correct role/permissions +can access this endpoint +- **success** - if we send correct data, the endpoint should return response that +contains correct data +- **failure** - if we send incorrect data, the endpoint should handle the exception +and return appropriate error status + +If the endpoint contains a lot of logic where we need to mock a lot of different +inputs, it might be a good idea to cover that logic with unit tests. Unit tests +will require less overhead and will provide better performance while at the same +time decoupling logic testing and endpoint testing. + +### E2E Tests + +These tests are executed within a browser environment (Playwright, Selenium, etc.). +The purpose of these tests is to make sure that interacting with the application UI +produces the expected result. + +Usually, these tests will cover a large portion of the codebase with least +amount of code. +Because of that, they can be the first tests to be added to a project that +has no tests or has low test coverage. + +These tests should not cover all of the use cases because they are the slowest to +execute. If we need to test edge cases, we should try to implement those at a +lower level, like integration or unit tests. + +### Performance Tests + +These types of tests will reproduce a usual user scenario and then simulate a group +of concurrent users and measure the server's response. + +They are typically used to stress test the infrastructure and measure the throughput +of the application. + + +### Visual Tests + +The type of test where test runner navigates to browser page, takes screenshot +and then compares the future screenshots with the reference screenshot. + +These types of tests will cover a lot of ground with the least effort and +can indicate a change in the app. The downside is that they're not very precise +and the engineer needs to spend some time to determine the cause of the error. From 6717639728372037ddcca7d1bd01909f23ab1b53 Mon Sep 17 00:00:00 2001 From: Marin Bareta Date: Tue, 27 Aug 2024 16:40:28 +0200 Subject: [PATCH 2/7] Clean up --- recipes/automated-testing.md | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md index f7446e0..af50533 100644 --- a/recipes/automated-testing.md +++ b/recipes/automated-testing.md @@ -2,7 +2,7 @@ ## Types of Automated Tests -There are different approaches to testing and depending on the level of the +There are different approaches to testing, and depending on the level of the entry point, we can split tests into following categories. - **Unit Tests** @@ -18,8 +18,8 @@ internal purposes, this should be considered the naming convention.* These are the most isolated tests that we can write. They should take a specific function/service/helper/module and test its functionality. Unit tests will -usually require mocked data, but since we're testing that specific input produces -specific output, the mocked data set should be minimal. +usually require mocked data, but since we're testing the case when specific +input produces specific output, the mocked data set should be minimal. Unit testing is recommended for functions that contain a lot of logic and/or branching. It is convenient to test a specific function at the lowest level so if the logic @@ -36,7 +36,7 @@ tests for the publically available endpoints. **TODO**: do we want to write anything about mocking the DB data/seeds? -In these tests we should cover *at least* the following: +In these tests we should cover **at least** the following: - **authorization** - make sure only logged in users with correct role/permissions can access this endpoint - **success** - if we send correct data, the endpoint should return response that @@ -57,20 +57,22 @@ produces the expected result. Usually, these tests will cover a large portion of the codebase with least amount of code. -Because of that, they can be the first tests to be added to a project that +Because of that, they can be the first tests to be added to existing project that has no tests or has low test coverage. These tests should not cover all of the use cases because they are the slowest to -execute. If we need to test edge cases, we should try to implement those at a -lower level, like integration or unit tests. +run. If we need to test edge cases, we should try to implement those at a lower +level (integration or unit tests). ### Performance Tests -These types of tests will reproduce a usual user scenario and then simulate a group -of concurrent users and measure the server's response. +These types of tests will reproduce a typical user scenario and then simulate a +group of concurrent users and then measure the server's response time and overall +performance. They are typically used to stress test the infrastructure and measure the throughput -of the application. +of the application. They can expose bottlenecks and identify endpoints that need +optimization. ### Visual Tests @@ -78,6 +80,6 @@ of the application. The type of test where test runner navigates to browser page, takes screenshot and then compares the future screenshots with the reference screenshot. -These types of tests will cover a lot of ground with the least effort and -can indicate a change in the app. The downside is that they're not very precise +These types of tests will cover a lot of ground with the least effort and can +easily indicate a change in the app. The downside is that they're not very precise and the engineer needs to spend some time to determine the cause of the error. From 881efee02c0c42ea8872881b0fc42b6bc8697ded Mon Sep 17 00:00:00 2001 From: Marin Bareta Date: Wed, 28 Aug 2024 15:30:56 +0200 Subject: [PATCH 3/7] Split integration and API tests --- recipes/automated-testing.md | 96 +++++++++++++++++++++++++++++++++--- 1 file changed, 90 insertions(+), 6 deletions(-) diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md index af50533..39eab47 100644 --- a/recipes/automated-testing.md +++ b/recipes/automated-testing.md @@ -2,11 +2,12 @@ ## Types of Automated Tests -There are different approaches to testing, and depending on the level of the -entry point, we can split tests into following categories. +There are different approaches to testing, and depending on boundaries of the +test, we can split them into following categories: - **Unit Tests** - **Integration Tests** +- **API Tests** - **E2E Tests** - **Load/Performance Tests** - **Visual Tests** @@ -25,12 +26,26 @@ Unit testing is recommended for functions that contain a lot of logic and/or bra It is convenient to test a specific function at the lowest level so if the logic changes, we can make minimal changes to the test suite and/or mocked data. +#### When to use +- Test a unit that implements the business logic, that's isolated from side effects such as database interaction or HTTP request processing +- Test function or class method with multiple input-output permutations -### Integration Tests (API Tests) +#### When **not** to use +- To test unit that integrates different application layers, such as persistence layer (database) or HTTP layer (see "Integration Tests") -This is the broadest test category. With these tests, we want to make sure our -API contract is valid and the API returns the expected data. That means we write -tests for the publically available endpoints. +#### Best practices +- Unit tests should execute fast (<50ms) +- Use mocks and stubs through dependency injection (method or constructor injection) + +#### Antipatterns +- Mocking infrastructure parts such as database I/O - instead, revert the control by using the `AppService`, `Command` or `Query` to integrate unit implementing business logic with the infrastructure layer of the application +- Monkey-patching dependencies used by the unit - instead, pass the dependencies through the constructor or method, so that you can pass the mocks or stubs in the test + + +### Integration Tests + +With these tests, we test the application API endpoints and assert that they are +actually working as expected. **TODO**: do we want to add that we should run full backend for these type of tests? @@ -43,12 +58,51 @@ can access this endpoint contains correct data - **failure** - if we send incorrect data, the endpoint should handle the exception and return appropriate error status +- **successful change** - successful request should make the appropriate change If the endpoint contains a lot of logic where we need to mock a lot of different inputs, it might be a good idea to cover that logic with unit tests. Unit tests will require less overhead and will provide better performance while at the same time decoupling logic testing and endpoint testing. +#### When to use +- To verify the API endpoint performs authentication and authorization. +- To verify user permissions for that endpoint. +- To verify that invalid input is correctly handled. + +#### When **not** to use +- For testing of specific function logic. We should use unit tests for those. + +#### Best practices +- Test basic API functionality and keep the tests simple. +- If the tested endpoint makes database changes, verify that the changes were +actually made. + +#### Antipatterns + +### API Tests + +With these tests, we want to make sure our API contract is valid and the API +returns the expected data. That means we write tests for the publically +available endpoints. + +Depending on the project setup, API tests can be covered with integration tests. +For example, if the application only has public APIs and more devs than QAs, it +might be a better option to add API testing in integration tests. + +#### When to use +- To make sure the API signature is valid. + +#### When **not** to use +- To test application logic. + +#### Best practices +- Write these tests with the tools which allow us to reuse the tests to write +performance tests (K6). + +#### Antipatterns + + ### E2E Tests These tests are executed within a browser environment (Playwright, Selenium, etc.). @@ -64,6 +118,18 @@ These tests should not cover all of the use cases because they are the slowest t run. If we need to test edge cases, we should try to implement those at a lower level (integration or unit tests). +#### When to use +- Test user interaction with the application UI. + +#### When **not** to use +- For data validation. + +#### Best practices +- Performance is key in these tests. We want to run tests as often as possible +and good performance will allow that. + +#### Antipatterns + ### Performance Tests These types of tests will reproduce a typical user scenario and then simulate a @@ -74,6 +140,16 @@ They are typically used to stress test the infrastructure and measure the throug of the application. They can expose bottlenecks and identify endpoints that need optimization. +#### When to use +- To stress test infrastructure. +- To measure how increased traffic affects load speeds and overall app performance. + +#### When **not** to use + +#### Best practices + +#### Antipatterns + ### Visual Tests @@ -83,3 +159,11 @@ and then compares the future screenshots with the reference screenshot. These types of tests will cover a lot of ground with the least effort and can easily indicate a change in the app. The downside is that they're not very precise and the engineer needs to spend some time to determine the cause of the error. + +#### When to use + +#### When **not** to use + +#### Best practices + +#### Antipatterns From 499edbdb8bf3d499e38c33b9eca9aa2aa3f55403 Mon Sep 17 00:00:00 2001 From: Marin Bareta Date: Mon, 2 Sep 2024 12:54:42 +0200 Subject: [PATCH 4/7] Add overarching Best Practices section, update other sections --- recipes/automated-testing.md | 72 +++++++++++++++++++++++++++++++++++- 1 file changed, 71 insertions(+), 1 deletion(-) diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md index 39eab47..4901016 100644 --- a/recipes/automated-testing.md +++ b/recipes/automated-testing.md @@ -1,5 +1,48 @@ # Automated Testing +## Testing best practices + +Writing tests can be hard because there are a lot of things that can be tested. + +Starting out can be overwhelming. But since writing tests is easy, we can write +a lot of them in no time. Don't do this. Do not focus on code coverage number as +it can lead to false sense of security. Remember that code with 100% test coverage +can still have bugs. + +Focus on test quality and test performance. Make sure the test is not asserting +unimportant things. Make sure the test is as quick as possible. Quick tests will +be run often. Running tests often means more early bug detection which means less +production errors. + +--- + +Deal with flaky tests immediately. Flaky tests ruin test suite confidence. A failed +test should raise alarm immediately. If the test suite contains flaky tests, disable +them and refactor as soon as possible. + +--- + +Be careful with tests that alter database state. We want to be able to run tests +in parallel so do not write tests that depend on each other. Each test should be +independent of the test suite. + +--- + +Test for behavior and not implementation. Rather focus on writing tests that +follow the business logic instead of programming logic. Avoid writing parts of +the function implementation in the actual test assertion. This will lead to tight +coupling of tests with internal implementation and the tests will have to be fixed +each time the logic changes. + +--- + +Writing quality tests is hard and it's easy to fall into common pitfalls of testing +that the database update function actually updates the database. Start off simple +and as the application grows in complexity, it will be easier to determine what +should be tested more thoroughly. It is perfectly fine to have a small test suite +that covers the critical code and the essentials. Small suites will run faster +which means they will be run more often. + ## Types of Automated Tests There are different approaches to testing, and depending on boundaries of the @@ -31,7 +74,7 @@ changes, we can make minimal changes to the test suite and/or mocked data. - Test function or class method with multiple input-output permutations #### When **not** to use -- To test unit that integrates different application layers, such as persistence layer (database) or HTTP layer (see "Integration Tests") +- To test unit that integrates different application layers, such as persistence layer (database) or HTTP layer (see "Integration Tests") or performs disk I/O or communicates with external system #### Best practices - Unit tests should execute fast (<50ms) @@ -72,13 +115,17 @@ time decoupling logic testing and endpoint testing. #### When **not** to use - For testing of specific function logic. We should use unit tests for those. +- For testing third party services. We should assume they work as expected. #### Best practices - Test basic API functionality and keep the tests simple. - If the tested endpoint makes database changes, verify that the changes were actually made. +- Assert that output data is correct. #### Antipatterns +- Aiming for code coverage percentage number. An app with 100% code coverage can +have bugs. Instead, focus on writing meaningful, quality tests. ### API Tests @@ -127,6 +174,8 @@ level (integration or unit tests). #### Best practices - Performance is key in these tests. We want to run tests as often as possible and good performance will allow that. +- Flaky tests should be immediately disabled and refactored. Flaky tests will +cause the team to ignore or bypass the tests and these should be dealt with immediately. #### Antipatterns @@ -140,15 +189,29 @@ They are typically used to stress test the infrastructure and measure the throug of the application. They can expose bottlenecks and identify endpoints that need optimization. +Performance tests are supposed to be run on actual production environment since +they test the performance of code **and** infrastructure. Keep in mind actual +users when running performance tests. Best approach is to spin up a production +clone and run tests against that environment. + #### When to use - To stress test infrastructure. - To measure how increased traffic affects load speeds and overall app performance. #### When **not** to use +- To test if the application works according to specs. +- To test a specific user scenario. #### Best practices +- These tests should mimic actual human user in terms of click frequency and page +navigation. +- There should be multiple tests that test different paths in the system, not a +single performance test. #### Antipatterns +- Running these tests locally or on an environment that doesn't match production +in terms of infrastructure performance. (tests should be developed on a local +instance, but the actual measurements should be performed live) ### Visual Tests @@ -161,9 +224,16 @@ easily indicate a change in the app. The downside is that they're not very preci and the engineer needs to spend some time to determine the cause of the error. #### When to use +- When we want to cover broad range of features. +- When we want to increase test coverage with least effort. +- When we want to make sure there are no changes in the UI. #### When **not** to use +- To test a specific feature or business logic. +- To test a specific user scenario. #### Best practices +- Have deterministic seeds so the UI always renders the same output. +- Add as many pages as possible but keep the tests simple. #### Antipatterns From 32fcf441e46441e820fcfb53683baf97ab447da2 Mon Sep 17 00:00:00 2001 From: Miro Dojkic Date: Thu, 5 Sep 2024 13:26:23 +0200 Subject: [PATCH 5/7] Rework quality over quantity section --- recipes/automated-testing.md | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md index 4901016..ae7af9f 100644 --- a/recipes/automated-testing.md +++ b/recipes/automated-testing.md @@ -1,18 +1,17 @@ # Automated Testing +## Glossary +**Confidence** - describes a degree to which passing tests guarantee that the app is working +**Determinism** - describes how easy it is to determine where the problem is based on the failing test ## Testing best practices -Writing tests can be hard because there are a lot of things that can be tested. +### Quality over quantity +Don't focus on achieving a specific code coverage percentage. +While code coverage can help us identify uncovered parts of the codebase, it doesn't guarantee high confidence. -Starting out can be overwhelming. But since writing tests is easy, we can write -a lot of them in no time. Don't do this. Do not focus on code coverage number as -it can lead to false sense of security. Remember that code with 100% test coverage -can still have bugs. - -Focus on test quality and test performance. Make sure the test is not asserting -unimportant things. Make sure the test is as quick as possible. Quick tests will -be run often. Running tests often means more early bug detection which means less -production errors. +Instead, focus on identifying important paths of the application, especially from user's perspective. +User can be a developer using a shared function, a user interacting with the UI, or a client using server app's JSON API. +Write tests to cover those paths in a way that gives confidence that each path, and each separate part of the path works as expected. --- From 7ef23a3c96312bd4a1157c97dcbe77a16519025b Mon Sep 17 00:00:00 2001 From: droguljic <1875821+droguljic@users.noreply.github.com> Date: Tue, 26 Nov 2024 11:01:27 +0100 Subject: [PATCH 6/7] Revise the flaky tests section (#12) --- recipes/automated-testing.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md index ae7af9f..fab499c 100644 --- a/recipes/automated-testing.md +++ b/recipes/automated-testing.md @@ -15,9 +15,12 @@ Write tests to cover those paths in a way that gives confidence that each path, --- -Deal with flaky tests immediately. Flaky tests ruin test suite confidence. A failed -test should raise alarm immediately. If the test suite contains flaky tests, disable -them and refactor as soon as possible. +Flaky tests that produce inconsistent results ruin confidence in the test suite, mask real issues, and are the source of frustration. The refactoring process to address the flakiness is crucial and should be a priority. +To adequately deal with flaky tests it is important to know how to identify, fix, and prevent them: +- Common characteristics of flaky tests include inconsistency, false positives and negatives, and sensitivity to dependency, timing, ordering, and environment. +- Typical causes of the stated characteristics are concurrency, timing/ordering problems, external dependencies, non-deterministic assertions, test environment instability, and poorly written test logic. +- Detecting flaky tests can be achieved by rerunning, running tests in parallel, executing in different environments, and analyzing test results. +- To fix and prevent further occurrences of flaky tests the following steps can be taken, isolate tests, employ setup and cleanup routines, handle concurrency, configure a stable test environment, improve error handling, simplify testing logic, and proactively deal with typical causes of the flaky tests. --- From 0d3898e336b2207fd02aad7b0f7bba48bab65cd5 Mon Sep 17 00:00:00 2001 From: Dino Bettini Date: Thu, 5 Dec 2024 14:55:24 +0100 Subject: [PATCH 7/7] Clarify integration and API test distinction (#11) * Clarify integration and API test distinction - Removed the conflation of integration and API tests - Added infrastructure and entry points integration subsections - Added a note to API tests to mention the distinction from integraiton * Update wording Co-authored-by: Miro Dojkic * Expand the glossary Also reworded Entry Points slightly. --------- Co-authored-by: Miro Dojkic --- recipes/automated-testing.md | 43 +++++++++++++++++++++++++++--------- 1 file changed, 33 insertions(+), 10 deletions(-) diff --git a/recipes/automated-testing.md b/recipes/automated-testing.md index fab499c..d4da7ef 100644 --- a/recipes/automated-testing.md +++ b/recipes/automated-testing.md @@ -1,7 +1,9 @@ # Automated Testing ## Glossary -**Confidence** - describes a degree to which passing tests guarantee that the app is working -**Determinism** - describes how easy it is to determine where the problem is based on the failing test +**Confidence** - describes a degree to which passing tests guarantee that the app is working. +**Determinism** - describes how easy it is to determine where the problem is based on the failing test. +**Use Case** - a potential scenario in which a system receives external input and responds to it. It defines the interactions between a role (user or another system) and a system to achieve a goal. +**Combinatiorial Explosion** - the fast growth in the number of combinations that need to be tested when multiple business rules are involved. ## Testing best practices @@ -89,10 +91,26 @@ changes, we can make minimal changes to the test suite and/or mocked data. ### Integration Tests -With these tests, we test the application API endpoints and assert that they are -actually working as expected. +With these tests, we test how multiple components of the system behave together. -**TODO**: do we want to add that we should run full backend for these type of tests? +#### Infrastructure + +Running the tests on test infrastructure should be preferred to mocking, unlike in unit tests. Ideally, a full application instance would be run, to mimic real application behavior as close as possible. +This usually includes running the application connected to a test database, inserting fake data into it during the test setup, and doing assertions on the current state of the database. This also means integration test code should have full access to the test infrastructure for querying. +> [!NOTE] +> Regardless of whether using raw queries or the ORM, simple queries should be used to avoid introducing business logic within tests. + +However, mocking can still be used when needed, for example when expecting side-effects that call third party services. + +#### Entry points + +Integration test entry points can vary depending on the application use cases. These include services, controllers, or the API. These are not set in stone and should be taken into account when making a decision. For example: +- A use case that can be invoked through multiple different protocols can be tested separately from them, to avoid duplication. A tradeoff in this case is the need to write some basic tests for each of the protocols. +- A use case that will always be invokeable through a single protocol might benefit enough from only being tested using that protocol. E.g. a HTTP API route test might eliminate the need for a lower level, controller/service level test. This would also enable testing the auth layer integration within these tests, which might not have been possible otherwise depending on the technology used. + +Multiple approaches can be used within the same application depending on the requirements, to provide sufficient coverage. + +#### Testing surface **TODO**: do we want to write anything about mocking the DB data/seeds? @@ -114,13 +132,16 @@ time decoupling logic testing and endpoint testing. - To verify the API endpoint performs authentication and authorization. - To verify user permissions for that endpoint. - To verify that invalid input is correctly handled. +- To verify the basic business logic is handled correctly, both in expected success and failure cases. +- To verify infrastructure related side-effects, e.g. database changes or calls to third party services. #### When **not** to use -- For testing of specific function logic. We should use unit tests for those. +- For extensive testing of business logic permutations beyond fundamental scenarios. Integration tests contain more overhead to write compared to unit tests and can easily lead to a combinatorial explosion. Instead, unit tests should be used for thorough coverage of these permutations. - For testing third party services. We should assume they work as expected. #### Best practices -- Test basic API functionality and keep the tests simple. +- Test basic functionality and keep the tests simple. +- Prefer test infrastructure over mocking. - If the tested endpoint makes database changes, verify that the changes were actually made. - Assert that output data is correct. @@ -135,9 +156,11 @@ With these tests, we want to make sure our API contract is valid and the API returns the expected data. That means we write tests for the publically available endpoints. -Depending on the project setup, API tests can be covered with integration tests. -For example, if the application only has public APIs and more devs than QAs, it -might be a better option to add API testing in integration tests. +> [!NOTE] +> As mentioned in the Integration Tests section, API can be the entry point to the integration tests, meaning API tests are a subtype of integration tests. However, when we talk about API tests here, we are specifically referring to the public API contract tests, which don't have access to the internals of the application. + +In the cases where API routes are covered extensively with integration tests, API tests might not be needed, leaving more time for QA to focus on E2E tests. +However, in more complex architectures (e.g. integration tested microservices behind an API gateway), API tests can be very useful. #### When to use - To make sure the API signature is valid.