Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document best practices for writing tests for nodes and pipelines #3782

Merged
merged 52 commits into from
Apr 17, 2024

Conversation

AhdraMeraliQB
Copy link
Contributor

@AhdraMeraliQB AhdraMeraliQB commented Apr 5, 2024

Description

Addresses #1271

Added the page "Test a Kedro project" to the spaceflights tutorial. The page details the structure of writing a positive and negative unit test for a node function, and an integration test for (part of) a pipeline, with examples. It then discusses some best practices for testing, including the directory setup, naming conventions, and use of fixtures.

The page on automated testing was updated to reflect the test directory structure we use in our template projects, and to include instructions on locally installing the Kedro project before testing, resolving the issues seen around imports from the project into the tests - xref on Slack

These changes will need to be updated on starters (PR paused - pending docs approval) - kedro-org/kedro-starters#216

Development notes

See the built docs here

The example tests have been tested in a spaceflights project (both unrefined and refactored versions) and pass as expected. These docs have pulled inspiration from some of the internal pages linked on the original issue.

Developer Certificate of Origin

We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a Signed-off-by line in the commit message. See our wiki for guidance.

If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.

Checklist

  • Read the contributing guidelines
  • Signed off each commit with a Developer Certificate of Origin (DCO)
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the RELEASE.md file
  • Added tests to cover my changes
  • Checked if this change will affect Kedro-Viz, and if so, communicated that with the Viz team

Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for putting this together @AhdraMeraliQB !

I gave a very quick first pass

docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
Copy link
Member

@astrojuanlu astrojuanlu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a minor formatting comment and another one that probably should become an issue in itself, this looks great ⭐

pip install -e .
```

>**NOTE**: The option `-e` installs an editable version of your project, allowing you to make changes to the project files without needing to re-install them each time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's use native MyST syntax

Suggested change
>**NOTE**: The option `-e` installs an editable version of your project, allowing you to make changes to the project files without needing to re-install them each time.
:::{note}
The option `-e` installs an editable version of your project, allowing you to make changes to the project files without needing to re-install them each time.
:::

https://myst-parser.readthedocs.io/en/latest/syntax/admonitions.html

(fenced admonitions should also work)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MyST syntax doesn't seem to render properly on RTD. Additionally, the syntax of >**NOTE** is currently consistent across the docs - I'm unsure of the motivation to change it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, we needed the colon_fence extension

@merelcht merelcht mentioned this pull request Apr 15, 2024
10 tasks
Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks @AhdraMeraliQB 💯

RELEASE.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved
docs/source/tutorial/test_a_project.md Outdated Show resolved Hide resolved

### Pipeline slicing

In the test `test_data_science_pipeline` we test the that outputs of the data science pipeline are those that are expected. However, as pipelines are not static, this test is not robust. Instead we should be specific with how we define the pipeline to be tested; we do this by using [pipeline slicing](../nodes_and_pipelines/slice_a_pipeline.md#slice-a-pipeline-by-running-specified-nodes) to specify the pipeline's start and end:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"we test the that outputs of the data science pipeline are those that are expected", at the moment we don't do this, but test that the pipeline runs without errors, that's why you needed to add the trivial assert statement. I think it would be better if we could indeed check some sort of output, even if it's just a logging statement.

don4get and others added 9 commits April 17, 2024 10:54
* Update kedro-catalog-0.19.json

Signed-off-by: Anthony De Bortoli <anthony.debortoli@protonmail.com>

* Update set_up_vscode.md

Signed-off-by: Anthony De Bortoli <anthony.debortoli@protonmail.com>

---------

Signed-off-by: Anthony De Bortoli <anthony.debortoli@protonmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Replace toposort with graphlib (built-in from Python 3.9)

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Create toposort groups only when needed

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Update RELEASE.md and graphlib version constraints

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Remove mypy-toposort

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Ensure that the suggest resume test has no node ordering requirement

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Ensure stable toposorting by grouping and ungrouping the result

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

---------

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Create toposort groups only when needed
* Ensure that the suggest resume test has no node ordering requirement
* Ensure stable toposorting by grouping and ungrouping the result
* Delay toposorting until pipeline.nodes is used
* Avoid using .nodes when topological order or new copy is unneeded
* Copy the nodes only if tags are provided
* Remove unnecessary condition in self.nodes

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Add project to robots.txt

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>

* Add EOF

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>

---------

Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Kedro need more uv

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* remove docker

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* Kedro need more uv

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* remove docker

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

* fix broken type hint and resolve project path

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* fix type hint

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* remove duplicate logic

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* adding nok.py is definitely an accident

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* fix test

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* remove print

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* add test

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
* double linkcheck limits

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>

* fix ratelimit

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Nok Lam Chan <nok.lam.chan@quantumblack.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
puneeter and others added 18 commits April 17, 2024 10:54
* Update omegaconf_config.py

Signed-off-by: Puneet Saini <99470400+puneeter@users.noreply.github.com>

* Update RELEASE.md

Signed-off-by: Puneet Saini <99470400+puneeter@users.noreply.github.com>

* add a more complicated test case

Signed-off-by: Nok <nok.lam.chan@quantumblack.com>

---------

Signed-off-by: Puneet Saini <99470400+puneeter@users.noreply.github.com>
Signed-off-by: Nok <nok.lam.chan@quantumblack.com>
Co-authored-by: Nok <nok.lam.chan@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Marcin Zabłocki <m.zablo@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
…ets optional dependencies (#3664)

* Update spaceflights tutorial and starter requirements

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* fix e2e tests

Signed-off-by: lrcouto <laurarccouto@gmail.com>

* Fix e2e tests by distinguishing `kedro-datasets` dependency for different python versions (#3802)

Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>

* Update docs/source/tutorial/tutorial_template.md

Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>

---------

Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: L. R. Couto <57910428+lrcouto@users.noreply.github.com>
Signed-off-by: Merel Theisen <merel.theisen@quantumblack.com>
Co-authored-by: Merel Theisen <49397448+merelcht@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ankita Katiyar <ankitakatiyar2401@gmail.com>
Co-authored-by: Juan Luis Cano Rodríguez <juan_luis_cano@mckinsey.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
#3812)

* Factor out transcoding helpers into a private module

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Ensure node input/output validation doesn't allow transcoded self-loops

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>

* Updated release note to avoid github warning

Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>

---------

Signed-off-by: Ivan Danov <idanov@users.noreply.github.com>
Signed-off-by: Elena Khaustova <ymax70rus@gmail.com>
Co-authored-by: Elena Khaustova <ymax70rus@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Dmitry Sorokin <dmd40in@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: lrcouto <laurarccouto@gmail.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
This reverts commit 9582a22.

Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
@AhdraMeraliQB AhdraMeraliQB force-pushed the docs/add-testing-best-practices branch from f04b96c to a212baf Compare April 17, 2024 09:55
AhdraMeraliQB and others added 6 commits April 17, 2024 10:56
Signed-off-by: Ahdra Merali <90615669+AhdraMeraliQB@users.noreply.github.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Signed-off-by: Ahdra Merali <ahdra.merali@quantumblack.com>
Copy link
Member

@merelcht merelcht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks @AhdraMeraliQB

SequentialRunner().run(pipeline, catalog)

# Assert
assert successful_run_msg in caplog.text
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@AhdraMeraliQB AhdraMeraliQB merged commit 4a4ecdf into main Apr 17, 2024
10 checks passed
@AhdraMeraliQB AhdraMeraliQB deleted the docs/add-testing-best-practices branch April 17, 2024 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet