## DSCI 524 - Collaborative Software Development

### Lecture 3: Code reviews, functional testing and advice for testing complex things

#### 2020-03-02

## Lecture 3 learning objectives:
By the end of this lecture, students should be able to:

- [Perform a code review that uses inline comments and suggested code fixes.](#Code-reviews-using-in-line-comments-and-suggested-code-fixes)
- Define the following 3 types of testing:
    - unit testing
    - integration testing
    - regression testing
- Employ a workflow that optimizes accurate code.
- Use `pytest` and `testhat` to run a project's entire test suite
- Explain how `pytest` and `testhat` find the test functions when they are asked to run a project's entire test suite
- [Write unit tests for complex objects (e.g., data frames, models, plots).](#Write-unit-tests-for-complex-objects)

## Code reviews using in-line comments and suggested code fixes

<img src ="https://help.github.com/assets/images/help/commits/hover-comment-icon.gif" width=800>

*Source: <https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/reviewing-proposed-changes-in-a-pull-request>*

### Exercise: do a code review:

We are going to each do our own code review of a pull request. I have set-up a template GitHub repository for you so that you can easily generate a pull request for you to review.

#### Steps:

**When you are done step #5 indicate so on the [sli.do](https://www.sli.do) poll (`#524-L03`).**

1. **Import** [this repository](https://github.com/ttimbers/review-my-pull-request) to obtain a copy of it for yourself (do not fork it).

2. Create a remote branch named `pr` (this will use GitHub Actions to create a pull request for you to review in this repository).

3. Click on the Pull Requests tab of your copy of the repository, click on the pull request titled "Report most accomplished pilots", and then click on "Files Changed". Next click on the `star-wars.Rmd` file. Review the file and observe the following problems with the R Markdown report that was submitted via the pull request:
  - Reasoning of the sentence on line 15
  - Incompatibility with the sentence on line 15 with the code in the code chunk named `table-of-most-accomplished-pilots`
  - Incorrect code in code chunk named `table-of-most-accomplished-pilots` (unested `film` instead of `starships`) leads to naming the wrong pilot as the most accomplished pilot on line 27
  - Incorrect code in code chunk named `table-of-most-accomplished-pilots` (unested `film` instead of `starships`) leads to the use of the wrong character's picture in the image that is sourced in the code chunk named `top-pilot` (it should be a picture of Han Solo, you could use this URL for example: <https://i1.wp.com/twinfinite.net/wp-content/uploads/2015/11/harrison-ford-Mill_3222044i.jpg>).

4. Add comments and suggested changes using the `+` sign beside the line numbers (the first time you do this will trigger the start of your code review. Need help? See [GitHub's how to on reviewing proposed changes in a pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/reviewing-proposed-changes-in-a-pull-request).

5. After you have made all the comments and suggested changes, then add a general comment for the code review, select "Request Changes" and submit your code review.



**When you are done step #5 indicate so on the [sli.do](https://www.sli.do) poll (`#524-L03`).**


### Exercise: Accept suggested changes from a code review:

#### Steps:

1. Practice accepting code changes that you provided as suggestions by revisiting the Pull Requests tab of your copy of the repository and clicking on the pull request titled "Report most accomplished pilots". Scroll through the pull request comments and find the code suggestions. Then click on the "Commit suggestion button" for each suggestion. 

2. Click on the "Show all reviewers" link beside the red "Changes requested"" text. Then click on the `...` beside the reviewer and click "Approve changes".

3. Finally click on the green buttons ("Merge Pull Request" & "Confirm merge") to merge the pull request.

**When you are done step #3 indicate so on the [sli.do](https://www.sli.do) poll (`#524-L03`).**

### Discussion: 

Was there anything we should have done differently with that code review?

*Hint: if I didn't tell you that the top pilot was Han Solo, how would you have known that?*

## Some common types of testing

- unit testing
- integration testing
- regression testing

## Write unit tests for complex objects 
(e.g., data frames, models, plots)

Writing unit tests for a single value, vector or list is fairly straight forward from what we have learned in 511, but what about more complex object? How do we write tests when our functions return:

- data frames?
- plot objects?
- model objects?

### General guidelings for testing data frames

- Where possible, use functions designed specifically for this (e.g., `dplyr::all_equal` and `pandas.DataFrame.equals`).
- If not possible, test for equality of important values (e.g., specific columns) and attributes (e.g., shape, column names, column type, etc) using the `expect_*` functions inside of `test_that` in R, or via assertions in Python.

TO DO: example of tests for a data frame

### General guidelings for testing plot objects

- Initial tests should be designed to test that plots have expected attributes (e.g., expected mark, correct mapping to axes, etc)
- Once a desired plot is generated, visual regression tests can be used to ensure that further code refactoring does not change the plot function. Tools for this exist for R in the [`vdiffr`](https://github.com/r-lib/vdiffr) package. AFAIK the Python tools in this space have only been developed/used for GUI & web apps, perhpas they could be used for plots as well? I have not yet tried (nor found any examples of anyone who has).

TO DO: example of tests for a plot object

### General guidelings for testing model objects

- Initial tests should be designed to test that models have expected attributes and results
- Only secondarily, may you want to compare to existing methods (rationale for this being second: what if their tests are wrong? Or worse, what if they don't have any!)

TO DO: example of tests for a model object

### But I have another type of object? How do I test it?

If you don't know where to start writing tests for the object you plan to use or return in your function, try the following:
- make such an object and interactively explore it
- look at other packages that have functions and return the same kind of object, what do they test for?

## How testthat works:

To run all tests in an R package that uses `testthat`, run the following from the R console with the working directory being set as the package's root directory:

```
devtools::test()
```

This command is a shortcut for `testthat::test_dir()`, and it runs all the files that live in `tests/testthat/` that start with `test`.

*Source: [R Packages, Chapter 10](https://r-pkgs.org/tests.html)*

### Organizing tests for your R package:

Tests are organised hierarchically: **expectations** are grouped into **tests** which are organised in **files**:

- An **expectation** is the atom of testing. It describes the expected result of a computation: Does it have the right value and right class? Does it produce error messages when it should? An expectation automates visual checking of results in the console. Expectations are functions that start with `expect_`.

- A **test** groups together multiple expectations to test the output from a simple function, a range of possibilities for a single parameter from a more complicated function, or tightly related functionality from across multiple functions. This is why they are sometimes called **unit** as they test one unit of functionality. A test is created with `test_that()`.

- A **file** groups together multiple related tests. Files are given a human readable name with `context()`.

*Source: [R Packages, Chapter 10](https://r-pkgs.org/tests.html)*

## How pytest works:

To run all tests in an Python package that uses `pytest`, run the following from the command line with the working directory being set as the package's root directory:

```
poetry run pytest
```

> Note: because we are using Poetry to build our packages, we need to prefix the pytest command with `poetry run` so that the tests are run in our package's virtual environment.

This command runs a recursive search (downward from the directory where this command is run) for files that are prefixed with `test_*.py` or `*_test.py` files which are imported by their test package name. From these files, it will run the functions whose names are pre-fixed with `test`.

### General guidelines for test data and helper functions:

- Keep your tests fast by creating toy data or using built-in data (e.g, `mtcars`) that you can feed to your unit tests. If you create toy data, do this within the unit test code block/function that uses that data. 
- If your tests need to be DRY'ed out in R, then:
    - put your helper functions in a file in the `tests/testthat` directory that and pre-fix the filename with `helper` so that they will be run before the tests.
    - store your data as a file in the `tests` directory.
- If your tests need to be DRY'ed out in Python, then:
    - put your helper functions in a file in the `tests` directory that uses `helper` in the filename to distinguish them from your user-facing functions. Do not use `test` in the name for these functions.
    - either store your data as a file in the `tests` directory or use [pytest fixtures](https://www.tutorialspoint.com/pytest/pytest_fixtures.htm) to generate data in a function before the tests are run