# Unit tests

Unit testing is a fundamental practice in software development where individual units or parts of code are tested to ensure they work correctly. These tests help identify any issues or bugs in the code early on, making it easier to fix them. By testing each unit separately, developers can isolate problems and ensure that each piece of code functions as intended. This approach enhances the reliability and stability of the code. 

## PyTest

There are various Python testing tools available, but the focus of this lesson is on the Python testing framework called PyTest. PyTest is a popular testing framework that simplifies the process of writing and exicuting tests in Python. It provides a simple syntax for writing tests, features for test discovery and organisation, and support for fixtures. 

### Organising and naming 
PyTest discovers tests based on predefined conventions, ensuring seamless discovery and execution when specific naming conventions are adhered to.
 
<br></br>

Without specifying any specific tests, PyTest will start looking for tests in the folder you are working. <br></br>
<br>`poetry run pytest`</br>
<br></br>
You can also tell PyTest to search in specific folders or for specific test files by using command line arguments. PyTest will go into these folders and search for files that start with 'test_' or end with '_test.py'. It looks for these files inside their test package names.<br></br>
<br>`poetry run pytest tests/unit_tests/test_practice.py`</br>

<br></br>

PyTest then collects differnt test from these identified files.
1. Functions that strat with `test_`.
2. Functions inside classes that strat with `Test` and don't have an `__init__` method.


## The <font color=#26A5B8>assert</font> statement

The `assert` statement is a built-in feature in Python used to check whether a given condition is true or not. If the condition is true, nothing happens, the test passes, but if it's not true, an error is raised.


Notice that in the last row of the error message there isn't an actual message after <font color="red">AssertionError:</font>. That is because you are able to pass in an error message.  

The basic syntax for using <font color=#26A5B8>assert</font> is <br></br>
`assert condition_being_tested, error_message_to_be_displayed`

### Lets have a go at writing a test for a simple function

In [None]:
# This import is needed to be able to run pytest in this notebook 
import ipytest
ipytest.autoconfig()

In [None]:
%%ipytest -qq


Let's have a look at what it looks like when a test fails

In [None]:
%%ipytest -qq


### Exercise
Let's have a go at writing a unit tests for a function that calculates BMI.

Write a unit test to check that the function for calculating BMI is working as we would expect.
Test it with valid inputs as well as boundary cases where either hight or weight is zero

|Height| Weight| BMI|
|-------|-------|----|
|1.75 | 70 | 22.86|
|1.6| 60 | 23.44|
|0|70|None|
|1.8| 0| None|

Can you write a test to check that the following function for categorising blood pressure is working as expected

In [None]:
# Function that categorises blood pressure 
def categorise_blood_pressure(systolic, diastolic):
    if (systolic > 120 and systolic < 139) or (diastolic > 80 and diastolic < 89):
        return "Prehypertension"
    elif systolic <= 120 and diastolic <= 80:
        return "Normal"
    else:
        return "Abnormal" 

### Alternatives to the basic <font color=#26A5B8>assert</font> statement 

The Python library Pandas, has specific assertion functions for comparing specific types of data structures in the Pandas library, such as DataFrames, Series, and Indexes. These assertion functions  provide more detailed and specialised comparison capabilities, allowing us to ensure that our data is processed correctly and remains consistent throughout our data analysis pipelines.

`assert_frame_equal` 
<br></br>
`assert_series_equal`
<br></br>
`assert_index_equal`


Imagine we have some patient data that includes comorbidity and any treatment they are receiving.

In [None]:
# Write a function that filters for Cancer patients 
# HINT: Use df.loc


### Exercise 
A client has asked for the total number of 2WW referrals for all CCGs, for the w/c the 6th of April 2020. Write a unit test to check that you have filtered your data correctly. 

1. Read in the data from the dataset: https://raw.githubusercontent.com/carnall-farrar/python_club/master/data/referrals_oct19_dec20.csv

2. Take a subset of the data where specialty will be two week wait cancer referrals and the week start is 2020-04-06

3. Write a test to check your function is correctly filtering for the desired specialty and time period.

4. Write another test to assess that the function calculating the sum of referrals is doing what we would expect.

In [None]:
df = pd.read_csv("https://raw.githubusercontent.com/carnall-farrar/python_club/master/data/referrals_oct19_dec20.csv")

In [None]:
df.head()

In [None]:
def filter_2WW_and_date(df):
    mask = (df['specialty'] == '2WW') & (df['week_start'] == '2020-04-06')
    df = df.loc[mask, :]
    return df

def get_total_2WW_referals(df):
    total_referrals = df['referrals'].sum()
    return total_referrals

In [None]:
# get_total_2WW_referals()
filter_2WW_and_date(df)

In [None]:
# Write a unit test to check the filter function 
