# M8: Final Practice on a Heart Disease Dataset

Congratulations, you have made it to the final module! Throughout the course, you have covered the fundamental knowledge and packages needed to apply Python in bioinformatics.

The aim of this module is to help you consolidate what you have learned. We will introduce a new dataset for you to analyse and explore. You'll be given a series of exercises designed to be completed independently, with minimal external assistance.

If you find yourself stuck, make sure to give it a proper attempt on your own first. If that doesn’t resolve the issue, revisit earlier modules to refresh your memory. And if you’re still unsure, feel free to use Google. In fact, knowing how to look up code or functions to accomplish a task is an important skill in itself. As a last resort, you can consult the solution sheet.

---
---

## Loading the Dataset

The dataset that we will be using is the *Heart Disease Dataset* from the UC Irvine Machine Learning Repository: https://archive.ics.uci.edu/dataset/45/heart+disease

This dataset is intended for a machine learning task: given a set of patient features, the goal is to predict whether the patient has been diagnosed with heart disease (the target variable). While building such a model is beyond the scope of this course, you will conduct some initial exploratory analyses to become familiar with the dataset and its contents.

These exploratory analyses are often an essential first step, regardless of whether your aim is to develop a machine learning model or to pursue other kinds of investigation.

---

### Exercise 1: Understanding the Dataset

*Take a few minutes to read through the dataset description on the website to familiarise yourself with its structure and the variables it contains.*

---

Great! Now that you've had a look at the dataset description, let's dive into the data itself.

Don't worry if you didn't understand everything — things should become clearer as you familiarise yourself with the dataset through practice.

At the top right-hand side of the dataset's website, you'll find a button labelled `IMPORT IN PYTHON`. Clicking on it will show you which package to install and how to load the dataset.


![Overview of the website](./pictures/websitescreenshot_overview.png)


![Import overview](./pictures/websitescreenshot_import.png)


```{admonition} Tip
:class: tip  

For the first part of the exercises, you mainly just need to follow the example code from the website.

Take your time going through the initial code, it will help you start exploring the dataset and understanding what is being provided and how to work with it.

---

### Exercise 2: Import the Dataset Using the UCI Package

*Install the required package (using any of the methods you've learned) and import it into your code.*

In [13]:
# Write your code here.

---

### Exercise 3: Explore the Structure of the Dataset Object

1. *Create a Heart Disease dataset object using `fetch_ucirepo(id=45)`.*
2. *Check the **type** of the dataset object you've created.*
3. *Use `dir()` on the object to list its attributes.*
4. *Try accessing `.metadata`, `.variables`, and `.data`. What kind of information do they contain? What type are they?*

*This will help you understand how the dataset is structured and how to navigate it.*

In [15]:
# Write your code here.

---

### Exercise 4: Summarise the Metadata

*Print the metadata information about the dataset. Try to answer the following questions based on what you find:*
1. *How many data points (patients) and how many features are there?*
2. *What are the names of the demographic features?*
3. *What is the name of the target variable?*
4. *Are there any missing values in the dataset? How can you identify them?*

In [17]:
# Write your code here.

---

### Exercise 5: Inspect Variable Details and Metadata

1. *Save the feature information from `dataset.variables` in a new variable*
2. *Print the variable information and its type.*
3. *Which variables have missing values?*
4. *What is the unit of **resting blood pressure**?*
5. *How many categorical variables are there?*

In [19]:
# Write your code here.

---

### Exercise 6: Convert Data to DataFrame Format for Exploration

1. *Extract the feature values and the target values from the dataset into two separate variables. Print each of them.*
2. *Do the number of rows and columns make sense?*
3. *Are you able to understand what information they contain based on the metadata exploration? If not, revisit the website, metadata, and variable information to clarify.*
4. *Import the `pandas` library.*
5. *Combine the feature and target values into a single pandas DataFrame, where each row represents a patient and each column represents a feature, with the final column being the target variable.*
6. *Display the first five rows of the resulting DataFrame.*

```{dropdown} Tip
:class: tip

If you're unsure how best to combine the features and target, start by checking their types. You'll see that both are already in the pandas framework, so you'll need to use a pandas command to combine them into the required layout — this should only take one line of code.

Whenever you're unsure how to manipulate the data, your first step should always be to check its type. This will help you better understand what operations are available.
```

In [21]:
# Write your code here.

---
---

## Data Cleaning

Now that we have everything in a single, tidy DataFrame, we need to make sure the data is properly cleaned before we begin analysing it.

Let's start with missing values. From the metadata, we already know that the features `ca` and `thal` contain missing values.

---

### Exercise 7: Investigating Missing Values in the Dataset

*To decide how to handle these missing values, we first need a more detailed understanding. For both `ca` and `thal`, print the following information:*
1. *The full column of values.*
2. *How many times each contains a `NaN` value.*
3. *The number of unique values in the column, and what those values are.*
4. *The type of each feature (use the variable information DataFrame to check this).*
5. *Based on these results, think about what the best way to handle the missing data is.*

In [23]:
# Write your code here.

---

You should notice that `ca` contains integer values, while `thal` is categorical, with each having four or three unique values respectively (excluding `NaN`). For this reason, it wouldn’t make sense to replace the missing values with the mean. Instead, we could replace them with the mode (the category or integer that is most common for that feature). However, since only a maximum of six patients out of the 303 have missing values (four for `ca` and two for `thal`), we can instead simply remove patients (rows) with any missing (`NaN`) values without losing too much data.

---

### Exercise 8: Removing Rows with Missing Values

1. *Remove all rows (patients) that contain any `NaN` values.*
2. *Reset the row index after dropping the rows with missing data.*
3. *Print the number of NaN values in each column to confirm that none remain.*
4. *Print the shape of the DataFrame and check whether the number of rows makes sense.*

In [25]:
# Write your code here.

---

Great! Now you should have no missing values in your dataset.A different issue that can sometimes occur is data being entered incorrectly, resulting in duplicate patients or duplicate features (i.e. rows or columns with exactly the same values). We need to check that this isn’t the case in our dataset.

---

### Exercise 9: Checking for Duplicate Rows and Columns

1. *Identify any duplicate rows.*
2. *Print the number of duplicate rows.*
3. *Write an `if` statement that prints and removes the duplicate rows, if any are found.*
4. *Repeat the same process for duplicate columns.*

In [27]:
# Write your code here.

---

As a final step, we want to ensure that the target variable (`num`) is in the correct format.

According to the website, it should take five possible values: 0, 1, 2, 3, and 4.
A value of 0 indicates no heart disease, while values 1–4 represent different categories of heart disease.

For your analyses, you may only be interested in whether the patient has the disease or not, but not which type. This is often the case in binary classification tasks, or simply to make the analysis more straightforward.  In such cases, the target values need to be converted into binary:
- 0 → no heart disease  (False / 0)
- 1-4 → heart disease present (True / 1)

---

### Exercise 10: Create a Binary Target Variable

1. *Check the type of `num` and its unique values, as you did in earlier exercises, to confirm that the data matches the description.*
2. *Create a new column at the end of the DataFrame called `heart_disease_binary`. This column should contain 0 if `num` is 0, and 1 otherwise. Use a lambda function to perform this transformation.*
3. *Print the final DataFrame to verify the result.*

In [29]:
# Write your code here.

---
---

## Exploratory Data Analysis (EDA)

Now that the dataset is clean, we can begin exploring the different features. Visualisation is a key part of **exploratory data analysis (EDA)**: it helps us detect patterns, identify outliers, and better understand the structure of our data — all of which can inform future modelling decisions.

We'll start by plotting the **individual features** to understand their distributions. Depending on the type of feature, different plots and analysis methods are needed. There are two main types:
- Numerical (quantities and measurements)
- Categorical (groups, types, labels)

Note that this type may not always align perfectly with the type Python assigns to a value. We will refer to whether the feature is numerical or categorical as its **feature type**, and the type you get from Python when using `type()` as its **Python type**.

**Numerical Features**

These represent measurable quantities, and they can be either:

| Type        | Description                         | Example from Dataset         | Visualisation         |
|-------------|-------------------------------------|------------------------------|-----------------------|
| Continuous  | Can take any value within a range   | `chol` (serum cholesterol), `oldpeak` (ST depression)           | Histogram             |
| Discrete    | Countable whole numbers             | `ca` (number of major vessels)     | Histogram or bar plot |

Even though both feature types are numeric, you might treat them differently in analysis. For example, **standardisation** usually makes sense for continuous variables but not for small-range discrete variables like `ca` or `slope` (slope of the peak exercise ST segment).

```{admonition} Continuous vs Discrete – Real vs Representation
:class: note

Some variables, like **age**, may appear **discrete** in a dataset (e.g. whole years),  
but they are **inherently continuous** — people can be 18.5 or 73.2 years old.  
Because it spans a wide range and behaves like a measurement, **age is typically analysed as continuous**.

In contrast, features like **number of vessels (`ca`)** are **truly discrete**:  
you can't have 2.5 vessels — it's either 2 or 3. These are counts and must be treated accordingly.

- Discrete features with **many unique values** (like age) are often analysed as continuous.  
- Truly discrete, **count-based** features are handled differently — especially in statistical models.

**Categorical Features**

These represent groups or labels. They can be:

| Type     | Description                           | Example from Dataset                                                                        | Visualisation  |
|----------|---------------------------------------|---------------------------------------------------------------------------------------------|----------------|
| Nominal  | No natural order                      | `sex` (biological sex), `cp` (chest pain type), `thal` (thalassemia: 3 = normal, 6 = fixed defect, 7 = reversible defect)                                              | Bar plot       |
| Ordinal  | Ordered categories                    | `slope` (slope of ST segment: up, flat, down) | Bar plot       |

Just because a feature is stored as an integer doesn't mean its **feature type** is numerical. Many **categorical variables** are encoded as integers (e.g., `sex`, `cp`, `slope`, `thal`), but these values represent **categories**, not measurements.

```{admonition} Identifiers
:class: note

Some biomedical datasets include identifiers (e.g., patient ID or record number). These are usually:
- Unique to each row
- Not useful for prediction
- Should not be plotted or included in modelling

Always check the variable and its description to determine whether it is categorical or numerical. You can use several types of information to help guide your decision:

-	Use `.dtypes` on the DataFrame:
    - `float64` → likely continuous
    - `int64` → could be either discrete or categorical

-	Use `.nunique()` on the DataFrame:
    - Few unique values (e.g., 2–4) → probably categorical
    - Many unique values (30+) → likely continuous

-	Use the variable descriptions or metadata:
    - They say if the the "Categorical" or "Integer"
    - Does it describe a **measurement** (e.g., blood pressure)? → Numerical
    - Does it describe a **group or type** (e.g., chest pain type)? → Categorical
    
If you are unsure, you can always try plotting both a histogram and a bar plot, and see which one better reflects the nature of the data.

---

### Exercise 11: Classify Each Variable by Type

*Identify the type of each feature. This will help guide your choice of visualisation, transformation, or statistical method later. Consider the following variable types:*
- **Continuous numerical** – real-valued measurements with many unique values
- **Discrete numerical** – count-based integers
- **Nominal categorical** – unordered categories
    - *Binary* – a special case with exactly two values
- **Ordinal categorical** – ordered categories
- **Identifier / metadata** – e.g., patient IDs (not used in modelling)

*(Not all types may be present in this dataset)*

*Create a list of categorical and numerical features and assign the feature names accordingly (you can ignore subtypes such as binary or ordinal for now).*

```{admonition} Target variables in EDA
:class: note
Although `heart_disease_binary` is a target variable, you can include it (and the non-binary version `num`) in the classification for exploratory analysis — just keep in mind that it won’t be used as a feature during modelling.

In [31]:
# Write your code here.

---

Now that you've classified the type of each feature, you can start visualising them.

We'll begin with **categorical variables**, which are best visualised using **bar plots**. A bar plot shows how many times each category appears in the dataset. This helps you:
- Understand **class balance** (e.g., how many patients are male vs female)
- Identify **rare or dominant categories**
- Spot **potential modelling issues** like highly imbalanced features

Bar plots are especially useful when working with binary or low-cardinality features, and they provide quick insights into how the data is distributed.

Bar plots are particularly helpful for **binary** and **low-cardinality** features — that is, those with just a few unique values. They give you a quick overview of how the categories are distributed, which can inform later modelling choices.

---

### Exercise 12: Visualise Categorical Feature Distributions

1. *Import the `matplotlib.pyplot` library as `plt` and `seaborn`.*
2. *Create a function to generate a bar plot for a categorical feature - include labels, a legend, and a title. You can use either `.countplot()` from `seaborn`, or extract the value counts from the dataframe and use `.plot()` on them.* 
3. *Loop through the categorial feature list to create bar plots for each of them.*

In [34]:
# Write your code here.

---

Let's now do the same for the **continuous numerical features** you identified earlier. 

We will use **histograms** as they are ideal for:
- Understanding the **shape** of the distribution (e.g., symmetric, skewed, bimodal)
- Identifying **outliers** or unusual values
- Comparing **ranges and spread** between features

---

### Exercise 13: Visualising Numerical Features Distributions

1. *Create a function to generate a histogram for a numerical feature — include labels, a legend (if appropriate), and a title. You can use either `.histplot()` from `seaborn` or `.hist()` from `matplotlib`.*
2. *Loop through the list of numerical features to create a histogram for each of them.*

```{dropdown} Tip
:class: tip

Use `bins=20` and `edgecolor='black'` with either `seaborn` or `matplotlib` to make your histograms easier to read and visually appealing.
```

In [36]:
# Write your code here.

---

### Exercise 14: Detecting Potential Outliers with Boxplots

Histograms are great for understanding overall shape, but they can sometimes hide extreme values (outliers). A boxplot shows the median, spread, and any unusually high or low points — which can help detect issues early.

In clinical data, outliers are often real and meaningful, such as a patient with very high cholesterol. However, it's still worth checking them!

**Task:**
1. *Create a boxplot for each numerical feature (not stratified by group).*
2. *Use `sns.boxplot()` in a loop, one subplot per feature.*
3. *Add titles and axis labels.*
4. *Look for skewed distributions or values that might need special attention later.*

In [38]:
# Write your code here.

---

For now, you've only explored the distribution of features across the **entire dataset**. However, it's often useful to examine **subgroups** to understand how they differ. For example, male and female patients may exhibit different patterns in how diseases present. Let's investigate whether any feature distributions vary between male and female patients. This can help identify biological or physiological differences, highlight features that may benefit from sex-specific modelling, and determine whether standardisation should be stratified (i.e., performed within each subgroup rather than across the full dataset).

```{admonition} Unequal group sizes can affect your interpretations
:class: warning

In our dataset, around **68% of patients are male** and only **32% are female**. 

This imbalance can affect how we interpret feature distributions. For example, if a certain chest pain type appears more frequently in males, it might simply reflect the larger number of male patients — not a true difference in risk or symptoms.
```

---

### Exercise 15: Compare Categorical Distributions by Sex

1. *Create a plot with as many subplots as there are categorical features.*
2. *Loop through each of the categorical features.*
3. *Within the loop: Create a bar plot for each feature, stratified by sex. You should show one bar per category **within each sex group**  - include axis labels, a legend, and a title. You can use either `.countplot()` from seaborn, or compute the counts manually and use `.plot()`.*

```{dropdown} Tip
:class: tip
- Use `enumerate(categorical_list)` when looping through the list to access both the index and the feature name. The index is useful for selecting the correct subplot, while the name is needed for labelling.
- Set `hue='sex'` in `.countplot()` to stratify the bars by sex.
- Alternatively, use `pd.crosstab()` or `df.groupby('sex')[feature].value_counts()` to compute subgroup counts if you prefer to use `.plot()`.
```

In [40]:
# Write your code here.

---

### Exercise 16: Compare Numerical Features by Sex

1. *Create a plot with as many subplots as there are numerical features.*
2. *Loop through each of the numerical features.*
3. *Within the loop: Plot a subplot of overlaid histograms for male and female patients for that feature. Include axis labels, a legend, and a title. You can use either `.histplot()` from `seaborn` or `.hist()` from `matplotlib`.*

```{dropdown} Tip
:class: tip  
- Set `hue='sex'` in `.histplot()` to have it **stratified** by sex.  
- If using `matplotlib`, create two histograms in the same subplot (one for each sex) by selecting the appropriate subset from the main dataframe.  
```

In [42]:
# Write your code here.

---

### Exercise 17: Boxplots of Numerical Features by Sex

*In the previous exercise, you used **stratified histograms** to explore how the distribution of each numerical variable differs between males and females. Now, you'll use **boxplots** to visualise these differences. While histograms are better for understanding the full shape of a distribution (e.g., unimodal, skewed, bimodal), boxplots provide a cleaner summary of central tendency and variability — especially for comparing groups.*

**Task:**
1. *Create a plot with as many subplots as there are numerical features.*
2. *Loop through each of the numerical features.*
3. *Within the loop: Plot a subplot of a boxplot comparing the distribution for male and female patients for that feature — include labels, a legend, and a title. Use `.boxplot()` from `seaborn`, similar to how you used `.histplot()` previously.*

In [44]:
# Write your code here.

---

The next step after looking at the distributions of the features for the whole dataset or subgroups is to start exploring the relationship between two features. A **correlation matrix** is a powerful way to do this for two continuous features since it can be used to:
- Identify **linear relationships** (positive or negative) between features
- Detect **redundant features** (strong correlation between features means similar information)
- Understand which variables may interact in modelling

---

### Exercise 18: Exploring Relationships Between Numerical Features

1. *Create a new pandas DataFrame containing only the numerical features.*
2. *Use `.corr()` on the new DataFrame to compute the Pearson correlation matrix and save it in a new variable `corr_matrix`.*
3. *Plot the `corr_matrix` as a heatmap using `seaborn` - include annotation, labels, a legend, and a title.*
4. *Interpret the heatmap:*
   - *Which features are positively or negatively correlated?*
   - *Are any relationships strong (above $\pm0.6$)?*
   - *Do any pairs look redundant or surprisingly independent?*

```{admonition} Interpreting correlations carefully
:class: note

Some correlations are expected — especially when features measure **closely related aspects** of the same thing. For example, resting **systolic** and **diastolic** blood pressure would naturally be correlated.

This isn't necessarily a problem, but when building models, including highly correlated features can cause issues like **multicollinearity**. It's a good reminder that interpreting correlation matrices also requires a bit of **domain knowledge** — not just statistical tools.

In [46]:
# Write your code here.

```{dropdown} Tip
:class: tip
- Set `annot=True` inside `seaborn.heatmap()` to display the correlation values on the plot.
- Set `cmap='coolwarm'` for an intuitive red–blue colour scheme.
```

---

We can't always trust the results from the Pearson correlation. Some of our continuous features are **right-skewed**, contain **outliers**, and might have **non-linear** relations. This violates Pearson correlation assumption of normality and linearity, which may lead to underestimation of non-linear relationship or high sensitivity to outliers.

Instead, you can compute the **Spearman correlation**, which is based on **ranks** rather raw values. This makes it more **robust to outliers and skewness**, and better suited for detecting **monotonic (non-linear)** relationships.

---

### Exercise 19: Spearman Correlation

1. *Use `.corr(method='spearman')` to compute the Spearman correlation matrix of the numerical dataframe.*
2. *Plot it as a heatmap (just like before).*
3. *Compare it to your Pearson matrix:*
   - *Which pairs change the most?*
   - *Are there any new strong relationships?*
   - *Do any associations disappear?*

*Seeing how these two methods differ gives you a deeper understanding of your data and can help you decide whether some features need **transformation** or whether **nonlinear methods** are more appropriate downstream.*

In [48]:
# Write your code here.

---

You've explored relationships between features. Now it's time to ask which features differ most between patients with and without heart disease. This is a crucial part of exploratory analysis and feature selection. You'll stratify patients based on the binary outcome: `heart_disease_binary` (0 = no disease, 1 = disease).

We're not only interested in **visualising** how distributions differ depending on whether a patient has heart disease, but also in determining whether these differences are statistically significant using **statistical tests**. Statistical tests return **p-values**. A **low p-value** (e.g., < 0.05) suggests that the feature is significantly associated with heart disease. If the result is not significant, it may not be useful on its own — or it might only show an effect when combined with other features.

- For **numerical features**, we typically use a **t-test**, which checks if the means of the feature differ significantly between patients with and without heart disease.

- If the feature distribution is **non-normal**, we use the **Mann–Whitney U test**, a non-parametric alternative.

- For **categorical features**, we use a **chi-squared test of independence**, which checks whether category counts differ significantly between the two patient groups.

This part is a bit more advanced than what we've done so far — it will require you to look up and interpret functions yourself. This is an important skill to develop, since in real projects you'll often need to explore new functions or packages for specific tasks. A great starting point is to Google the function name along with keywords like *“python”* or the package name. In this case, check out the `scipy` package documentation:
https://docs.scipy.org/doc/scipy/index.html.

---

### Exercise 20: Explore Numerical Feature Relationships with the Target Variable

1. *For each numerical feature:*
    - *Create a boxplot grouped by `heart_disease_binary`.*
    - *Optionally: also create stratified histograms or violin plots.*
2. *Observe:*
    - *Which variables show clear separation between the two classes?*
    - *Are there noticeable shifts in distribution or outliers?*
    - *Which features might be informative for classification?*

*This step helps you identify potential predictors of heart disease, build intuition about what differences exist between groups, and sets the stage for later statistical testing or modelling.*

In [50]:
# Write your code here.

---

### Exercise 21: Statistical Testing of Numerical Feature Relationships with the Target Variable

1. *If needed, install `SciPy`.*
2. *Import `ttest_ind` and `mannwhitneyu` from `scipy.stats`.*
3. *Split the dataframe into patients **with** (`heart_disease_binary` = 1) and **without** (`heart_disease_binary` = 0) heart disease.*
4. *Look at the documentation for `ttest_ind` and `mannwhitneyu` online to understand:*
- *Required input formats (e.g., two arrays of values)*
- *How to extract the p‑value from each test.*
5. *For each numerical feature, perform:*
	- *An **independent two-sample t-test***
	- *A **Mann–Whitney U test***
6. *Print and interpret the p-values:*
	- *Which features show statistically significant differences?*
	- *Do the conclusions differ between the t-test and Mann–Whitney U test?*

In [52]:
# Write your code here.

```{admonition} Reminder: Check the distribution shape
:class: tip

When results **disagree**, it often signals a violation of assumptions — such as **skewness** or **outliers** — which makes the **non-parametric test** more trustworthy. Use your earlier **histograms and boxplots** to decide which test is more appropriate.

- If a feature is **skewed** or contains **many outliers**, the **Mann–Whitney U test** is generally more reliable.
- If a feature is **symmetric** and approximately **normally distributed**, the **t-test** is typically suitable.

This ensures you're applying valid statistical assumptions and makes your results more reliable. Using **both tests** can also strengthen your conclusions — especially when dealing with non-normal data.

```{admonition}
:class: warning

When results **disagree**, it often reflects a violation of assumptions — such as **skewness** or **outliers** — which makes the **non-parametric test** more trustworthy.
```

```{admonition} Takeaways
:class: tip

- Use **both tests** for robustness, especially with **non-normal** data.
- Several features (like `thalach` and `oldpeak`) show strong group separation — they may be useful for modelling heart disease.
- For features like `chol`, the **visualisations** and **test disagreement** suggest caution — it may require **transformation** or further exploration.
```

---

### Exercise 21: Exploring Categorical Feature Relationships with the Target Variable

*You've explored continuous variables in detail — now it's time to analyse how **categorical features** relate to heart disease.*

1. *For each categorical feature:*
    - *Create a bar chart grouped by `heart_disease_binary` like you did to compare female vs male patients.*
2. *Observe:*
    - *Which variables show clear differences between the two classes?*
    - *Which features might be informative for classification?*

In [54]:
# Write your code here.

---

### Exercise 22: Statistical Testing of Categorical Feature Relationships with the Target Variable

1. *Import `chi2_contingency` from `scipy.stats`.*

2. *Consult the documentation for `chi2_contingency` to understand:*
- *The required input format (typically a contingency table from `pd.crosstab()` or similar)*
- *How to extract the p-value from the function output*

3. *For each categorical feature, construct a contingency table with heart_disease_binary and run a chi-squared test of independence.*

4. *Print and interpret the p-values:*
- *Which features show statistically significant differences ($p<0.05$)?*
- *How might these features inform modelling or further investigation?*

In [56]:
# Write your code here.

---

## Congratulations, you have made it to the end of the course! 

We're aware that this practice sheet hasn't covered everything you've learned during the course — for example, topics like `numpy` or creating classes. We didn't want it to get too long. But if you have some time left, try coming up with your own tasks! Think about what kinds of analyses you might want to carry out on this dataset, and how you could approach them.

You can always use Google to help you. Tools like ChatGPT or other large language models might also be useful if you get really stuck. But use them with care — relying on them too much can limit your learning and prevent you from developing your own problem-solving skills. Also, remember that their answers aren't always correct.

We hope you enjoyed the course and found it helpful! Please let us know if there's anything else you would have liked it to cover, if you spotted any errors, or if you have any general feedback.