# MN5813: Introduction to Data Visualisation (Exercises)

_This notebook provides exercises for basic data visualisations using Pandas. Exercises are designed to be completed in approximately 90 minutes by students who have little familiarity with the topics._

Note: This Jupyter Notebook was originally compiled by Alex Reppel (AR) based on conversations with [ClaudeAI](https://claude.ai/) *(version 3.5 Sonnet)*. For this year's materials, further revisions were made using [Claude Code](https://www.anthropic.com/claude-code) *(Sonnet 4.5)*, including updated documentation and git commit messages.

## Introduction

### Overview

1. **Basic line plot:** Begin with fundamental plotting concepts, similar to the demonstration but with slightly different requirements to ensure understanding
2. **Scatter plot analysis:** Introduce more complex visualisation by incorporating a third variable through colour mapping
3. **Statistical visualisation:** Practice creating subplots and using different types of statistical plots from [seaborn](https://seaborn.pydata.org/)
4. **Customised visualisation:** Extends Example 1 with additional customization options shown in the demonstration
5. **Comparative analysis:** Uses the subplot concept shown in the demonstration

### Tips for success

- Review the demonstration notebook for examples and syntax
- Pay attention to plot customization options
- Consider the best way to present the data clearly
- Don't forget to add proper labels and titles
- Use appropriate colour schemes

**Remember:** The goal is to create clear, informative visualisations that effectively communicate the data's story.

## Setup

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
# Set the style
plt.style.use("classic")

In [None]:
# Load the data
df = pd.read_csv("assets/data/data.csv")
df['date'] = pd.to_datetime(df['date'])

In [None]:
df.head()

## Exercise 1: Basic line plot

Using the business metrics dataset, create a line plot showing the daily conversion rate over time.

Requirements:

1. Set the figure size to 12x6
2. Add appropriate title and axis labels
3. Include gridlines with 30% transparency
4. Make the line dark blue with 70% opacity
5. Rotate x-axis labels by 45 degrees

In [None]:
# Your code here

## Exercise 2: Scatter plot analysis

Create a scatter plot to examine the relationship between number of visitors and revenue.

Requirements:

1. Set figure size to 10x6
2. Use different colours for points based on satisfaction scores (hint: use plt.scatter's 'c' parameter)
3. Add a colour bar to show the satisfaction scale
4. Include appropriate labels and title
5. Add gridlines

In [None]:
# Your code here

## Exercise 3: Statistical visualisation

Create two subplots side by side:

1. A histogram showing the distribution of marketing spend
2. A box plot showing satisfaction scores across different days of the week

Requirements:

1. Use a figure size of 15x6
2. Add appropriate titles for each subplot and an overall figure title
3. Use different colours for each plot
4. Add grid lines to both plots
5. Include proper axis labels

In [None]:
# Your code here

## Exercise 4: Customised visualisation

reate a plot showing the relationship between marketing spend and conversion rate, similar to Example 1 in the demonstration, but with additional customisation:

Requirements:

1. Create a scatter plot of marketing spend vs conversion rate
2. Colour the points based on revenue
3. Add a title and properly labeled axes
4. Include a colourbar with an appropriate label
5. Add a grid with 30% transparency
6. Format axis labels to show currency (£) for marketing spend and percentage for conversion rate


In [None]:
# Your code here

## Exercise 5: Comparative analysis

Create a figure with two subplots comparing different business metrics:

Requirements:

1. Left subplot: Create a scatter plot of visitors vs revenue
2. Right subplot: Create a scatter plot of marketing spend vs revenue
3. Use the same scale for revenue on both plots
4. Add appropriate titles for each subplot and an overall figure title
5. Include gridlines on both plots
6. Add proper axis labels with units _(`£` for monetary values)_
7. Use different colours for each plot
8. Ensure proper spacing between subplots using `tight_layout()`

In [None]:
# Your code here