# Seaborn and Matplotlib Practice
Welcome! This notebook covers essential plots using Seaborn and Matplotlib.

In [None]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load example penguins dataset
df = sns.load_dataset("xxxx")
df.head()

## 🔹 Seaborn: Histogram
Plot a histogram of flipper length by Species.  Hue is the legend...

In [None]:
# 💡 Hint: Use sns.histplot(data=penguins, x=?, hue=?, multiple="stack") 

## 🔹 Seaborn: KDE
Overlay a KDE plot for destiny estimation of flipper length by Species.

In [None]:
# 💡 Hint: Use sns.kdeplot(data=df, x=?, hue=?, multiple="stack")

In [None]:
# What do you think this code will do?

sns.displot(data=df, x="flipper_length_mm", hue="species", col="species")

In [None]:
# Load tips dataset
dftips = sns.load_dataset("tips")
dftips.head()

## 🔹 Seaborn: Boxplot
Compare total bill distribution by day.

In [None]:
# 💡 Hint: Use sns.boxplot(x='day', y='total_bill', data=dftips)

## 🔹 Seaborn: Scatterplot
Explore the relationship between total bill and tip.

In [None]:
# 💡 Hint: Use sns.scatterplot(x='total_bill', y='tip', data=dftips)

## 🔹 Seaborn: Barplot
Compare average tips by smoker status.

In [None]:
# 💡 Hint: Use sns.barplot(x='smoker', y='tip', data=dftips)

## 🔸 Matplotlib: Line Chart
###### Create a basic line chart using Matplotlib.  This is the basic code for a simple plot using Matplotlib.

```
plt.plot([1, 2, 3, 4], [2, 4, 6, 8])
plt.title('Simple Line Chart')
plt.show()
```

###### Using the dataset below, let's create a line plot:

In [None]:
data = {
    "Date": pd.date_range(start="2025-01-01", periods=7, freq="D"),
    "Value": [100, 105, 98, 110, 107, 102, 115]
}
dfMatplotlib = pd.DataFrame(data)

In [None]:
# Plot it - the basic syntax is below, amend and play with it to see what each element does.

# Challenge, how would you format the data?

import matplotlib.pyplot as plt

# Create figure and axis
fig, ax = plt.subplots()

# Plot on the axis
ax.plot(dfMatplotlib["Date"], dfMatplotlib["Value"], marker='x')

# Titles and labels
ax.set_title("Your Chart Title Here")
ax.set_xlabel("Date")
ax.set_ylabel("Value")

# Rotate x-axis labels and add grid
ax.tick_params(axis='x', rotation=45)
ax.grid(True)

# Layout and display
plt.tight_layout()
plt.show()


## 🔸 Matplotlib: Subplots
###### Display multiple plots in one figure.  Below is a dataset built with a discussion with Chat GPT to give us some data to play with.
###### Have a look and check that you can understand the code, it's all pandas and numpy methods to give you an indication of what they can do together)

In [1]:
import pandas as pd
import numpy as np

# ✅ Set a random seed for reproducibility — ensures the same random values each time
np.random.seed(42)

# ✅ Generate a 14-day date range starting from 1st Jan 2025
dates = pd.date_range(start="2025-01-01", periods=14, freq="D")

# ✅ Simulate daily 'Lines of Code' using a Poisson distribution
#    Poisson is a good choice for count data like code lines written per day
lines_of_code = np.random.poisson(lam=150, size=14)

# ✅ Simulate daily 'Errors Encountered' using integers between 1 and 9
#    Keeps error values low and realistic for visualisation
errors_encountered = np.random.randint(1, 10, size=14)

# ✅ Optional: Simulate daily 'Cups of Coffee' — correlated with coding effort 😄
cups_of_coffee = np.random.randint(1, 6, size=14)  # 1 to 5 coffees a day

# ✅ Create a DataFrame to hold all values
df_python_dev = pd.DataFrame({
    "Date": dates,
    "Lines of Code": lines_of_code,
    "Errors Encountered": errors_encountered,
    "Cups of Coffee": cups_of_coffee
})

# ✅ Preview the data
df_python_dev.head()


StatementMeta(, 1058fa7e-115e-4ed7-b266-1684867e870d, 3, Finished, Available, Finished)

Cache file does not exists. resource=$ml, path=$/nfs4/tridenttokenlibrary/tokens/ml.token
failed to get ml token
Traceback (most recent call last):
  File "/home/trusted-service-user/cluster-env/trident_env/lib/python3.11/site-packages/synapse/ml/fabric/token_utils.py", line 156, in _get_access_token_from_token_lib
    return PyTridentTokenLibrary.get_access_token_from_cache(resource)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/trusted-service-user/cluster-env/trident_env/lib/python3.11/site-packages/trident_token_library_wrapper.py", line 71, in get_access_token_from_cache
    raise IOError(err_msg)
OSError: Cache file does not exists. resource=$ml, path=$/nfs4/tridenttokenlibrary/tokens/ml.token


Unnamed: 0,Date,Lines of Code,Errors Encountered,Cups of Coffee
0,2025-01-01,145,7,2
1,2025-01-02,159,2,4
2,2025-01-03,136,4,2
3,2025-01-04,154,9,2
4,2025-01-05,163,2,4


##### ✅ Python Developer Subplots – Ready to Edit

###### ✏️ Try changing:
###### 
###### color= to any colour name ('red', 'green', etc.)
###### 
###### marker= to 'D', '^', or '+'

###### What else can you change, how does this code work?  

In [None]:
import matplotlib.pyplot as plt

# 📊 Create figure and 3 subplots (vertically stacked)
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(10, 8), sharex=True)

# 🔹 Plot 1: Lines of Code
ax1.plot(df_python_dev["Date"], df_python_dev["Lines of Code"], marker='o', color='blue')
ax1.set_title("Daily Python Coding Activity")
ax1.set_ylabel("Lines of Code")
ax1.grid(True)

# 🔸 Plot 2: Errors Encountered
ax2.plot(df_python_dev["Date"], df_python_dev["Errors Encountered"], marker='x', color='orange')
ax2.set_title("Daily Errors Encountered")
ax2.set_ylabel("Errors")
ax2.grid(True)

# 🟤 Plot 3: Cups of Coffee
ax3.plot(df_python_dev["Date"], df_python_dev["Cups of Coffee"], marker='s', color='brown')
ax3.set_title("Cups of Coffee")
ax3.set_xlabel("Date")
ax3.set_ylabel("Cups")
ax3.tick_params(axis='x', rotation=45)
ax3.grid(True)

# ✅ Tidy up layout
plt.tight_layout()
plt.show()



## ✅ Summary: Seaborn and Matplotlib Practice

###### In this notebook, you:
###### - Practised creating visualisations using **Seaborn** and **Matplotlib**
###### - Explored common chart types including:
  ###### - Line plots
  ###### - Bar plots
  ###### - Scatter plots
  ###### - Subplots with multiple metrics
###### - Learned how to:
  ###### - Use `fig, ax = plt.subplots()` for better control
  ###### - Format date axes using `matplotlib.dates`
  ###### - Add labels, titles, and markers for clarity
  ###### - Simulate and visualise fun, relatable data (like Python developer habits!)

---

### ➡️ Next Step: Introduction to Plotly

###### **Plotly Express** is a modern charting library built for interactivity and ease-of-use.
###### 
###### ✅ Benefits:
###### - Hover tooltips, zoom, and pan are built-in
###### - Clean syntax like `px.line()`, `px.bar()`, `px.scatter()`
###### - Great for web-based dashboards or interactive exploration
###### 
###### We'll now create the same types of charts using **Plotly Express**, starting with:
###### - A line plot with dates
###### - A bar chart grouped by category
###### - A scatterplot with size and colour encoding
###### 
###### ➡️ Let’s dive into interactive charts with Plotly!
