# Exercise 1: Simple Data Loading and Basic Visualization

**Objective:** Load a CSV file, explore the data, and create a simple line plot.

**Skills Practiced:**
- Loading data with pandas
- Basic data exploration
- Creating a simple line plot with matplotlib

## Step 1: Setup and Import Libraries

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import os

# Plotting setup
pd.plotting.register_matplotlib_converters()
sns.set_theme(style="whitegrid")

# Create plots folder if it doesn't exist
os.makedirs('plots', exist_ok=True)
print("Setup Complete")

## Step 2: Load the Data

**Task:** Load the CSV file `datasets/May_Office_213.csv`

**Hints:**
- Use `pd.read_csv()`
- Set `encoding='ISO-8859-1'` if you encounter encoding errors
- Use `index_col="Datetime"` to set the datetime column as index
- Use `parse_dates=True` to automatically parse dates
- Use `dayfirst=True` if dates are in DD.MM.YYYY format

In [None]:
# Your code here
# Load the CSV file
filepath = "datasets/May_Office_213.csv"

df = pd.read_csv(
    filepath,
    encoding='ISO-8859-1',
    index_col="Datetime",
    parse_dates=True,
    dayfirst=True
)

# Display first few rows
df.head()

## Step 3: Explore the Data

**Task:** Check the data structure and display basic information

In [None]:
# Your code here
# Display first and last date
print("First date in data:", df.index.min())
print("Last date in data:", df.index.max())

# Show data types
print("\nData types:")
print(df.dtypes)

# Display basic statistics
print("\nBasic statistics:")
print(df.describe())

## Step 4: Clean Column Names

**Task:** Clean up column names by removing spaces and special characters

In [None]:
# Your code here
# Clean column names: strip spaces and replace special characters
df.columns = df.columns.str.strip().str.replace('[^A-Za-z0-9_]+', '_', regex=True)

# Display cleaned column names
print("Column names:")
print(df.columns.tolist())

## Step 5: Create a Simple Line Plot

**Task:** Create a line plot showing temperature over time

**Requirements:**
- Plot temperature on the y-axis
- Use dates on the x-axis
- Add proper labels and title
- Use a figure size of (12, 6)
- Save the plot as 'temperature_plot.jpg'

In [None]:
# Your code here
# Find the temperature column (adjust name if needed)
# Common names: 'Temperature', 'Temperature_C_', 'Temp', etc.
temp_column = [col for col in df.columns if 'temp' in col.lower()][0]
print(f"Using column: {temp_column}")

# Create the plot
plt.figure(figsize=(12, 6))
plt.plot(df.index, df[temp_column], linewidth=2, color='blue')
plt.xlabel('Date', fontsize=12, weight='bold')
plt.ylabel('Temperature (°C)', fontsize=12, weight='bold')
plt.title('Temperature Over Time', fontsize=14, weight='bold')
plt.grid(True, alpha=0.3)
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig('plots/temperature_plot.jpg', format='jpg', dpi=150)
plt.show()

## Exercise Complete! ✅

**What you learned:**
- How to load CSV files with pandas
- How to explore and clean data
- How to create a simple line plot

**Next Steps:** Try Exercise 2 for more visualization practice!