# Introduction to Data Processing in Python using Windy Birthdays!

<div class = "alert alert-info" role = "alert" 
     style = "font-size: 1.7em; font-weight: bold; padding: 15px; margin: 10px 0; text-align: center; background-color: #d9edf7; 
    border-color: #bce8f1; color: #31708f; border-radius: 8px;">
    In this tutorial, we will be downloading meteorological data from Rame Head NCI so we can 
    assess wind and temperature measurements using Python.

</div>

### Step 1: Import the python libraries you'll need -- basic setup

In [None]:
# Data Handling and Manipulation
import os
import numpy as np
import pandas as pd

# Data Visualisation and Plots
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns

# Supress warnings (not errors)
import warnings
warnings.filterwarnings('ignore')

### Step 2: Download a 2-day txt file from your most recent birthday from the archive using this [LINK](http://www.nci-ramehead.org.uk/weather/archive/)

<div class="alert alert-info" role="alert" 
     style="font-size: 1.1em; padding: 10px; margin: 10px 0; text-align: left;">
    
     Did you know?
     -------------
      • The Rame Head station was opened in May 1998 and is part of the National Coastwatch Institution (NCI). 
      • NCI Rame Head is one of 50 NCI stations operating around the British Isles.
      • Learn more here if interested:
        http://www.nci-ramehead.org.uk/weather/History_Vantage_Pro.htm
                                                                                                             .
</div>

### Step 3: Load your txt file using pandas' `read_csv function`

In [None]:
# Let's give our rame head met txt file a short name so we can use it more easily later:
rame_file  = 'P3_Test0404_RameH.txt'

In [None]:
# Import our named text file (rame_file) as a dataframe (weather_df)
weather_df = pd.read_csv(rame_file, delimiter = '\t') # tab delimited

# Look inside your dataframe using 'print'
print(weather_df)

<div class="alert alert-info" role="alert" 
     style="font-size: 1.1em; padding: 10px; margin: 10px 0; text-align: left;">

       
     That doesn't look good!
    
       - First, open your birthday text file using Windows File Explorer (MacBook Finder). 
       - You'll see that Rame Head have tried to make the data they provide look nice by splitting header names over 
         two lines and adding a line of dashes before the numerical data starts.
       - Unfortunately, Python is going to struggle with this.
      
     There are a few ways around this. 
       - One neat option is to selectively import the columns we want by number (count). 
       - REMEMBER: Python always starts counting from zero, not one.
       - Column 0: 'Date'; Column 1: 'Time'; Column 2: 'Temp Out'; Column 7: 'Wind Speed'
       
    
</div>

In [None]:
# Column numbers (count from 0)
columns_to_read = [0, 1, 2, 7]   

# Read in data from text file to make dataframe (weather_df)
weather_df = pd.read_csv(rame_file,                        # - short name of your data file.
                         delim_whitespace = True,          # - this command tells pandas that columns are separated by whitespace.
                         skiprows= 3,                      # - skips the first 3 rows which contain text and the separator line.
                         usecols = columns_to_read,        # - tells pandas to only import the columns we pre-defined (0, 1, 2, 7).
                         header  = None)                   # - prevents pandas from treating the first data row as a column name.

# Assign easy-to-understand names to imported columns
weather_df.columns = ['Date', 'Time', 'Temp', 'Wind']

# Display first few rows of yr weather_df (dataframe)
print(weather_df.head())

### Step 4: Data Assessment

In [None]:
# How many unique days are in our file?
num_days = weather_df['Date'].nunique()
# Print the result
print(f"There are {num_days} days of data in the file, but we only want the birthday date.")

In [None]:
birthday = '4/04/25'

In [None]:
# Filter 2-day dataframe for rows where 'Date' is 'birthday'
bday_df = weather_df[weather_df['Date'] == birthday]
# Combine Date and Time into a single datetime column
bday_df['DTime'] = pd.to_datetime(bday_df['Date'] + ' ' + bday_df['Time'], 
                                 format = '%d/%m/%y %H:%M')

# Reset 1-day dataframe index to count from row zero again
bday_df.reset_index(drop = True, inplace = True)

# Show birthday data
print(bday_df.head(4))

In [None]:
# Calculating max and min SST values, and determining where they sit in the timeseries using pandas:
max_wind = bday_df['Wind'].max()
min_wind = bday_df['Wind'].min()

# Finding the index of the max and min SST values
max_wind_ind = bday_df['Wind'].idxmax()
min_wind_ind = bday_df['Wind'].idxmin()

# Getting the dates corresponding to the max and min SST values and formatting them without time
time_max = bday_df['DTime'][max_wind_ind]
time_min = bday_df['DTime'][min_wind_ind]

# Printing the results
print(f'The Max wind speed on my birthday was {max_wind:.1f} m/s reached at {time_max.strftime("%H:%M")}')

### Step 5: Line plot of `air temperatures` and `wind speeds` from your birthday day

In [None]:
# Figure 1: Plot your first timeseries!
fig1, ax = plt.subplots(figsize = (15, 5), dpi = 250)

# Plot the temp data using 's' square markers in red
plt.plot(bday_df['DTime'], bday_df['Temp'], linestyle = ':', marker = 's', markersize = 1, color = 'red', 
         label = 'Temperature (°C )')
# Plot the wind speed using 'o' round marker in blue
plt.plot(bday_df['DTime'], bday_df['Wind'], linestyle = ':', marker = 'o', markersize = 2, color = 'blue',
         label = 'Wind Speed (m/s)')

# Set y-axis limits
# ax.set_ylim([0 , 30])
# Set x-axis limits
ax.set_xlim([bday_df['DTime'].min(), bday_df['DTime'].max()])
# Set x-axis ticks for every 1 hour
ax.xaxis.set_major_locator(mdates.HourLocator(interval = 1))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M' ))
# Rotate tick labels for better readability
plt.setp(ax.get_xticklabels(), rotation = 33, ha = 'center')

# Add gridlines 
ax.grid(True, color = 'silver', linestyle = ':', linewidth = 1)

# Add labels
plt.title(f"My Birthday Weather: {birthday}", fontsize = 11, weight = 'bold')
plt.xlabel('Time')
plt.legend()

# Show the plot
plt.show()

In [None]:
# Save your figure by uncommenting the next line of code
#fig1.savefig('Birthday_LinePlot.png', dpi = 300, bbox_inches = 'tight')

### Step 4: Boxplot of `air temperatures` and `wind speeds` from your birthday day

In [None]:
# Create your figure and axis
fig2, ax = plt.subplots(figsize = (6, 6))

# Set the overall style
sns.set_style('whitegrid')

# Boxplot with Temp and Wind as separate boxes
ax = sns.boxplot(data = bday_df[['Temp', 'Wind']], width = 0.6, palette = "mako", 
                 flierprops = dict(marker = 'd', markersize = 3))

# Overlay a stripplot of individual data points
sns.stripplot(data = bday_df[['Temp', 'Wind']], palette = "flare", jitter = True, alpha = 0.3, size = 2)

# Gridlines
ax.grid(True, linestyle = ':', linewidth = 0.5, color = 'gray', axis = 'y')

# Add title
ax.set_title(f'Rame Head Meteorological Measurements: {birthday}', fontsize = 12, fontweight = 'bold')

# Show the plot
plt.show()

In [None]:
# Save your figure by uncommenting the next line of code
#fig2.savefig('Birthday_Boxplots.png', dpi = 300, bbox_inches = 'tight')

<div class="alert alert-info" role="alert" 
     style="font-size: 1.2em; padding: 10px; margin: 10px 0; text-align: center;">
    
    Well done on successfully using python to plot and assess downloaded weather data from your special day!
</div>