# Validating Wind Data

This notebook will go through the steps taken to validate wind data and prove the sensor Gill 2-axis to be either accurate or undermeasuring data.

First, import the packages that we will be using. One new package we will be using is xarray. It is designed for working with oceanographic and other earth science datasets, particularly
netCDF, HDF5 (common in satellite/NASA products), and other multi-dimensional data.
It will also introduce matplotlib, a basic plotting library that produces nice looking graphs and figures

### My steps
* load my data sets
* examine data sets
* convert time column into a datetime object (timestamp)
* set column 'date' to index
* create time series plot

    * interpolate to the resample 1H
        *Calculate average and Std.
    * merge average ship data

* Create scatter plots ship x-axis and buoy y-axis
    * examine time stamps for each comparison
    * interpolate from highest frequency time to lowest frequency (highest to lowest)
    * plot the comparisons

In [1]:
import os
import numpy as np
import pandas as pd
import xarray as xr


import matplotlib.pyplot as plt
# This is so figures will display in the notebook without an explicit disp(figure)
%matplotlib inline

In [None]:
!pip install h5py

In [None]:
# Load the dataset
METBK1_D11 = "/home/jovyan/wind_comparison/Ship_Buoy_Comparison/d11_metbk1.csv"
METBK1_D11_data = pd.read_csv(METBK1_D11)

METBK1_D10 = "/home/jovyan/wind_comparison/Ship_Buoy_Comparison/d10_metbk1.csv"
METBK1_D10_data = pd.read_csv(METBK1_D10)

METBK2_D11 = "/home/jovyan/wind_comparison/Ship_Buoy_Comparison/d11_metbk2.csv"
METBK2_D11_data = pd.read_csv(METBK2_D11)

FDCHP1_D11 = "/home/jovyan/wind_comparison/Ship_Buoy_Comparison/d11_fdchp1.csv"
FDCHP1_D11_data = pd.read_csv(FDCHP1_D11)

Here we are going to do a couple processes that includes a resample interpolation and assigning our data set to an index followed by merging our data set with mathcing time series and finally calculating the magnitute from wind directions.

We can do this in preperation to graphing so we are going to do this process for all data sets prior to graphing.

In [None]:
METBK1_D10_data["time (UTC)"] = METBK1_D10_data["time (UTC)"].apply(lambda x: pd.to_datetime(x))
METBK1_D11_data["time (UTC)"] = METBK1_D11_data["time (UTC)"].apply(lambda x: pd.to_datetime(x))


METBK1_D11_data = METBK1_D11_data.set_index(keys='time (UTC)')
METBK1_D10_data= METBK1_D10_data.set_index(keys='time (UTC)')


Similarly, we can click on the other data variable's attributes and learn that the other two variables we want are, the East Wind vector value and, the North Wind vector value.

There remains one more step before we can get to plotting a comparison, and that is to calculate the vector wind speed average for comparison with the scalar wind speed. If we remember Calculus, we know that:

$$
\|{U}\| = \sqrt{u^{2} + v^{2}}
$$

where $\|{U}\|$ is the magnitude of the wind speed, $u$ is the eastward vector wind speed, and $v$ is the northward vector wind speed. So we can go ahead and calculate that:

In [None]:
wspd_METBK1_D10 = np.sqrt(METBK1_D10_data["northward_wind_velocity (m s-1)"]**2 + METBK1_D10_data["eastward_wind_velocity (m s-1)"]**2)

wspd_METBK1_D11 = np.sqrt(METBK1_D11_data["northward_wind_velocity (m s-1)"]**2 + METBK1_D11_data["eastward_wind_velocity (m s-1)"]**2)

# Add the calculated results
METBK1_D10_data["wspd_MET1_D10"] = wspd_METBK1_D10
METBK1_D11_data["wspd_MET1_D11"] = wspd_METBK1_D11

In [None]:
def resample_interpolate(df, freq='1min'):
    """Resample and interpolate a datetime index dataframe to new frequency"""
    new_index = df.resample(freq).asfreq().index
    tmp_index = df.index.union(new_index)
    new_df = df.reindex(tmp_index).interpolate('index').reindex(new_index)
    return new_df

METBK1_D11_1min = resample_interpolate(METBK1_D11_data)

METBK1_D11_1min = METBK1_D11_1min.loc[slice('2024-06-10 00:00:00','2024-06-11 07:00:00')]


METBK1_D11_1min = METBK1_D11_1min.iloc[0:-1]

In [None]:
# Now we can go ahead and plot the data
#slice('2024-06-10 00:00:00' , '2024-06-11 7:00:00') # Known time for MET comparison
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(METBK1_D11_1min["wspd_MET1_D11"], METBK1_D10_1min["wspd_MET1_D10"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((0,25))
ax.set_ylim((0,25))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea Deployment 10 & 11 METBK 1 Sensor S/N', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel('2-axis Wind Speed (m/s) Deployment 11',fontsize=15)
ax.set_ylabel('2-axis Wind Speed (m/s) Deployment 10',fontsize=15)
ax.grid()

# Ship Adjusted Data

Importing our Ship data and resampling to 1 minute and slicing to our desired time stamp for comparisons.

In [None]:
# Load in your ship data set
underway_june10_june11 = "/home/jovyan/wind_comparison/Ship_Buoy_Comparison/underway_june10_june11.csv"
underway_data = pd.read_csv(underway_june10_june11)

In [None]:
# Examine your data set
underway_data

In [None]:
# Set Date Time Index
underway_data["time"] = underway_data["time"].apply(lambda x: pd.to_datetime(x))

underway_data = underway_data.set_index(keys='time')

# Resample Interpolation
def resample_interpolate(df, freq='1min'):
    """Resample and interpolate a datetime index dataframe to new frequency"""
    new_index = df.resample(freq).asfreq().index
    tmp_index = df.index.union(new_index)
    new_df = df.reindex(tmp_index).interpolate('index').reindex(new_index)
    return new_df

# Assign
underway_data_1min = resample_interpolate(underway_data)

#SLice our data to time stamps
#Ship_adj_data_1min.loc[slice('2024-06-10 00:00:00','2024-06-11 07:00:00')]

In [None]:
underway_data_1min

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_starboard"], underway_data_1min["adj_wind_speed_port"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea 11 Ship Sensor Vaisala WXT520 Comparison', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel(' Vaisala WXT520 Wind Speed (m/s) Starboard',fontsize=15)
ax.set_ylabel('Vaisala WXT520 Wind Speed (m/s) Port',fontsize=15)
ax.grid()

# METBK1 vs Ship Sensors

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_port"], METBK1_D10_1min["wspd_MET1_D10"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea 11 METBK1 vs. Ship Sensor Comparison', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel('Vaisala WXT520 Wind Speed (m/s) Ship Port Sensor',fontsize=15)
ax.set_ylabel('2-axis Wind Speed (m/s) METBK1 Deployment 10',fontsize=15)
ax.grid()

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_starboard"], METBK1_D10_1min["wspd_MET1_D10"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea 11 METBK1 Deployment Starboard Ship Sensor Comparison', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel('Vaisala WXT520 Wind Speed (m/s) Ship Starboard Sensor',fontsize=15)
ax.set_ylabel('2-axis Wind Speed (m/s) METBK1 Deployment 10',fontsize=15)
ax.grid()

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_starboard"], METBK1_D11_1min["wspd_MET1_D11"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea 11 METBK1 Deployment Starboard Ship Sensor Comparison', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel('Vaisala WXT520 Wind Speed (m/s) Ship Starboard Sensor',fontsize=15)
ax.set_ylabel('2-axis Wind Speed (m/s) METBK1 Deployment 11',fontsize=15)
ax.grid()

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_port"], METBK1_D11_1min["wspd_MET1_D11"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea 11 METBK1 Deployment Starboard Ship Sensor Comparison', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel('Vaisala WXT520 Wind Speed (m/s) Ship Port Sensor',fontsize=15)
ax.set_ylabel('2-axis Wind Speed (m/s) METBK1 Deployment 11',fontsize=15)
ax.grid()

# METBK 2 vs Ship Sensors

In [None]:
METBK2_D11_data

In [None]:
# Calcultate the Magnitute
METBK2_D11_wspd = np.sqrt(METBK2_D11_data["northward_wind_velocity (m s-1)"]**2 + METBK2_D11_data["eastward_wind_velocity (m s-1)"]**2)

# Add wind speed to dataset
METBK2_D11_data["METBK2_D11_wspd"] = METBK2_D11_wspd


#METBK2_D11_data = METBK2_D11_data.set_index(keys='time (UTC)')


In [None]:
# Set Date Time Index
METBK2_D11_data["time"] = METBK2_D11_data["time"].apply(lambda x: pd.to_datetime(x))

METBK2_D11_data = METBK2_D11_data.set_index(keys='time')

# Resample Interpolation
def resample_interpolate(df, freq='1min'):
    """Resample and interpolate a datetime index dataframe to new frequency"""
    new_index = df.resample(freq).asfreq().index
    tmp_index = df.index.union(new_index)
    new_df = df.reindex(tmp_index).interpolate('index').reindex(new_index)
    return new_df

# Assign
METBK2_D11_data_1min = resample_interpolate(METBK2_D11_data)

#SLice our data to time stamps
METBK2_D11_data_1min = METBK2_D11_data_1min.loc[slice('2024-06-10 00:00:00','2024-06-11 07:00:00')]

In [None]:
METBK2_D11_data_1min

In [None]:
METBK2_D11_data_1min = METBK2_D11_data_1min.loc[slice('2024-06-10 00:00:00','2024-06-11 07:00:00')]
METBK2_D11_data_1min = METBK2_D11_data_1min.iloc[0:-1]

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_port"], METBK2_D11_data_1min["METBK2_D11_wspd"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea Deployment 10 & 11 Ship vs. Buoy METBK 1 Sensor S/N', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel(' 1 minute Resampled Ship Data',fontsize=15)
ax.set_ylabel('1 minute Resampled METBK1 D11 data',fontsize=15)
ax.grid()

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(Ship_adj_data_1min_mean["adj_wind_speed_starboard"], METBK2_D11_1min_mean["METBK2_D11_wspd"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea Deployment 10 & 11 Ship vs. Buoy METBK 1 Sensor S/N', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel(' 1 minute Resampled Ship Data',fontsize=15)
ax.set_ylabel('1 minute Resampled METBK1 D10 data',fontsize=15)
ax.grid()

# FDCHP vs Ship Sensors

In [None]:
FDCHP1_D11_data

In [None]:
# Set Date Time Index
FDCHP1_D11_data["time (UTC)"] = FDCHP1_D11_data["time (UTC)"].apply(lambda x: pd.to_datetime(x))

FDCHP1_D11_data = FDCHP1_D11_data.set_index(keys='time (UTC)')

# Resample Interpolation
def resample_interpolate(df, freq='1min'):
    """Resample and interpolate a datetime index dataframe to new frequency"""
    new_index = df.resample(freq).asfreq().index
    tmp_index = df.index.union(new_index)
    new_df = df.reindex(tmp_index).interpolate('index').reindex(new_index)
    return new_df

# Assign
FDCHP1_D11_data_1min = resample_interpolate(FDCHP1_D11_data)

In [None]:
FDCHP1_D11_data_1min = FDCHP1_D11_data_1min.iloc[0:-1]

In [None]:
underway_data_1min

In [None]:
#SLice our data to time stamps
FDCHP1_D11_data_1min = FDCHP1_D11_data_1min.loc[slice('2024-06-10 00:00:00','2024-06-11 07:00:00')]
FDCHP1_D11_data_1min

In [None]:
# Now we can go ahead and plot the data
fig, ax = plt.subplots(figsize=(12, 12))

# Create a one:one line
x = np.arange(0, 31, 1)
y = np.arange(0, 31, 1)

# Plot the one:one line
ax.plot(x, y, color="black", linewidth=2)


# Plot the comparison
ax.plot(underway_data_1min["adj_wind_speed_starboard"], FDCHP1_D11_data_1min["wind_speed"], marker='o', linestyle='', color="red", alpha=0.3) # Alpha controls transparency, 1=solid, 0=transparent

# Set some limits on the figure
ax.set_xlim((2,17.5))
ax.set_ylim((2,17.5))

# Add in title, axis labels, and grid lines
ax.set_title(f'Irminger Sea Deployment 10 & 11 Ship vs. Buoy METBK 1 Sensor S/N', fontsize=15) # f-strings are a way to substitute a variable into a string and have it print out
ax.set_xlabel(' 1 minute Resampled Ship Data',fontsize=15)
ax.set_ylabel('1 minute Resampled METBK1 D10 data',fontsize=15)
ax.grid()