### What this script does 
Basically synthesis and analysis of data from all instruments (in-situ and remote sensing, also combining data from other 3 stations in the Netherlands)

- loads CO₂ flux datasets from three stations (Veenkampen, Loobos, Amsterdam), converts µmol m⁻² s⁻¹ → ppm·m·s⁻¹, and plots time series.

- loads a 10-minute merged sonic/mast/radiometer dataset from a date-based folder to continue (LCL work comes next).
- plots multiple LCL estimates through the day:

    - black markers: LCL from temperature at 2 m (LCL column’s first element per row).

    - red markers: LCL computed from qv with T at 2 m (LCL_qv_T_2.0m).

    - green markers: Romps LCL already in km (LCL_romps_km).

    - orange line: CBL height from the parcel method (zi_parcel, converted to km).

    - shades periods where LCL ≤ CBL (suggesting a possible cloud layer).

- Sonic–mast comparisons: converts timestamps and makes two time-series scatter plots:

    - Temperature_K_2.99 vs Average_Temperature_Corr

    - qv_2.99m vs qv_sonic

    - Merge MWR 10-min data: loads iwv_lwp_10min_avg.csv from the microwave radiometer folder and merges it into   merged_data_10min (nearest‐time join).

- Plot LWP: creates a presentation-style plot of LWP_Corrected vs time and saves it.

- Surface state from LW↑: derives T_srf, esat_srf, and qsat_srf.

- Load Cloud Radar products: loads 10-min rain CSV and the 10-min vertical dataset (Parquet), suffixes CR columns, merges to the main dataframe.

- Quality flags: adds Flag (|ΔT|>1 K) and Flag_Rain (rain>0) and plots a flagged temperature scatter.

- Outputs: several PNGs saved into the appropriate day folders.


- Loads vertical profiles: reads RH/AH 10-min profiles (profiles_data_10min_avg.parquet) and MWR vertical dataset (MWR_vertical_dataset_10min.parquet) for the chosen day.

- Fuse with mast data: for each profile timestamp, pulls same-time mast RH and qv at fixed heights and builds a combined row.

- Plot: makes side-by-side plots comparing radiometer vs mast RH (and later temperature) profiles at a chosen time index.

- Temperature profiles: loads vertical_temperature_profiles_10min_avg.parquet, combines with mast temperatures and VDSE, and plots.

- All-in merge: merges RH/AH combo, MWR vertical, and temperature combo into df_combined_all, then saves a figure.

- Classifies cloud type per time step using radiometer RH profile, LWP, two flags (instrument/rain), and a clearness index CSI.

- Saves the cloud classification to Excel and produces per-type profile plots (RH & temperature) into subfolders.

- Merges your per-timestamp profile bundle (df_combined_all) into the 10-min base table (merged_data_10min) on Time.

- Provides three plotting helpers:

    - plot_temperature_profiles(...): compares T and θᵥ from MWR, Cloud Radar (CR), and Mast.

    - plot_rh_profiles(...): compares RH (primary x-axis) and AH (secondary x-axis) for MWR, CR, and Mast.

    - plot_all_profiles(...): a 3-panel figure for T/θᵥ, RH/AH, and qᵥ side-by-side; also draws CBL height and LCL lines if present.


- Exports a 1-page PDF listing all columns in merged_data_10min.

- Computes basic bulk‐flux estimates (u_star, Cd, SHF_bulk, LHF_bulk) and checks sign agreement with measured SHF/LHF.

- Plots time series for SHF/LHF (measured vs bulk).

- Builds diurnal plots for temperatures, dry static energy, virtual dry static energy, qv, wind speed, radiation components, momentum fluxes, and CO₂.

- Computes/plots vertical gradients (lapse rates) of 𝑠𝑑𝑟𝑦,𝑣/𝑐𝑝 and marks zero crossings.

- LW↓ vs IWV: plots downwelling longwave radiation against integrated water vapor and saves the scatter plot.

- SW↓ vs LWP: plots downwelling shortwave irradiance against liquid water path (LWP) and saves it.

- LWP vs Time: Shows how LWP changes over time in a time-series plot.

- Temperature (2.99 m) vs Sonic: Compares mast temperature (at 2.99 m) with sonic temperature over time.

- Wind (2.99 m) vs Sonic: Compares mast wind speed with sonic wind speed and saves the time plot.

- Wind Speed Flags & Scatter: Marks and plots wind speed outliers (ΔWS > 1 m/s) and rain periods on a scatter plot.

- SEB Scatter Plots: Plots four energy balance components (SHF, LHF, Net Radiation, G, F_CO₂) versus time (filtered by quality flags).

- 30-min Flux Data: Loads the 30-minute averaged flux dataset for further comparisons.

- External Stations Data: Loads and cleans flux, soil, and radiation data from Loobos, Amsterdam, and Veenkampen reference sites:

    - SHF Comparison: Compares sensible heat flux (H) from all stations with your measured SHF.

    - LHF Comparison: Compares latent heat flux (LE) from all stations with your measured LHF.

    - CO₂ Flux Comparison: Compares CO₂ flux from all stations with your measured F_CO₂.

    - Soil Fluxes: Plots soil heat flux (G) probes from Loobos and Veenkampen alongside your measured G.

    - Net Radiation Comparison: Compares net radiation from all sites (with consistent sign convention) against your measured Net Rad.

    - Residual Ground Heat Flux: Computes residual G = −(H + LE + Net Rad) for each site and compares with your measured G.
  
#### Lines you should change before running

- other stations data folder:
    - data_vl = r"C:\path\to\your\Fluxes_Other_Stations_May_20_26"
    - vl_file  = os.path.join(data_vl, "Veenkampen_Fluxes.csv")
    - loo_file = os.path.join(data_vl, "Loobos_Fluxes.csv")
    - ams_file = os.path.join(data_vl, "Amsterdam_Fluxes.csv")

- 10-minute dataset folder and date of interest:
    - date_str  = "2024-05-23"           ← change date here
    - month_str = date_str[:7]
    - data_dir  = rf"C:\path\to\your\Sonic\{month_str}\{date_str}"
    - input_file = os.path.join(data_dir, "merged_data_10min.csv")


- Base directory containing CBL height outputs from the microwave radiometer:
  cbl_dir = rf"C:\path\to\your\Microwave_radiometer\{month_str}\{date_str}"


- MWR (IWV/LWP) folder: folder_path = rf"C:\path\to\your\Microwave_radiometer\{month_str}\{date_str}"

- Cloud Radar folder: cloud_radar_dir = rf"C:\path\to\your\Cloud_radar\{month_str}\{date_str}"

- folder_path = rf"C:\path\to\your\Microwave_radiometer\{month_str}\{date_str}"
    - Files expected in folder_path: 
        - profiles_data_10min_avg.parquet, 
        - MWR_vertical_dataset_10min.parquet, 
        - vertical_temperature_profiles_10min_avg.parquet

- pdf_path = rf"C:\path\to\your\Master_Thesis\merged_columns_list.pdf"


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import datetime
from matplotlib.lines import Line2D


!pip install pvlib
from scipy.special import lambertw

from matplotlib.dates import DateFormatter
from datetime import time
import pvlib
import seaborn as sns
from sklearn.linear_model import LinearRegression
import matplotlib.dates as mdates
from sklearn.metrics import mean_absolute_error, mean_squared_error
from numpy.polynomial.polynomial import Polynomial


In [None]:
# Constants
Cp = 1005  # Specific heat capacity of dry air at constant pressure (J/kg/K)
g = 9.81   # Acceleration due to gravity (m/s^2)

A=6.11*100 #Pa
beta=0.067 #K^-1
Ttrip=273.16 #K
epsilon=0.622
sigma=5.67e-8 #W*m^-2*K^-4
Lv = 2.5e6  # Latent heat of vaporization in J/kg

Rd=287.04
Rv=461.5

rho_atm=1.225 #kg/m^3
m_co2=0.044 #kg/mole molecular mass CO2
m_atm=0.028 #molecular mass atmosphere

In [None]:
# Function to calculate saturation pressure (es) from temperature (T)
def calculate_saturation_pressure(T):
    #es = A*np.exp(beta*(T+273.15-Ttrip))  # Saturation vapor pressure in Pa
    es=610.78*np.exp(17.2694*(T-Ttrip)/(T-35.86))
    return es
def calculate_saturation_specific_humidity(es, p):
    #qv = epsilon * e *1000 / (p*100) # Specific humidity in g/kg
    qs=es*1000/(p*100+(((Rv/Rd)-1)*(p*100-es)))
    return qs


In [None]:
###LCL
# Function to calculate pressure at altitude z given surface pressure and temperature
def calculate_pressure(P_surf, z, T_surf):
    #return P_surf * np.exp(-g * z / (Rd * T))
    return P_surf * (1+(-g*z/(T_surf*Cp)))**(Cp/Rd)
def calculate_saturation_specific_humidity_lcl(T, p):
    es = 610.78 * np.exp(17.2694 * (T - Ttrip) / (T - 35.86))
    qs = es * 1000 / (((Rv / Rd) * (p * 100 - es)) + es)
   # qs=Rd*es*1000/(Rv*p*100)
    return qs

In [None]:
def lcl(p, T, rh=None, rhl=None, rhs=None, return_ldl=False, return_min_lcl_ldl=False):
    """
    Compute the lifting-condensation level (LCL) [m] (or deposition level LDL, or min(LCL,LDL))
    following Romps (2017).  Inputs must be in SI units:

      • p : pressure [Pa]  (positive)
      • T : temperature [K] (positive)
      • Exactly one of rh, rhl, rhs (each between 0 and 1)

    Returns a float (or NumPy array of floats) giving height in meters.

    Raises ValueError if more than one (or none) of {rh, rhl, rhs} is given,
    or if the inputs are out of range.
    """

    # 1) Input validation
    p_arr = np.asarray(p, dtype=float)
    T_arr = np.asarray(T, dtype=float)
    if np.any(p_arr <= 0):
        raise ValueError("Pressure p must be > 0 Pa.")
    if np.any(T_arr <= 0):
        raise ValueError("Temperature T must be > 0 K.")

    provided = [rh is not None, rhl is not None, rhs is not None]
    if sum(provided) != 1:
        raise ValueError(f"Exactly one of rh, rhl, rhs must be specified (you provided {sum(provided)}).")

    # If arrays are passed for rh/rhl/rhs, cast them to NumPy arrays
    rh_arr  = np.asarray(rh,  dtype=float) if rh  is not None else None
    rhl_arr = np.asarray(rhl, dtype=float) if rhl is not None else None
    rhs_arr = np.asarray(rhs, dtype=float) if rhs is not None else None

    for name, arr in (("rh", rh_arr), ("rhl", rhl_arr), ("rhs", rhs_arr)):
        if arr is not None and np.any((arr < 0) | (arr > 1)):
            raise ValueError(f"{name} must be between 0 and 1.")

    # 2) Physical constants (Romps 2017)
    Ttrip = 273.16   # K
    ptrip = 611.65   # Pa
    E0v   = 2.3740e6 # J/kg
    E0s   = 0.3337e6 # J/kg
    ggr   = 9.81     # m/s²
    rgasa = 287.04   # J/(kg·K)
    rgasv = 461      # J/(kg·K)
    cva   = 719      # J/(kg·K)
    cvv   = 1418     # J/(kg·K)
    cvl   = 4119     # J/(kg·K)
    cvs   = 1861     # J/(kg·K)
    cpa   = cva + rgasa
    cpv   = cvv + rgasv

    # 3) Helper functions for saturation vapor pressure (Pa)
    def pvstarl(Tval):
        # over liquid
        return ptrip * (Tval/Ttrip)**((cpv - cvl)/rgasv) * np.exp(
            (E0v - (cvv - cvl)*Ttrip)/rgasv * (1/Ttrip - 1/Tval)
        )

    def pvstars(Tval):
        # over ice
        return ptrip * (Tval/Ttrip)**((cpv - cvs)/rgasv) * np.exp(
            (E0v + E0s - (cvv - cvs)*Ttrip)/rgasv * (1/Ttrip - 1/Tval)
        )

    # 4) Compute vapor pressure pv from whichever RH was provided
    if rh_arr is not None:
        # If T ≥ Ttrip, use RH over liquid; else over ice
        pv = np.where(T_arr >= Ttrip,
                      rh_arr * pvstarl(T_arr),
                      rh_arr * pvstars(T_arr))
    elif rhl_arr is not None:
        pv = rhl_arr * pvstarl(T_arr)
    else:  # rhs_arr is not None
        pv = rhs_arr * pvstars(T_arr)

    # 5) If pv > p anywhere, set those points to NaN (no finite LCL)
    pv = np.where(pv > p_arr, np.nan, pv)

    # 6) Recompute all three humidity ratios at the ambient T:
    rhl_new = pv / pvstarl(T_arr)
    rhs_new = pv / pvstars(T_arr)
    rh_new  = np.where(T_arr >= Ttrip, rhl_new, rhs_new)

    # 7) Mixed‐gas parameters
    qv   = rgasa * pv / (rgasv * p_arr + (rgasa - rgasv) * pv)
    rgasm= (1 - qv)*rgasa + qv*rgasv
    cpm  = (1 - qv)*cpa + qv*cpv

    # 8) Dry‐adiabatic limit (if rh = 0, no condensation)
    dry_case = (rh_new == 0)
    lcl_dry = cpm * T_arr / ggr

    # 9) Liquid branch coefficients
    aL = -(cpv - cvl)/rgasv + cpm/rgasm
    bL = -(E0v - (cvv - cvl)*Ttrip) / (rgasv * T_arr)
    cL = (pv / pvstarl(T_arr)) * np.exp(-(E0v - (cvv - cvl)*Ttrip)/(rgasv * T_arr))

    # 10) Solid branch coefficients
    aS = -(cpv - cvs)/rgasv + cpm/rgasm
    bS = -(E0v + E0s - (cvv - cvs)*Ttrip)/(rgasv * T_arr)
    cS = (pv / pvstars(T_arr)) * np.exp(-(E0v + E0s - (cvv - cvs)*Ttrip)/(rgasv * T_arr))

    # 11) Evaluate LambertW on branch = –1
    W_L = lambertw(bL/aL * cL**(1.0/aL), k=-1).real
    W_S = lambertw(bS/aS * cS**(1.0/aS), k=-1).real

    lcl_liquid = cpm * T_arr / ggr * (1.0 - bL/(aL * W_L))
    lcl_solid  = cpm * T_arr / ggr * (1.0 - bS/(aS * W_S))

    # 12) Combine dry/condensing results
    lcl_all = np.where(dry_case, lcl_dry, lcl_liquid)
    ldl_all = np.where(dry_case, lcl_dry, lcl_solid)

    # 13) Handle return flags
    if return_ldl and return_min_lcl_ldl:
        raise ValueError("Cannot set both return_ldl and return_min_lcl_ldl to True.")
    elif return_ldl:
        return ldl_all.astype(float)
    elif return_min_lcl_ldl:
        return np.minimum(lcl_all, ldl_all).astype(float)
    else:
        return lcl_all.astype(float)

### Compare CO2 flux

In [None]:
#Edit this before running!!!
#Path to data from other stations:
data_vl = r"C:\path\to\your\Fluxes_Other_Stations_May_20_26"

vl_file=os.path.join(data_vl,'Veenkampen_Fluxes.csv')
vl=pd.read_csv(vl_file)
loo_file=os.path.join(data_vl,'Loobos_Fluxes.csv')
loo=pd.read_csv(loo_file)
ams_file=os.path.join(data_vl,'Amsterdam_Fluxes.csv')
ams=pd.read_csv(ams_file)

print(loo.columns)

In [None]:
#Veenkampen
#print(vl['co2_flux'])
# Convert Timestamp column to datetime format
vl['Timestamp'] = pd.to_datetime(vl['Timestamp'], format='%Y-%m-%d %H:%M:%S',errors='coerce')
# Convert co2_flux column to numeric, coerce errors to NaN
vl['co2_flux'] = pd.to_numeric(vl['co2_flux'], errors='coerce')

# Drop rows with NaN values in co2_flux (if any)
vl.dropna(subset=['co2_flux'], inplace=True)

# Convert µmol m-2 s-1 to ppm m s^-1
mol_m2_s = vl['co2_flux'] / 1e6  # Convert µmol m-2 s-1 to mol m-2 s-1
ppm_m_s = (mol_m2_s / 22.414) * 1e6   # Convert mol m-2 s-1 to ppm m s^-1

# Add the converted values back to the DataFrame
vl['F_CO2_ppm_ms'] = ppm_m_s

#Loobos
# Convert Timestamp column to datetime format
loo['Timestamp'] = pd.to_datetime(loo['Timestamp'], format='%Y-%m-%d %H:%M:%S',errors='coerce')
# Convert co2_flux column to numeric, coerce errors to NaN
loo['co2_flux'] = pd.to_numeric(loo['co2_flux'], errors='coerce')

# Drop rows with NaN values in co2_flux (if any)
loo.dropna(subset=['co2_flux'], inplace=True)

# Convert µmol m-2 s-1 to ppm m s^-1
# Add the converted values back to the DataFrame
loo['F_CO2_ppm_ms'] = ((loo['co2_flux'] / 1e6 )/ 22.414) * 1e6  

#Amsterdam

# Convert Timestamp column to datetime format
ams['Timestamp'] = pd.to_datetime(ams['Timestamp'], format='%Y-%m-%d %H:%M:%S',errors='coerce')
# Convert co2_flux column to numeric, coerce errors to NaN
ams['co2_flux'] = pd.to_numeric(ams['co2_flux'], errors='coerce')

# Drop rows with NaN values in co2_flux (if any)
ams.dropna(subset=['co2_flux'], inplace=True)

# Convert µmol m-2 s-1 to ppm m s^-1
# Add the converted values back to the DataFrame
ams['F_CO2_ppm_ms'] = ((ams['co2_flux'] / 1e6 )/ 22.414) * 1e6  

# Plot CO2 flux vs time
plt.figure(figsize=(12, 6))
plt.plot(vl['Timestamp'], vl['F_CO2_ppm_ms'], marker='o', linestyle='-', color='b',label='Veenkampen')
plt.plot(loo['Timestamp'], loo['F_CO2_ppm_ms'], marker='x', linestyle='-', color='r',label='Loobos')
plt.plot(ams['Timestamp'], ams['F_CO2_ppm_ms'], marker='+', linestyle='-', color='g',label='Amsterdam')

plt.title('CO2 Flux vs Time')
plt.xlabel('Time')
plt.ylabel('CO2 Flux (ppm*ms^-1)')
plt.grid(True)
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### 10 min dataset

In [None]:
# 1) Single place to change:
date_str  = '2024-05-23'    # ← only line you ever edit
month_str = date_str[:7]    # '2024-03'
#Edit this before running!!!
# Define the path to the directory containing the data files
data_dir  = rf"C:\path\to\your\Sonic\{month_str}\{date_str}"

input_file = os.path.join(data_dir, 'merged_data_10min.csv')

# Step 3: Load the data
merged_data_10min = pd.read_csv(input_file)

# Display the first few rows of the DataFrame to verify the data is loaded correctly
print(merged_data_10min.columns)

### LCL

In [None]:
'''
# Filter data for the specific timestamp
#specific_timestamp = '2024-05-03 13:10:00'
#filtered_data = merged_data_10min[merged_data_10min['TIMESTAMP'] == specific_timestamp]
# Heights (in meters)
heights_m = np.array([2, 2.99, 4.47, 6.69, 10])

# Extracted qv and qs values (in g/kg) at the given heights for a specific timestamp
qv_values = np.array([filtered_data['qv_2m'].values[0], 
                      filtered_data['qv_2.99m'].values[0], 
                      filtered_data['qv_4.47m'].values[0], 
                      filtered_data['qv_6.69m'].values[0], 
                      filtered_data['qv_10m'].values[0]])

qs_values = np.array([filtered_data['qs_2m'].values[0], 
                      filtered_data['qs_2.99m'].values[0], 
                      filtered_data['qs_4.47m'].values[0], 
                      filtered_data['qs_6.69m'].values[0], 
                      filtered_data['qs_10m'].values[0]])

# Step 1: Linear fit (extrapolation) for qs
# Fit a linear model for qs vs height
qs_slope, qs_intercept = np.polyfit(heights_m, qs_values, 1)

# Create a linear function for qs
def qs_extrapolated(height):
    return qs_slope * height + qs_intercept

# Step 2: Assume constant qv (last value in qv_values array)
qv_constant = qv_values[0]

# Step 3: Find the height where qv_constant = qs_extrapolated(height)
# Solve for height: qv_constant = qs_slope * height + qs_intercept
LCL_height = (qv_constant - qs_intercept) / qs_slope

# Plotting qv, qs and the extrapolated qs curve
plt.figure(figsize=(10, 6))

# Plot qv
plt.plot(qv_values, heights_m, label='qv (Specific Humidity)', marker='o', linestyle='--', color='blue')

# Plot qs
plt.plot(qs_values, heights_m, label='qs (Saturation Specific Humidity)', marker='o', linestyle='-', color='orange')

# Plot the extrapolated qs curve
extrapolated_heights = np.linspace(0, LCL_height + 50, 100)  # We extend the height range a bit for visualization
extrapolated_qs = qs_extrapolated(extrapolated_heights)
plt.plot(extrapolated_qs, extrapolated_heights, label='Extrapolated qs', linestyle='-', color='orange', alpha=0.6)

# Plot the LCL (intersection point)
plt.axhline(LCL_height, color='green', linestyle='--', label=f'LCL at {LCL_height:.2f} m')

# Annotations and labels
plt.title(f'qv and qs Profiles with Extrapolated LCL for {specific_timestamp}')
plt.xlabel('Specific Humidity (g/kg)')
plt.ylabel('Height (m)')
plt.grid()
plt.legend()

# Show the plot
plt.show()

print(f"Estimated LCL height: {LCL_height:.2f} meters")
'''

In [None]:
'''
P_surf = filtered_data['BP_mbar_Avg'].values[0]  # Surface pressure in mbar
T_surf = filtered_data['Temperature_K_2.99'].values[0]  # Surface temperature in Kelvin (e.g., from Temperature_K_2 column)
heights_km = np.linspace(0, 10, 100)  # Heights from 0 to 10 km
# Function to calculate saturation specific humidity (qs)
def calculate_saturation_specific_humidity(T, p):
    es = 610.78 * np.exp(17.2694 * (T - Ttrip) / (T - 35.86))
    qs = es * 1000 / (((Rv / Rd) * (p * 100 - es)) + es)
   # qs=Rd*es*1000/(Rv*p*100)
    return qs
# Function to calculate pressure at altitude z given surface pressure and temperature
def calculate_pressure(P_surf, z, T_surf):
    #return P_surf * np.exp(-g * z / (Rd * T))
    return P_surf * (1+(-g*z/(T_surf*Cp)))**(Cp/Rd)

# Arrays to store qs values
qs_values = []

# Calculate qs over height
for z in heights_km * 1000:  # Convert heights to meters
    # Temperature at height z (K), assuming a lapse rate of -9.8 K/km
    T_z = T_surf - 9.8 * (z / 1000)
    
    # Pressure at height z
    p_z = calculate_pressure(P_surf, z, T_surf)
    
    # Saturation specific humidity at height z
    qs_z = calculate_saturation_specific_humidity(T_z, p_z)
    
    # Store qs value
    qs_values.append(qs_z)

# Convert qs_values to a numpy array for plotting
qs_values = np.array(qs_values)

qv_extrapolated = np.interp(heights_km, heights_m, qv_values)



# Find the LCL: where qv and qs intersect
LCL_height = None
for i in range(len(heights_km)):
    if qv_extrapolated[i] >= qs_values[i]:  # Find the point where qv crosses qs
        LCL_height = heights_km[i]  # Height at which they intersect
        break

# Plotting qv and qs
plt.figure(figsize=(10, 6))
plt.plot(qv_extrapolated, heights_km, label='Extrapolated qv (Specific Humidity)', color='blue', linestyle='--', marker='o')
plt.plot(qs_values, heights_km, label='qs (Saturation Specific Humidity)', color='orange', linestyle='-', marker='o')

# Mark the LCL
if LCL_height is not None:
    plt.axhline(y=LCL_height, color='green', linestyle='--', label=f'LCL at {LCL_height:.2f} km')

# Annotations and labels
plt.title('Specific Humidity (qv) and Saturation Specific Humidity (qs)')
plt.xlabel('Specific Humidity (g/kg)')
plt.ylabel('Height (km)')
plt.grid()
plt.legend()
plt.show()

if LCL_height is not None:
    print(f"Estimated LCL height (Cloud Base Height): {LCL_height:.2f} km")
else:
    print("No LCL found where qv intersects qs.")
# Optional: Print the qs values at each height for inspection
for h, qs in zip(heights_km, qs_values):
    print(f"Height: {h:.2f} km, Saturation Specific Humidity: {qs:.4f} g/kg")
'''

In [None]:


measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])

# Heights in kilometers
heights_km = np.linspace(0, 10, 100)  # Heights from 0 to 10 km


# Prepare to calculate LCL for each timestamp in merged_data_10min
lcl_list = []

# Loop through each row in the DataFrame
for index, row in merged_data_10min.iterrows():
    # Extract surface pressure and temperatures for this timestamp
    P_surf = row['BP_mbar_Avg']  # Surface pressure in mbar
    T_surf = row['Temperature_K_2']  # Surface temperature in Kelvin

    # Extract qv values for this timestamp
    qv_values = np.array([
        row['qv_2m'],
        row['qv_2.99m'],
        row['qv_4.47m'],
        row['qv_6.69m'],
        row['qv_10m']
    ])  # Specific humidity in g/kg

    # Arrays to store qs values
    qs_values = []

    # Calculate qs over height
    for z in heights_km * 1000:  # Convert heights to meters
        # Temperature at height z (K), assuming a lapse rate of -9.8 K/km
        T_z = T_surf - 9.8 * (z / 1000)
        
        # Pressure at height z
        p_z = calculate_pressure(P_surf, z, T_surf)
        
        # Saturation specific humidity at height z
        qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)
        
        # Store qs value
        qs_values.append(qs_z)

    # Convert qs_values to a numpy array for further processing
    qs_values = np.array(qs_values)

    # Extrapolate qv to the height range
    qv_extrapolated = np.interp(heights_km, measured_heights / 1000, qv_values)

    # Find the LCL: where qv and qs intersect
    LCL_height = None
    for i in range(len(heights_km)):
        if qv_extrapolated[i] >= qs_values[i]:  # Find the point where qv crosses qs
            LCL_height = heights_km[i]  # Height at which they intersect
            break

    # Append LCL height for this timestamp
    lcl_list.append(LCL_height)

# Convert LCL heights list to a DataFrame column
merged_data_10min['LCL_qv'] = lcl_list



# Convert the 'TIMESTAMP' column to datetime format for better plotting
merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])

# Plotting LCL over time
plt.figure(figsize=(12, 6))
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv'], marker='o', linestyle='-')
plt.title('Lifting Condensation Level (LCL) Over Time')
plt.xlabel('Timestamp')
plt.ylabel('LCL Height (km)')
plt.xticks(rotation=45)
plt.grid()
plt.tight_layout()
plt.show()


In [None]:

# Example measured heights in meters
measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # Measured heights
heights_km = np.linspace(0, 10, 1000)  # Heights in kilometers, ranging from 0 to 10 km

# Prepare to calculate LCL for each timestamp and each T_surf from different heights
lcl_dict = {f'LCL_qv_T_{h}m': [] for h in measured_heights}

# Loop through each row in the DataFrame
for index, row in merged_data_10min.iterrows():
    
    # Extract qv values for this timestamp (specific humidity)
    qv_values = np.array([
        row['qv_2m'],
        row['qv_2.99m'],
        row['qv_4.47m'],
        row['qv_6.69m'],
        row['qv_10m']
    ])  # Specific humidity in g/kg

    # Extract measured temperatures for this timestamp (in Kelvin)
    measured_temperatures = np.array([
        row['Temperature_K_2'],
        row['Temperature_K_2.99'],
        row['Temperature_K_4.47'],
        row['Temperature_K_6.69'],
        row['Temperature_K_10']
    ])  # Temperatures at corresponding heights

    # Arrays to store qs values (saturation specific humidity)
    qs_values = []

    # Loop through each T_surf for each height
    for t_idx, T_surf in enumerate(measured_temperatures):
        # Surface pressure for this timestamp (assumed constant across heights)
        P_surf = row['BP_mbar_Avg']  # Surface pressure in mbar

        # Calculate qs (saturation specific humidity) over height for this T_surf
        qs_values = []
        for z in heights_km * 1000:  # Convert heights from km to meters
            # Temperature at height z (K), assuming a lapse rate of -9.8 K/km
            T_z = T_surf - 9.8 * (z / 1000)

            # Pressure at height z
            p_z = calculate_pressure(P_surf, z, T_surf)

            # Saturation specific humidity at height z
            qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)

            # Store qs value
            qs_values.append(qs_z)

        # Convert qs_values to a numpy array for further processing
        qs_values = np.array(qs_values)

        # Extrapolate qv to the height range
        qv_extrapolated = np.interp(heights_km, measured_heights / 1000, qv_values)

        # Find the LCL: where qv and qs intersect
        LCL_height = None
        for i in range(len(heights_km)):
            if qv_extrapolated[i] >= qs_values[i]:  # Find the point where qv crosses qs
                LCL_height = heights_km[i]  # Height at which they intersect
                break

        # Append LCL height for this T_surf (for the corresponding height)
        height_label = measured_heights[t_idx]
        lcl_dict[f'LCL_qv_T_{height_label}m'].append(LCL_height)

# Convert LCL heights list to new columns in the DataFrame
for height_label, lcl_values in lcl_dict.items():
    merged_data_10min[height_label] = lcl_values

# Convert the 'TIMESTAMP' column to datetime format for better plotting
merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])

# Plotting LCL over time for each surface temperature
plt.figure(figsize=(12, 6))

for height_label in measured_heights:
    plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min[f'LCL_qv_T_{height_label}m'],
             marker='o', linestyle='-', label=f'T_surf = {height_label}m')

# Format the x-axis to show time in HH:MM format
date_format = DateFormatter('%H:%M')
plt.gca().xaxis.set_major_formatter(date_format)

# Customize plot labels and title with larger fonts
plt.title('Lifting Condensation Level (LCL) Over Time for Different Surface Temperatures', fontsize=18)
plt.xlabel('Timestamp (UTC)', fontsize=16)
plt.ylabel('LCL Height (km)', fontsize=16)

# Rotate x-axis ticks for better readability
plt.xticks(rotation=45, fontsize=12)
plt.yticks(fontsize=12)

# Show legend
plt.legend(title='Surface Temperature Height', fontsize=12)

# Show the grid
plt.grid(True)

# Adjust layout to prevent overlap
plt.tight_layout()

# Display plot
plt.show()


In [None]:
# === Plot LCL profile for a specific timestamp (e.g., index 10) ===

index = 80  # Choose the timestamp row you want to plot
row = merged_data_10min.iloc[index]

# Use T_surf from 2m as example
T_surf = row['Temperature_K_2.99']
P_surf = row['BP_mbar_Avg']

# Get qv values
qv_values = np.array([
    row['qv_2m'],
    row['qv_2.99m'],
    row['qv_4.47m'],
    row['qv_6.69m'],
    row['qv_10m']
])

# Calculate qs profile
qs_profile = []
for z in heights_km * 1000:
    T_z = T_surf - 9.8 * (z / 1000)  # Dry lapse rate
    p_z = calculate_pressure(P_surf, z, T_surf)
    qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)
    qs_profile.append(qs_z)
qs_profile = np.array(qs_profile)

# Interpolate qv to full height range
qv_profile = np.interp(heights_km, measured_heights / 1000, qv_values)

# Find LCL
LCL_height = np.nan
for i in range(len(heights_km)):
    if qv_profile[i] >= qs_profile[i]:
        LCL_height = heights_km[i]
        break

# Plot nicely formatted LCL diagnostic
plt.figure(figsize=(8, 6))

# Plot extrapolated qv
plt.plot(qv_profile, heights_km, 'o', color='royalblue', markersize=4, label=r'$q_v$ (extrapolated)')

# Plot qs profile
plt.plot(qs_profile, heights_km, '-', color='darkorange', linewidth=2, label=r'$q_{sat}$ (saturation)')

# Plot LCL height
if not np.isnan(LCL_height):
    plt.axhline(LCL_height, color='seagreen', linestyle='--', linewidth=1.5, label=fr'LCL $\approx$ {LCL_height:.2f} km')

# Axes labels and title
plt.xlabel('Specific Humidity (g/kg)', fontsize=13)
plt.ylabel('Height (km)', fontsize=13)
plt.title(f'LCL Profile at {row["TIMESTAMP"]:%Y-%m-%d %H:%M:%S}', fontsize=14, weight='bold')

# Ticks and grid
plt.xticks(fontsize=11)
plt.yticks(fontsize=11)
plt.grid(True, linestyle='--', linewidth=0.5, alpha=0.7)

# Legend
plt.legend(fontsize=11, loc='upper right')

# Layout
plt.tight_layout()
plt.show()


In [None]:

# 1) Pick a row (timestamp) to plot
index = 80
row = merged_data_10min.iloc[index]

# 2) Extract mast‐level temperatures (K) at each height (m)
measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # in meters
measured_temperatures = np.array([
    row['Temperature_K_2'],
    row['Temperature_K_2.99'],
    row['Temperature_K_4.47'],
    row['Temperature_K_6.69'],
    row['Temperature_K_10']
])

# 3) Surface pressure for computing pressure profile
P_surf = row['BP_mbar_Avg']  # in mbar

# 4) Build measured qv at mast levels and interpolate onto heights_km
qv_values = np.array([
    row['qv_2m'],
    row['qv_2.99m'],
    row['qv_4.47m'],
    row['qv_6.69m'],
    row['qv_10m']
])
qv_profile = np.interp(heights_km, measured_heights / 1000.0, qv_values)

# 5) Colorblind‐friendly palette and distinct markers
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#9467bd', '#8c564b']
markers = ['o', 'v', 's', 'D', 'X']

# 6) Create a 12×6″ canvas
fig, ax = plt.subplots(figsize=(10, 6))

# 6a) Plot interpolated qv profile (black circles, larger marker)
ax.plot(
    qv_profile,
    heights_km,
    linestyle='None',
    marker='o',
    color='black',
    markersize=3,
    label=r'$q_v$ (interpolated)'
)

# 6b) Loop over each mast‐level temperature to compute qs(z)
for t_idx, T_surf in enumerate(measured_temperatures):
    height_m = measured_heights[t_idx]
    color = colors[t_idx]
    marker = markers[t_idx]

    qs_profile = []
    for z_km in heights_km:
        z_m = z_km * 1000.0
        T_z = T_surf - 9.8 * (z_m / 1000.0)
        p_z = calculate_pressure(P_surf, z_m, T_surf)
        qs_z = calculate_saturation_specific_humidity_lcl(T_z, p_z)
        qs_profile.append(qs_z)
    qs_profile = np.array(qs_profile)

    # 6c) Determine LCL crossing (where qv_profile ≥ qs_profile)
    LCL_height_km = np.nan
    for i_h, z_km in enumerate(heights_km):
        if qv_profile[i_h] >= qs_profile[i_h]:
            LCL_height_km = z_km
            break

    # 6d) Plot the saturation‐specific‐humidity curve (larger marker)
    ax.plot(
        qs_profile,
        heights_km,
        linestyle='None',
        marker=marker,
        markersize=3,
        color=color,
        label=fr'$q_{{sat}}$ (T$_{{{height_m:.2f}\,m}}$)'
    )

    # 6e) If LCL is found, draw a horizontal line (thicker)
    if not np.isnan(LCL_height_km):
        ax.axhline(
            LCL_height_km,
            color=color,
            linestyle=':',
            linewidth=2.0,
            label=fr'LCL T$_{{{height_m:.2f}\,m}} \approx {LCL_height_km:.2f}$ km'
        )

# 7) Final formatting

# 7a) Axis labels and title (larger, bold)
ax.set_xlabel('q$_v$ (g/kg)', fontsize=16, fontweight='bold')
ax.set_ylabel('Height (km)', fontsize=16, fontweight='bold')
ax.set_title(
    f'LCL Profiles at {row["TIMESTAMP"]:%Y-%m-%d %H:%M:%S}',
    fontsize=18,
    fontweight='bold'
)

# 7b) Tick parameters (major = 14 pt, thicker ticks)
ax.tick_params(axis='both', which='major', labelsize=14, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# 7c) Thicken spines to match other figures
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# 7d) Grid styling (major dashed, minor dotted)
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# 7e) Legend inside the plot area (e.g., upper right), larger font
legend = ax.legend(
    fontsize=12,
    title='Legend',
    title_fontsize=13,
    frameon=True,
    loc='upper right'
)
legend.get_frame().set_linewidth(1.5)

# 7f) Limit vertical axis to 0–4 km
ax.set_ylim(0, 4.0)

# 7g) Use identical tight_layout rect so axes occupy 78% width
plt.tight_layout(rect=[0, 0, 0.78, 1.0])

# 8) Save the figure in data_dir at 300 dpi
output_path = os.path.join(data_dir, 'LCL_qv.png')
plt.savefig(output_path, dpi=300, bbox_inches='tight')
print(f"Figure saved to: {output_path}")

plt.show()


In [None]:
# Extract the IR20Dn values
lw_dn = merged_data_10min['IR20Dn']
Tcld = (lw_dn / sigma)**0.25  # Temperature in Kelvin

merged_data_10min['Tcld']=Tcld

# Add Tcld to the filtered data for reference
#print(merged_data_10min)

In [None]:
'''
# Filter data for the specific timestamp
#specific_timestamp = '2024-05-03 13:10:00'
#filtered_data = merged_data_10min[merged_data_10min['TIMESTAMP'] == specific_timestamp]

# Constants for lapse rate
lapse_rate = -9.8  # °C/km
max_height = 10.0  # km
heights_km = np.arange(0, max_height + 0.1, 0.1)  # Heights from 0 to 10 km in increments of 0.1 km

# Measured heights and temperatures (in Kelvin)
measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # Heights in meters
measured_temperatures = np.array([filtered_data['Temperature_K_2'].values[0],
                                   filtered_data['Temperature_K_2.99'].values[0],
                                   filtered_data['Temperature_K_4.47'].values[0],
                                   filtered_data['Temperature_K_6.69'].values[0],
                                   filtered_data['Temperature_K_10'].values[0]])  # Temperatures in Kelvin

# Extrapolate each measured temperature
extrapolated_temperatures = []
for temp, height in zip(measured_temperatures, measured_heights):
    # Calculate temperature at each height using lapse rate
    extrapolated_temp = temp + lapse_rate * (heights_km - height / 1000)  # Convert height from m to km
    extrapolated_temperatures.append(extrapolated_temp)

# Convert the list to an array for plotting
extrapolated_temperatures = np.array(extrapolated_temperatures)

# Initialize list to store LCL data
lcl_data = []

# Calculate LCL heights
for i in range(len(measured_temperatures)):
    # Find where the extrapolated temperature equals Tcld
    tcld_value = filtered_data['Tcld'].values[0]
    for j, temp in enumerate(extrapolated_temperatures[i]):
        if temp <= tcld_value:
            lcl_height_km = heights_km[j]  # Height in km where Tcld is reached
            lcl_data.append({'Measured Height (m)': measured_heights[i], 'LCL Height (km)': lcl_height_km})
            break

# Create DataFrame from LCL data
lcl_df = pd.DataFrame(lcl_data)

# Plotting
plt.figure(figsize=(10, 6))

# Plot extrapolated temperatures for each measured temperature
for i in range(len(measured_temperatures)):
    plt.plot(extrapolated_temperatures[i], heights_km, label=f'Extrapolated from {measured_heights[i]} m', linestyle='--')

# Plot measured temperatures
plt.scatter(measured_temperatures, measured_heights / 1000, color='orange', label='Measured Temperatures', zorder=5)

# Add vertical line for Tcld
plt.axvline(x=filtered_data['Tcld'].values[0], color='blue', linestyle='-', label=f'Tcld: {filtered_data["Tcld"].values[0]:.2f} K')

# Plot LCL heights
#plt.scatter(lcl_df['LCL Height (km)'], lcl_df['Measured Height (m)']/1000, color='red', label='LCL Heights', zorder=5)

# Annotations and legend
plt.title('Extrapolated Temperature Profiles vs Height and LCL Levels')
plt.xlabel('Temperature (K)')
plt.ylabel('Height (km)')
plt.grid()
plt.legend()
plt.show()

# Display LCL DataFrame
print(lcl_df)
'''

In [None]:
# Constants for lapse rate
lapse_rate = -9.8  # °C/km
max_height = 10.0  # km
heights_km = np.arange(0, max_height + 0.1, 0.01)  # Heights from 0 to 10 km in increments of 0.1 km

# Measured heights and temperatures (in Kelvin)
measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # Heights in meters

# Initialize a list to store LCL heights for each timestamp
lcl_list = []

# Loop through each row in the DataFrame
for index, row in merged_data_10min.iterrows():
    # Get Tcld for the current timestamp
    tcld_value = row['Tcld']
    
    # Measured temperatures for this row
    measured_temperatures = np.array([
        row['Temperature_K_2'],
        row['Temperature_K_2.99'],
        row['Temperature_K_4.47'],
        row['Temperature_K_6.69'],
        row['Temperature_K_10']
    ])  # Temperatures in Kelvin

    # Initialize a list for LCL heights for this row
    lcl_heights = []

    # Extrapolate each measured temperature
    for i in range(len(measured_temperatures)):
        temp = measured_temperatures[i]
        found_lcl = False
        for j, h in enumerate(heights_km):
            extrapolated_temp = temp + lapse_rate * (h - (measured_heights[i] / 1000))
            if extrapolated_temp <= tcld_value:
                lcl_heights.append(h)  # Height in km where Tcld is reached
                found_lcl = True
                break
        if not found_lcl:
            lcl_heights.append(np.nan)  # No LCL found for this temperature

    # Append the list of LCL heights for this timestamp
    lcl_list.append(lcl_heights)
    #print(lcl_list)
# Convert LCL heights list to a DataFrame column
#lcl_array = pd.DataFrame(lcl_list, columns=['LCL_2m', 'LCL_2.99m', 'LCL_4.47m', 'LCL_6.69m', 'LCL_10m'])

# Combine the LCL heights with the original DataFrame
#merged_data_10min = pd.concat([merged_data_10min, lcl_array], axis=1)

# Display the DataFrame with LCL heights
#print(merged_data_10min[['TIMESTAMP', 'LCL_2m', 'LCL_2.99m', 'LCL_4.47m', 'LCL_6.69m', 'LCL_10m']])#LCL in km
# Add the LCL heights as a new column to the DataFrame
merged_data_10min['LCL'] = lcl_list

# Display the DataFrame with LCL heights
print(merged_data_10min[['TIMESTAMP', 'LCL']])


In [None]:

# 1) Select the same timestamp as before
index = 80
row = merged_data_10min.iloc[index]

# 2) Extract mast heights (m) and surface temperatures (K)
measured_heights = np.array([2, 2.99, 4.47, 6.69, 10])  # meters
measured_temperatures = np.array([
    row['Temperature_K_2'],
    row['Temperature_K_2.99'],
    row['Temperature_K_4.47'],
    row['Temperature_K_6.69'],
    row['Temperature_K_10']
])

# 3) Compute brightness‐temperature‐equivalent for cloud base
T_cld = (row['IR20Dn'] / sigma) ** 0.25  # K

# 4) Define a height array from 0 to 4 km
heights_km = np.linspace(0, 4, 100)

# 5) Choose consistent, color‐blind‐friendly palette and markers
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#9467bd', '#8c564b']
markers = ['o', 'v', 's', 'D', 'X']

# 6) Create figure and axes with the same 12×6″ size as other plots
fig, ax = plt.subplots(figsize=(10, 6))

# 6a) Loop over each mast‐level surface temperature to plot dry‐adiabatic profile
for t_idx, T_surf in enumerate(measured_temperatures):
    height_label = measured_heights[t_idx]
    color = colors[t_idx]
    marker = markers[t_idx]

    # Dry‐adiabatic temperature profile: T(z) = T_surf − 9.8 K/km × z_km
    temp_profile = T_surf - 9.8 * heights_km

    # Find cloud‐base height where temp_profile ≤ T_cld
    cbh = np.nan
    for i, z_km in enumerate(heights_km):
        if temp_profile[i] <= T_cld:
            cbh = z_km
            break

    # Plot dry‐adiabatic profile with large markers
    ax.plot(
        temp_profile,
        heights_km,
        linestyle='None',
        marker=marker,
        markersize=3,
        color=color,
        label=fr'T$_{{{height_label:.2f}\,\mathrm{{m}}}}$ profile'
    )

    # Plot horizontal line at CBH if found
    if not np.isnan(cbh):
        ax.axhline(
            cbh,
            linestyle=':',
            color=color,
            linewidth=2.0,
            label=fr'CBH T$_{{{height_label:.2f}\,\mathrm{{m}}}} \approx {cbh:.2f}\,\mathrm{{km}}$'
        )

# 6b) Plot vertical line for brightness temperature T_cld
ax.axvline(
    T_cld,
    color='black',
    linestyle='--',
    linewidth=2.0,
    label=fr'$T_{{cld}}$ = {T_cld:.2f} K'
)

# 7) Final formatting

# 7a) Axis labels and title with larger fonts to match other figures
ax.set_xlabel('Temperature (K)', fontsize=16, fontweight='bold')
ax.set_ylabel('Height (km)', fontsize=16, fontweight='bold')
ax.set_title(
    f'Radiative CBH Profiles at {row["TIMESTAMP"]:%Y-%m-%d %H:%M:%S}',
    fontsize=18,
    fontweight='bold'
)

# 7b) Tick parameters: major ticks at 14 pt, thicker lines
ax.tick_params(axis='both', which='major', labelsize=14, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# 7c) Thicken spines (axis borders)
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# 7d) Grid styling: major dashed, minor dotted
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# 7e) Legend inside the plot area (upper right), with larger font
legend = ax.legend(
    fontsize=12,
    title='Profiles & CBH',
    title_fontsize=13,
    frameon=True,
    loc='upper right'
)
legend.get_frame().set_linewidth(1.5)

# 7f) Restrict vertical axis to 0–4 km (same as heights_km)
ax.set_ylim(0, 4.0)

# 7g) Use identical tight_layout rectangle so axes occupy 78% width
plt.tight_layout(rect=[0, 0, 0.78, 1.0])

# 8) Save figure in data_dir at 300 dpi
output_path = os.path.join(data_dir, 'LCL_Tcld.png')
plt.savefig(output_path, dpi=300, bbox_inches='tight')
print(f"Figure saved to: {output_path}")

plt.show()


In [None]:
# Plot LCL at 2m over time
plt.figure(figsize=(10, 6))
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[0])
, marker='o', label='LCL T2m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[1])
, marker='o', label='LCL T2.99m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[2])
, marker='o', label='LCL T4.47m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[3])
, marker='o', label='LCL T6.69m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[4])
, marker='o', label='LCL T10m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv'], marker='o', linestyle='-',label='LCL from qv')

plt.xlabel('Timestamp')
plt.ylabel('LCL Height (km)')
plt.title('LCL at 2m Over Time')
plt.xticks(rotation=45)  # Rotate timestamp labels for better readability
plt.grid(True)
plt.legend()
plt.tight_layout()
plt.show()


In [None]:
# (Assume merged_data_10min is already loaded, with columns:
#    • “LCL”                → shape (n_times, 5) after vstack
#    • “LCL_qv_T_2.0m” … etc → length‐n_times each
#  And data_dir is defined.)

# 1) Stack radiative‐LCL and qv‐LCL into 2D arrays of shape (n_times, 5)
radiative_LCL = np.vstack(merged_data_10min["LCL"].values)  # (n_times, 5)
qv_LCL = np.column_stack([
    merged_data_10min["LCL_qv_T_2.0m"].values,
    merged_data_10min["LCL_qv_T_2.99m"].values,
    merged_data_10min["LCL_qv_T_4.47m"].values,
    merged_data_10min["LCL_qv_T_6.69m"].values,
    merged_data_10min["LCL_qv_T_10.0m"].values
])  # (n_times, 5)

# 2) Height labels and a single color for each subplot (you can re‐use colors if you like):
height_labels = ["2 m", "2.99 m", "4.47 m", "6.69 m", "10 m"]
subplot_colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#9467bd', '#8c564b']

# 3) Create a 1×5 grid of subplots, share x‐ and y‐limits for uniform appearance
fig, axes = plt.subplots(
    nrows=1,
    ncols=5,
    figsize=(18, 4),  # wide enough to accommodate five panels
    sharex=True,
    sharey=True
)

# 4) Determine maximum LCL (both methods) over all heights/times for a common axis range
all_vals = np.concatenate([radiative_LCL.flatten(), qv_LCL.flatten()])
vmax = np.nanmax(all_vals) * 1.05  # a little padding

for j, ax in enumerate(axes):
    x = radiative_LCL[:, j]
    y = qv_LCL[:, j]
    mask_both = np.isfinite(x) & np.isfinite(y)

    # 4a) Plot paired points (both methods valid) in one color
    ax.scatter(
        x[mask_both],
        y[mask_both],
        s=20,
        alpha=0.6,
        color=subplot_colors[j],
        marker='o',
        edgecolor='black'
    )

    # 4b) Plot “radiative‐only” points below y = 0.0 (if desired)
    mask_rad_only = np.isfinite(x) & ~np.isfinite(y)
    if mask_rad_only.sum() > 0:
        ax.scatter(
            x[mask_rad_only],
            np.full(mask_rad_only.sum(), -0.05),
            s=15,
            color='gray',
            marker='x',
            alpha=0.4
        )

    # 4c) Plot “qᵥ‐only” points left of x = 0.0 (if desired)
    mask_qv_only = ~np.isfinite(x) & np.isfinite(y)
    if mask_qv_only.sum() > 0:
        ax.scatter(
            np.full(mask_qv_only.sum(), -0.05),
            y[mask_qv_only],
            s=30,
            color='black',
            marker='+',
            alpha=0.2
        )

    # 5) Reference 1:1 line (only on the middle subplot to reduce clutter, or on all if you prefer)
    ax.plot([0, vmax], [0, vmax], linestyle='--', color='gray', linewidth=1.2)

    # 6) Formatting per subplot
    ax.set_xlim(-0.1, vmax)
    ax.set_ylim(-0.1, vmax)
    ax.set_aspect('equal', 'box')

    # Only label outer axes
    if j == 0:
        ax.set_ylabel("Thermo LCL (qᵥ) (km)", fontsize=12, fontweight='bold')
    ax.set_title(f"{height_labels[j]}", fontsize=12, fontweight='bold')
    ax.tick_params(labelsize=10)

# 7) Common X‐label (put below all subplots)
fig.text(0.5, -0.02, "Radiative LCL (km)", ha='center', fontsize=12, fontweight='bold')

# 8) Overall figure title
fig.suptitle("Radiative vs. Thermodynamic (qᵥ) LCL, by Start Height", fontsize=16, fontweight='bold')

plt.tight_layout(rect=[0, 0.05, 1, 0.95])

# 9) Save the figure in data_dir at 300 dpi
#save_path = os.path.join(data_dir, "LCL_scatter_each_height.png")
#plt.savefig(save_path, dpi=300, bbox_inches='tight')
#print(f"Figure saved to: {save_path}")

plt.show()
# 1) Reconstruct the 2D arrays if you haven’t already:
radiative_LCL = np.vstack(merged_data_10min["LCL"].values)  
qv_LCL = np.column_stack([
    merged_data_10min["LCL_qv_T_2.0m"].values,
    merged_data_10min["LCL_qv_T_2.99m"].values,
    merged_data_10min["LCL_qv_T_4.47m"].values,
    merged_data_10min["LCL_qv_T_6.69m"].values,
    merged_data_10min["LCL_qv_T_10.0m"].values
])

n_times, n_heights = radiative_LCL.shape
print(f"Total time steps: {n_times}, Heights: {n_heights}")

# 2) Build boolean masks
valid_rad  = np.isfinite(radiative_LCL)
valid_qv   = np.isfinite(qv_LCL)
mask_both  = valid_rad & valid_qv
mask_rad_only = valid_rad & (~valid_qv)
mask_qv_only  = (~valid_rad) & valid_qv

# 3) Count “both” per height
both_counts = np.sum(mask_both, axis=0)         # length = 5
rad_only_counts = np.sum(mask_rad_only, axis=0)
qv_only_counts  = np.sum(mask_qv_only, axis=0)

# 4) Display results
height_labels = ["2 m", "2.99 m", "4.47 m", "6.69 m", "10 m"]
print("\nPoint Counts by Height (radiative vs. qᵥ):")
for idx, label in enumerate(height_labels):
    print(f"  • {label}:")
    print(f"      – Both valid:     {both_counts[idx]}")
    print(f"      – Radiative-only: {rad_only_counts[idx]}")
    print(f"      – qᵥ-only:        {qv_only_counts[idx]}")

# 5) Totals across all heights
total_both = both_counts.sum()
total_rad_only = rad_only_counts.sum()
total_qv_only  = qv_only_counts.sum()

print("\nOverall Totals across all heights:")
print(f"  • Both valid (plotted normally): {total_both}")
print(f"  • Radiative-only (plotted at y=–0.05): {total_rad_only}")
print(f"  • qᵥ-only (plotted at x=–0.05):    {total_qv_only}")


In [None]:
# Create the plot with a larger size for better visibility
plt.figure(figsize=(12, 7))

# Plot LCL data with different markers but less intense black (alpha adjusted)
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[0]), 
         marker='o', linestyle='', color='black', alpha=0.5, label='LCL T2m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[1]), 
         marker='v', linestyle='', color='black', alpha=0.5, label='LCL T2.99m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[2]), 
         marker='s', linestyle='', color='black', alpha=0.5, label='LCL T4.47m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[3]), 
         marker='D', linestyle='', color='black', alpha=0.5, label='LCL T6.69m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[4]), 
         marker='x', linestyle='', color='black', alpha=0.5, label='LCL T10m')

# Now, we will add the LCL from qv with the same markers as the above plots but with a red color
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.0m'], 
         marker='o', linestyle='', color='red', alpha=0.5, label='LCL qv T2m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.99m'], 
         marker='v', linestyle='', color='red', alpha=0.5, label='LCL qv T2.99m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_4.47m'], 
         marker='s', linestyle='', color='red', alpha=0.5, label='LCL qv T4.47m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_6.69m'], 
         marker='D', linestyle='', color='red', alpha=0.5, label='LCL qv T6.69m')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_10.0m'], 
         marker='x', linestyle='', color='red', alpha=0.5, label='LCL qv T10m')

# Formatting the x-axis for a full day (00:00 to 23:59)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))  # Format to display hours and minutes
plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))   # Set tick locator to display every 2 hours
plt.gcf().autofmt_xdate()  # Rotate time labels for better readability

# Set axis labels and title with larger fonts
plt.xlabel('Time (Full Day)', fontsize=14)
plt.ylabel('LCL Height (km)', fontsize=14)
plt.title('LCL at Various Heights Over Time (Full Day)', fontsize=16)

# Enhance the legend with better positioning
plt.legend(loc='upper left', fontsize=12)

# Add a grid with subtle enhancements
plt.grid(True, linestyle='--', linewidth=0.6, alpha=0.7)

# Adjust layout to prevent overlap
plt.tight_layout()

# Display the plot
plt.show()


In [None]:
# … after you load merged_data_10min via pd.read_csv(…):
lcl_list_new = []

for index, row in merged_data_10min.iterrows():
    # 1) Extract P_surf, T_surf for this timestamp
    P_surf_mbar = row['BP_mbar_Avg']
    T_surf       = row['Temperature_K_2']  # or whichever level you prefer
    
    # 2) Extract RH at surface
    #    (Adjust column name and /100.0 if needed.)
    RH_surf_pct = row['RH_E5567_Avg']  
    rh_surf = RH_surf_pct / 100.0

    # 3) Convert pressure to Pa
    p_surf = P_surf_mbar * 100.0

    # 4) Compute thermodynamic LCL (in meters) via Romps function
    #    We only need lcl(...), so leave return_ldl=False, return_min_lcl_ldl=False
    lcl_thermo_m = lcl(p_surf, T_surf, rh=rh_surf)

    # 5) If you prefer LCL in kilometers, do:
    lcl_thermo_km = float(lcl_thermo_m) / 1000.0

    # 6) Append to a list
    lcl_list_new.append(lcl_thermo_km)

# After the loop:
merged_data_10min['LCL_romps_km'] = lcl_list_new
print(merged_data_10min['LCL_romps_km'])

In [None]:
# Base directory containing daily CBL height outputs from the microwave radiometer
# Edit this path to match your local setup before running.
# Example: cbl_dir = r"C:\path\to\your\Microwave_radiometer\2024-05\2024-05-23"
cbl_dir = rf"C:\path\to\your\Microwave_radiometer\{month_str}\{date_str}"

cbl_file = os.path.join(cbl_dir, f'cbl_height_{date_str}.csv')

df_cbl = pd.read_csv(cbl_file, parse_dates=['Time'])
print(df_cbl)

In [None]:
# Ensure LCL timestamps are in datetime
df_cbl['Time'] = pd.to_datetime(df_cbl['Time'])

# Merge on timestamp (inner join to keep overlapping times only)
merged_data_10min = pd.merge(merged_data_10min, df_cbl, how='inner', left_on='TIMESTAMP', right_on='Time')

# Optional: remove the duplicate 'Time' column
merged_data_10min.drop(columns=['Time'], inplace=True)
print(merged_data_10min)

In [None]:
# Create the plot with a larger size for better visibility
plt.figure(figsize=(12, 7))

# Plot LCL data with different markers but less intense black (alpha adjusted)
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[0]), marker='o',
        linestyle=' ', color='black', alpha=0.5, label='LCL T2m')# marker='o'
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[1]), 
      #   marker='v', linestyle='', color='black', alpha=0.5, label='LCL T2.99m')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[2]), 
       #  marker='s', linestyle='', color='black', alpha=0.5, label='LCL T4.47m')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[3]), 
        # marker='D', linestyle='', color='black', alpha=0.5, label='LCL T6.69m')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL'].apply(lambda x: x[4]), 
         #marker='x', linestyle='', color='black', alpha=0.5, label='LCL T10m')

# Now, we will add the LCL from qv with the same markers as the above plots but with a red color
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.0m'], marker='o',
          linestyle='', color='red', alpha=0.5, label='LCL qv T2m')#marker='o',
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_2.99m'], 
       #  marker='v', linestyle='', color='red', alpha=0.5, label='LCL qv T2.99m')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_4.47m'], 
        # marker='s', linestyle='', color='red', alpha=0.5, label='LCL qv T4.47m')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_6.69m'], 
        # marker='D', linestyle='', color='red', alpha=0.5, label='LCL qv T6.69m')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LCL_qv_T_10.0m'], 
         #marker='x', linestyle='', color='red', alpha=0.5, label='LCL qv T10m')

# Plot CBL height as a line (in meters → convert to km to match y-axis)
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['zi'] / 1000,
 #        color='blue', linewidth=2, label='CBL Height $z_i$')
# Plot Romps thermodynamic LCL
plt.plot(merged_data_10min['TIMESTAMP'],
         merged_data_10min['LCL_romps_km'], marker='o',alpha=0.5,
         color='green', linestyle='', linewidth=2, label='LCL (Romps)')
# Plot Parcel Method CBL height (in km) with dashed orange line
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['zi_parcel'] / 1000,
         color='orange', linestyle='-', linewidth=2, label='CBL Height $z_i$ (Parcel Method)')


plt.fill_between(merged_data_10min['TIMESTAMP'], 
                 merged_data_10min['zi_parcel'] / 1000, 
                 merged_data_10min['LCL'].apply(lambda x: x[0]),  # or another LCL source
                 where=(merged_data_10min['LCL'].apply(lambda x: x[0]) <= merged_data_10min['zi_parcel'] / 1000),
                 color='orange', alpha=0.1, label='LCL <= CBL (possible cloud layer)')


# Formatting the x-axis for a full day (00:00 to 23:59)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))  # Format to display hours and minutes
plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))   # Set tick locator to display every 2 hours
plt.gcf().autofmt_xdate()  # Rotate time labels for better readability

# Set axis labels and title with larger fonts
plt.xlabel('Time (Full Day)', fontsize=20)
plt.ylabel('Height (km)', fontsize=20)
plt.xticks(fontsize=18)  # or any size you prefer
plt.yticks(fontsize=18)  # or any size you prefer

plt.title('LCL at Various Heights Over Time (Full Day)', fontsize=20)
# Enhance the legend with better positioning
plt.legend(loc='upper left', fontsize=14)

# Add a grid with subtle enhancements
plt.grid(True, linestyle='--', linewidth=0.6, alpha=0.7)

# Adjust layout to prevent overlap
plt.tight_layout()

# Display the plot
plt.show()



### Sonic - Mast comparisons

In [None]:

# Convert the 'TIMESTAMP' column to datetime
merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])

# Plotting Temperature_K_2.99 vs time and Average_Temperature_Corr vs time as scatter plots
plt.figure(figsize=(12, 6))

# Scatter plot for Temperature_K_2.99
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Temperature_K_2.99'], label='Temperature_K_2.99', alpha=0.6, s=10, c='blue')

# Scatter plot for Average_Temperature_Corr
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Average_Temperature_Corr'], label='Average_Temperature_Corr', alpha=0.6, s=10, c='red')

# Adding labels and title
plt.xlabel('Time')
plt.ylabel('Temperature (K)')
plt.title('Temperature Comparison Mast - Sonic (10 min intervals)')
plt.legend()
plt.grid(True)

# Formatting the x-axis
plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m-%d %H:%M'))

plt.xticks(rotation=45)

# Show plot
plt.tight_layout()
# Save the plot to a file
output_file = os.path.join(data_dir, 'temperature_comparison_plot_10min.png')
plt.savefig(output_file)
plt.show()

In [None]:
# Plotting qv_2.99m vs time and qv_sonic vs time as scatter plots
plt.figure(figsize=(12, 6))

# Scatter plot for qv_2.99m
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['qv_2.99m'], label='qv_2.99m', alpha=0.6, s=10, c='green')

# Scatter plot for qv_sonic
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['qv_sonic'], label='qv_sonic', alpha=0.6, s=10, c='purple')

# Adding labels and title
plt.xlabel('Time')
plt.ylabel('Specific Humidity (g/kg)')
plt.title('Specific Humidity Comparison Mast - Sonic (10 min averages)')
plt.legend()
plt.grid(True)

# Formatting the x-axis
plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=1))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m-%d %H:%M'))

plt.xticks(rotation=45)



# Tight layout
plt.tight_layout()

# Save the plot to a file
output_file = os.path.join(data_dir, 'specific_humidity_comparison_plot_10min.png')
plt.savefig(output_file)

# Show plot
plt.show()

In [None]:
#Edit this before running!!!
#MWR (IWV/LWP) folder: 
folder_path = rf"C:\path\to\your\Microwave_radiometer\{month_str}\{date_str}"
# Load the iwv_lwp_10min_avg.csv file
iwv_lwp_file = os.path.join(folder_path, 'iwv_lwp_10min_avg.csv')
iwv_lwp_10min_avg = pd.read_csv(iwv_lwp_file)

# Convert the 'Time' column in iwv_lwp_10min_avg to datetime if it's not already
iwv_lwp_10min_avg['TIMESTAMP'] = pd.to_datetime(iwv_lwp_10min_avg['TIMESTAMP'])

# Convert the 'TIMESTAMP' column in merged_data_10min to datetime if it's not already
merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])

# Merge the data on the 'TIMESTAMP' column, using nearest match within 10 minutes
merged_data_10min = pd.merge_asof(iwv_lwp_10min_avg.sort_values('TIMESTAMP'),
                                  merged_data_10min.sort_values('TIMESTAMP'),
                                  on='TIMESTAMP', direction='nearest')
print(merged_data_10min)

In [None]:
# Create a larger figure
fig, ax = plt.subplots(figsize=(8, 6))

# 1) Plot LWP_Corrected with a bolder line
ax.plot(
    merged_data_10min['TIMESTAMP'],
    merged_data_10min['LWP_Corrected'],#_Corrected
    color='blue',
    linewidth=2.0,
    label='LWP_Corrected'
)

# 2) Thicken all spines (axis borders)
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# 3) Tick parameters for major & minor ticks
ax.tick_params(axis='both', which='major', labelsize=12, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# 4) Format x‐axis to show time in HH:MM
date_format = mdates.DateFormatter('%H:%M')
ax.xaxis.set_major_formatter(date_format)
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# 5) Labels and title with bold font
ax.set_title('LWP vs. Time', fontsize=18, fontweight='bold')
ax.set_xlabel('Time', fontsize=16, fontweight='bold')
ax.set_ylabel('LWP (g/m²)', fontsize=16, fontweight='bold')

# 6) Y‐axis tick font size
ax.tick_params(axis='y', labelsize=12)

# 7) Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# 8) Rotate x‐tick labels
plt.xticks(rotation=45)

# 9) (Optional) Legend inside plot
legend = ax.legend(fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# 10) Tight layout and save
plt.tight_layout()
file_path = os.path.join(folder_path, 'LWP_Corrected_vs_TIMESTAMP_report.png')
plt.savefig(file_path, dpi=300)
plt.show()

print(f"Plot saved to: {file_path}")

In [None]:
#Tsrfc from LWup
merged_data_10min['T_srf']=(merged_data_10min['IR20Up']/sigma)**0.25
merged_data_10min['esat_srf']=calculate_saturation_pressure(merged_data_10min['T_srf'])
merged_data_10min['qsat_srf']=calculate_saturation_specific_humidity(merged_data_10min['esat_srf'],merged_data_10min['BP_mbar_Avg'])

print(merged_data_10min.columns)

###  Load Cloud Radar Rain data

In [None]:
#Edit this before running!!!
# Define the file paths
#Cloud Radar folder: 
cloud_radar_dir = rf"C:\path\to\your\Cloud_radar\{month_str}\{date_str}"
rain_file = os.path.join(cloud_radar_dir, 'Rain_10min_Averages.csv')
cloud_radar_file = os.path.join(cloud_radar_dir, "cloud_radar_vertical_dataset_10min.parquet")

# Load the rain data
rain_data_10min = pd.read_csv(rain_file)

# Convert the 'Time' column in iwv_lwp_10min_avg to datetime if it's not already
rain_data_10min['TIMESTAMP'] = pd.to_datetime(rain_data_10min['TIMESTAMP'])

# Merge the data on the 'TIMESTAMP' column, using nearest match within 10 minutes
merged_data_10min = pd.merge_asof(rain_data_10min.sort_values('TIMESTAMP'),
                                  merged_data_10min.sort_values('TIMESTAMP'),
                                  on='TIMESTAMP', direction='nearest')
#print(merged_data_10min)



cloud_radar_df = pd.read_parquet(cloud_radar_file)
cloud_radar_df["timestamp"] = pd.to_datetime(cloud_radar_df["timestamp"])

# 4) Rename all columns except "TIMESTAMP" by appending "_CR"
cols_to_rename = {
    col: f"{col}_CR"
    for col in cloud_radar_df.columns
    if col != "timestamp"
}
cloud_radar_df = cloud_radar_df.rename(columns=cols_to_rename)
# 7) Merge the existing merged_data_10min (which uses "TIMESTAMP") with cloud_radar_df (which uses "timestamp")
merged_data_10min = pd.merge(
    merged_data_10min,
    cloud_radar_df,
    left_on="TIMESTAMP",
    right_on="timestamp",
    how="left"
)

# 8) Drop the duplicate 'timestamp' column (if you only want to keep "TIMESTAMP")
merged_data_10min = merged_data_10min.drop(columns=["timestamp"])

# 9) Inspect the result
print(merged_data_10min.head())

### Flags

In [None]:
# Calculate the temperature difference
merged_data_10min['Temp_Diff'] = merged_data_10min['Temperature_K_2.99'] - merged_data_10min['Average_Temperature_Corr']

# Create a flag column

merged_data_10min['Flag'] = merged_data_10min['Temp_Diff'].apply(lambda x: 1 if abs(x) > 1 else 0)
#Create the second flag column for rain condition
merged_data_10min['Flag_Rain'] = merged_data_10min['Rain'].apply(lambda x: 1 if x > 0 else 0)

# Display the first few rows of the DataFrame to verify
# Display the first few rows of the DataFrame to verify
print(merged_data_10min[['TIMESTAMP', 'Temperature_K_2.99', 'Average_Temperature_Corr', 'Temp_Diff', 'Rain', 'Flag', 'Flag_Rain']].head())

In [None]:
# Create the figure
plt.figure(figsize=(10, 6))

# Plot all temperature points
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Temperature_K_2.99'],
            label='Temperature_K_2.99 (mast)', alpha=0.4, color='dodgerblue')
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Average_Temperature_Corr'],
            label='T_sonic_corrected', alpha=0.4, color='green')

# Highlight flagged outliers
flagged_data = merged_data_10min[merged_data_10min['Flag'] == 1]
plt.scatter(flagged_data['TIMESTAMP'], flagged_data['Temperature_K_2.99'],
            color='red', label='Flagged Outliers ($\Delta T > 1$ K)', s=50)
plt.scatter(flagged_data['TIMESTAMP'], flagged_data['Average_Temperature_Corr'],
            color='red', s=50)#label='Flagged Outliers (Avg_Temp_Corr)', 

# Highlight rain-flagged points
flagged_rain_data = merged_data_10min[merged_data_10min['Flag_Rain'] == 1]
plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['Temperature_K_2.99'],
            color='black', marker='+', label='Flagged Rain', s=100)
plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['Average_Temperature_Corr'],
            color='black', marker='+', s=100)#label='Flagged Rain (Avg_Temp_Corr)'

# Format x-axis for time
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plt.xticks(fontsize=18, rotation=45)
plt.yticks(fontsize=18)

# Labels and title
plt.xlabel('Time', fontsize=20)
plt.ylabel('Temperature (K)', fontsize=20)
plt.title('Temperature Data with Flagged Outliers', fontsize=20, weight='bold')

# Grid, legend, and layout
plt.grid(True, linestyle='--', alpha=0.7)
plt.legend(fontsize=14, loc='upper right')
plt.tight_layout()

# Save the plot
output_path = os.path.join(data_dir, 'temperature_scatter_flagged.png')
plt.savefig(output_path, dpi=300)

plt.show()

### Vertical Profiles

In [None]:
#Edit this before running!!!
# Define the folder path and file name
folder_path = rf"C:\path\to\your\Microwave_radiometer\{month_str}\{date_str}"
file_name = 'profiles_data_10min_avg.parquet'
file_path = f'{folder_path}\\{file_name}'

rh_ah_vertical= pd.read_parquet(file_path)
# Display the first fewf

#rows of the dataframe
#print(rh_ah_vertical)

# 2) Load the “MWR_vertical_dataset_10min.parquet” in the same folder
mwr_path = os.path.join(folder_path, "MWR_vertical_dataset_10min.parquet")
mwr_vertical = pd.read_parquet(mwr_path)

print("MWR_vertical_dataset_10min:")
#print(mwr_vertical.head())

# 3) Ensure both DataFrames have a proper datetime column named "timestamp"
rh_ah_vertical["Time"] = pd.to_datetime(rh_ah_vertical["Time"])
mwr_vertical["Time"] = pd.to_datetime(mwr_vertical["Time"])


In [None]:
# Define mast heights and corresponding RH columns
mast_heights = [2, 2.99, 4.47, 6.69, 10]
rh_columns = ['RH_E5567_Avg', 'RH_E5568_Avg', 'RH_E5569_Avg', 'RH_E5570_Avg', 'RH_E5571_Avg']
qv_columns= ['qv_2m','qv_2.99m','qv_4.47m','qv_6.69m','qv_10m']
def combine_mast_and_radiometer_data(merged_data_10min, df_radiometer):
    # Initialize a list to store combined rows
    combined_data_list = []

    # Iterate through each row of the radiometer data
    for i, row in df_radiometer.iterrows():
        # Get the corresponding row from the mast data for the same timestamp
        timestamp = row['Time']
        mast_row = merged_data_10min[merged_data_10min['TIMESTAMP'] == timestamp]
        
        if not mast_row.empty:
            # Extract RH values from the mast row
            mast_rh_values = mast_row[rh_columns].values.flatten().tolist()
            mast_qv_values = mast_row[qv_columns].values.flatten().tolist()
            # Prepare the combined data
            combined_data = {
                'Time': timestamp,
                'Radiometer_Altitude': row['Altitude'],
                'Radiometer_RH_Profile': row['RH_Profile'],
                'Radiometer_AH_Profile': row['AH_Profile'],
                'Mast_Heights': mast_heights,
                'Mast_RH_Profile': mast_rh_values,
                'Mast_qv_Profile': mast_qv_values
            }
            
            # Append combined data to the list
            combined_data_list.append(combined_data)
    
    # Create a DataFrame from the combined list of dictionaries
    df_combined = pd.DataFrame(combined_data_list)

    return df_combined


df_combined = combine_mast_and_radiometer_data(merged_data_10min, rh_ah_vertical)
print(df_combined.head())  # View the combined data



In [None]:
# Plot the combined RH profiles
def plot_combined_profiles(time_index, df_combined):
    time = df_combined['Time'][time_index]
    
    # Extract the radiometer and mast profiles
    radiometer_altitude = df_combined['Radiometer_Altitude'][time_index]
    radiometer_rh_profile = df_combined['Radiometer_RH_Profile'][time_index]
    mast_heights = df_combined['Mast_Heights'][time_index]
    mast_rh_profile = df_combined['Mast_RH_Profile'][time_index]

    # Plot radiometer RH profile
    plt.figure(figsize=(6, 8))
    plt.plot(radiometer_rh_profile, radiometer_altitude, 'b-', label='Radiometer RH Profile')
    
    # Plot mast RH profile
    plt.plot(mast_rh_profile, mast_heights, 'ro-', label='Mast RH Profile')

    plt.xlabel('Relative Humidity (%)')
    plt.ylabel('Altitude (m)')
    plt.title(f'Combined RH Profiles at {time}')
    plt.legend()
    # Zoom in on the lower 1 km
    #plt.ylim([0, 100])
    
    plt.show()

# Example: Plot the combined RH profiles for the first timestamp
plot_combined_profiles(0, df_combined)


In [None]:
merged_mwr = pd.merge(
    df_combined,
    mwr_vertical,
    on="Time",
    how="inner",
    suffixes=("", "_mwr")
)



print("Merged (all columns) – first 5 rows:")
print(merged_mwr.head())

In [None]:
# Define the new file path for temperature profiles
temperature_file_name = 'vertical_temperature_profiles_10min_avg.parquet'
temperature_file_path = f'{folder_path}\\{temperature_file_name}'

# Load the temperature profiles data
temperature_profiles_df = pd.read_parquet(temperature_file_path)
print(temperature_profiles_df)

In [None]:
temperature_columns = ['Temperature_K_2', 'Temperature_K_2.99', 'Temperature_K_4.47', 'Temperature_K_6.69', 'Temperature_K_10']
vdse_columns= ['Virtual_Dry_Static_Energy_2', 'Virtual_Dry_Static_Energy_2.99', 'Virtual_Dry_Static_Energy_4.47', 'Virtual_Dry_Static_Energy_6.69', 'Virtual_Dry_Static_Energy_10']
def combine_mast_and_temperature_data(merged_data_10min, df_temperature_profiles):
    # Initialize a list to store combined rows
    combined_data_temperature_list = []

    # Iterate through each row of the temperature profiles data
    for i, row in df_temperature_profiles.iterrows():
        # Get the corresponding row from the mast data for the same timestamp
        timestamp = row['Time']
        mast_row = merged_data_10min[merged_data_10min['TIMESTAMP'] == timestamp]
        
        if not mast_row.empty:
            # Extract temperature values from the mast row
            mast_temp_values = mast_row[temperature_columns].values.flatten().tolist()
            mast_vdse_values =mast_row[vdse_columns].values.flatten().tolist()
            # Prepare the combined data
            combined_data = {
                'Time': timestamp,
                'Temperature_Altitude': row['Altitude'],
                'Temperature_Profile': row['T_Profile'],  # Adjust this if necessary
                'Mast_Heights': mast_heights,
                'Mast_Temperature_Profile': mast_temp_values,
                'Mast_VDSE_Profile': mast_vdse_values
            }
            
            # Append combined data to the list
            combined_data_temperature_list.append(combined_data)
    
    # Create a DataFrame from the combined list of dictionaries
    df_combined_temperature = pd.DataFrame(combined_data_temperature_list)

    return df_combined_temperature


In [None]:
df_combined_temperature = combine_mast_and_temperature_data(merged_data_10min, temperature_profiles_df)
print(df_combined_temperature.head())  # View the combined data


In [None]:
def plot_combined_temperature_profiles(time_index, df_combined):
    time = df_combined['Time'][time_index]
    
    # Extract the radiometer and mast profiles
    radiometer_altitude = df_combined['Temperature_Altitude'][time_index]
    radiometer_temp_profile = df_combined['Temperature_Profile'][time_index]
    mast_heights = df_combined['Mast_Heights'][time_index]
    mast_temp_profile = df_combined['Mast_Temperature_Profile'][time_index]

    # Plot radiometer temperature profile
    plt.figure(figsize=(6, 8))
    plt.plot(radiometer_temp_profile, radiometer_altitude, 'b-', label='Radiometer Temperature Profile')
    
    # Plot mast temperature profile
    plt.plot(mast_temp_profile, mast_heights, 'ro-', label='Mast Temperature Profile')

    plt.xlabel('Temperature (°C)')
    plt.ylabel('Altitude (m)')
    plt.title(f'Combined Temperature Profiles at {time}')
    plt.legend()
    # Zoom in on the lower 1 km if needed
    #plt.ylim([0, 100])
    
    plt.show()

# Example: Plot the combined temperature profiles for the first timestamp
plot_combined_temperature_profiles(10, df_combined_temperature)


In [None]:
# Combine df_combined with df_combined_temperature on the 'Time' column
df_combined_all = pd.merge(
    merged_mwr,
    df_combined_temperature,
    on='Time',
    suffixes=('_RH', '_Temp')
)

# Display the combined DataFrame
print(df_combined_all.head())

In [None]:

def plot_combined_profiles(time_index, df_combined_all,data_dir):
    # Extract the time and profiles
    time = df_combined_all['Time'][time_index]
    
    # Relative Humidity Profiles
    radiometer_altitude_rh = df_combined_all['Radiometer_Altitude'][time_index]
    radiometer_rh_profile = df_combined_all['Radiometer_RH_Profile'][time_index]
    mast_heights_rh = df_combined_all['Mast_Heights_RH'][time_index]
    mast_rh_profile = df_combined_all['Mast_RH_Profile'][time_index]
    
    # Temperature Profiles
    radiometer_temp_profile = df_combined_all['Temperature_Profile'][time_index]
    mast_heights_temp = df_combined_all['Mast_Heights_Temp'][time_index]
    mast_temp_profile = df_combined_all['Mast_Temperature_Profile'][time_index]

    # Create figure and subplots
    fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(14, 8), sharey=True)
    
    # Plot RH profiles in the first subplot
    ax1.plot(radiometer_rh_profile, radiometer_altitude_rh, 'b-', label='Radiometer RH Profile')
    ax1.plot(mast_rh_profile, mast_heights_rh, 'ro-', label='Mast RH Profile')
    ax1.set_xlabel('Relative Humidity (%)')
    ax1.set_ylabel('Altitude (m)')
    ax1.set_title(f'RH Profiles at {time}')
    ax1.legend()
    ax1.grid(True)
    
    # Plot temperature profiles in the second subplot
    ax2.plot(radiometer_temp_profile, radiometer_altitude_rh, 'g--', label='Radiometer Temperature Profile')
    ax2.plot(mast_temp_profile, mast_heights_temp, 'mo-', label='Mast Temperature Profile')
    ax2.set_xlabel('Temperature (K)')
    ax2.set_title(f'Temperature Profiles at {time}')
    ax2.legend()
    ax2.grid(True)
    
    plt.tight_layout()
    
    # Create save path
    save_path = os.path.join(data_dir, f'example_vertical_combined_profiles_plot.png')
    
    # Save plot to a file
    plt.savefig(save_path)
    plt.show()

# Example: Plot the combined profiles for the first timestamp
plot_combined_profiles(19, df_combined_all, data_dir)


In [None]:
print(df_combined_all) 

### Cloud Type

In [None]:
def classify_cloud_type(row):
    rh_profile = np.array(row['Radiometer_RH_Profile'])
    altitude_profile = np.array(row['Radiometer_Altitude'])
    lwp = row['LWP_Corrected']#_Corrected
    flag = row['Flag']
    flag_rain = row['Flag_Rain']
    csi = row['CSI']

    # Find RH just above 2 km (e.g., within 2–2.5 km)
    
    above_2km = (
    (altitude_profile > 1500)
    & (altitude_profile <= 2500))
   # & (lwp < 20))
    rh_near_above_2km = rh_profile[above_2km]

    if flag != 0 or flag_rain != 0:
        return 'Flagged Observation'

    boundary_layer_altitude = 2000  # meters
    boundary_layer_indices = np.where(altitude_profile <= boundary_layer_altitude)
    rh_boundary_layer = rh_profile[boundary_layer_indices]
    mean_rh_boundary_layer = np.mean(rh_boundary_layer)
    rh_above_boundary_layer = rh_profile[altitude_profile > boundary_layer_altitude]

    if pd.notna(csi) and csi >= 0.7 and mean_rh_boundary_layer < 50 and lwp < 20:
        return 'Clear Sky'
    #elif mean_rh_boundary_layer > 80 and 50 <= lwp <= 300 and np.any(rh_above_boundary_layer < 60):
    # 3) Stratocumulus: require CSI < 0.3 AND high BL humidity AND moderate LWP AND capping inversion
    elif (pd.notna(csi) and csi < 0.5 and mean_rh_boundary_layer > 80 and 50 <= lwp <= 300 and np.mean(rh_above_boundary_layer) < 60):
        return 'Stratocumulus'

    #elif (mean_rh_boundary_layer > 80 and 50 <= lwp <= 300 and np.mean(rh_near_above_2km) < 60):
     #   return 'Stratocumulus'
        #pd.notna(csi) and 0.3 <= csi < 0.7 and
    elif (60 <= mean_rh_boundary_layer <= 80 and 20 <= lwp < 50
    #elif (60 <= rh_profile[altitude_profile <= 2000].mean() <= 80
      #    and 20 <= lwp < 50
          and np.all(np.diff(rh_above_boundary_layer) <= 0)):
        return 'Shallow Cumulus'
    return 'Unclassified'


# Now apply this function to classify each time step

# Assuming 'df_combined_all' contains the profile data and 'merged_data_10min' contains LWP and SWR data
# Merge both dataframes on 'Time'
df_combined_all['Time'] = pd.to_datetime(df_combined_all['Time'])
# Rename the 'TIMESTAMP' column to 'Time' in the merged_data_10min dataframe
merged_data_10min.rename(columns={'TIMESTAMP': 'Time'}, inplace=True)

merged_data_10min['Time'] = pd.to_datetime(merged_data_10min['Time'])

# Merge on 'Time'
#df_merged = pd.merge(df_combined_all, merged_data_10min[['Time', 'LWP_Corrected']], on='Time')
df_merged = pd.merge(
    df_combined_all,
    merged_data_10min[['Time', 'LWP_Corrected', 'Flag', 'Flag_Rain','CSI']],#_Corrected
    on='Time'
)

# Apply the classification
df_merged['Cloud_Type'] = df_merged.apply(classify_cloud_type, axis=1)

# Output the classified data
print(df_merged[['Time', 'Cloud_Type', 'LWP_Corrected','CSI']])#_Corrected

# Define the full path to save the Excel file
output_path = f'{data_dir}\\cloud_classification_3.xlsx'

# Save the DataFrame to the Excel file
df_merged.to_excel(output_path, index=False)

# Confirmation message
print(f"File saved to {output_path}")

In [None]:

def plot_all_cloud_profiles(df):
      # Create a base folder for all plots
    plots_dir = os.path.join(data_dir, "cloud_profiles_3")
    os.makedirs(plots_dir, exist_ok=True)

    # Filter out flagged and unclassified observations
    valid_df = df[~df['Cloud_Type'].isin(['Unclassified', 'Flagged Observation', 'Excluded (Flagged)'])]

    # Get unique valid cloud types
    cloud_types = valid_df['Cloud_Type'].unique()

    for cloud_type in cloud_types:
        # Create a subfolder for each cloud type
        folder_path = os.path.join(plots_dir, cloud_type.replace(" ", "_"))
        os.makedirs(folder_path, exist_ok=True)

        # Filter data for this cloud type
        cloud_df = valid_df[valid_df['Cloud_Type'] == cloud_type]

        for idx, row in cloud_df.iterrows():
            # Common metadata
            timestamp = pd.to_datetime(row['Time'])
            timestamp_str = timestamp.strftime('%Y%m%d_%H%M')
            lwp = row['LWP']#_Corrected

            # === RH PROFILE PLOT ===
            rh_profile = np.array(row['Radiometer_RH_Profile'])
            altitude_profile = np.array(row['Radiometer_Altitude'])
            mast_rh_profile = np.array(row['Mast_RH_Profile'])
            mast_heights_rh = np.array(row['Mast_Heights_RH'])

            fig_rh, ax_rh = plt.subplots(figsize=(5, 6))
            ax_rh.scatter(rh_profile, altitude_profile, label='Radiometer RH Profile', color='b', marker='o')
            ax_rh.scatter(mast_rh_profile, mast_heights_rh, label='Mast RH Profile', color='g', marker='x')
            ax_rh.axhline(y=2000, color='r', linestyle='--', label='2 km Boundary Layer')
            ax_rh.set_xlabel('Relative Humidity (%)')
            ax_rh.set_ylabel('Altitude (meters)')
            ax_rh.set_title(f'{cloud_type} - RH Profile\nTime: {timestamp}')
            ax_rh.legend()
            ax_rh.annotate(f'LWP: {lwp:.2f} g/m²', xy=(0.05, 0.95), xycoords='axes fraction',
                           fontsize=10, bbox=dict(boxstyle='round', fc='white', edgecolor='black'))

            plt.tight_layout()
            rh_filename = f'{cloud_type.replace(" ", "_")}_{timestamp_str}_RH.png'
            plt.savefig(os.path.join(folder_path, rh_filename))
            plt.close(fig_rh)

            # === TEMPERATURE PROFILE PLOT ===
            temp_profile = np.array(row['Temperature_Profile'])
            mast_temp_profile = np.array(row['Mast_Temperature_Profile'])
            mast_heights_temp = np.array(row['Mast_Heights_Temp'])

            fig_temp, ax_temp = plt.subplots(figsize=(5, 6))
            ax_temp.scatter(temp_profile, altitude_profile, label='Radiometer Temp Profile', color='orange', marker='o')
            ax_temp.scatter(mast_temp_profile, mast_heights_temp, label='Mast Temp Profile', color='purple', marker='x')
            ax_temp.axhline(y=2000, color='r', linestyle='--', label='2 km Boundary Layer')
            ax_temp.set_xlabel('Temperature (K)')
            ax_temp.set_ylabel('Altitude (meters)')
            ax_temp.set_title(f'{cloud_type} - Temperature Profile\nTime: {timestamp}')
            ax_temp.legend()
            ax_temp.annotate(f'LWP: {lwp:.2f} g/m²', xy=(0.05, 0.95), xycoords='axes fraction',
                             fontsize=10, bbox=dict(boxstyle='round', fc='white', edgecolor='black'))

            plt.tight_layout()
            temp_filename = f'{cloud_type.replace(" ", "_")}_{timestamp_str}_Temp.png'
            plt.savefig(os.path.join(folder_path, temp_filename))
            plt.close(fig_temp)

            
    # ✅ Moved this line inside the function so it has access to `plots_dir`
    print(f"\n✅ Cloud profile plots saved in: {plots_dir}")

#plot_all_cloud_profiles(df_merged)


### VERtical PRofiles

In [None]:
# Combine df_combined with df_combined_temperature on the 'Time' column
merged_data_10min = pd.merge(
    merged_data_10min,
    df_combined_all,
    on='Time',
    how="left"
)
# 8) Drop the duplicate 'timestamp' column (if you only want to keep "TIMESTAMP")
#merged_data_10min = merged_data_10min.drop(columns=["Time"])


In [None]:

def plot_temperature_profiles(
    time_index: int,
    merged_df,
    fontsize_labels=14,
    fontsize_ticks=12,
    fontsize_title=16
):
    """
    Plot MWR (blue circles), CR (red triangles with black border), and Mast (red squares)
    for a given row index in merged_df. Also adjusts axis styling for report/PowerPoint.

    Parameters
    ----------
    time_index : int
        Integer index of the row to plot (0 <= time_index < len(merged_df)).
    merged_df : pandas.DataFrame
        Your merged_data_10min DataFrame that contains the following array columns:
          - 'Radiometer_Altitude'
          - 'Temperature_Profile'
          - 'temperature_altitudes_CR'
          - 'temperature_profile_CR'
          - 'Mast_Heights_Temp'
          - 'Mast_Temperature_Profile'
          - 'TIMESTAMP' (or 'Time')
    fontsize_labels : int, optional
        Font size for x/y labels. Default is 14.
    fontsize_ticks : int, optional
        Font size for tick labels. Default is 12.
    fontsize_title : int, optional
        Font size for the plot title. Default is 16.
    """

    # 1) pull out that single row as a Series
    row = merged_df.iloc[time_index]

    # 2) extract MWR (“Radiometer”) arrays
    radiometer_alt  = row['Radiometer_Altitude']       # e.g. [0,10,25,…]
    radiometer_temp = row['Temperature_Profile']       # T_Profile from MWR
    radiometer_thetav=row['Theta_v_Alternative']
    # 3) extract CR (“Cloud Radar”) arrays
    cr_alt   = row['temperature_altitudes_CR']         # e.g. [0,20,40,…]
    cr_temp  = row['temperature_profile_CR']           # corresponding T
    cr_thetav = row['theta_v_CR']
    # 4) extract Mast‐measured temperature arrays
    mast_alt  = row['Mast_Heights_Temp']               # e.g. [2,2.99,4.47,…]
    mast_temp = row['Mast_Temperature_Profile']        # corresponding T values

    # 5) get the timestamp string for the title
    timestamp = row['Time']  # or row['Time'], depending on where you want to grab it
    
    # 6) Start figure
    plt.figure(figsize=(4, 6))

    # 7) Plot MWR Temperature (red solid, circle marker)
    plt.plot(
        radiometer_temp,
        radiometer_alt,
        color='red',
        linestyle='-',
       # marker='o',
        markersize=4,
        linewidth=1.5,
        label='T(K) MWR'
    )

    # 8) Plot CR Temperature (orange solid, circle marker)
    plt.plot(
        cr_temp,
        cr_alt,
        color='red',
        linestyle='--',
       # marker='o',
        markersize=4,
        linewidth=1.5,
        label='T(K) CR'
    )

    # 9) Plot Mast Temperature (green solid, x‐marker)
    plt.plot(
        mast_temp,
        mast_alt,
        color='green',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='T(K) Mast'
    )

    # 10) Plot MWR Θv (red dashed, diamond marker)
    plt.plot(
        radiometer_thetav,
        radiometer_alt,
        color='orange',
        linestyle='-',
       # marker='D',
        markersize=4,
        linewidth=1.2,
        label='Θv(K) MWR'
    )

    # 11) Plot CR Θv (orange dashed, diamond marker)
    plt.plot(
        cr_thetav,
        cr_alt,
        color='orange',
        linestyle='--',
    #    marker='D',
        markersize=4,
        linewidth=1.2,
        label='Θv(K) CR'
    )

    # 12) Labels and title
    plt.xlabel('T, Θv (K)', fontsize=fontsize_labels)
    plt.ylabel('Altitude (m)', fontsize=fontsize_labels)
    plt.title(f'Profiles at {timestamp}', fontsize=fontsize_title)
    plt.ylim(-100,2000)
    # 13) Legend
    plt.legend(loc='best', fontsize=fontsize_ticks)

    # 14) Grid (light dashed)
    plt.grid(True, linestyle='--', alpha=0.4)

    # 15) Tick label sizes
    plt.xticks(fontsize=fontsize_ticks)
    plt.yticks(fontsize=fontsize_ticks)

    # 16) Thicken axis spines
    ax = plt.gca()
    for spine in ['left', 'bottom', 'right', 'top']:
        ax.spines[spine].set_linewidth(1.2)

    # 17) (Optional) invert y‐axis if you want altitude increasing upward
    # plt.gca().invert_yaxis()

    plt.tight_layout()
    plt.show()


# ─────────────── Example of how to call it ───────────────
# Suppose you want the 10th ten‐minute timestamp in merged_data_10min:
# plot_temperature_profiles(9, merged_data_10min)
plot_temperature_profiles(38, merged_data_10min)

In [None]:
def plot_rh_profiles(
    time_index: int,
    merged_df,
    fontsize_labels=14,
    fontsize_ticks=12,
    fontsize_title=16
):
    """
    For a given integer row index in merged_df, plot relative‐humidity profiles for:
      • MWR RH (blue solid line with circle markers)
      • CR RH (orange solid line with circle markers)
      • Mast RH (green solid line with x‐markers)

    merged_df must contain these columns (as array‐like lists):
      - 'Radiometer_Altitude'
      - 'Radiometer_RH_Profile'
      - 'Mast_Heights_RH'
      - 'Mast_RH_Profile'
      - 'humidity_altitudes_CR'
      - 'rel_humidity_profile_CR'
      - 'TIMESTAMP' (or 'Time')
    """

    # 1) Extract the row
    row = merged_df.iloc[time_index]

    # 2) MWR (“Radiometer”) RH arrays
    radiometer_alt = row['Radiometer_Altitude']      # e.g. [0, 10, 25, …]
    radiometer_rh  = row['Radiometer_RH_Profile']    # e.g. [81.7, 79.9, …]
    radiometer_ah  = row['Radiometer_AH_Profile']      # e.g. [7.68, 7.62, …]

    # 3) CR (“Cloud Radar”) RH arrays
    cr_alt = row['humidity_altitudes_CR']            # e.g. [0, 20, 40, …]
    cr_rh  = row['rel_humidity_profile_CR']          # e.g. [75.2, 72.1, …]
    cr_ah  = row['abs_humidity_profile_CR']       # e.g. [5.1, 4.8, …]

    # 4) Mast RH arrays
    mast_alt = row['Mast_Heights_RH']                # e.g. [2, 2.99, 4.47, …]
    mast_rh  = row['Mast_RH_Profile']                # e.g. [94.0, 93.5, …]

    # 5) Timestamp for title
    timestamp = row.get('TIMESTAMP', row.get('Time', ''))

   # 6) Start the figure and primary axis (RH)
    fig, ax1 = plt.subplots(figsize=(4, 6))

    # 7) Plot RH on ax1 (bottom x‐axis)
    ax1.plot(
        radiometer_rh,
        radiometer_alt,
        color='blue',
        linestyle='-',
       # marker='o',
        #markersize=4,
        linewidth=1.5,
        label='RH MWR'
    )
    ax1.plot(
        cr_rh,
        cr_alt,
        color='black',
        linestyle='--',
        #marker='o',
        #markersize=4,
        linewidth=1.5,
        label='RH CR'
    )
    ax1.plot(
        mast_rh,
        mast_alt,
        color='green',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='RH Mast'
    )

    # 8) Configure ax1 (RH axis)
    ax1.set_xlabel('RH (%)', fontsize=fontsize_labels)
    ax1.set_ylabel('Altitude (m)',          fontsize=fontsize_labels)
    ax1.tick_params(axis='x', labelsize=fontsize_ticks)
    ax1.tick_params(axis='y', labelsize=fontsize_ticks)
    ax1.grid(True, linestyle='--', alpha=0.4)
    for spine in ['left', 'bottom', 'right', 'top']:
        ax1.spines[spine].set_linewidth(1.2)

    # 9) Create secondary x‐axis (AH) sharing the same y‐axis
    ax2 = ax1.twiny()
    ax2.plot(
        radiometer_ah,
        radiometer_alt,
        color='blue',
        linestyle='-',
       # marker='s',
        #markersize=4,
        linewidth=1.2,
        label='AH MWR'
    )
    ax2.plot(
        cr_ah,
        cr_alt,
        color='skyblue',
        linestyle='--',
        #marker='v',
       # markersize=4,
        linewidth=1.2,
        label='AH CR'
    )

    # 10) Configure ax2 (AH axis)
    ax2.set_xlabel('AH (g/m³)', fontsize=fontsize_labels)
    ax2.tick_params(axis='x', labelsize=fontsize_ticks)
    for spine in ['left', 'bottom', 'right', 'top']:
        ax2.spines[spine].set_linewidth(1.2)
    plt.ylim(-100,3000)
    # 11) Title
    plt.title(f'Profiles at {timestamp}', fontsize=fontsize_title)

    # 12) Combine legends from both axes
    handles1, labels1 = ax1.get_legend_handles_labels()
    handles2, labels2 = ax2.get_legend_handles_labels()
    ax1.legend(handles1 + handles2, labels1 + labels2, loc='best', fontsize=fontsize_ticks)

    plt.tight_layout()
    plt.show()
    
plot_rh_profiles(68, merged_data_10min)

In [None]:

def plot_qv_profiles(
    time_index: int,
    merged_df,
    fontsize_labels=14,
    fontsize_ticks=12,
    fontsize_title=16
):
    """
    For a given integer row index in merged_df, plot specific‐humidity (qv) profiles for:
      • MWR qv (blue solid line with circle markers)
      • CR qv  (orange solid line with circle markers)
      • Mast qv (green solid line with x‐markers)

    merged_df must contain these columns (as array‐like lists):
      - 'Altitude'                      (MWR altitudes)
      - 'Specific Humidity (qv)'        (MWR qv profile)
      - 'humidity_altitudes_CR'         (CR altitudes)
      - 'specific_humidity_CR'          (CR qv profile)
      - 'Mast_Heights_Temp'             (mast altitudes—same mast levels)
      - 'Mast_qv_Profile'               (mast qv profile)
      - 'TIMESTAMP' (or 'Time')
    """

    # 1) Extract the selected row
    row = merged_df.iloc[time_index]

    # 2) MWR (“Radiometer”) qv arrays
    mwr_alt = row['Radiometer_Altitude']                       # e.g. [0, 10, 25, …]
    mwr_qv  = row['Specific Humidity (qv)']          # e.g. [7.27, 6.89, …]

    # 3) CR (“Cloud Radar”) qv arrays
    cr_alt = row['humidity_altitudes_CR']            # e.g. [0, 20, 40, …]
    cr_qv  = row['specific_humidity_CR']             # e.g. [6.14, 5.98, …]

    # 4) Mast qv arrays
    mast_alt = row['Mast_Heights_RH']              # e.g. [2, 2.99, 4.47, …]
    mast_qv  = row['Mast_qv_Profile']                # e.g. [7.12, 6.98, …]

    # 5) Timestamp for title
    timestamp = row.get('TIMESTAMP', row.get('Time', ''))

    # 6) Start the figure
    plt.figure(figsize=(4, 6))

    # 7) Plot MWR qv (blue solid, circle marker)
    plt.plot(
        mwr_qv,
        mwr_alt,
        color='purple',
        linestyle='-',
        #marker='o',
        #markersize=4,
        linewidth=1.5,
        label='qv MWR'
    )

    # 8) Plot CR qv (orange solid, circle marker)
    plt.plot(
        cr_qv,
        cr_alt,
        color='black',
        linestyle='--',
       # marker='o',
        #markersize=4,
        linewidth=1.5,
        label='qv CR'
    )

    # 9) Plot Mast qv (green solid, x‐marker)
    plt.plot(
        mast_qv,
        mast_alt,
        color='green',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='qv Mast'
    )

    # 10) Labels and title
    plt.xlabel('qv (g/kg)', fontsize=fontsize_labels)
    plt.ylabel('Altitude (m)',             fontsize=fontsize_labels)
    plt.title(f'Profiles at {timestamp}', fontsize=fontsize_title)

    # 11) Legend
    plt.legend(loc='best', fontsize=fontsize_ticks)

    # 12) Grid (light dashed)
    plt.grid(True, linestyle='--', alpha=0.4)

    # 13) Tick label sizes
    plt.xticks(fontsize=fontsize_ticks)
    plt.yticks(fontsize=fontsize_ticks)
    plt.ylim(-100,3000)
    # 14) Thicken axis spines
    ax = plt.gca()
    for spine in ['left', 'bottom', 'right', 'top']:
        ax.spines[spine].set_linewidth(1.2)

    # 15) (Optional) invert y‐axis if you want altitude increasing upward
    # plt.gca().invert_yaxis()

    plt.tight_layout()
    plt.show()
plot_qv_profiles(48, merged_data_10min)


In [None]:

def plot_all_profiles(
    time_index: int,
    merged_df,
    alt_min=-10,
    alt_max=2000,
    fontsize_labels=20,
    fontsize_ticks=18,
    fontsize_title=20
):
    """
    Plot temperature/θv, RH/AH, and qv profiles side‐by‐side for a given time index.

    Parameters
    ----------
    time_index : int
        Integer index of the row in merged_df to plot.
    merged_df : pandas.DataFrame
        DataFrame containing all merged variables. Must include these columns (array‐like):
          - 'Radiometer_Altitude'
          - 'Temperature_Profile'
          - 'Theta_v_Alternative'
          - 'temperature_altitudes_CR'
          - 'temperature_profile_CR'
          - 'theta_v_CR'
          - 'Mast_Heights_Temp'
          - 'Mast_Temperature_Profile'
          - 'Radiometer_RH_Profile'
          - 'Radiometer_AH_Profile'
          - 'Mast_Heights_RH'
          - 'Mast_RH_Profile'
          - 'humidity_altitudes_CR'
          - 'rel_humidity_profile_CR'
          - 'abs_humidity_profile_CR'
          - 'Altitude'                     (MWR altitude for qv)
          - 'Specific Humidity (qv)'
          - 'specific_humidity_CR'
          - 'Mast_qv_Profile'
          - 'TIMESTAMP'  or 'Time'
    clim_alt_max : float, optional
        Maximum altitude (in meters) to display on the y‐axis for all subplots.
        Default is 2000 (i.e. 0–2 km).
    fontsize_labels : int, optional
        Font size for x‐ and y‐axis labels. Default is 14.
    fontsize_ticks : int, optional
        Font size for tick labels and legend text. Default is 12.
    fontsize_title : int, optional
        Font size for subplot titles. Default is 16.
    """

    # 1) Extract the single row by index
    row = merged_df.iloc[time_index]

    # 2) Timestamp string (for all subplot titles)
    timestamp = row.get('TIMESTAMP', row.get('Time', 'Unknown Time'))

    # ────────────────────────────────────────────────────────────────────────
    # 3) Pull out Temperature & Θᵥ profiles
    #    - MWR (“Radiometer”)
    mwr_alt       = row['Radiometer_Altitude']       # e.g. [0, 10, 25, …]
    mwr_temp      = row['Temperature_Profile']       # Temperature from MWR
    mwr_thetav    = row['Theta_v_Alternative']       # Θᵥ from MWR

    #    - CR (“Cloud Radar”)
    cr_alt_T      = row['temperature_altitudes_CR']  # e.g. [0, 20, 40, …]
    cr_temp       = row['temperature_profile_CR']    # Temperature from CR
    cr_thetav     = row['theta_v_CR']                # Θᵥ from CR

    #    - Mast (tower)
    mast_alt_T    = row['Mast_Heights_Temp']         # e.g. [2, 2.99, 4.47, …]
    mast_temp     = row['Mast_Temperature_Profile']  # Temperature from Mast
    mast_vdse     = row['Mast_VDSE_Profile']
    # (no Mast Θᵥ, since tower data only provides Temperature & Humidity;
    # θᵥ could be computed from T & qv if needed, but here we skip Mast θᵥ)

    # ────────────────────────────────────────────────────────────────────────
    # 4) Pull out RH & AH profiles
    #    - MWR (“Radiometer”)
    mwr_alt_RH    = row['Radiometer_Altitude']       # same altitude vector
    mwr_rh        = row['Radiometer_RH_Profile']     # RH from MWR
    mwr_ah        = row['Radiometer_AH_Profile']     # AH from MWR

    #    - CR (“Cloud Radar”)
    cr_alt_RH     = row['humidity_altitudes_CR']     # e.g. [0, 20, 40, …]
    cr_rh         = row['rel_humidity_profile_CR']   # RH from CR
    cr_ah         = row['abs_humidity_profile_CR']   # AH from CR

    #    - Mast (tower)
    mast_alt_RH   = row['Mast_Heights_RH']           # e.g. [2, 2.99, 4.47, …]
    mast_rh       = row['Mast_RH_Profile']           # RH from Mast
    # (no Mast AH, but if needed, could compute from Mast qv & T)

    # ────────────────────────────────────────────────────────────────────────
    # 5) Pull out qᵥ profiles
    #    - MWR (“Radiometer”)
    mwr_alt_qv    = row['Radiometer_Altitude']                   # MWR altitude vector (from mwr_vertical)
    mwr_qv        = row['Specific Humidity (qv)']      # qᵥ (g/kg) from MWR

    #    - CR (“Cloud Radar”)
    cr_alt_qv     = row['humidity_altitudes_CR']      # same CR altitude vector
    cr_qv         = row['specific_humidity_CR']       # qᵥ (g/kg) from CR

    #    - Mast (tower)
    mast_alt_qv   = row['Mast_Heights_RH']          # same mast altitude for temperature
    mast_qv       = row['Mast_qv_Profile']            # qᵥ (g/kg) from Mast

    # ────────────────────────────────────────────────────────────────────────
    # 6) Open figure with 3 side‐by‐side subplots, sharing the y‐axis
    fig, (ax_T, ax_RH, ax_qv) = plt.subplots(
        nrows=1,
        ncols=3,
        figsize=(12, 6),
        sharey=True
    )

    # ────────────────────────────────────────────────────────────────────────
    # 7) Panel 1: Temperature & Θᵥ
    # ------------------------------------------------
   
 # Plot Temperature (solid lines with circle markers)
    ax_T.plot(
        mwr_temp,
        mwr_alt,
        color='red',
        linestyle='-',
        #marker='o',
       # markersize=4,
        linewidth=1.5,
        label='T (MWR)'
    )
    ax_T.plot(
        cr_temp,
        cr_alt_T,
        color='red',
        linestyle='--',
       # marker='o',
       # markersize=4,
        linewidth=1.5,
        label='T (CR)'
    )
    ax_T.plot(
        mast_temp,
        mast_alt_T,
        color='green',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='T (Mast)'
    )

    # Plot Θᵥ (dashed lines with diamond markers)
    ax_T.plot(
        mwr_thetav,
        mwr_alt,
        color='orange',
        linestyle='-',
        #marker='D',
       # markersize=4,
        linewidth=1.2,
        label='Θᵥ (MWR)'
    )
    ax_T.plot(
        cr_thetav,
        cr_alt_T,
        color='orange',
        linestyle='--',
        #marker='D',
        #markersize=4,
        linewidth=1.2,
        label='Θᵥ (CR)'
    )
    
    ax_T.plot(
        mast_vdse,
        mast_alt_T,
        color='black',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='Θᵥ (Mast)'
    )
    # Configure Panel 1
    ax_T.set_xlabel('T, Θᵥ (K)', fontsize=fontsize_labels)
    ax_T.set_ylabel('Altitude (m)', fontsize=fontsize_labels)
    ax_T.tick_params(axis='both', labelsize=fontsize_ticks)
    ax_T.grid(True, linestyle='--', alpha=0.4)
    ax_T.set_ylim(alt_min, alt_max)
    ax_T.set_xlim(275,300)
    for spine in ax_T.spines.values():
        spine.set_linewidth(1.2)

   # ax_T.legend(loc='upper left', fontsize=fontsize_ticks - 2)

    # ────────────────────────────────────────────────────────────────────────
    # 8) Panel 2: RH (bottom x‐axis) & AH (top x‐axis)
    # ------------------------------------------------
    # RH on primary axis (ax_RH)
    ax_RH.plot(
        mwr_rh,
        mwr_alt_RH,
        color='blue',
        linestyle='-',
        #marker='o',
        #markersize=4,
        linewidth=1.5,
        label='RH (MWR)'
    )
    ax_RH.plot(
        cr_rh,
        cr_alt_RH,
        color='black',
        linestyle='--',
       # marker='o',
        #markersize=4,
        linewidth=1.5,
        label='RH (CR)'
    )
    ax_RH.plot(
        mast_rh,
        mast_alt_RH,
        color='green',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='RH (Mast)'
    )

    ax_RH.set_xlabel('RH (%)', fontsize=fontsize_labels)
    ax_RH.set_title(f'Profiles at {timestamp}', fontsize=fontsize_title)
    ax_RH.tick_params(axis='both', labelsize=fontsize_ticks)
    ax_RH.grid(True, linestyle='--', alpha=0.4)
    ax_RH.set_ylim(alt_min, alt_max)
    ax_RH.set_xlim(45, 100)

    for spine in ax_RH.spines.values():
        spine.set_linewidth(1.2)

    # Create twin x‐axis for AH
    ax_AH = ax_RH.twiny()
    ax_AH.plot(
        mwr_ah,
        mwr_alt_RH,
        color='gray',
        linestyle='-',
        #marker='s',
        #markersize=4,
        linewidth=1.2,
        label='AH (MWR)'
    )
    ax_AH.plot(
        cr_ah,
        cr_alt_RH,
        color='skyblue',
        linestyle='--',
       # marker='v',
        #markersize=4,
        linewidth=1.2,
        label='AH (CR)'
    )
    ax_AH.set_xlabel('AH (g/m³)', fontsize=fontsize_labels)
    ax_AH.tick_params(axis='x', labelsize=fontsize_ticks)
    for spine in ax_AH.spines.values():
        spine.set_linewidth(1.2)

    # Combine legends from both axes
    #lines_RH, labels_RH = ax_RH.get_legend_handles_labels()
    #lines_AH, labels_AH = ax_AH.get_legend_handles_labels()
    # Manually collect handles from both RH and AH axes
    handles_RH, labels_RH = ax_RH.get_legend_handles_labels()
    handles_AH, labels_AH = ax_AH.get_legend_handles_labels()

# Combine both into one
    handles_all = handles_RH + handles_AH
    labels_all = labels_RH + labels_AH
    ax_RH.legend(handles_all, labels_all, loc='upper right', fontsize=fontsize_ticks - 2)

   # ax_RH.legend(
    #    lines_RH + lines_AH,
     #   labels_RH + labels_AH,
      #  loc='upper right',
       # fontsize=fontsize_ticks - 2
    #)

  
    # ────────────────────────────────────────────────────────────────────────
    # 9) Panel 3: Specific Humidity (qᵥ)
    # ------------------------------------------------
    ax_qv.plot(
        mwr_qv,
        mwr_alt_qv,
        color='purple',
        linestyle='-',
        #marker='o',
        #markersize=4,
        linewidth=1.5,
        label='qᵥ (MWR)'
    )
    ax_qv.plot(
        cr_qv,
        cr_alt_qv,
        color='black',
        linestyle='--',
        #marker='o',
        #markersize=4,
        linewidth=1.5,
        label='qᵥ (CR)'
    )
    ax_qv.plot(
        mast_qv,
        mast_alt_qv,
        color='green',
        linestyle='-',
        marker='x',
        markersize=6,
        linewidth=1.5,
        label='qᵥ (Mast)'
    )

    ax_qv.set_xlabel('qᵥ (g/kg)', fontsize=fontsize_labels)
    #ax_qv.set_title(f'Profiles at {timestamp}\nqᵥ', fontsize=fontsize_title)
    ax_qv.tick_params(axis='both', labelsize=fontsize_ticks)
    ax_qv.grid(True, linestyle='--', alpha=0.4)
    ax_qv.set_ylim(alt_min, alt_max)
    ax_qv.set_xlim(2, 10)

    #ax_qv.set_xlim(4,11)
    for spine in ax_qv.spines.values():
        spine.set_linewidth(1.2)

    ax_qv.legend(loc='upper right', fontsize=fontsize_ticks - 2)

    # ────────────────────────────────────────────────────────────────────────
    # 10) Add a single shared y‐axis label (for altitude)
    #fig.text(
    #    0.06, 0.5,
    #    'Altitude (m)',
     #   va='center',
     #   rotation='vertical',
     #   fontsize=fontsize_labels
   # )
    # 14) Thicken axis spines
    ax = plt.gca()
    for spine in ['left', 'bottom', 'right', 'top']:
        ax.spines[spine].set_linewidth(1.2)
        # Add parcel-method CBL height to all three subplots (if available)
    zi_parcel = row.get('zi_parcel', None)
    lcl_qv=    row.get('LCL_qv_T_2.0m',None)
    lcl_romps=row.get('LCL_romps_km',None)
    if zi_parcel is not None and not pd.isna(zi_parcel):
        for ax in [ax_T, ax_RH, ax_qv]:
            ax.axhline(zi_parcel, color='magenta', linestyle='--', linewidth=1.5, label='CBL (Parcel)')
    if lcl_qv is not None and not pd.isna(lcl_qv):
        for ax in [ax_T, ax_RH, ax_qv]:
            ax.axhline(lcl_qv*1000, color='brown', linestyle='--', linewidth=1.5, label='LCL (qv)')
    if lcl_romps is not None and not pd.isna(lcl_romps):
        for ax in [ax_T, ax_RH, ax_qv]:
            ax.axhline(lcl_romps*1000, color='green', linestyle='--', linewidth=1.5, label='LCL (Romps)')
                   

        # Add legend to each subplot (optional, or just one for cleanliness)
        ax_T.legend(loc='upper left', fontsize=fontsize_ticks - 8)
        ax_RH.legend(loc='upper right', fontsize=fontsize_ticks -8)
        ax_AH.legend(loc='right', fontsize=fontsize_ticks -8)

        ax_qv.legend(loc='upper right', fontsize=fontsize_ticks - 8)
    plt.tight_layout(rect=[0.1, 0.05, 1, 0.95])
    plt.show()
plot_all_profiles(38,merged_data_10min)


### Bulk relations

In [None]:
# 2) Then write them into a simple one‐page PDF:
column_names = merged_data_10min.columns.tolist()

fig, ax = plt.subplots(figsize=(8, len(column_names)*0.3))
ax.axis('off')

for i, col in enumerate(column_names):
    ax.text(0.01, 1 - (i+1)*0.03, col, fontsize=12, va='top')

#Edit before running!!!magda
pdf_path = rf"C:\path\to\your\Master_Thesis\merged_columns_list.pdf"

plt.savefig(pdf_path, format='pdf', bbox_inches='tight')
plt.close(fig)

print(f"Saved column list to {pdf_path}")

In [None]:
Cd=10e-3
merged_data_10min['u_star']=((merged_data_10min['uw_flux_corr']**2)+(merged_data_10min['vw_flux_corr']**2))**0.25
merged_data_10min['Cd']=(merged_data_10min['u_star']/merged_data_10min['WS_ms_D15463_Avg'])**2
merged_data_10min['SHF_bulk']=merged_data_10min['rho_air_Tv']*Cp*merged_data_10min['Cd']*merged_data_10min['WS_ms_D15463_Avg']*(merged_data_10min['T_srf']-merged_data_10min['Dry_Static_Energy_10'])
merged_data_10min['LHF_bulk']=merged_data_10min['rho_air_Tv']*Lv*merged_data_10min['Cd']*merged_data_10min['WS_ms_D15463_Avg']*(merged_data_10min['qsat_srf']-merged_data_10min['qv_10m'])/1000
print(merged_data_10min)

In [None]:
# Assuming Cp and Lv are defined previously in the code as constants
# Calculate the sign agreement for SHF and LHF
merged_data_10min['SHF_sign_match'] = np.sign(merged_data_10min['SHF_bulk']) == np.sign(merged_data_10min['SHF'])
merged_data_10min['LHF_sign_match'] = np.sign(merged_data_10min['LHF_bulk']) == np.sign(merged_data_10min['LHF'])


# Calculate the separate Sign Agreement Percentages for SHF and LHF
shf_sign_agreement_percentage = merged_data_10min['SHF_sign_match'].mean() * 100
lhf_sign_agreement_percentage = merged_data_10min['LHF_sign_match'].mean() * 100

# Output the results
print(f"SHF Sign Agreement Percentage: {shf_sign_agreement_percentage:.2f}%")
print(f"LHF Sign Agreement Percentage: {lhf_sign_agreement_percentage:.2f}%")



In [None]:
# Plot sensible heat flux against time
# Ensure TIMESTAMP is in datetime format
merged_data_10min['Time'] = pd.to_datetime(merged_data_10min['Time'])
merged_data_10min.rename(columns={'Time': 'TIMESTAMP'}, inplace=True)


plt.figure(figsize=(10, 6))
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['SHF'], s=10, alpha=0.7,label='SHF')
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['SHF_bulk'], s=10, alpha=0.7,label='SHF_bulk')

# Format the x-axis to show time in HH:MM format and set major ticks
plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))  # Set ticks every hour
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))

# Rotate the x-axis labels for better readability
plt.gcf().autofmt_xdate()

plt.xlabel('Time')
plt.ylabel('Heat Flux (W/m^2)')
plt.title('SHF vs Time (averaged in 10 min intervals)')
plt.legend()
plt.grid(True)
#plt.tight_layout()
plt.savefig(os.path.join(data_dir, 'shf_bulk_plot_10min.png'),dpi=300)  # Save the plot as a JPEG file

plt.show()

In [None]:
# Plot sensible heat flux against time
# Ensure TIMESTAMP is in datetime format
merged_data_10min['TIMESTAMP'] = pd.to_datetime(merged_data_10min['TIMESTAMP'])
plt.figure(figsize=(10, 6))
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['LHF'], s=10, alpha=0.7,label='LHF')
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['LHF_bulk'], s=10, alpha=0.7,label='LHF_bulk')

# Format the x-axis to show time in HH:MM format and set major ticks
plt.gca().xaxis.set_major_locator(mdates.HourLocator(interval=2))  # Set ticks every hour
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))

# Rotate the x-axis labels for better readability
plt.gcf().autofmt_xdate()

plt.xlabel('Time')
plt.ylabel('Heat Flux (W/m^2)')
plt.title('LHF vs Time (averaged in 10 min intervals)')
plt.legend()
plt.grid(True)
#plt.tight_layout()
plt.savefig(os.path.join(data_dir, 'lhf_bulk_plot_10min.png'),dpi=300)  # Save the plot as a JPEG file

plt.show()

### Diurnal plots

In [None]:
# --- Filter data where Rain == 0 and Flag == 0 ---
filtered_df = merged_data_10min[(merged_data_10min.get('Flag_Rain', 0) == 0) & (merged_data_10min.get('Flag', 0) == 0)]
filtered_df['TIMESTAMP'] = pd.to_datetime(filtered_df['TIMESTAMP'])
filtered_df.rename(columns={'TIMESTAMP': 'Time'}, inplace=True)


# Set up the figure and axes
fig, ax = plt.subplots(figsize=(10, 6))

# Heights dictionary
heights = {
    'Temperature_K_2': '2 m',
    'Temperature_K_2.99': '2.99 m',
    'Temperature_K_4.47': '4.47 m',
    'Temperature_K_6.69': '6.69 m',
    'Temperature_K_10': '10 m'
}

# Plot temperatures at different mast heights with thicker lines
for col, label in heights.items():
    ax.plot(
        filtered_df['Time'],
        filtered_df[col],
        label=f'{label}',
        linewidth=1.5
    )

# Plot T_srf (from longwave radiation) with a bolder dashed line
ax.plot(
    filtered_df['Time'],
    filtered_df['T_srf'],
    label='Surface T$_{srf}$',
    color='black',
    linestyle='--',
    linewidth=2.0
)

# --- Formatting Spines (thicker borders) ---
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# --- Tick parameters ---
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# --- Format x-axis to show every 2 hours in HH:MM ---
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels for readability
plt.xticks(rotation=45)

# --- Labels and title with bold font ---
ax.set_title('Diurnal Variation of Air and Surface Temperature', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Temperature (K)', fontsize=20, fontweight='bold')

# --- Grid styling ---
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# --- Legend styling ---
legend = ax.legend(
    title='Measurement Height',
    fontsize=14,
    title_fontsize=14,
    frameon=True
)
legend.get_frame().set_linewidth(1.5)

# --- Y‐axis tick label size ---
ax.tick_params(axis='y', labelsize=18)


# --- Tight layout and display ---
plt.tight_layout()
output_filename = 'diurnal_temperature_plot.png'
output_path = os.path.join(folder_path, output_filename)
plt.savefig(output_path, dpi=300,bbox_inches='tight')

# Print confirmation
print(f"Plot saved to: {output_path}")
plt.show()

In [None]:
# Define heights and corresponding variable names
heights = ['2', '2.99', '4.47', '6.69', '10']
sdry_vars = [f'Dry_Static_Energy_{h}' for h in heights]
qv_vars = [f'qv_{h}m' for h in heights]
# Mapping of wind speed columns
wind_speed_cols = {
    'WS_ms_D15008_Avg': '2 m',
    'WS_ms_D15014_Avg': '4.47 m',
    'WS_ms_D15463_Avg': '10 m'
}

# 1) Diurnal Variation of Dry Static Energy (s_dry / c_p)
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

for h, var in zip(heights, sdry_vars):
    ax.plot(
        filtered_df['Time'],
        filtered_df[var],
        label=f'{h} m',
        linewidth=1.5
    )

# Title and labels
ax.set_title('Diurnal Variation of Dry Static Energy (s$_{dry}/c_p$)',
             fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Dry Static Energy (K)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick formatting
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis date formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend
legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Rotate x‐tick labels
plt.xticks(rotation=45)

plt.tight_layout()

# Save figure
output_path = os.path.join(folder_path, 'dry_static_energy_diurnal.png')
plt.savefig(output_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {output_path}")

plt.show()


# ------------------------------------------------------------
# 2) Diurnal Variation of Specific Humidity (q_v)
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

for h, var in zip(heights, qv_vars):
    ax.plot(
        filtered_df['Time'],
        filtered_df[var],
        label=f'{h} m',
        linewidth=1.5
    )

# Title and labels
ax.set_title('Diurnal Variation of Specific Humidity (q$_v$)',
             fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Specific Humidity (g/kg)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick formatting
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis date formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend
legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Rotate x‐tick labels
plt.xticks(rotation=45)

plt.tight_layout()

# Save figure
output_path = os.path.join(folder_path, 'specific_humidity_diurnal.png')
plt.savefig(output_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {output_path}")

plt.show()


# ------------------------------------------------------------
# 3) Diurnal Variation of Virtual Dry Static Energy (s_{dry,v}/c_p)
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

for h in heights:
    col = f'Virtual_Dry_Static_Energy_{h}'
    ax.plot(
        filtered_df['Time'],
        filtered_df[col],
        label=f'{h} m',
        linewidth=1.5
    )

# Title and labels
ax.set_title('Diurnal Variation of Virtual Dry Static Energy (s$_{dry,v}/c_p$)',
             fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Virtual Dry Static Energy (K)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick formatting
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis date formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend
legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Rotate x‐tick labels
plt.xticks(rotation=45)

plt.tight_layout()

# Save figure
output_path = os.path.join(folder_path, 'virtual_dry_static_energy_diurnal.png')
plt.savefig(output_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {output_path}")

plt.show()


# ------------------------------------------------------------
# 4) Diurnal Variation of Wind Speed at Different Heights
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

for col, label in wind_speed_cols.items():
    ax.plot(
        filtered_df['Time'],
        filtered_df[col],
        label=f'{label}',
        linewidth=1.5
    )

# Title and labels
ax.set_title('Diurnal Variation of Wind Speed',
             fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Wind Speed (m/s)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick formatting
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis date formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend
legend = ax.legend(title='Measurement Height', fontsize=14, title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Rotate x‐tick labels
plt.xticks(rotation=45)

plt.tight_layout()

# Save figure
output_path = os.path.join(folder_path, 'wind_speed_diurnal.png')
plt.savefig(output_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {output_path}")

plt.show()


In [None]:
# Define measurement heights in meters
height_levels = [2, 2.99, 4.47, 6.69, 10]
height_pairs = list(zip(height_levels[:-1], height_levels[1:]))

# Calculate vertical gradients (lapse rates) of s_{dry,v}/c_p between levels
for h1, h2 in height_pairs:
    col1 = f'Virtual_Dry_Static_Energy_{h1}'
    col2 = f'Virtual_Dry_Static_Energy_{h2}'
    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'
    
    dz = h2 - h1  # height difference in meters
    filtered_df[lapse_col] = (filtered_df[col2] - filtered_df[col1]) / dz

# Plot the lapse rates
fig, ax = plt.subplots(figsize=(10, 6))

for h1, h2 in height_pairs:
    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'
    ax.plot(filtered_df['Time'], filtered_df[lapse_col], label=f'{h1}-{h2} m', linewidth=1.5)

# Formatting
ax.set_title('Vertical Gradient of Virtual Dry Static Energy (ds$_{dry,v}$/dz)', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Gradient (K/m)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.legend(title='Layer', fontsize=14, title_fontsize=14, frameon=True)
plt.xticks(rotation=45)
plt.tight_layout()

# Save figure
output_path = os.path.join(folder_path, 'lapse_rate_sdryv_diurnal.png')
plt.savefig(output_path, dpi=300, bbox_inches='tight')
print(f"Plot saved to: {output_path}")

plt.show()

In [None]:
# Define measurement heights in meters
height_levels = [2, 2.99, 4.47, 6.69, 10]
height_pairs = list(zip(height_levels[:-1], height_levels[1:]))

# Calculate vertical gradients (lapse rates) of s_{dry,v}/c_p between levels
for h1, h2 in height_pairs:
    col1 = f'Virtual_Dry_Static_Energy_{h1}'
    col2 = f'Virtual_Dry_Static_Energy_{h2}'
    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'
    
    dz = h2 - h1  # height difference in meters
    filtered_df[lapse_col] = (filtered_df[col2] - filtered_df[col1]) / dz

# Plot the lapse rates
fig, ax = plt.subplots(figsize=(10, 6))

for h1, h2 in height_pairs:
    lapse_col = f'Lapse_sdryv_{h1}-{h2}_Kperm'

    # Plot curve and grab its assigned color
    line, = ax.plot(filtered_df['Time'], filtered_df[lapse_col],
                    label=f'{h1}-{h2} m', linewidth=1.5)
    color = line.get_color()

    # ---- Find zero-crossings for this gradient ----
    grad = filtered_df[lapse_col].to_numpy()
    time_vals = filtered_df['Time'].to_numpy()
    sign = np.sign(grad)

    # zero-crossings = where sign changes between consecutive points
    zc_idx = np.where(sign[1:] * sign[:-1] < 0)[0] + 1
    zc_times = time_vals[zc_idx]

    # Draw vertical lines in the same color as the curve
    for t in zc_times:
        ax.axvline(t, color=color, linestyle='--', alpha=0.9, linewidth=1)

# Formatting
ax.set_title('Vertical Gradient of Virtual Dry Static Energy (ds$_{dry,v}$/dz)', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Gradient (K/m)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)

ax.legend(title='Layer', fontsize=14, title_fontsize=14, frameon=True)
plt.xticks(rotation=45)
plt.tight_layout()

# Save figure
output_path = os.path.join(folder_path, 'lapse_rate_sdryv_diurnal.png')
plt.savefig(output_path, dpi=300, bbox_inches='tight')
print(f"Plot saved to: {output_path}")

plt.show()



In [None]:
# 1) Longwave Radiation (LW) Over Time
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Plot downward, upward, and net LW with thicker lines
ax.plot(
    filtered_df['Time'],
    filtered_df['IR20Dn'],
    label='LW↓ (Downward)',
    color='blue',
    linewidth=2.0
)
ax.plot(
    filtered_df['Time'],
    filtered_df['IR20Up'],
    label='LW↑ (Upward)',
    color='orange',
    linewidth=2.0
)
ax.plot(
    filtered_df['Time'],
    filtered_df['NetRl'],
    label='LW$_{net}$ (Net)',
    color='green',
    linewidth=2.0
)

# Title and labels in bold
ax.set_title('Longwave Radiation (LW) vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Radiation (W/m²)', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters for major and minor ticks
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: show HH:MM every 2 hours, minor ticks every 30 minutes
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels for readability
plt.xticks(rotation=45)

# Grid styling: dashed major, dotted minor
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(fontsize=14, title='LW Components', title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Y‐axis tick label size
ax.tick_params(axis='y', labelsize=18)

plt.tight_layout()

# Save figure
lw_output = os.path.join(folder_path, 'longwave_radiation_diurnal.png')
plt.savefig(lw_output, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {lw_output}")

plt.show()


# ------------------------------------------------------------
# 2) Shortwave Radiation (SW) Over Time
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Plot downward, upward, and net SW with thicker lines
ax.plot(
    filtered_df['Time'],
    filtered_df['SR15D1Dn_Irr'],
    label='SW↓ (Downward)',
    color='purple',
    linewidth=2.0
)
ax.plot(
    filtered_df['Time'],
    filtered_df['SR15D1Up_Irr'],
    label='SW↑ (Upward)',
    color='red',
    linewidth=2.0
)
ax.plot(
    filtered_df['Time'],
    filtered_df['NetRs'],
    label='SW$_{net}$ (Net)',
    color='darkgreen',
    linewidth=2.0
)

# Title and labels in bold
ax.set_title('Shortwave Radiation (SW) vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Radiation (W/m²)', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(fontsize=14, title='SW Components', title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Y‐axis tick label size
ax.tick_params(axis='y', labelsize=18)

plt.tight_layout()

# Save figure
sw_output = os.path.join(folder_path, 'shortwave_radiation_diurnal.png')
plt.savefig(sw_output, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {sw_output}")

plt.show()
# 3)Radiation Over Time
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Scatter for measured Net_Rad (negative)
ax.plot(
    filtered_df['Time'],
    filtered_df['Net_Radiation_10min'],
    label='F$_{net}$ (Net)',

    color='red',
    linewidth=2.0
)
ax.plot(
    filtered_df['Time'],
    filtered_df['NetRl'],
    label='LW$_{net}$ (Net)',
    color='green',
    linewidth=2.0
)
ax.plot(
    filtered_df['Time'],
    filtered_df['NetRs'],
    label='SW$_{net}$ (Net)',
    color='blue',
    linewidth=2.0
)

# Title and labels in bold
ax.set_title('Net Radiation vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Radiation (W/m²)', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(fontsize=14, title='Components', title_fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Y‐axis tick label size
ax.tick_params(axis='y', labelsize=18)

plt.tight_layout()

# Save figure
sw_output = os.path.join(folder_path, 'radiation_diurnal.png')
plt.savefig(sw_output, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {sw_output}")

plt.show()




In [None]:
#print(filtered_df['tau_xz_corr'])
#tau_yz_corr
#tau_xy_corr'])
# 1) Diurnal Variation of Sensible Heat Flux (SHF)
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Plot measured SHF
ax.plot(
    filtered_df['Time'],
    filtered_df['tau_xz_corr'],
    marker='o',
    linestyle='-',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='green',
    label='${\\tau}_{xz}$'
)

# Plot bulk‐estimate SHF
ax.plot(
    filtered_df['Time'],
    filtered_df['tau_yz_corr'],
    marker='s',
    linestyle='--',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='red',
    label='${\\tau}_{yz}$'
)
# Plot bulk‐estimate SHF
ax.plot(
    filtered_df['Time'],
    filtered_df['tau_xy_corr'],
    marker='x',
    linestyle='--',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='blue',
    label='${\\tau_{xy}}$'
)
# Title and labels in bold
ax.set_title('Diurnal Variation of Momentum Flux', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Shear stress (Pa)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Show and save
plt.tight_layout()
shf_path = os.path.join(folder_path, 'diurnal_tau.png')
plt.savefig(shf_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {shf_path}")
plt.show()


In [None]:

# Assuming Cp and Lv are defined previously in the code as constants
# Calculate the sign agreement for SHF and LHF
filtered_df['SHF_sign_match_again'] = np.sign(filtered_df['SHF_bulk']) == np.sign(filtered_df['SHF'])
filtered_df['LHF_sign_match_again'] = np.sign(filtered_df['LHF_bulk']) == np.sign(filtered_df['LHF'])


# Calculate the separate Sign Agreement Percentages for SHF and LHF
shf_sign_agreement_percentage = filtered_df['SHF_sign_match'].mean() * 100
lhf_sign_agreement_percentage = filtered_df['LHF_sign_match'].mean() * 100

# Output the results
print(f"SHF Sign Agreement Percentage: {shf_sign_agreement_percentage:.2f}%")
print(f"LHF Sign Agreement Percentage: {lhf_sign_agreement_percentage:.2f}%")

# 1) Diurnal Variation of Sensible Heat Flux (SHF)
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Plot measured SHF
ax.plot(
    filtered_df['Time'],
    filtered_df['SHF'],
    marker='o',
    linestyle='-',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='darkorange',
    label='SHF (measured)'
)

# Plot bulk‐estimate SHF
ax.plot(
    filtered_df['Time'],
    filtered_df['SHF_bulk'],
    marker='s',
    linestyle='--',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='gray',
    label='SHF (bulk estimate)'
)

# Title and labels in bold
ax.set_title('Diurnal Variation of Sensible Heat Flux', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Sensible Heat Flux (W/m²)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Show and save
plt.tight_layout()
shf_path = os.path.join(folder_path, 'diurnal_shf.png')
plt.savefig(shf_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {shf_path}")
plt.show()


# ------------------------------------------------------------
# 2) Diurnal Variation of Latent Heat Flux (LHF)
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Plot measured LHF
ax.plot(
    filtered_df['Time'],
    filtered_df['LHF'],
    marker='o',
    linestyle='-',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='lightskyblue',
    label='LHF (measured)'
)

# Plot bulk‐estimate LHF
ax.plot(
    filtered_df['Time'],
    filtered_df['LHF_bulk'],
    marker='s',
    linestyle='--',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='gray',
    label='LHF (bulk estimate)'
)

# Title and labels in bold
ax.set_title('Diurnal Variation of Latent Heat Flux', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Latent Heat Flux (W/m²)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14,frameon=True)
legend.get_frame().set_linewidth(1.5)

# Show and save
plt.tight_layout()
lhf_path = os.path.join(folder_path, 'diurnal_lhf.png')
plt.savefig(lhf_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {lhf_path}")
plt.show()


# ------------------------------------------------------------
# 3) Combined Diurnal Variation of SHF and LHF
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Plot LHF (measured)
ax.plot(
    filtered_df['Time'],
    filtered_df['LHF'],
    marker='o',
    linestyle='-',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='lightskyblue',
    label='LHF (measured)'
)

# Plot SHF (measured)
ax.plot(
    filtered_df['Time'],
    filtered_df['SHF'],
    marker='s',
    linestyle='-',
    markersize=4,
    linewidth=1.5,
    alpha=0.8,
    color='darkorange',
    label='SHF (measured)'
)

# Title and labels
ax.set_title('Diurnal Variation of SHF and LHF', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Heat Flux (W/m²)', fontsize=20, fontweight='bold')

# Thicken spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: every 2 hours
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Show and save
plt.tight_layout()
combined_path = os.path.join(folder_path, 'diurnal_shf_lhf_combined.png')
plt.savefig(combined_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {combined_path}")
plt.show()

In [None]:
fig, ax = plt.subplots(figsize=(10, 6))

# Plot CO₂ concentration with markers and a bold line
ax.plot(
    filtered_df['Time'],
    filtered_df['PPM_CO2'],
    marker='o',
    markersize=4,
    linestyle='-',
    linewidth=1.5,
    color='green',
    label='CO₂ Concentration'
)

# Title and labels in bold
ax.set_title('CO₂ Concentration Over Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('CO₂ Concentration (ppm)', fontsize=20, fontweight='bold')

# Thicken all spines (axis borders)
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters for major & minor ticks
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: show HH:MM every 2 hours, minor ticks every 30 min
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels for readability
plt.xticks(rotation=45)

# Grid styling: dashed major, dotted minor
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# (Optional) If you want a legend, uncomment:
# legend = ax.legend(loc='upper right', fontsize=12, frameon=True)
# legend.get_frame().set_linewidth(1.5)

# Tight layout
plt.tight_layout()

# Save figure
co2_output = os.path.join(folder_path, 'co2_concentration_time_series.png')
plt.savefig(co2_output, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {co2_output}")

plt.show()

In [None]:
# Plot CO2 flux from both datasets
plt.figure(figsize=(12, 6))

# Plot merged_data_10min with circles ('o')
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['F_CO2'], s=50, alpha=0.7, label='Warmenhuizen (10 min)', marker='o')

# Plot vl with crosses ('x')
plt.scatter(vl['Timestamp'], vl['F_CO2_ppm_ms'], s=50, alpha=0.7, label='Veenkampen (30 min)', marker='x')

plt.scatter(loo['Timestamp'], loo['F_CO2_ppm_ms'], s=50, alpha=0.7, label='Loobos (30min)',marker='+')

plt.scatter(ams['Timestamp'], ams['F_CO2_ppm_ms'], s=50, alpha=0.7,label='Amsterdam (30min)', marker='d')

plt.title('CO2 Flux Comparison')
plt.xlabel('Time')
plt.ylabel('CO2 Flux (ppm*ms^1)')
plt.legend()
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

### LW_dn vs WVP (IVP)

In [None]:
print(merged_data_10min.columns)

In [None]:
# Plotting IR20Dn vs wvp
plt.figure(figsize=(10, 6))
plt.scatter(merged_data_10min['IWV'], merged_data_10min['IR20Dn'], alpha=0.5)
plt.xlabel('Integrated Water Vapor (kg/m^2)')
plt.ylabel('LW_dn radiation (W/m^2)')
plt.title('LW_dn vs Integrated Water Vapor')
plt.grid(True)
# Save the figure in the data_dir directory
save_path = os.path.join(data_dir, 'LW_dn_vs_IWV.png')
plt.savefig(save_path)

plt.show()

### SW_dn vs LWP

In [None]:
# Plotting SR15D1Dn_Irr vs LWP
plt.figure(figsize=(10, 6))
plt.scatter(merged_data_10min['LWP'], merged_data_10min['SR15D1Dn_Irr'], alpha=0.5)
plt.xlabel('Liquid Water Path (g/m^2)')
plt.ylabel('SW_dn Irradiance (W/m^2)')
plt.title('SW_dn vs Liquid Water Path')
plt.grid(True)

# Save the figure in the data_dir directory
save_path = os.path.join(data_dir, 'SW_dn_vs_LWP.png')
plt.savefig(save_path)
plt.show()

### LWP vs time

In [None]:
# Plotting LWP vs Time
plt.figure(figsize=(12, 6))
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['LWP'], marker='o', linestyle='-')
plt.xlabel('Time')
plt.ylabel('Liquid Water Path (g/m^2)')
plt.title('Liquid Water Path vs Time')
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()

# Save the figure in the data_dir directory
save_path = os.path.join(data_dir, 'LWP_vs_Time_10min.png')
plt.savefig(save_path)
plt.show()  # Close the plot to release memory


### T at 2.99m, T_sonic

In [None]:
# Plotting Temperature_K_2.99, T_srf, and Average_Temperature_Corr vs Time
plt.figure(figsize=(12, 6))
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['Temperature_K_2.99'], marker='o', linestyle='-', color='b', label='Temperature_K_2.99')
#plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['T_srf'], marker='o', linestyle='-', color='g', label='T_srf')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['Average_Temperature_Corr'], marker='o', linestyle='-', color='r', label='Average_Temperature_Corr')
plt.xlabel('Time')
plt.ylabel('Temperature (K)')
plt.title('Temperature Variations Over Time')
plt.legend()
plt.grid(True)
plt.xticks(rotation=45)
plt.tight_layout()

save_path = os.path.join(data_dir, 'Temperature_Variations_10min.png')
plt.savefig(save_path)
plt.show()  # Close the plot to release memory

### Wind at 2.99 m, Wind speed from sonic

In [None]:
# Plotting WS_ms_D15014_Avg and Wind_Speed vs Time
plt.figure(figsize=(12, 6))
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['WS_ms_D15008_Avg'], marker='o', linestyle='-', color='b', label='WS_ms_D15014_Avg')
plt.plot(merged_data_10min['TIMESTAMP'], merged_data_10min['Wind_Speed'], marker='x', linestyle='-', color='r', label='Wind_Speed_sonic')
plt.xlabel('Time')
plt.ylabel('Wind Speed (m/s)')
plt.title('Wind Speed vs Time')
plt.grid(True)
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()

# Save the figure in the data_dir directory
save_path = os.path.join(data_dir, 'Wind_Speed_Comparison_10min.png')
plt.savefig(save_path)
plt.show()  # Close the plot to release memory

print(f"Plot saved successfully as '{save_path}'")


In [None]:
# Calculate the temperature difference
merged_data_10min['Windspeed_Diff'] = merged_data_10min['WS_ms_D15014_Avg'] - merged_data_10min['Wind_Speed']

# Create a flag column
merged_data_10min['Flag_windspeed'] = merged_data_10min['Windspeed_Diff'].apply(lambda x: 1 if abs(x) > 1 else 0)
#Create the second flag column for rain condition
merged_data_10min['Flag_Rain_windspeed'] = merged_data_10min['Rain'].apply(lambda x: 1 if x > 0 else 0)

# Display the first few rows of the DataFrame to verify
# Display the first few rows of the DataFrame to verify
print(merged_data_10min[['TIMESTAMP', 'WS_ms_D15014_Avg', 'Wind_Speed', 'Windspeed_Diff', 'Rain', 'Flag_windspeed', 'Flag_Rain_windspeed']].head())

In [None]:
# Create the figure
plt.figure(figsize=(10, 6))

# Plot all wind speed points
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['WS_ms_D15014_Avg'],
            label='WS_4.47m (mast)', alpha=0.4, color='royalblue')
plt.scatter(merged_data_10min['TIMESTAMP'], merged_data_10min['Wind_Speed'],
            label='Wind_Speed (sonic)', alpha=0.4, color='seagreen')

# Highlight flagged outliers
flagged_data = merged_data_10min[merged_data_10min['Flag_windspeed'] == 1]
plt.scatter(flagged_data['TIMESTAMP'], flagged_data['WS_ms_D15014_Avg'],
            color='red', label='Flagged Outliers ($\Delta w_s > 1$ m/s)', s=50)
plt.scatter(flagged_data['TIMESTAMP'], flagged_data['Wind_Speed'],
            color='red', s=50)# label='Flagged Outliers (Wind_Speed)',

# Highlight rain-flagged points
flagged_rain_data = merged_data_10min[merged_data_10min['Flag_Rain_windspeed'] == 1]
plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['WS_ms_D15014_Avg'],
            color='black', marker='+', label='Flagged Rain', s=100)# (WS_ms_D15014_Avg)
plt.scatter(flagged_rain_data['TIMESTAMP'], flagged_rain_data['Wind_Speed'],
            color='black', marker='+', s=100)# label='Flagged Rain (Wind_Speed)',

# Format x-axis
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
plt.xticks(fontsize=18, rotation=45)
plt.yticks(fontsize=18)

# Axis labels and title
plt.xlabel('Time', fontsize=20)
plt.ylabel('Wind Speed (m/s)', fontsize=20)
plt.title('Wind Speed Data with Flagged Outliers', fontsize=20, weight='bold')

# Legend, grid, and layout
plt.legend(fontsize=14, loc='upper left')
plt.grid(True, linestyle='--', alpha=0.7)
plt.tight_layout()

# Save the plot
output_path = os.path.join(data_dir, 'wind_speed_scatter_flagged.png')
plt.savefig(output_path, dpi=300)

plt.show()

In [None]:
print(merged_data_10min.columns)

In [None]:


# 1. Filter the DataFrame for rows where all the flags are 0
filtered_data = merged_data_10min[
    (merged_data_10min['Flag'] == 0) &
     (merged_data_10min['Flag_Rain'] == 0) #& 
 #  (merged_data_10min['Flag_windspeed'] == 0) &
  #  (merged_data_10min['Flag_Rain_windspeed'] == 0)
]

# 2. Extract the necessary columns for plotting
time = pd.to_datetime(filtered_data['TIMESTAMP'])  # Convert timestamp to datetime
SHF = filtered_data['SHF']
LHF = filtered_data['LHF']

Net_Rad = filtered_data['Net_Radiation_10min']

G = filtered_data['G']
F_CO2 = filtered_data['F_CO2']

# 3. Create the scatter plots
fig, axs = plt.subplots(2, 2, figsize=(15, 10))

# Scatter plot for SHF
axs[0, 0].scatter(time, SHF, color='red', alpha=0.5,label='SHF')
axs[0, 0].scatter(time, LHF, color='blue', alpha=0.5,label='LHF')

axs[0, 0].set_title('Sensible and Latent Heat Flux')
axs[0, 0].set_xlabel('Time')
axs[0, 0].set_ylabel('Heat FLux (W/m²)')
axs[0, 0].grid(True)
axs[0, 0].legend()  # Add legend for SHF and LHF

# Scatter plot for LHF
axs[0, 1].scatter(time, Net_Rad, color='blue', alpha=0.5)
axs[0, 1].set_title('Net Radiation')
axs[0, 1].set_xlabel('Time')
axs[0, 1].set_ylabel('Net_Rad (W/m²)')
axs[0, 1].grid(True)

# Scatter plot for G
axs[1, 0].scatter(time, G, color='green', alpha=0.5)
axs[1, 0].set_title('Ground Heat Flux (G)')
axs[1, 0].set_xlabel('Time')
axs[1, 0].set_ylabel('G (W/m²)')
axs[1, 0].grid(True)

# Scatter plot for F_CO2
axs[1, 1].scatter(time, F_CO2, color='purple', alpha=0.5)
axs[1, 1].set_title('CO2 Flux (F_CO2)')
axs[1, 1].set_xlabel('Time')
axs[1, 1].set_ylabel('F_CO2 (μmol m⁻² s⁻¹)')
axs[1, 1].grid(True)

# 4. Adjust layout for better spacing
plt.tight_layout()
plt.suptitle('Scatter Plots of SEB components Over Time (Filtered by Flags)', fontsize=16, y=1.02)

# Create the full save path
save_path = os.path.join(data_dir, 'SEB_scatter_plots_10min.png')  # The 'dpi' argument should not be in this line

# Save the figure using plt.savefig
plt.savefig(save_path, dpi=300,bbox_inches='tight')  # Add the 'dpi=300' here, inside plt.savefig()

# Show the scatter plots
plt.show()


In [None]:

# Define the full path to the CSV file
fluxes_30min_file_path = os.path.join(data_dir, 'flux_data_30min.csv')

# Load the CSV file into a DataFrame
flux_data_30min = pd.read_csv(fluxes_30min_file_path)

# Convert the 'TIMESTAMP' column to datetime format if it's not already
flux_data_30min['TIMESTAMP'] = pd.to_datetime(flux_data_30min['TIMESTAMP'])


# Print the first few rows to confirm the data is loaded correctly
print(flux_data_30min.head())

In [None]:

# 2) Now define each path by joining that base folder with the filename:
loobos_path                    = os.path.join(data_vl, 'Loobos_Fluxes.csv')
amsterdam_path                 = os.path.join(data_vl, 'Amsterdam_Fluxes.csv')
veenkampen_path                = os.path.join(data_vl, 'Veenkampen_Fluxes.csv')

loobos_soil_path               = os.path.join(data_vl, 'Loobos_Soil.csv')
veenkampen_soil_path           = os.path.join(data_vl, 'Veenkampen_Soil.csv')

veenkampen_net_radiation_path  = os.path.join(data_vl, 'Veenkampen_Meteorology.csv')#Veenkampen_net_rad.csv #Veenkampen_Meteorology
loobos_net_radiation_path      = os.path.join(data_vl, 'Loobos_Meteorology.csv')
amsterdam_net_radiation_path   = os.path.join(data_vl, 'Amsterdam_Meteorology.csv')

# Read the CSV files
loobos_data = pd.read_csv(loobos_path)
amsterdam_data = pd.read_csv(amsterdam_path)
veenkampen_data = pd.read_csv(veenkampen_path)

# Read the CSV files for soil data
loobos_soil_data = pd.read_csv(loobos_soil_path)
veenkampen_soil_data = pd.read_csv(veenkampen_soil_path)

# Read the CSV files for net rad data
veenkampen_net_rad = pd.read_csv(veenkampen_net_radiation_path)
loobos_net_rad = pd.read_csv(loobos_net_radiation_path)
amsterdam_net_rad = pd.read_csv(amsterdam_net_radiation_path)

# Deleting the second row from each DataFrame
loobos_data = loobos_data.drop(index=0).reset_index(drop=True)
amsterdam_data = amsterdam_data.drop(index=0).reset_index(drop=True)
veenkampen_data = veenkampen_data.drop(index=0).reset_index(drop=True)
loobos_soil_data=loobos_soil_data.drop(index=0).reset_index(drop=True)
veenkampen_soil_data=veenkampen_soil_data.drop(index=0).reset_index(drop=True)
veenkampen_net_rad=veenkampen_net_rad.drop(index=0).reset_index(drop=True)
loobos_net_rad=loobos_net_rad.drop(index=0).reset_index(drop=True)
amsterdam_net_rad=amsterdam_net_rad.drop(index=0).reset_index(drop=True)

# Keep only the specified columns
columns_to_keep = ['Timestamp','H', 'LE','co2_flux']
columns_to_keep_rad = ['Timestamp','SW_IN_1_1_1','SW_OUT_1_1_1','LW_IN_1_1_1','LW_OUT_1_1_1']

columns_to_keep_soil = ['Timestamp','G_1_1_1','G_2_1_1','G_3_1_1','G_4_1_1']
columns_to_keep_rad_loobos = ['Timestamp','SW_IN_1_1_1','SW_OUT_1_1_1','LW_IN_1_1_1','LW_OUT_1_1_1']
columns_to_keep_rad_ams=['Timestamp','Rnet']

loobos_data = loobos_data[columns_to_keep]
amsterdam_data = amsterdam_data[columns_to_keep]
veenkampen_data = veenkampen_data[columns_to_keep]
loobos_soil_data=loobos_soil_data[columns_to_keep_soil]
veenkampen_soil_data=veenkampen_soil_data[columns_to_keep_soil]
veenkampen_net_rad=veenkampen_net_rad[columns_to_keep_rad]
loobos_net_rad=loobos_net_rad[columns_to_keep_rad_loobos]
amsterdam_net_rad=amsterdam_net_rad[columns_to_keep_rad_ams]

# Convert 'H' and 'LE' columns to numeric
loobos_data['H'] = pd.to_numeric(loobos_data['H'], errors='coerce')
loobos_data['LE'] = pd.to_numeric(loobos_data['LE'], errors='coerce')
loobos_data['co2_flux'] = pd.to_numeric(loobos_data['co2_flux'], errors='coerce')
# Convert µmol m-2 s-1 to ppm m s^-1
# Add the converted values back to the DataFrame
loobos_data['F_CO2_ppm_ms'] = ((loobos_data['co2_flux'] / 1e6 )/ 22.414) * 1e6  

amsterdam_data['H'] = pd.to_numeric(amsterdam_data['H'], errors='coerce')
amsterdam_data['LE'] = pd.to_numeric(amsterdam_data['LE'], errors='coerce')
amsterdam_data['co2_flux'] = pd.to_numeric(amsterdam_data['co2_flux'], errors='coerce')
amsterdam_data['F_CO2_ppm_ms'] = ((amsterdam_data['co2_flux'] / 1e6 )/ 22.414) * 1e6  

veenkampen_data['H'] = pd.to_numeric(veenkampen_data['H'], errors='coerce')
veenkampen_data['LE'] = pd.to_numeric(veenkampen_data['LE'], errors='coerce')
veenkampen_data['co2_flux'] = pd.to_numeric(veenkampen_data['co2_flux'], errors='coerce')
veenkampen_data['F_CO2_ppm_ms'] = ((veenkampen_data['co2_flux'] / 1e6 )/ 22.414) * 1e6  

for col in columns_to_keep_soil[1:]:  # Convert to numeric excluding 'Timestamp'
    loobos_soil_data[col] = pd.to_numeric(loobos_soil_data[col], errors='coerce')
    
for col in columns_to_keep_soil[1:]:  # Convert to numeric excluding 'Timestamp'
    veenkampen_soil_data[col] = pd.to_numeric(veenkampen_soil_data[col], errors='coerce')

#veenkampen_net_rad['RN_1_1_1']=pd.to_numeric(veenkampen_net_rad['RN_1_1_1'],errors='coerce')
for col in columns_to_keep_rad[1:]:  # Convert to numeric excluding 'Timestamp'
    veenkampen_net_rad[col] = pd.to_numeric(veenkampen_net_rad[col], errors='coerce')
    
amsterdam_net_rad['Rnet']=pd.to_numeric(amsterdam_net_rad['Rnet'],errors='coerce')

for col in columns_to_keep_rad_loobos[1:]:  # Convert to numeric excluding 'Timestamp'
    loobos_net_rad[col] = pd.to_numeric(loobos_net_rad[col], errors='coerce')

# Convert the 'Timestamp' column to datetime format (if it's not already)
loobos_data['Timestamp'] = pd.to_datetime(loobos_data['Timestamp'])
amsterdam_data['Timestamp'] = pd.to_datetime(amsterdam_data['Timestamp'])
veenkampen_data['Timestamp'] = pd.to_datetime(veenkampen_data['Timestamp'])
loobos_soil_data['Timestamp'] = pd.to_datetime(loobos_soil_data['Timestamp'])
veenkampen_soil_data['Timestamp'] = pd.to_datetime(veenkampen_soil_data['Timestamp'])
veenkampen_net_rad['Timestamp'] = pd.to_datetime(veenkampen_net_rad['Timestamp'])
loobos_net_rad['Timestamp'] = pd.to_datetime(loobos_net_rad['Timestamp'])
amsterdam_net_rad['Timestamp'] = pd.to_datetime(amsterdam_net_rad['Timestamp'])

# Filter for rows with the specified date
date_filter = date_str

loobos_data = loobos_data[loobos_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
amsterdam_data = amsterdam_data[amsterdam_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
veenkampen_data = veenkampen_data[veenkampen_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
loobos_soil_data = loobos_soil_data[loobos_soil_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
veenkampen_soil_data = veenkampen_soil_data[veenkampen_soil_data['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
veenkampen_net_rad = veenkampen_net_rad[veenkampen_net_rad['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
loobos_net_rad = loobos_net_rad[loobos_net_rad['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]
amsterdam_net_rad=amsterdam_net_rad[amsterdam_net_rad['Timestamp'].dt.date == pd.to_datetime(date_filter).date()]

# Drop NaN values
loobos_data = loobos_data.dropna()
amsterdam_data = amsterdam_data.dropna()
veenkampen_data = veenkampen_data.dropna()
loobos_soil_data=loobos_soil_data.dropna()
veenkampen_soil_data=veenkampen_soil_data.dropna()
veenkampen_net_rad=veenkampen_net_rad.dropna()
loobos_net_rad=loobos_net_rad.dropna()
amsterdam_net_rad=amsterdam_net_rad.dropna()

# Reset index
loobos_data = loobos_data.reset_index(drop=True)
amsterdam_data = amsterdam_data.reset_index(drop=True)
veenkampen_data = veenkampen_data.reset_index(drop=True)
loobos_soil_data=loobos_soil_data.reset_index(drop=True)
veenkampen_soil_data=veenkampen_soil_data.reset_index(drop=True)
veenkampen_net_rad=veenkampen_net_rad.reset_index(drop=True)
loobos_net_rad=loobos_net_rad.reset_index(drop=True)
amsterdam_net_rad=amsterdam_net_rad.reset_index(drop=True)
veenkampen_net_rad['Net_Rad']=veenkampen_net_rad['SW_OUT_1_1_1']-veenkampen_net_rad['SW_IN_1_1_1']+veenkampen_net_rad['LW_OUT_1_1_1']-veenkampen_net_rad['LW_IN_1_1_1']
loobos_net_rad['Net_Rad']=loobos_net_rad['SW_OUT_1_1_1']-loobos_net_rad['SW_IN_1_1_1']+loobos_net_rad['LW_OUT_1_1_1']-loobos_net_rad['LW_IN_1_1_1']
# Display the first few rows of each DataFrame
print("Loobos Fluxes Data:")
print(loobos_data)

print("\nAmsterdam Fluxes Data:")
print(amsterdam_data)

print("\nVeenkampen Fluxes Data:")
print(veenkampen_data)

#Round the 'Timestamp' to the nearest 10-minute interval for Loobos
loobos_soil_data['Timestamp'] = loobos_soil_data['Timestamp'].dt.floor('10T')
loobos_soil_data = loobos_soil_data.groupby('Timestamp').mean().reset_index()

# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen
veenkampen_soil_data['Timestamp'] = veenkampen_soil_data['Timestamp'].dt.floor('10T')
veenkampen_soil_data = veenkampen_soil_data.groupby('Timestamp').mean().reset_index()

# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen
veenkampen_net_rad['Timestamp'] = veenkampen_net_rad['Timestamp'].dt.floor('10T')
veenkampen_net_rad = veenkampen_net_rad.groupby('Timestamp').mean().reset_index()

# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen
loobos_net_rad['Timestamp'] = loobos_net_rad['Timestamp'].dt.floor('10T')
loobos_net_rad = loobos_net_rad.groupby('Timestamp').mean().reset_index()


# Round the 'Timestamp' to the nearest 10-minute interval for Veenkampen
amsterdam_net_rad['Timestamp'] = amsterdam_net_rad['Timestamp'].dt.floor('10T')
amsterdam_net_rad = amsterdam_net_rad.groupby('Timestamp').mean().reset_index()

print("\nLoobos Soil Data:")
print(loobos_soil_data)

print("\nVeenkampen Soil Data:")
print(veenkampen_soil_data)

print("\nVeenkampen Net Rad Data:")
print(veenkampen_net_rad)

print("\nLoobos Net Rad Data:")
print(loobos_net_rad)

print("\nAmsterdam Net Rad Data:")
print(amsterdam_net_rad)

In [None]:
# 1) SHF vs Time for Multiple Stations + Measured SHF
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Scatter for Loobos, Amsterdam, Veenkampen
ax.scatter(
    loobos_data['Timestamp'],
    loobos_data['H'],
    label='Loobos H',
    color='green',
    s=40,
    alpha=0.7,
    marker='o'
)
ax.scatter(
    amsterdam_data['Timestamp'],
    amsterdam_data['H'],
    label='Amsterdam H',
    color='orange',
    s=40,
    alpha=0.7,
    marker='o'
)
ax.scatter(
    veenkampen_data['Timestamp'],
    veenkampen_data['H'],
    label='Veenkampen H',
    color='blue',
    s=40,
    alpha=0.7,
    marker='o'
)

# Scatter for Measured SHF
ax.scatter(
    time,
    SHF,
    label='SHF (Measured)',
    color='red',
    s=40,
    alpha=0.7,
    marker='x'
)

# Title and labels in bold
ax.set_title('SHF vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('SHF [W/m²]', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Tight layout and save
plt.tight_layout()
shf_stations_path = os.path.join(data_dir, 'SHF_vs_Stations.png')
plt.savefig(shf_stations_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {shf_stations_path}")
plt.show()


# ------------------------------------------------------------
# 2) LHF vs Time for Multiple Stations + Measured LHF
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Scatter for Loobos, Amsterdam, Veenkampen
ax.scatter(
    loobos_data['Timestamp'],
    loobos_data['LE'],
    label='Loobos LE',
    color='green',
    s=40,
    alpha=0.7,
    marker='o'
)
ax.scatter(
    amsterdam_data['Timestamp'],
    amsterdam_data['LE'],
    label='Amsterdam LE',
    color='orange',
    s=40,
    alpha=0.7,
    marker='o'
)
ax.scatter(
    veenkampen_data['Timestamp'],
    veenkampen_data['LE'],
    label='Veenkampen LE',
    color='blue',
    s=40,
    alpha=0.7,
    marker='o'
)

# Scatter for Measured LHF
ax.scatter(
    time,
    LHF,
    label='LHF (Measured)',
    color='red',
    s=40,
    alpha=0.7,
    marker='x'
)

# Title and labels in bold
ax.set_title('LHF vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('LHF [W/m²]', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Tight layout and save
plt.tight_layout()
lhf_stations_path = os.path.join(data_dir, 'LHF_vs_Stations.png')
plt.savefig(lhf_stations_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {lhf_stations_path}")
plt.show()


# ------------------------------------------------------------
# 3) CO₂ Flux vs Time for Multiple Stations + Measured F_CO2
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Scatter for Loobos, Amsterdam, Veenkampen
ax.scatter(
    loobos_data['Timestamp'],
    loobos_data['F_CO2_ppm_ms'],
    label='Loobos F_CO₂',
    color='green',
    s=40,
    alpha=0.7,
    marker='o'
)
ax.scatter(
    amsterdam_data['Timestamp'],
    amsterdam_data['F_CO2_ppm_ms'],
    label='Amsterdam F_CO₂',
    color='orange',
    s=40,
    alpha=0.7,
    marker='o'
)
ax.scatter(
    veenkampen_data['Timestamp'],
    veenkampen_data['F_CO2_ppm_ms'],
    label='Veenkampen F_CO₂',
    color='blue',
    s=40,
    alpha=0.7,
    marker='o'
)

# Scatter for Measured F_CO2
ax.scatter(
    time,
    F_CO2,
    label='F_CO₂ (Measured)',
    color='red',
    s=40,
    alpha=0.7,
    marker='x'
)

# Title and labels in bold
ax.set_title('CO₂ Flux vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('CO₂ Flux [ppm · m s⁻¹]', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)
plt.ylim(-1,1)
# Grid styling
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Tight layout and save
plt.tight_layout()
co2_flux_path = os.path.join(data_dir, 'FCO2_vs_Stations.png')
plt.savefig(co2_flux_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {co2_flux_path}")
plt.show()

In [None]:


plt.figure(figsize=(14, 6))


# Plotting Loobos Soil Fluxes with circular markers and green color
plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_1_1_1'], 
            label='Loobos G_1_1_1', color='green', alpha=0.5, s=30, marker='o')  # Circle
plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_2_1_1'], 
            label='Loobos G_2_1_1', color='green', alpha=0.5, s=30, marker='D')  # Circle
plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_3_1_1'], 
            label='Loobos G_3_1_1', color='green', alpha=0.5, s=30, marker='v')  # Circle
plt.scatter(loobos_soil_data['Timestamp'], loobos_soil_data['G_4_1_1'], 
            label='Loobos G_4_1_1', color='green', alpha=0.5, s=30, marker='^')  # Circle

# Plotting Veenkampen Soil Fluxes with circular markers and blue color
plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_1_1_1'], 
            label='Veenkampen G_1_1_1', color='blue', alpha=0.5, s=30, marker='o')  # Circle
plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_2_1_1'], 
            label='Veenkampen G_2_1_1', color='blue', alpha=0.5, s=30, marker='D')  # Circle
plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_3_1_1'], 
            label='Veenkampen G_3_1_1', color='blue', alpha=0.5, s=30, marker='v')  # Circle
plt.scatter(veenkampen_soil_data['Timestamp'], veenkampen_soil_data['G_4_1_1'], 
            label='Veenkampen G_4_1_1', color='blue', alpha=0.5, s=30, marker='^')  # Circle

# Example of measured G values (replace 'time' and 'G' with actual variables)
plt.scatter(time, G, label='G (Measured)', color='red', s=30, alpha=0.5, marker='x')  # Cross

# Customize the plot
plt.title('Soil Fluxes vs. Time', fontsize=20)
plt.xlabel('Timestamp', fontsize=16)
plt.ylabel('Soil Flux [W/m^2]', fontsize=16)
plt.xticks(rotation=45, fontsize=12)
plt.yticks(fontsize=12)
plt.grid(True, linestyle='--', alpha=0.5)  # Change grid style
plt.legend(fontsize=10)
plt.tight_layout()

# Show plot
plt.show()


In [None]:
# Net Radiation vs. Time
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Scatter for Veenkampen (negative Net_Rad)
ax.scatter(
    veenkampen_net_rad['Timestamp'],
    -veenkampen_net_rad['Net_Rad'],
    label='Veenkampen Net Rad',
    color='blue',
    alpha=0.5,
    s=40,
    marker='o'
)

# Scatter for Loobos (negative Net_Rad)
ax.scatter(
    loobos_net_rad['Timestamp'],
    -loobos_net_rad['Net_Rad'],
    label='Loobos Net Rad',
    color='green',
    alpha=0.5,
    s=40,
    marker='o'
)

# Scatter for Amsterdam (Rnet, assume already signed correctly)
ax.scatter(
    amsterdam_net_rad['Timestamp'],
    amsterdam_net_rad['Rnet'],
    label='Amsterdam Net Rad',
    color='orange',
    alpha=0.5,
    s=40,
    marker='o'
)

# Scatter for measured Net_Rad (negative)
ax.scatter(
    time,
    -Net_Rad,
    label='Net_Rad (Measured)',
    color='red',
    s=40,
    alpha=0.5,
    marker='x'
)

# Title and labels in bold
ax.set_title('Net Radiation vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('Net Radiation [W/m²]', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Y‐axis tick font size
ax.tick_params(axis='y', labelsize=18)

# Grid styling: dashed major, dotted minor
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Tight layout and save
plt.tight_layout()
output_path = os.path.join(data_dir, 'net_radiation_vs_time.png')
plt.savefig(output_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {output_path}")
plt.show()


In [None]:
# Round to the nearest 30-minute interval and average for Radiation
veenkampen_net_rad['Timestamp'] = veenkampen_net_rad['Timestamp'].dt.floor('30T')
veenkampen_net_rad = veenkampen_net_rad.groupby('Timestamp').mean().reset_index()

# Merge with Veenkampen data based on 'Timestamp'
merged_data = pd.merge(veenkampen_data, veenkampen_net_rad, on='Timestamp', how='inner')

merged_data['G'] = -(merged_data['H'] + merged_data['LE'] + merged_data['Net_Rad'])#RN_1_1_1
print(merged_data)

In [None]:
# Round to the nearest 30-minute interval and average for Radiation
loobos_net_rad['Timestamp'] = loobos_net_rad['Timestamp'].dt.floor('30T')
loobos_net_rad = loobos_net_rad.groupby('Timestamp').mean().reset_index()

# Merge with Veenkampen data based on 'Timestamp'
merged_data_loobos = pd.merge(loobos_data, loobos_net_rad, on='Timestamp', how='inner')

merged_data_loobos['G'] = -(merged_data_loobos['H'] + merged_data_loobos['LE'] + merged_data_loobos['Net_Rad'])
print(merged_data_loobos)

In [None]:
# Round to the nearest 30-minute interval and average for Radiation
amsterdam_net_rad['Timestamp'] = amsterdam_net_rad['Timestamp'].dt.floor('30T')
amsterdam_net_rad = amsterdam_net_rad.groupby('Timestamp').mean().reset_index()

# Merge with Veenkampen data based on 'Timestamp'
merged_data_amsterdam = pd.merge(amsterdam_data, amsterdam_net_rad, on='Timestamp', how='inner')

merged_data_amsterdam['G'] = -(merged_data_amsterdam['H'] + merged_data_amsterdam['LE'] - merged_data_amsterdam['Rnet'])
print(merged_data_amsterdam)

In [None]:
# ------------------------------------------------------------
fig, ax = plt.subplots(figsize=(10, 6))

# Scatter for Veenkampen G (as residual)
ax.scatter(
    merged_data['Timestamp'],
    merged_data['G'],
    label='Veenkampen G (residual)',
    color='blue',
    s=40,
    alpha=0.7,
    marker='o'
)

# Scatter for Loobos G (as residual)
ax.scatter(
    merged_data_loobos['Timestamp'],
    merged_data_loobos['G'],
    label='Loobos G (residual)',
    color='green',
    s=40,
    alpha=0.7,
    marker='o'
)

# Scatter for Amsterdam G (as residual)
ax.scatter(
    merged_data_amsterdam['Timestamp'],
    merged_data_amsterdam['G'],
    label='Amsterdam G (residual)',
    color='orange',
    s=40,
    alpha=0.7,
    marker='o'
)

# Scatter for measured G values
ax.scatter(
    time,
    G,
    label='G (Measured)',
    color='red',
    s=40,
    alpha=0.7,
    marker='x'
)

# Title and labels in bold
ax.set_title('Ground Heat Flux vs. Time', fontsize=20, fontweight='bold')
ax.set_xlabel('Time', fontsize=20, fontweight='bold')
ax.set_ylabel('G [W/m²]', fontsize=20, fontweight='bold')

# Thicken all spines
for spine in ax.spines.values():
    spine.set_linewidth(1.5)

# Tick parameters
ax.tick_params(axis='both', which='major', labelsize=18, width=1.5, length=6)
ax.tick_params(axis='both', which='minor', width=1.0, length=4)

# X‐axis formatting: HH:MM every 2 hours, minor ticks every 30 min
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M'))
ax.xaxis.set_major_locator(mdates.HourLocator(interval=2))
ax.xaxis.set_minor_locator(mdates.MinuteLocator(interval=30))

# Rotate x‐tick labels
plt.xticks(rotation=45)

# Y‐axis tick font size
ax.tick_params(axis='y', labelsize=18)

# Grid styling: dashed major, dotted minor
ax.grid(True, which='major', linestyle='--', linewidth=0.8, alpha=0.7)
ax.grid(True, which='minor', linestyle=':', linewidth=0.5, alpha=0.5)

# Legend styling
legend = ax.legend(loc='upper left', fontsize=14, frameon=True)
legend.get_frame().set_linewidth(1.5)

# Tight layout and save
plt.tight_layout()
output_path = os.path.join(data_dir, 'ground_heat_flux_vs_time.png')
plt.savefig(output_path, dpi=300,bbox_inches='tight')
print(f"Plot saved to: {output_path}")
plt.show()


