The provided code defines a Python function `vertical_transect` that processes meteorological data, generates visualizations, and saves them as PNG images. The code appears to be quite extensive, combining data processing, visualization generation, and file saving into one function. Here's a description or summary of what the code does:

---

The `vertical_transect` function processes meteorological data collected by a KAZR (K-band zenith radar) and interpolated sonde measurements. It generates a series of vertical transect visualizations and saves them as PNG images. The processed data includes information such as reflectivity, Doppler velocity, spectral width, liquid water path, precipitation rate, and temperature.

The function begins by specifying paths to data directories and reading relevant data files in NetCDF format. It then processes the data based on a list of formatted date ranges and a reference CAO (Central Analysis Office) date. For each date range in the list:

1. Subsets of data are extracted from the KAZR and sonde datasets based on the specified date range.
2. Various meteorological variables such as reflectivity, Doppler velocity, spectral width, liquid water path, precipitation rate, and temperature are extracted from the datasets.
3. Gaussian smoothing is applied to the reflectivity, Doppler velocity, and spectral width data.
4. Processed data variables are stored in separate lists for later use.

After processing the data for all date ranges, the function proceeds to generate visualizations for each date range:

1. A set of subplots is created for each variable, and labels are added to each subplot.
2. Different color maps and color scales are applied to different variables for meaningful visualization.
3. The visualizations include data like reflectivity, Doppler velocity, spectral width, and combined liquid water path and precipitation rate.
4. Custom datetime formatting is applied to the x-axis labels, and tick positions are adjusted for clarity.
5. Each generated visualization is saved as a PNG image file in a specified directory structure.

Finally, the function prints a success message indicating the completion of data processing and visualization generation.

---

Please note that the above description is a general overview of the code's functionality based on the provided code snippet. If there are specific details or nuances in your implementation that you'd like to emphasize or include in the description, feel free to make adjustments accordingly.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr
import cartopy.crs as ccrs
import cartopy.feature as cf
import matplotlib.dates as mdates  # Importing module for date formatting
import matplotlib.ticker as ticker  # Importing ticker for customizing colorbar ticks
from scipy.ndimage import gaussian_filter
import glob
from datetime import datetime
import os

In [2]:
def vertical_transect(formatted_date:list, CAO_date:str):
    
    # path to data
    path1 = "/glade/work/noteng/masters-research/data/anx-data/otengn1/240475/"
    # kazr data
    data1 = "KAZR/anxarsclkazr1kolliasM1/"
    # get upper temperature from this
    data2 = "interpsonde/"
    # reading all data
    data = glob.glob(f"{path1}{data1}*.nc")
    data_sonde = glob.glob(f"{path1}{data2}*.nc")

    ds = xr.open_mfdataset(data)
    ds1 = xr.open_mfdataset(data_sonde)

    ds_subsets = []; ds1_subsets = []

    reflectivitys = []; doppler_velocitys = []; spectral_widths = []; lwps = []; precips = []; temps = []

    # Apply Gaussian smoothing
    smoothed_reflectivitys = []; smoothed_doppler_velocitys = []; smoothed_spectral_widths = []

    # getting the data in formatted way as in DataArray
    for entry in formatted_date:
        name, start_time, end_time = entry
        # kazr
        ds_subset = ds.sel(time=slice(start_time, end_time))
        # interpolate sonde(get temperature variable from this)
        ds1_subset = ds1.sel(time=slice(start_time, end_time))

        # gettting variables from subset
        reflectivity = ds_subset['reflectivity_best_estimate']
        doppler_velocity = ds_subset['mean_doppler_velocity']
        spectral_width = ds_subset['spectral_width']
        lwp = ds_subset['mwr_lwp']
        precip = ds_subset['precip_mean']
        temp = ds1_subset['temp']

        # Apply Gaussian smoothing
        smoothed_reflectivity = gaussian_filter(reflectivity, sigma=0)
        smoothed_doppler_velocity = gaussian_filter(doppler_velocity, sigma=0)
        smoothed_spectral_width = gaussian_filter(spectral_width, sigma=0)

        ##################
        # append new results(variables)
        ds_subsets.append(ds_subset)
        ds1_subsets.append(ds1_subset)

        reflectivitys.append(reflectivity)
        doppler_velocitys.append(doppler_velocity)
        spectral_widths.append(spectral_width)
        lwps.append(lwp)
        precips.append(precip)
        temps.append(temp)

        smoothed_reflectivitys.append(smoothed_reflectivity)
        smoothed_doppler_velocitys.append(smoothed_doppler_velocity)
        smoothed_spectral_widths.append(smoothed_spectral_width)
        
###################################### plotting code
    for i in range(0, len(formatted_date)):
    # for i in range(0, 2):

        # fig, ax = plt.subplots(figsize=(10, 15), nrows=4, ncols=1, sharex=True)
        fig, ax = plt.subplots(figsize=(8, 15), nrows=4, ncols=1, sharex=True)
        ax = ax.flatten()
        
        
        # text = "Example Text"
        # ax[j].text(0.5, 2.05, j+1, fontsize=38, transform=ax[i].transAxes, ha='center', fontweight='bold')
            
        


        # adding labels to each subplot
        for j, v in enumerate(ax):
            labels = ["(a)", "(b)", "(c)", "(d)"]
            # Add text to the subplots
            ax[j].text(0.004, 0.97, labels[j], transform=ax[j].transAxes,
            fontsize=18, va='top', ha='left', fontweight='bold') 
            

        ### reflectivity
         # if i == 1:
       
        pcm = ax[0].pcolormesh(ds_subsets[i]['time'], reflectivitys[i]['height']/1000.0, smoothed_reflectivitys[i].T, 
        cmap='nipy_spectral', vmin=-30, vmax=25, zorder=5)
        

        # temperature
        cntr = ax[0].contour(temps[i]['time'][::2], temps[i]['height'][::2], np.transpose(temps[i].values)[::2, ::2], 
        np.round(np.arange(-100,320,10)), colors='k', linewidths=0.8, linestyles ="dashed", zorder=6)
        plt.clabel(cntr,inline=1, inline_spacing=8, fontsize=20, fmt='%i',manual=False)
        
        # add id labels
        ax[0].text(0.5, 1.1, f"ID-{i+1}", fontsize=35, transform=ax[0].transAxes, ha='center')
            
        

        ax[0].set_ylim(0, 7)
        # Set y-axis ticks
        ax[0].set_yticks(np.arange(1, 8))
        ax[0].tick_params(axis='y', labelsize=15)
        ax[0].set_ylabel('Height [km]', color='black', fontsize=15)  # Label for the twin y-axis
        cbar = fig.colorbar(pcm, ax=ax[0])
        cbar.set_label('Reflectivity [dBZ]', fontsize=15)
        # Set colorbar ticks at an interval of 10 (adjust as needed)
        tick_locator = ticker.MultipleLocator(base=10)
        cbar.locator = tick_locator
        cbar.update_ticks()
        

        ### doppler velocity
        # if i == 1:
        # pcm = ax[1].pcolormesh(ds_subsets[i]['time'], reflectivitys[i]['height']/1000.0, smoothed_doppler_velocitys[i].T,
        # cmap='bwr', vmin=-3, vmax=1, zorder=5)
        pcm = ax[1].pcolormesh(ds_subsets[i]['time'], reflectivitys[i]['height']/1000.0, smoothed_doppler_velocitys[i].T,
        cmap='bwr', vmin=-3, vmax=1, zorder=5)
        ax[1].set_ylim(0, 7)
        # Set y-axis ticks
        ax[1].set_yticks(np.arange(1, 8))
        ax[1].tick_params(axis='y', labelsize=15)
        ax[1].set_ylabel('Height [km]', color='black', fontsize=15)  # Label for the twin y-axis
        cbar = fig.colorbar(pcm, ax=ax[1], extend='both')
        cbar.set_label("Doppler\nVelocity [m s$^{-1}$]", fontsize=15)

        ### spectral width
        # if i == 2:
        pcm = ax[2].pcolormesh(ds_subsets[i]['time'], reflectivitys[i]['height']/1000.0, smoothed_spectral_widths[i].T,
                               cmap='Blues', vmin=0, vmax=1, zorder=5)
        ax[2].set_ylim(0, 7)
        # Set y-axis ticks
        ax[2].set_yticks(np.arange(1, 8))
        ax[2].tick_params(axis='y', labelsize=15)
        ax[2].set_ylabel('Height [km]', color='black', fontsize=15)  # Label for the twin y-axis
        cbar = fig.colorbar(pcm, ax=ax[2], extend='max')
        cbar.set_label("Spectral\nWidth [m s$^{-1}$]", fontsize=15)


        ### liquid water path and precipitation rate
        # if i == 3:
        pcm = ax[3].scatter(lwps[i]['time'], np.divide(lwps[i].values, 1000), color='blue', zorder=5)
        ax[3].set_yticks(np.arange(0, round(np.nanmax(np.divide(lwps[i].values, 1000)))+0.1, 0.2))
        ax[3].spines['left'].set_color('blue')
        ax[3].tick_params(axis='y', colors='blue', labelsize=15)
        ax[3].set_ylabel("LWP [kg m$^{-2}$]", c = "blue", fontsize=15)
        cbar = fig.colorbar(pcm, ax=ax[3], extend='neither')
        # set colorbar to invisible
        cbar.ax.set_visible(False)
        ax[3].grid(axis="y", ls = '--', c = "blue",alpha = 0.5)
        #####################
        # ax[3].set_xlim(0, np.nanmax(np.divide(lwps[i].values, 1000)))
        # ax[3].set_xticks(np.arange(1, (lwps[i].values/1000), 0.2)
        ax[3].set_ylim(0, 2)
        ax[3].set_yticks(np.arange(0, 2.5, 0.5))
        # print(np.nanmin(lwps[i]/1000))

        # Create a twin y-axis for ax[2]
        ax31 = ax[3].twinx()
        ax31.scatter(lwps[i]['time'], precips[i], color='black')
        ax31.spines['right'].set_color('black')
        ax31.tick_params(axis='y', colors='black', labelsize=15)
        ax31.set_ylabel('Precip Rate [mm/hr]', color='black', fontsize=15)  # Label for the twin y-axis
        ax31.set_yticks(np.arange(0, round(np.max(precips[i].values))+0.1))
        ax31.grid(axis="y", ls = '--', c = "black",alpha = 0.5)
        ax31.set_ylim(0, 2)
        ax31.set_yticks(np.arange(0, 2.5, 0.5))


        # Custom datetime format
        custom_date_format = "%H:%M UTC\n%d %B %Y"  # Format: 04:00 UTC\n13 March 2020
        # Formatting the time labels
        ax[len(ax)-1].xaxis.set_major_formatter(mdates.DateFormatter(custom_date_format))
        # Including the first-date and last-date on the plot
        ax[len(ax)-1].set_xlim([ds_subsets[i]['time'].min(), ds_subsets[i]['time'].max()])
        # Set custom tick positions for the x-axis
        # num_ticks = 6  # Number of tick labels
        num_ticks = 5  # Number of tick labels
        tick_positions = np.linspace(0, len(ds_subsets[i]['time']) - 1, num_ticks, dtype=int)
        ax[len(ax)-1].set_xticks(ds_subsets[i]['time'][tick_positions])

        # Generate custom tick labels including the first and last dates
        tick_labels = [ds_subsets[i]['time'][pos].dt.strftime(custom_date_format).values for pos in tick_positions]
        tick_labels[0] = ds_subsets[i]['time'].min().dt.strftime(custom_date_format).values  # First date
        tick_labels[-1] = ds_subsets[i]['time'].max().dt.strftime(custom_date_format).values  # Last date
        ax[len(ax)-1].set_xticklabels(tick_labels ,rotation=0);  # You can adjust fontsize and rotation
        
        
        # convert datetime from '%Y-%m-%dT%H:%M:%S.%f' to '%Y-%m-%d %H:%M:%S' format
        timestamp_start = str(ds_subsets[i]['time'].values[0]) # start time
        timestamp_end = str(ds_subsets[i]['time'].values[-1])  # end time
        
        timestamp_without_nanoseconds = timestamp_start[:-3]  # Remove last three digits
        timestamp_without_nanoseconds1 = timestamp_end[:-3]  # Remove last three digits
        
        timestamp_out_start = datetime.strptime(timestamp_without_nanoseconds, '%Y-%m-%dT%H:%M:%S.%f').strftime('%Y%m%d-%H:%M:%S')
        timestamp_out_end = datetime.strptime(timestamp_without_nanoseconds1, '%Y-%m-%dT%H:%M:%S.%f').strftime('%Y%m%d-%H:%M:%S')
        

        directory = f'Open-cells-Figures-Final/t_series/{CAO_date}'
        file_name = f"{timestamp_out_start}_{timestamp_out_end}_ID{i+1}.png"


         # Create the directory if it doesn't exist
        if not os.path.exists(directory):
            os.makedirs(directory)
            print(f"Directory '{directory}' created successfully.")

        dirr = sorted(os.listdir(directory))
        if file_name in dirr:
            # print(f'{dt[i]}.png already exist')
            pass
            plt.close()
        else:
            fig.savefig(os.path.join(directory, file_name), dpi=500)
            print(f"Figures in {file_name} executed!")
            plt.close()
    print(f'\U0001f600\U0001f600\U0001f600\U0001f600IDENTIFIED ALL {i+1} CELLS IDENTIFIED SUCCESSFULLY!!!\U0001f600\U0001f600\U0001f600\U0001f600')

In [3]:
data = [
    (1, '2020-03-13 08:12:28', '2020-03-13 08:34:48'),
    (2, '2020-03-13 08:32:08', '2020-03-13 09:06:36'),
    (3, '2020-03-13 09:12:48', '2020-03-13 09:31:00'),
    (4, '2020-03-13 09:29:36', '2020-03-13 09:58:16'),
    (5, '2020-03-13 10:00:12', '2020-03-13 10:11:56'),
    (6, '2020-03-13 10:14:32', '2020-03-13 10:29:00'),
    (7, '2020-03-13 10:22:40', '2020-03-13 10:46:56'),
    (8, '2020-03-13 10:42:32', '2020-03-13 10:59:04'),
    (9, '2020-03-13 11:00:32', '2020-03-13 11:26:20'),
    (10, '2020-03-13 11:23:32', '2020-03-13 11:32:28'),
    (11, '2020-03-13 11:30:44', '2020-03-13 11:51:52'),
    (12, '2020-03-13 11:48:08', '2020-03-13 12:09:04'),
    (13, '2020-03-13 12:03:08', '2020-03-13 12:15:00'),
    (14, '2020-03-13 12:12:24', '2020-03-13 12:38:04'),
    (15, '2020-03-13 12:32:48', '2020-03-13 12:56:36'),
    (16, '2020-03-13 12:52:56', '2020-03-13 13:14:32'),
    (17, '2020-03-13 13:53:00', '2020-03-13 14:05:16'),
    (18, '2020-03-13 14:03:28', '2020-03-13 14:31:20'),
    (19, '2020-03-13 15:09:36', '2020-03-13 15:19:08'),
    (20, '2020-03-13 15:17:16', '2020-03-13 15:45:20'),
    (21, '2020-03-13 15:54:36', '2020-03-13 15:58:08'),
    (22, '2020-03-13 15:58:12', '2020-03-13 16:02:36'),
    (23, '2020-03-13 15:48:00', '2020-03-13 16:12:44'),
    (24, '2020-03-13 16:12:00', '2020-03-13 16:33:44'),
    (25, '2020-03-13 16:29:40', '2020-03-13 16:38:20'),
    (26, '2020-03-13 16:44:24', '2020-03-13 16:57:56'),
    (27, '2020-03-13 16:53:20', '2020-03-13 17:13:32'),
    (28, '2020-03-13 17:12:56', '2020-03-13 17:22:56'),
    (29, '2020-03-13 17:13:36', '2020-03-13 17:45:48'),
    (30, '2020-03-13 18:10:04', '2020-03-13 18:28:04'),
    (31, '2020-03-13 18:27:28', '2020-03-13 18:50:04'),
    (32, '2020-03-13 18:47:08', '2020-03-13 18:54:48'),
    (33, '2020-03-13 18:54:04', '2020-03-13 18:57:08'),
    (34, '2020-03-13 19:15:32', '2020-03-13 19:54:00'),
    (35, '2020-03-13 19:47:12', '2020-03-13 20:15:52'),
    (36, '2020-03-13 20:14:12', '2020-03-13 20:32:32'),
    (37, '2020-03-13 20:26:44', '2020-03-13 20:48:12'),
    (38, '2020-03-13 20:50:16', '2020-03-13 20:59:28'),
    (39, '2020-03-13 20:48:36', '2020-03-13 21:19:56'),
    (40, '2020-03-13 21:07:00', '2020-03-13 21:11:04'),
    (41, '2020-03-13 21:16:00', '2020-03-13 21:21:40'),
    (42, '2020-03-13 21:27:40', '2020-03-13 21:50:48'),
    (43, '2020-03-13 21:48:20', '2020-03-13 21:59:08'),
    (44, '2020-03-13 21:57:40', '2020-03-13 22:13:00'),
    (45, '2020-03-13 22:28:24', '2020-03-13 22:39:36'),
    (46, '2020-03-13 22:38:08', '2020-03-13 23:00:00'),
    (47, '2020-03-13 23:38:56', '2020-03-14 00:16:28'),
    (48, '2020-03-14 00:13:24', '2020-03-14 00:30:44'),
    (49, '2020-03-14 01:06:40', '2020-03-14 01:24:08'),
    (50, '2020-03-14 02:19:20', '2020-03-14 02:34:36'),
    (51, '2020-03-14 02:43:24', '2020-03-14 03:00:44'),
    (52, '2020-03-14 03:17:16', '2020-03-14 03:37:04'),
    (53, '2020-03-14 03:33:36', '2020-03-14 03:41:20'),
    (54, '2020-03-14 03:57:16', '2020-03-14 04:18:04'),
    (55, '2020-03-14 04:12:56', '2020-03-14 04:31:56'),
    (56, '2020-03-14 04:27:40', '2020-03-14 04:50:20'),
    (57, '2020-03-14 04:49:56', '2020-03-14 05:19:48'),
]





formatted_date = []

for entry in data:
    entry_id, start_str, end_str = entry
    start_dt = datetime.strptime(start_str, '%Y-%m-%d %H:%M:%S')
    end_dt = datetime.strptime(end_str, '%Y-%m-%d %H:%M:%S')
    
    formatted_start = start_dt.strftime('%Y-%m-%dT%H:%M:%S.%f000')
    formatted_end = end_dt.strftime('%Y-%m-%dT%H:%M:%S.%f000')
    
    formatted_date.append((entry_id, formatted_start, formatted_end))

In [4]:
vertical_transect(formatted_date=formatted_date, CAO_date='Mar13-Mar14')

Directory 'Open-cells-Figures-Final/t_series/Mar13-Mar14' created successfully.
Figures in 20200313-08:12:28_20200313-08:34:48_ID1.png executed!
Figures in 20200313-08:32:08_20200313-09:06:36_ID2.png executed!
Figures in 20200313-09:12:48_20200313-09:31:00_ID3.png executed!
Figures in 20200313-09:29:36_20200313-09:58:16_ID4.png executed!
Figures in 20200313-10:00:12_20200313-10:11:56_ID5.png executed!
Figures in 20200313-10:14:32_20200313-10:29:00_ID6.png executed!
Figures in 20200313-10:22:40_20200313-10:46:56_ID7.png executed!
Figures in 20200313-10:42:32_20200313-10:59:04_ID8.png executed!
Figures in 20200313-11:00:32_20200313-11:26:20_ID9.png executed!
Figures in 20200313-11:23:32_20200313-11:32:28_ID10.png executed!
Figures in 20200313-11:30:44_20200313-11:51:52_ID11.png executed!
Figures in 20200313-11:48:08_20200313-12:09:04_ID12.png executed!
Figures in 20200313-12:03:08_20200313-12:15:00_ID13.png executed!
Figures in 20200313-12:12:24_20200313-12:38:04_ID14.png executed!
Figur