**RADIOSONDE PROJECT CODE FILE:**

Author: Muntaha Pasha

**IMPORTS NEEDED**

Importing all the Packages I need below.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.pyplot import figure
import csv
from windrose import WindroseAxes
from matplotlib import pyplot as plt
import matplotlib.cm as cm
import glob

**INGESTING RADIOSONDE DATA INTO DATAFRAME (TO-DO):**

This code has been tested only on one of the three files needed (The Grand Junction Data).

Code Provided by Anderson Banihiwre on the NCAR Python Zulip Group. 

Current Bugs: Does not ingest the last header in the text file (?) and also does not update Header Information from one record to the next. Date, Time, Lat, Long all remain the same.

In [5]:
import pathlib
import pandas as pd
file = pathlib.Path("GJ_Station_72476/uadb_trh_72476.txt")

with file.open() as f:
    data = f.read().splitlines()

dfs = []
dtypes = {'unknown_column': 'int32', 'pressure': 'float32', 'altitude': 'float32', 'temperature': 'float32',
          'relative_humidity': 'float32', 'wind_direction': 'float32', 'wind_speed': 'float32'}

header_row_cols = ['station_id', 'year', 'month', 'day', 'hour', 'latitude', 'longitude', 'elevation']

x = []
for entry in data:
    entry = entry.strip().split()
    if len(entry) == 7:
        x.append(entry)

    elif len(entry) == 18 and x:
        station_id = entry[2]
        year = int(entry[7])
        month = int(entry[8])
        day = int(entry[9])
        hour = entry[10]
        #time = f"{year}-{month}-{day}T{hour}"
        latitude = float(entry[12])
        longitude = float(entry[13])
        elevation = float(entry[14])
        temp_df = pd.DataFrame(x, columns=list(dtypes.keys())).astype(dtypes)
        header_df =  pd.DataFrame([[station_id, year, month, day, hour,  latitude, longitude, elevation]]*temp_df.shape[0],
                                  columns = header_row_cols)
        dfGJS = header_df.join(temp_df)
        dfs.append(dfGJS)
        x = []

dfGJS = pd.concat(dfs)
dfGJS.head()

Unnamed: 0,station_id,year,month,day,hour,latitude,longitude,elevation,unknown_column,pressure,altitude,temperature,relative_humidity,wind_direction,wind_speed
0,72476,1962,3,4,1100,39.12,251.47,1474.0,2,1000.0,140.0,-999.0,-999.0,260.0,10.3
1,72476,1962,3,4,1100,39.12,251.47,1474.0,2,850.0,1475.0,2.5,55.400002,280.0,6.2
2,72476,1962,3,4,1100,39.12,251.47,1474.0,3,749.0,-99999.0,-7.2,32.700001,-999.0,-999.0
3,72476,1962,3,4,1100,39.12,251.47,1474.0,2,700.0,3002.0,-12.5,49.700001,280.0,11.8
4,72476,1962,3,4,1100,39.12,251.47,1474.0,3,600.0,-99999.0,-22.799999,71.5,-999.0,-999.0


**INGESTING RADIOSONDE DATA INTO MATRICES**

This is an alternate option to the Dataframe method done above. 

I wrote this code because for me it seemed more simple to do List Queries as opposed to working with a massive dataframe. 

IMPORTANT ASSUMPTIONS MADE: I am assuming each record aligns on date/time. Two radiosonde measurements per day.

First, let's create matrices out of each of the Radiosonde Files for each of the stations.

In [7]:
with open('LAN_Station_72576/uadb_trh_72576.txt') as f:
    matrixLan=[line.split() for line in f]
with open('GJ_Station_72476/uadb_trh_72476.txt') as f1:
    matrixGJ=[line.split() for line in f1]
with open('SLC_Station_72572/uadb_trh_72572.txt') as f2:
    matrixSLC=[line.split() for line in f2]

Now let's test some values to make sure the matrix input the information correctly from the files.

In [11]:
#-- Testing --#
#matrixLan[0]
#matrixGJ[0]
#matrixSLC[0]

#-- Testing --#
#matrixLan[1][2]
#matrixGJ[1][2]
#matrixSLC[1][2]

Variable Prefixes Used:
    
Beginning with L - Lander Variables

Beginning with G - Grand Junc. Variables

Beginning with S - Salt Lake City Variables

In [23]:
LSevPressureVals = []
LFourPressureVals = []
LThreePressureVals = []
LElevationVals = []
LTempFour = []
LTempThree = []
#Get Elevation at 700 mb Height. That's column 3 in the array, so since Python indexes 0, its column 2.
for i in range(len(matrixLan)):
    if(matrixLan[i][1] == '700.0'):
        #append what i values are available for pressure value at 700 mb.
        LSevPressureVals.append(i)
        #TODO: Check for missing value
        LElevationVals.append(matrixLan[i][2])
    elif(matrixLan[i][1] == '400.0'):
        LFourPressureVals.append(i)
        #TODO: Check for missing value
        LTempFour.append(matrixLan[i][3])
    elif(matrixLan[i][1] == '300.0'):
        #Dont necessarily need to compute second equation if missing.
        LThreePressureVals.append(i)
        #TODO: Check for missing value
        LTempThree.append(matrixLan[i][3])

GSevPressureVals = []
GFourPressureVals = []
GThreePressureVals = []
GElevationVals = []
GTempFour = []
GTempThree = []
#Get Elevation at 700 mb Height. That's column 3 in the array, so since Python indexes 0, its column 2.
for i in range(len(matrixGJ)):
    if(matrixGJ[i][1] == '700.0'):
        #append what i values are available for pressure value at 700 mb.
        GSevPressureVals.append(i)
        #TODO: Check for missing value
        GElevationVals.append(matrixGJ[i][2])
    elif(matrixGJ[i][1] == '400.0'):
        GFourPressureVals.append(i)
        #TODO: Check for missing value
        GTempFour.append(matrixGJ[i][3])
    elif(matrixGJ[i][1] == '300.0'):
        #Dont necessarily need to compute second equation if missing.
        GThreePressureVals.append(i)
        #TODO: Check for missing value
        GTempThree.append(matrixGJ[i][3])

SSevPressureVals = []
SFourPressureVals = []
SThreePressureVals = []
SElevationVals = []
STempFour = []
STempThree = []
#Get Elevation at 700 mb Height. That's column 3 in the array, so since Python indexes 0, its column 2.
for i in range(len(matrixSLC)):
    if(matrixSLC[i][1] == '700.0'):
        #append what i values are available for pressure value at 700 mb.
        SSevPressureVals.append(i)
        #TODO: Check for missing value
        SElevationVals.append(matrixSLC[i][2])
    elif(matrixSLC[i][1] == '400.0'):
        SFourPressureVals.append(i)
        #TODO: Check for missing value
        STempFour.append(matrixSLC[i][3])
    elif(matrixSLC[i][1] == '300.0'):
        #Dont necessarily need to compute second equation if missing.
        SThreePressureVals.append(i)
        #TODO: Check for missing value
        STempThree.append(matrixSLC[i][3])

#---FIRST EQUATION CALCULATION---#
#DZ70D =  Z70SLC + Z70GJT - 2 * Z70LND
DZ720D = []
#List Comprehension
Z70SLC = [float(i) for i in SElevationVals]
Z70GJT = [float(i) for i in GElevationVals]
Z70LND = [float(i) for i in LElevationVals]

print("SLC Total Elevation Data Points: {}".format(len(Z70SLC)))
print("GJT Total Elevation Data Points: {}".format(len(Z70GJT)))
print("LAN Total Elevation Data Points: {}".format(len(Z70LND)))

for i in range(len(Z70LND)):
    DZ720D.append(Z70SLC[i] + Z70GJT[i] - (2 * Z70LND[i]))

for i in range(0,10):
    print(DZ720D[i])

SLC Total Elevation Data Points: 48290
GJT Total Elevation Data Points: 47061
LAN Total Elevation Data Points: 28964
152.0
176.0
59.0
55.0
57.0
103.0
76.0
39.0
95.0
109.0


**END OF FILE NOTES:**

$\textbf{TO-DO:}$

1. Check for Missing Data
2. Make sure that every radiosonde file is aligned (2 entries per day, otherwise the code above does not work).
3. Figure out what to do with Header Information
4. Calculate Second Equation with T400 and T300