## Lab 1 Part I: METARs
#### 9/7/2022


The following tutorial is the first of three parts of the python portion of Lab 1.  In this part we will focus on how to work with METAR data in python using the modules MetPy, and Pandas.  As with every lab in the future I will include a link to the documentation of each module that we introduce for the first time.
<br />
### Module Documentation
1. MetPy Metar Parsing Function: https://unidata.github.io/MetPy/latest/api/generated/metpy.io.parse_metar_file.html
2. Pandas DataFrame: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
3. The datetime function from the datetime module: https://docs.python.org/3/library/datetime.html


<br /><br />

If you have any questions about the code below always feel free to reach out to me at mpvossen@uwm.edu and I am always willing to further explain the code. <br /> <br />


1. In most things we do in atmospheric science we can save ourselves time by importing code that someone else has written for us called modules.  In the section below I load the python modules we are going to need to complete the part I of the tutorial.


<br /><br />
4. We now have our data parsed out.  The data is now in something that is called a Pandas DataFrame, which you can visualize to be just like a table of data that you would see in a textbook.  There are column names and row names for the table that we can use to access various parts of the data.  With the way that MetPy structures it's parser the row names are the name of the station, and the column names are the observation variable names.  This structure is useful because sometimes when working with METAR data we need to get an observation for a single location.  In the code below I use the pandas' syntax to get O'Hare airport's observation rows in our sample file.  Multiple times may appear since O'Hare may make multiple observations during the hour that the data is for. <br />


In [1]:
#from the data reading capabilities of metpy (metpy.io) import the metar reading capability (parse_metar_file)
from metpy.io import parse_metar_file
#import the data storage for the metar data.  This package lays the data out in a table like format
import pandas as pd
#from the dates and time code(datetime), import the date and time reading capabilities (datetime).
from datetime import datetime
#from python's data import module (io) import the ability to read a string as a file.  This allows us to avoid downloading files which speeds things up and keeps your files storage clean.
from io import StringIO
#import the module to download files from the internet
import requests

<br /><br />
2. In this part of the lab we will need to convert various values since the data is not always in the units we would like.  Below is a sample function for how to convert wind from knots to mph. <br />

In [2]:
#here the function is defined.  The def command says to define the function of the name convert_knots_to_mph with the input variable of the name value
def convert_knots_to_mph(value):
    
    #this line causes the function to return a value.  Here I'm returning the input variable divided by 0.868976
    return value / 0.868976
    

<br /><br />In the section below create a function to convert a value temperature from C to F.  Name the function convert_c_to_f. <br />

In [3]:
def convert_c_to_f(value):
    return (value * 9/5) + 32

<br /><br />

3. Let's start off by accessing our METAR data.  Normally we would download the latest METAR data from the Unidata THREDDS server, but as of this week the Unidata THREDDS server is down.  So instead I have a sample metar file downloaded for you to open.  When you run this code it may take up to 10 seconds to run.  Note: If you wanted to get your own surface data you would go to https://thredds-test.unidata.ucar.edu/thredds/catalog/noaaport/text/metar/catalog.html<br /> <br />



In [4]:
#Here I set the create the variable that holds the time in UTC that we want the metar data for.  The file that I have downloaded for you is for July 15th, 2022 1500 UTC
#datetime(year, month, day, hour)
file_time = datetime(2022,7,15,15)



#Here I build the string to tell metpy later where the data is located on the JupyterHub server
data_location = "/srv/data/shared_notebooks/Synoptic1-AtmSci360/Data/Lab_1/"

#Here I definde the name of the METAR file we are going to parse
data_name = "sample_surface.txt"

#We now tell metpy to parse out the METAR file. Here I concatenate the data_location and data_name variable to get the full data file name (data_location+data_name) 
#Also Metpy only can get the day of the month from the METAR, so we need to specify the month (file_time.month) and year (file_time.year) 
#from the file time that we set before or else it will assume the current month and year.  
metar_data = parse_metar_file(data_location+data_name, month = file_time.month, year=file_time.year)

#below you can see that the data is parse out and now is in a form that is similar to a table.  This is called a data frame.
#also in Jupyter you can display one variable by typing out the variable name like I did below. (Note: this does not work outside Jupyter)
#if you need to display multiple variables in a cell, you will need to use the print statement instead
metar_data

Unnamed: 0_level_0,station_id,latitude,longitude,elevation,date_time,wind_direction,wind_speed,wind_gust,visibility,current_wx1,...,air_temperature,dew_point_temperature,altimeter,current_wx1_symbol,current_wx2_symbol,current_wx3_symbol,remarks,air_pressure_at_sea_level,eastward_wind,northward_wind
station_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
MDPC,MDPC,18.570000,-68.370000,12.0,2022-07-14 15:00:00,,,,901.000,,...,,,,0,0,0,2 9999 BKN022 30/23 Q1018,,,
KVKS,KVKS,32.220000,-90.930000,32.0,2022-07-15 14:55:00,200.0,4.0,,16093.440,,...,27.0,26.0,30.11,0,0,0,A01,1019.79,1.368081e+00,3.758770e+00
K27K,K27K,38.233333,-84.433333,289.0,2022-07-15 14:55:00,0.0,0.0,,16093.440,,...,25.0,16.0,30.17,0,0,0,AO1,1020.73,-0.000000e+00,-0.000000e+00
PAKU,PAKU,70.310000,-149.580000,2.0,2022-07-15 14:45:00,330.0,3.0,,8046.720,BR,...,-1.0,-1.0,29.90,10,0,0,,1012.84,1.500000e+00,-2.598076e+00
K1U7,K1U7,42.250000,-111.350000,1807.0,2022-07-15 14:45:00,90.0,3.0,,16093.440,,...,19.0,17.0,30.30,0,0,0,A01,1019.28,-3.000000e+00,-1.836970e-16
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
KFEW,KFEW,41.130000,-104.870000,1878.0,2022-07-15 14:58:00,180.0,4.0,,16093.440,,...,26.0,12.0,30.32,0,0,0,AO2A SLP178,1014.29,-4.898587e-16,4.000000e+00
FKKD,FKKD,4.000000,9.720000,9.0,2022-07-15 15:00:00,230.0,7.0,,9999.000,,...,29.0,23.0,29.83,0,0,0,NOSIG,1010.25,5.362311e+00,4.499513e+00
HKNW,HKNW,-1.320000,36.820000,1679.0,2022-07-15 15:00:00,140.0,12.0,,9999.000,,...,22.0,10.0,30.15,0,0,0,,1012.83,-7.713451e+00,9.192533e+00
CYXS,CYXS,53.900000,-122.680000,691.0,2022-07-15 15:00:00,350.0,10.0,,14484.096,,...,-3.0,-13.0,29.99,0,0,0,SLP194,1020.80,1.736482e+00,-9.848078e+00


In [5]:
#set the site variable to a string of O'Hare's 4 letter identifier
site = "KORD"

#from the metar data frame (the metar_data variable) slice out the row (.loc[]) that has the index that is for the site we want (site) and save it to the variable station.
station = metar_data.loc[site]

#display the sliced data for O'Hare. The data may look different, but it is still setup the same as the cell above.
station

Unnamed: 0_level_0,station_id,latitude,longitude,elevation,date_time,wind_direction,wind_speed,wind_gust,visibility,current_wx1,...,air_temperature,dew_point_temperature,altimeter,current_wx1_symbol,current_wx2_symbol,current_wx3_symbol,remarks,air_pressure_at_sea_level,eastward_wind,northward_wind
station_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
KORD,KORD,41.98,-87.93,200.0,2022-07-15 14:51:00,0.0,0.0,,11265.408,-RA,...,19.0,18.0,30.15,61,0,0,AO2 RAE08B35 SLP206 P0005 60033 T01940178 53009,1020.94,-0.0,-0.0
KORD,KORD,41.98,-87.93,200.0,2022-07-15 15:05:00,210.0,3.0,,9656.064,-RA,...,19.0,18.0,30.15,61,10,0,AO2 P0000 T01940183,1020.94,1.5,2.598076
KORD,KORD,41.98,-87.93,200.0,2022-07-15 15:48:00,210.0,5.0,,6437.376,RA,...,19.0,18.0,30.15,63,10,0,AO2 P0003,1020.94,2.5,4.330127


<br /> <br /> 
5. We can also parse out specific variables we want by using the syntax below. <br />

In [6]:
#from the data that only contains the metar for KORD (station) slice out the column named "windspeed" and save it to the variable station_wind.  
#For columns we can just do the brackets and we don't need a function like the .loc() function that we needed before for the row.
station_wind = station["wind_speed"]

#display the variable that we saved the wind speed data from KORD to.
station_wind

station_id
KORD    0.0
KORD    3.0
KORD    5.0
Name: wind_speed, dtype: float64

<br /><br />
6. Finally, our data is in the standard METAR units.  One way we can convert these units is by using the functions we created before like the code below
<br />

In [7]:
#using the convert_knots_to_mph function that I defined before to convert the wind speed for KORD (station_wind) from knots to mph and save the output from the function to the variable station_wind_mph.
station_wind_mph = convert_knots_to_mph(station_wind)
#display the station wind speed that resulted from the function above
station_wind_mph

station_id
KORD    0.000000
KORD    3.452339
KORD    5.753899
Name: wind_speed, dtype: float64


7. In the code section below, parse out the temperature (air_temperature), dewpoint (dew_point_temperature), pressure (air_pressure_at_sea_level), wind speed (wind_speed), wind direction (wind_direction), and cloud coverage (cloud_coverage) for Madison (KMSN) in the code section below.  Display the output so you can use it to answer question 6 in the lab.  Be sure to convert temperature, and dewpoint to the appropriate units.


In [16]:
site2 = "KMSN"
station2= metar_data.loc[site2]
station_temperature=station2["air_temperature"]
station_dewpoint=station2["dew_point_temperature"]
station_pressure=station2["air_pressure_at_sea_level"]
station_windspeed= station2["wind_speed"]
station_winddirection=station2["wind_direction"]
station_clouds=station2["cloud_coverage"]

station_temperature_F=convert_c_to_f(station_temperature)
station_dewpoint_F=convert_c_to_f(station_dewpoint)

(station_temperature_F, station_dewpoint_F,station_pressure,station_windspeed,station_winddirection,station_clouds)

(62.6, 60.8, 1017.62, 13.0, 180.0, 8)

<br /><br />

### You have now completed Part I of the python portion of the lab.  Be sure to submit the fully rendered Jupyter Notebook on GitHub when you are finished.
