# GEOS 505: Problem Set 1

__Instructions__: Complete the two problems below. One is a coding problem, where you will review and comment existing code. Where you are asked to provide descriptive text and answer questions, please do so via well formatted Markdown cells below the problem.  

__Due Date and Time__: September 19, 2025 at 5:00 PM MT

__Turn In Via__: Commit and push your complete notebook to your personal GitHub repository for the class, and submit the URL for notebook via Canvas. 

## Problem 1

### 1. Problem Background:

As a new graduate student, your research is examining the performance of 5-day precipitation forecasts at a number of Snotel sites in the Upper Boise River Basin. The forecasts are generated by a research center at a university in the western United States using the [Weather Research and Forecasting](https://www2.mmm.ucar.edu/wrf/users/) model. They run a 360 hr (i.e., 15 day) forecast using WRF every day at the top of every hour for the entire western United States, and puts the output you need on an Amazon Web Services (AWS) S3 bucket that is freely available.  

File pattern:
WRF_fcst_YYYYMMDD_HHz_VVV.nc

Where:
- YYYY is the year of the forecast date
- MM is the month of the forecast date
- DD is the day of the forecast
- HH is the hour the forecast was initiated (0-23)
- VVV is the valid hour of the forecast from the time of initiation in hours (0-360 hr in 3 hr increments). So 000 is the initial hour of the forecast, 003 would be the 3rd hour, 006 the 6th, and so forth.
- 'WRF_fcst_' is the prefix of the file name
- '.nc' is the file extension (this stands for NetCDF file - a real file format)

### 2. Instructions: 

1. Write a function that takes as input the forecast year, month, day, and hour, as well as a maximum valid hour and returns a list of file names that contain the forecast data
2. Test your code for three days (2025-03-01, 2025-04-01, and 2025-05-01), for forecast hour 6, and a maximum valid hour of 120 hours. Print the results of the returned list to the screen and verify manually. 

### 3. Concepts Assessed:

- String handling
- For loops
- Basic functions

In [None]:
import numpy as np
"""
function to provide a list of file names based on inputs.

Inputs:
year = year of forecast date [YYYY]
month = month of forecast date [MM]
day = day of forecast [DD]
hour = hour the forecast was initiated [HH] (0-23)
max_valid_hour = the maximum valid hour of the forecast from the time of initiation [VVV] (0-360 in 3 hour increments)

Returns: list of file names with file pattern WRF_fcst_YYYYMMDD_HHz_VVV.nc

"""
def write_filenames(year, month, day, hour, max_valid_hour): #function to write filenames
    #make variables into integers for compatibility (with math such as >, etc.)
    year = int(year) 
    month = int(month)
    day = int(day)
    hour = int(hour)
    max_valid_hour = int(max_valid_hour)

    #make sure each month, day, hour has two digits in the filename
    if month < 10: 
        month = f'0{month}'
    if day < 10:
        day = f'0{day}'
    if hour < 10:
        hour = f'0{hour}'

    #loop to produce filenames
    filenames = [] #place to store filenames
    hours_list = range(0,max_valid_hour,3) #make sure max valid hours are spaced correctly (every 3 hours)
    for i in hours_list:
        #ensure correct number of digits (3)
        if i < 10:
            max_valid_hour = f'00{i}'
        if i > 10:
            if i < 100:
                max_valid_hour = f'0{i}'
        if i > 100:
            max_valid_hour = f'{i}'
        #complie each filename with variables
        filename = f'WRF_fcst_{year}{month}{day}_{hour}z_{max_valid_hour}.nc'
        #add each filename to a list
        filenames.append(filename)

    return filenames

In [29]:
#2025-03-01

result = write_filenames(2025, 3, 1, 6, 120)
print(result)

['WRF_fcst_20250301_06z_000.nc', 'WRF_fcst_20250301_06z_003.nc', 'WRF_fcst_20250301_06z_006.nc', 'WRF_fcst_20250301_06z_009.nc', 'WRF_fcst_20250301_06z_012.nc', 'WRF_fcst_20250301_06z_015.nc', 'WRF_fcst_20250301_06z_018.nc', 'WRF_fcst_20250301_06z_021.nc', 'WRF_fcst_20250301_06z_024.nc', 'WRF_fcst_20250301_06z_027.nc', 'WRF_fcst_20250301_06z_030.nc', 'WRF_fcst_20250301_06z_033.nc', 'WRF_fcst_20250301_06z_036.nc', 'WRF_fcst_20250301_06z_039.nc', 'WRF_fcst_20250301_06z_042.nc', 'WRF_fcst_20250301_06z_045.nc', 'WRF_fcst_20250301_06z_048.nc', 'WRF_fcst_20250301_06z_051.nc', 'WRF_fcst_20250301_06z_054.nc', 'WRF_fcst_20250301_06z_057.nc', 'WRF_fcst_20250301_06z_060.nc', 'WRF_fcst_20250301_06z_063.nc', 'WRF_fcst_20250301_06z_066.nc', 'WRF_fcst_20250301_06z_069.nc', 'WRF_fcst_20250301_06z_072.nc', 'WRF_fcst_20250301_06z_075.nc', 'WRF_fcst_20250301_06z_078.nc', 'WRF_fcst_20250301_06z_081.nc', 'WRF_fcst_20250301_06z_084.nc', 'WRF_fcst_20250301_06z_087.nc', 'WRF_fcst_20250301_06z_090.nc', 'WRF_fc

In [30]:
#2025-04-01

result = write_filenames(2025, 4, 1, 6, 120)
print(result)

['WRF_fcst_20250401_06z_000.nc', 'WRF_fcst_20250401_06z_003.nc', 'WRF_fcst_20250401_06z_006.nc', 'WRF_fcst_20250401_06z_009.nc', 'WRF_fcst_20250401_06z_012.nc', 'WRF_fcst_20250401_06z_015.nc', 'WRF_fcst_20250401_06z_018.nc', 'WRF_fcst_20250401_06z_021.nc', 'WRF_fcst_20250401_06z_024.nc', 'WRF_fcst_20250401_06z_027.nc', 'WRF_fcst_20250401_06z_030.nc', 'WRF_fcst_20250401_06z_033.nc', 'WRF_fcst_20250401_06z_036.nc', 'WRF_fcst_20250401_06z_039.nc', 'WRF_fcst_20250401_06z_042.nc', 'WRF_fcst_20250401_06z_045.nc', 'WRF_fcst_20250401_06z_048.nc', 'WRF_fcst_20250401_06z_051.nc', 'WRF_fcst_20250401_06z_054.nc', 'WRF_fcst_20250401_06z_057.nc', 'WRF_fcst_20250401_06z_060.nc', 'WRF_fcst_20250401_06z_063.nc', 'WRF_fcst_20250401_06z_066.nc', 'WRF_fcst_20250401_06z_069.nc', 'WRF_fcst_20250401_06z_072.nc', 'WRF_fcst_20250401_06z_075.nc', 'WRF_fcst_20250401_06z_078.nc', 'WRF_fcst_20250401_06z_081.nc', 'WRF_fcst_20250401_06z_084.nc', 'WRF_fcst_20250401_06z_087.nc', 'WRF_fcst_20250401_06z_090.nc', 'WRF_fc

In [31]:
#2025-05-01

result = write_filenames(2025, 5, 1, 6, 120)
print(result)

['WRF_fcst_20250501_06z_000.nc', 'WRF_fcst_20250501_06z_003.nc', 'WRF_fcst_20250501_06z_006.nc', 'WRF_fcst_20250501_06z_009.nc', 'WRF_fcst_20250501_06z_012.nc', 'WRF_fcst_20250501_06z_015.nc', 'WRF_fcst_20250501_06z_018.nc', 'WRF_fcst_20250501_06z_021.nc', 'WRF_fcst_20250501_06z_024.nc', 'WRF_fcst_20250501_06z_027.nc', 'WRF_fcst_20250501_06z_030.nc', 'WRF_fcst_20250501_06z_033.nc', 'WRF_fcst_20250501_06z_036.nc', 'WRF_fcst_20250501_06z_039.nc', 'WRF_fcst_20250501_06z_042.nc', 'WRF_fcst_20250501_06z_045.nc', 'WRF_fcst_20250501_06z_048.nc', 'WRF_fcst_20250501_06z_051.nc', 'WRF_fcst_20250501_06z_054.nc', 'WRF_fcst_20250501_06z_057.nc', 'WRF_fcst_20250501_06z_060.nc', 'WRF_fcst_20250501_06z_063.nc', 'WRF_fcst_20250501_06z_066.nc', 'WRF_fcst_20250501_06z_069.nc', 'WRF_fcst_20250501_06z_072.nc', 'WRF_fcst_20250501_06z_075.nc', 'WRF_fcst_20250501_06z_078.nc', 'WRF_fcst_20250501_06z_081.nc', 'WRF_fcst_20250501_06z_084.nc', 'WRF_fcst_20250501_06z_087.nc', 'WRF_fcst_20250501_06z_090.nc', 'WRF_fc

## Problem 2

### 1. Problem Background

You just started grad school and inhereted some code from the final chapter of a previous student's thesis. That last chapter was completed in about 1 month and your advisor has said they want you to follow up on the precipitation analysis the student has done with a new version of the precipitation dataset that was just released. 

As you dig through their code to make it work, you realize it's not commented well at all. Additionally, the student is now working Goldman Sachs working on CliFi (climate finance) problems and, well, you can't afford their hourly rate to get some help! In order to get the data, you would need to modify the name of the files, because the revised dataset uses a different naming convention. The previous student's code to create a list of the names of files to download and analyze is below. 

### 2. Instructions

1. In your own words, what does the code do? Is it correct? 
2. Go through the old student's code and add comments describing what the code is doing. Comments should be thorough enough that they're helpful for you, but also the student that comes after you. 
3. Are there some things in the code that you would do to edit the code for clarity (i.e., to make it more readible)? Make a list of these things, but do not implement them. 

### 3. Concepts Assessed:

- Peer review
- Commenting code
- String handling
- For loops
- Functions

In [None]:
def IsLeapYear(year): #writing a function to determine whether the input year is a leap year (output is true or false)
    
    if (year % 4 == 0): #if the current year divided by 4 remainder is zero
        if (year % 500 == 0): #and the current year divided by 500 remainder is zero
            return True #it is a leap year
        elif (year % 100 == 0): #if the current year divided by 100 remainder is zero
            return False #is it not a leap year
        else:
            return True
    else:
        return False
    
def ReturnFileList(file_base, file_ext, start_yr, end_yr): #function to create a list of filenames
    
    DiM = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] #list of days per month for one year (days in month)

    year = start_yr #defining the starting year

    files = [] #initializing an empty list to store filenames
    
    for i in range(end_yr - start_yr + 1): #loop through years from starting year to ending year
        for j in range(12): #loop through months
            if (j+1==2) and IsLeapYear(year): #if the month is February and is a leap year
                days_in_month = 29 #the days in the month are 29
            else:
                days_in_month = DiM[j] #if the month is not February and a leap year, the days in the month remain the same
            
            for k in range(days_in_month): #loop through the number of days within each month
                
                file_name = f'{file_base}-{year:4d}-{(j+1):02d}-{(k+1):02d}{file_ext}' #create a filename based on the date
                files.append(file_name) #create a list of all filenames
                
        year = year + 1 #do again for next year until we get to end year
    
    return files

In [33]:
StartYr = 1998
EndYr = 2000

file_list = ReturnFileList('precip','.nc',StartYr,EndYr)

print(str(len(file_list))+' files')
print(*file_list, sep='\n')


1096 files
precip-1998-01-01.nc
precip-1998-01-02.nc
precip-1998-01-03.nc
precip-1998-01-04.nc
precip-1998-01-05.nc
precip-1998-01-06.nc
precip-1998-01-07.nc
precip-1998-01-08.nc
precip-1998-01-09.nc
precip-1998-01-10.nc
precip-1998-01-11.nc
precip-1998-01-12.nc
precip-1998-01-13.nc
precip-1998-01-14.nc
precip-1998-01-15.nc
precip-1998-01-16.nc
precip-1998-01-17.nc
precip-1998-01-18.nc
precip-1998-01-19.nc
precip-1998-01-20.nc
precip-1998-01-21.nc
precip-1998-01-22.nc
precip-1998-01-23.nc
precip-1998-01-24.nc
precip-1998-01-25.nc
precip-1998-01-26.nc
precip-1998-01-27.nc
precip-1998-01-28.nc
precip-1998-01-29.nc
precip-1998-01-30.nc
precip-1998-01-31.nc
precip-1998-02-01.nc
precip-1998-02-02.nc
precip-1998-02-03.nc
precip-1998-02-04.nc
precip-1998-02-05.nc
precip-1998-02-06.nc
precip-1998-02-07.nc
precip-1998-02-08.nc
precip-1998-02-09.nc
precip-1998-02-10.nc
precip-1998-02-11.nc
precip-1998-02-12.nc
precip-1998-02-13.nc
precip-1998-02-14.nc
precip-1998-02-15.nc
precip-1998-02-16.nc
pr

The first function determines whether the input year is a leap year. If yes, it returns true. If no, it returns false
The second function creates a list of filenames (including date stamps) for each day from a starting year to an ending year, accounting for leap years by adding an additional day to February in each leap year. 
To make the code easier to understand, I would start the functions by defining inputs and returns and use more descriptive variables in loops instead of i,j,k