# GEOS 505: Problem Set 1

__Instructions__: Complete the two problems below. One is a coding problem, where you will review and comment existing code. Where you are asked to provide descriptive text and answer questions, please do so via well formatted Markdown cells below the problem.  

__Due Date and Time__: September 19, 2025 at 5:00 PM MT

__Turn In Via__: Commit and push your complete notebook to your personal GitHub repository for the class, and submit the URL for notebook via Canvas. 

## Problem 1

### 1. Problem Background:

As a new graduate student, your research is examining the performance of 5-day precipitation forecasts at a number of Snotel sites in the Upper Boise River Basin. The forecasts are generated by a research center at a university in the western United States using the [Weather Research and Forecasting](https://www2.mmm.ucar.edu/wrf/users/) model. They run a 360 hr (i.e., 15 day) forecast using WRF every day at the top of every hour for the entire western United States, and puts the output you need on an Amazon Web Services (AWS) S3 bucket that is freely available.  

File pattern:
WRF_fcst_YYYYMMDD_HHz_VVV.nc

Where:
- YYYY is the year of the forecast date
- MM is the month of the forecast date
- DD is the day of the forecast
- HH is the hour the forecast was initiated (0-23)
- VVV is the valid hour of the forecast from the time of initiation in hours (0-360 hr in 3 hr increments). So 000 is the initial hour of the forecast, 003 would be the 3rd hour, 006 the 6th, and so forth.
- 'WRF_fcst_' is the prefix of the file name
- '.nc' is the file extension (this stands for NetCDF file - a real file format)

### 2. Instructions: 

1. Write a function that takes as input the forecast year, month, day, and hour, as well as a maximum valid hour and returns a list of file names that contain the forecast data
2. Test your code for three days (2025-03-01, 2025-04-01, and 2025-05-01), for forecast hour 6, and a maximum valid hour of 120 hours. Print the results of the returned list to the screen and verify manually. 

### 3. Concepts Assessed:

- String handling
- For loops
- Basic functions

In [1]:
def FormatFileName(file_prefix, file_ext, year, mon, day, fcst_hour, beg_valid_hr, end_valid_hr):
    
    fcst_filenames = []
    
    fcst_filebase = f'{file_prefix}{year}{mon:02d}{day:02d}_{fcst_hour:02d}z_'
    
    # Error trap: The following assertions must pass
    assert beg_valid_hr % 3 == 0, "Error, beg_valid_hour must be a multiple of 3"
    assert end_valid_hr % 3 == 0, "Error, end_valid_hour must be a multiple of 3"
    assert (fcst_hour % 6 == 0) and (fcst_hour < 24) and (fcst_hour >= 0), "Forecast hour must be 0, 6, 12, or 18"
    
    valid_hr = beg_valid_hr
    while(valid_hr <= end_valid_hr):
        fcst_filenames.append(f'{fcst_filebase}{valid_hr:03d}{file_ext}')
        valid_hr += 3
        
    return fcst_filenames
        
        

In [4]:
fcst_prefix = 'WRF_fcst_'
fcst_ext = '.nc'

test_filenames = FormatFileName(fcst_prefix,fcst_ext,2025,3,1,9,0,120)
print(*test_filenames, sep='\n')

AssertionError: Forecast hour must be 0, 6, 12, or 18

In [None]:
test_filenames2 = FormatFileName(fcst_prefix,fcst_ext,2025,4,1,6,0,120)
print(*test_filenames2, sep='\n')

In [None]:
test_filenames3 = FormatFileName(fcst_prefix,fcst_ext,2025,5,1,6,0,120)
print(*test_filenames3, sep='\n')

## Problem 2

### 1. Problem Background

You just started grad school and inhereted some code from the final chapter of a previous student's thesis. That last chapter was completed in about 1 month and your advisor has said they want you to follow up on the precipitation analysis the student has done with a new version of the precipitation dataset that was just released. 

As you dig through their code to make it work, you realize it's not commented well at all. Additionally, the student is now working Goldman Sachs working on CliFi (climate finance) problems and, well, you can't afford their hourly rate to get some help! In order to get the data, you would need to modify the name of the files, because the revised dataset uses a different naming convention. The previous student's code to create a list of the names of files to download and analyze is below. 

### 2. Instructions

1. In your own words, what does the code do? Is it correct? 
2. Go through the old student's code and add comments describing what the code is doing. Comments should be thorough enough that they're helpful for you, but also the student that comes after you. 
3. Are there some things in the code that you would do to edit the code for clarity (i.e., to make it more readible)? Make a list of these things, but do not implement them. 

### 3. Concepts Assessed:

- Peer review
- Commenting code
- String handling
- For loops
- Functions

In [None]:
# The function below takes as input the year and returns a boolean variable
# (a logical True or False) if it is a leap year 
def IsLeapYear(year):
    
    if (year % 4 == 0): # Checking if the year is modulo 4
        if (year % 500 == 0): # If the year is modulo 500 (should be 400) it is a leap year
            return True
        elif (year % 100 == 0): # If year is not modulo 400, but is modulo 100, it is NOT a leap year
            return False
        else: # If year is modulo 4, but not 400 or 100, it is a leap year
            return True
    else: # If year is not modulo 4, it is not a leap year
        return False
    
# The function below takes as input a file base name, file extension, start 
# year, and end year and returns a list of file names between the start and end year
# that are based on the date
def ReturnFileList(file_base, file_ext, start_yr, end_yr):
    
    # This is a list of the days in each month
    DiM = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]

    # Set year equal to start year
    year = start_yr

    # Create an empty list to store the file names in
    files = []
    
    # Loop through the years 
    for i in range(end_yr - start_yr + 1):
        # Loop through the months
        for j in range(12):
            # If the month is February and it's a leap year, 
            # there are 29 days in the month
            if (j+1==2) and IsLeapYear(year):
                days_in_month = 29
            else:
                days_in_month = DiM[j]
            
            # Loop through the days in month
            for k in range(days_in_month):
                
                # Construct file name based on file base, year, month, day, and extension
                file_name = f'{file_base}-{year:4d}-{(j+1):02d}-{(k+1):02d}{file_ext}'
                
                # Asspend the file name to the list of files
                files.append(file_name)
                
        # Increment the year
        year = year + 1
    
    # Return the list
    return files

In [None]:
StartYr = 1998
EndYr = 2000

file_list = ReturnFileList('precip','.nc',StartYr,EndYr)

print(str(len(file_list))+' files')
print(*file_list, sep='\n')


Your Markdown answers go here...