# Project 1: Weather Analyzer

For this project, you are going to write code to analyze the high and low temperatures near Grand Rapids over the past 30 years. In this workspace, you should have two data files:

* `grr_high_low_94_24.csv.gz` contains the actual high and low temperatures measured at the airport from October 1994 through early February 2024.
* `alto_mi_data.csv.gz` contains estimated high and low temperatures in a city near Grand Rapids. This data is similar to, but not identical to, the data measured at the airport.

Both of these files have been compressed in `gzip` format to reduce the space consumed on the PrairieLearn servers. Please do not store uncompressed versions of these files in your PrairieLearn workspace. The file `grr_sample.csv` contains the first 500 lines of the data file uncompressed, so you can see the structure of the file.

***Important!**  Take time to carefully check your work before submitting.
You are limited to <em>three</em> submissions per day. Do not use the PrairieLearn autograder 
as a substitute for preparing your own test cases.</p>

This three-submission-per-day limit is a policy, not a technical limit. In other words, 
  PrairieLearn will not prevent you from making additional submissions; but, doing so may lower your score.


### Task 1

Some of the tasks below will require you to determine which of two dates comes earlier or later in the year (e.g., 14 April or 3 May). Complete the two functions below.

(Python does have a `datetime` module; but, this library assumes that all dates have a year, which is not a valid assumption for the code you will be writing for this project. You are allowed to use `datetime` if you like; but, 
these functions can be implemented in five lines or fewer without it.)

In [9]:
#grade DO NOT REMOVE

# return True if m1/d1 comes earlier in the year than m2/d2, otherwise return False
# Do not use Python's datetime module

def is_date_before(m1 :int, d1 :int , m2 : int, d2 :int) -> bool:
    return d1 < d2 if m1 == m2 else m1 < m2


# return True if m1/d1 comes later in the year than m2/d2, otherwise return False
# Do not use Python's datetime module

def is_date_after(m1 :int, d1 :int , m2 : int, d2 :int) -> bool:
    return d1 > d2 if m1 == m2 else m1 > m2






In [10]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
# Add more tests.
print(is_date_before(1, 1, 2, 1))
print(is_date_before(2, 5, 2, 17))
print(is_date_before(3, 4, 1, 15))


print(is_date_after(1, 1, 2, 1))
print(is_date_after(5, 3, 5, 6))
print(is_date_after(12, 1, 4, 13))



True
True
False
False
False
True


### Task 2

Write a function `all_low_above` that returns the list of all dates on which the low temperature was above the given threshold temperature. Dates should be strings in `mm/dd/yyyy` format.

The "starter code" in the function below shows how to read the csv data from a file that is compressed. You may either use this code as is or modify it if you like. 

**Important**: Please do not uncompress or modify the `.gz`files.  The uncompressed files are large and storing them would strain the PrairieLearn system.

In [7]:
#grade DO NOT REMOVE
import gzip
import csv

def all_low_above(filename: str, threshold: int | float) -> list[str]:
  with gzip.open(filename, 'rt') as csv_file:
    f = csv.reader(csv_file)
    next(f, None) # skip the header row
    low_above_list = []
    for row in f:
        if float(row[5] or 0) > threshold:
            date_strings = row[2].split('/')
            if len(date_strings[0]) != 2:
                date_strings[0] = f"0{date_strings[0]}"
            if len(date_strings[1]) != 2:
                date_strings[1] = f"0{date_strings[1]}"
            low_above_list.append('/'.join(date_strings))
    return low_above_list




In [8]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
print(all_low_above('grr_high_low_94_24.csv.gz', 98))
print(all_low_above('grr_high_low_94_24.csv.gz', 80))
print(all_low_above('grr_high_low_94_24.csv.gz', 70))



[]
['07/14/1995']
['07/05/1995', '07/13/1995', '07/14/1995', '07/15/1995', '08/04/1995', '08/12/1995', '08/13/1995', '08/14/1995', '08/18/1995', '05/19/1996', '06/29/1996', '08/06/1996', '08/07/1996', '06/24/1997', '07/14/1997', '07/26/1997', '06/25/1998', '08/23/1998', '08/24/1998', '06/06/1999', '07/04/1999', '07/05/1999', '07/31/1999', '07/23/2001', '08/01/2001', '08/02/2001', '08/07/2001', '08/08/2001', '08/09/2001', '06/26/2002', '07/01/2002', '07/02/2002', '07/21/2002', '07/22/2002', '07/28/2002', '07/29/2002', '08/04/2002', '08/15/2003', '06/08/2004', '06/09/2004', '07/12/2004', '08/27/2004', '06/24/2005', '06/28/2005', '07/13/2005', '07/18/2005', '07/25/2005', '06/17/2006', '07/20/2006', '07/25/2006', '07/26/2006', '07/31/2006', '08/01/2006', '08/02/2006', '07/08/2007', '07/09/2007', '08/07/2007', '08/08/2007', '08/09/2007', '06/06/2008', '07/08/2008', '07/18/2008', '08/22/2008', '06/25/2009', '08/17/2009', '07/05/2010', '07/06/2010', '07/07/2010', '07/08/2010', '07/17/2010', '

## Task 3

Write a function `minima_for_day` that returns the list of years during which the minimum for that date was observed. The return value is a list because the minimum may have been observed during several different years.

In [11]:
#grade DO NOT REMOVE
import gzip
import csv

def minima_for_day(filename: str, month: int, day: int) -> list[int]:
  with gzip.open(filename, 'rt') as csv_file:
    f = csv.reader(csv_file)
    next(f, None) # skip the header row
    temp_cache = {}
    for row in f:
      if f"{month}/{day}" in row[2]:
        min_temp = int(row[5])
        if min_temp in temp_cache:
            temp_cache[min_temp].append(int(row[2].split('/')[2]))
        else:
            temp_cache[min_temp] = [int(row[2].split('/')[2])]
    return temp_cache[min(temp_cache.keys())]

In [12]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
print(minima_for_day('grr_high_low_94_24.csv.gz', 7, 12))
print(minima_for_day('grr_high_low_94_24.csv.gz', 2, 15))
print(minima_for_day('grr_high_low_94_24.csv.gz', 1, 23))


[2002]
[2015]
[2005, 2011]


## Task 4

Write a function `earliest_low_above` that returns the earliest date in any year that the low temperature was above <span style="font-family: monospace;">threshold</span> degrees. Dates should be strings in `mm/dd/yyyy` format.

****


In [4]:
#grade DO NOT REMOVE
import gzip
import csv

def earliest_low_above(filename: str, threshold: int|float) -> str:
  with gzip.open(filename, 'rt') as csv_file:
    f = csv.reader(csv_file)
    next(f, None) # skip the header row
    target = ['12', '31', '2100']
    for row in f:
        if float(row[5]) > threshold:
            current = row[2].split('/')
            if int(current[0]) < int(target[0]):
                target = current
            elif int(current[0]) == int(target[0]) and int(current[1]) < int(target[1]):
                target = current
    if len(target[0]) != 2:
        target[0] = f"0{target[0]}"
    if len(target[1]) != 2:
        target[1] = f"0{target[1]}"
    return '/'.join(target)



In [5]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
print(earliest_low_above('grr_high_low_94_24.csv.gz', 80))
print(earliest_low_above('grr_high_low_94_24.csv.gz', 77))
print(earliest_low_above('grr_high_low_94_24.csv.gz', 43))
print(earliest_low_above('grr_high_low_94_24.csv.gz', 42))


07/14/1995
06/30/2018
01/05/2007
01/05/1998
