# Project 1: Weather Analyzer

For this project, you are going to write code to analyze the high and low temperatures near Grand Rapids over the past 30 years. In this workspace, you should have two data files:

* `grr_high_low_94_24.csv.gz` contains the actual high and low temperatures measured at the airport from October 1994 through early February 2024.
* `cassopolis_mi_data.csv.gz` contains estimated high and low temperatures in a city near Grand Rapids. This data is similar to, but not identical to, the data measured at the airport.

Both of these files have been compressed in `gzip` format to reduce the space consumed on the PrairieLearn servers. Please do not store uncompressed versions of these files in your PrairieLearn workspace. The file `grr_sample.csv` contains the first 500 lines of the data file uncompressed, so you can see the structure of the file.

***Important!**  Take time to carefully check your work before submitting.
You are limited to <em>three</em> submissions per day. Do not use the PrairieLearn autograder 
as a substitute for preparing your own test cases.</p>

This three-submission-per-day limit is a policy, not a technical limit. In other words, 
  PrairieLearn will not prevent you from making additional submissions; but, doing so may lower your score.


### Task 1

Some of the tasks below will require you to determine which of two dates comes earlier or later in the year (e.g., 14 April or 3 May). Complete the two functions below.

(Python does have a `datetime` module; but, this library assumes that all dates have a year, which is not a valid assumption for the code you will be writing for this project. You are allowed to use `datetime` if you like; but, 
these functions can be implemented in five lines or fewer without it.)

In [4]:
#grade DO NOT REMOVE

# return True if m1/d1 comes earlier in the year than m2/d2, otherwise return False
# Do not use Python's datetime module

def is_date_before(m1 :int, d1 :int , m2 : int, d2 :int) -> bool:
    if m1 < m2:
        return True
    elif m1 == m2:
        if d1 < d2:
            return True
    return False
    
day_check = is_date_before(10, 25, 3, 26)
print(day_check)

# return True if m1/d1 comes later in the year than m2/d2, otherwise return False
# Do not use Python's datetime module

def is_date_after(m1 :int, d1 :int , m2 : int, d2 :int) -> bool:
    if m1 > m2:
        return True
    elif m1 == m2:
        if d1 > d2:
            return True
    return False



False


In [5]:
day_check = is_date_after(10, 27, 10, 26)
print(day_check)

True


In [6]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
# Add more tests.
print(is_date_before(1, 1, 2, 1))

print(is_date_after(1, 1, 2, 1))


True
False


### Task 2

Write a function `all_low_above` that returns the list of all dates on which the low temperature was above the given threshold temperature. Dates should be strings in `mm/dd/yyyy` format.

The "starter code" in the function below shows how to read the csv data from a file that is compressed. You may either use this code as is or modify it if you like. 

**Important**: Please do not uncompress or modify the `.gz`files.  The uncompressed files are large and storing them would strain the PrairieLearn system.

In [16]:
#grade DO NOT REMOVE
import gzip
import csv

def all_low_above(filename: str, threshold: int | float) -> list[str]:
  with gzip.open(filename, 'rt') as csv_file:
    f = csv.reader(csv_file)
    next(f, None) # skip the header row
    date = []
    for row in f:
        #date_split = row[5].split('/')
        # zk Added a comment to change where not appears.
        low_temp = (int(row[5]))
        if low_temp > threshold:
            month, day, year = row[2].split('/')
            formatted_date = f"{int(month):02d}/{int(day):02d}/{year}"
            date.append(formatted_date)
    return date

#assign a varable tp slpi


In [17]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
# Add more tests.
print(all_low_above('grr_high_low_94_24.csv.gz', 76))

['07/13/1995', '07/14/1995', '06/25/1998', '07/04/1999', '08/01/2006', '08/02/2006', '06/19/2012', '07/04/2012', '07/07/2012', '07/17/2012', '07/19/2013', '09/10/2013', '06/30/2018', '07/19/2019', '06/15/2022']


## Task 3

Write a function `maxima_for_day` that returns the list of years during which the maximum for that date was observed. The return value is a list because the maximum may have been observed during several different years.

In [14]:
#grade DO NOT REMOVE
import gzip
import csv

def maxima_for_day(filename: str, month: int, day: int) -> list[int]:
  with gzip.open(filename, 'rt') as csv_file:
    f = csv.reader(csv_file)
    next(f, None) # skip the header row
    max_value = float('-inf')
    maximum = []
    
    for row in f:
        date = row[2].split('/')
        month_row = int(date[0])
        day_row = int(date[1])
        year_row = int(date[2])
        if month_row == month and day_row == day:
            tmin = int(row[5])
            if tmin > max_value:
                max_value = tmin
                maximum = [year_row]
            elif tmin == max_value:
                maximum.append(year_row)
    return maximum


In [15]:
maxima_for_day('grr_high_low_94_24.csv.gz', 1, 1)

[2006]

In [None]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
# Add your own tests.

## Task 4

Write a function `earliest_low_below` that returns the earliest date in any year that the low temperature was below <span style="font-family: monospace;">threshold</span> degrees. Dates should be strings in `mm/dd/yyyy` format.

**Only consider dates on or after 1 July.**


In [11]:
#grade DO NOT REMOVE
import gzip
import csv

def earliest_low_below(filename: str, threshold: int|float) -> str:
  with gzip.open(filename, 'rt') as csv_file:
    f = csv.reader(csv_file)
    next(f, None) # skip the header row
    y = 0
    value = 0
    
    for row in f:
        date_split = row[2].split('/')
        date_month = int(date_split[0])
        date_day = int(date_split[1])
        date_year = int(date_split[2])
        if int(row[4]) < threshold and date_month >= 7:
            date = f"{date_month:02d}/{date_day:02d}/{date_year}"
            if (value == 0 or date < value):
                value = date
                x = value.split('/')
            if (x[0] > date[0]):
                y = value
            if x[0] == date[0] and x[1] > date[1]:
                y = value
    return y

In [12]:
print(earliest_low_below('grr_high_low_94_24.csv.gz', 32))

11/10/2017


In [None]:
# Use this block to test your code.
# (This block is not run by the auto-grader)
# Add your own tests.