In [None]:
## Please make sure "project.py" is in your "lab5" folder.
## We will use the same API for both the lab and the project
import project
import datetime

----------------------------------
## Segment 2: Learning the API
### Task 2.1: Examine the `hurricanes` CSV file

The `project.py` file will allow you to access the dataset you'll use this week, `hurricanes.csv`. Start by looking at the hurricane dataset [here](https://github.com/msyamkumar/cs220-s22-projects/blob/main/lab-p5/hurricanes.csv) pulled from the [List of United States hurricanes](https://en.wikipedia.org/wiki/List_of_United_States_hurricanes) on Wikipedia.

Look at a hurricane in the dataset, such as hurricane Baker, and briefly familiarize with each of the columns. The data shows:
* name
* the date of formation
* the date of dissipation
* max wind speed (in MPH)
* damage (in US dollars)
* deaths

Often, we'll organize data by assigning numbers (called indexes) to different parts of the data (e.g., rows or columns in a table). In Computer Science, indexing typically starts with the number `0`; i.e., when you have a sequence of things, you'll start counting them from `0` instead of `1`. Thus, you should **ignore the numbers shown by GitHub to the left of the rows**. From the perspective of `project.py`, the indexes of Baker, Camille, and Eloise are 0, 1, and 2 respectively (and so on).

For example, consider this example from `hurricanes.csv`:

<img src="https://github.com/msyamkumar/cs220-s22-projects/raw/main/lab-p5/table.png" width="240" alt="Hurricanes outlined with position and name: 1: Baker, 2: Camille, 3: Eloise, 4: Frederic, 5: Elena">

The **index** for the Hurricane Eloise is 2 but its actual **location** is 3.
Therefore, you must follow this convention for all the questions
asking for the value at a particular index.


### Task 2.2: Explore the API
Use the inspection process we learned in Lab-P3 and Lab-P4 to know more details of the 'project' API. In lab-p4, we saw how to use `dir`, and `help` to learn the API. Run the following in cells to explore the API:

In [None]:
dir(project)

Spend some time reading about each of the six functions that don't begin with two underscores. For example, run this to learn about `count`:

In [None]:
help(project.count)

or alternatively, you could run the following to just see the function's documentation:

In [None]:
print(project.count.__doc__)

You may also open up the `project.py` file directly to learn about the functions provided. E.g., you might see this:

```python
def count():
    """This function will return the number of records in the dataset"""
    return len(__hurricane__)
```

You don't need to understand the code in the functions, but the strings in triple quotes (called *docstrings*) explain what each function does. As it turns out, all `project.count.__doc__` is providing you the docstring for the `count` function.

Try to learn other functions in `project.py`, by using `help` function. For example, you may try: 


In [None]:
help(project.get_name)

In [None]:
# TODO: Try getting help for each of the functions.

Complete the following TODOs, and check your results against what you see in `hurricanes.csv`.

**Remember:** In Computer Science, we start indexing at 0. GitHub will start indexing at 2 (the row following the header row), ignore this.

In [None]:
# Get the name of the hurricane at index 0.
# This one is done for you.
project.get_name(0)

In [None]:
# TODO: Get the name of the hurricane at index 1.
# Your answer should be 'Camille'. Verify this in the CSV as well.


In [None]:
# Get the wind speed of the hurricane at index 2.
# This one is done for you.
project.get_mph(2)

In [None]:
# TODO: Get the wind speed of the hurricane at index 7.


In [None]:
# TODO: Get the damage of the hurricane at index 5.


Notice that the damage amount ends a "B". In this dataset, "K" represents one thousand, "M" represents one million, and "B" represents one billion. For P5, you'll need to convert these strings to the appropriate ints (e.g., `"1.5K"` will become `1500`, `"2.55M"` will become `2550000`).

In [None]:
# Get the name of the hurricane at the end of the dataset.
# This one is done for you.
project.get_name(project.count() - 1)

In [None]:
# Try getting the name at project.count() instead. What happens? Why?


----------------------------------
## Segment 3: Working with strings

### Task 3.1: Indexing / slicing Strings

Stepping back from the Hurricane data, Tasks 3.1 and 3.2 introduce us to performing operations with strings. While this will be covered in more detail during Friday's lecture, we will cover the essentials now.

We can think of a string as a sequence of characters. For example, the string `my_str = 'hello_world!'` can be written as...

| Index  | 0    | 1    | 2    | 3    | 4    | 5    | 6    | 7    | 8    | 9    | 10   | 11   |
| ------ | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| String | h    | e    | l    | l    | o    | _    | w    | o    | r    | l    | d    | !    |

... where we can then access specific characters of the string by an index, e.g. `my_str[0]` which returns `'h'` or `my_str[8]` which returns `'r'`.

Furthermore, we can "slice" strings -- that is, get a particular section of characters. For example,

- `my_str[1:5]` returns `'ello'`
- `my_str[:8]` returns `'hello_wo'`
- `my_str[5:]` returns `'_world!'`
- `my_str[:]` returns `'hello_world!'`

Try running this in the cell below.

In [None]:
my_str = 'hello_world!'
print("my_str[0] returns", my_str[0])
print("my_str[8] returns", my_str[8])
print("my_str[1:5] returns", my_str[1:5])
print("my_str[:8] returns", my_str[:8])
print("my_str[5:] returns", my_str[5:])
print("my_str[:] returns", my_str[:])

Notice that slicing is *inclusive* on the lower bound and *exclusive* on the upper bound. We can also leave out a bound to start from the beginning (e.g. `my_str[:6]`) or the end (e.g. `my_str[8:]`). Lastly, a negative index will count *backwards* from the *end* of the string.

In [None]:
print("my_str[-1] returns", my_str[-1])
print("my_str[-4:-1] returns", my_str[-4:-1])

**Your Turn!** Try slicing the below phone number! Can you extract the area code (first 3 digits), exchange code (middle 3 digits), and line number (last 4 digits) of the given phone number?

In [None]:
phone_number = "608-555-1234"
area_code = ???
exchange_code = ???
line_number = ???
print("area_code:", area_code)
print("exchange_code:", exchange_code)
print("line_number:", line_number)

In [None]:
# TODO: use slicing to extract just the last digit of the phone number
last_digit = ???
print("last digit:", last_digit)

### Task 3.2 Case-Sensitivity

Other helpful string functions include `upper` and `lower`. `upper` converts a string to all UPPERCASE letters, while `lower` converts a string to all lowercase letters.

In [None]:
print('helLO wOrLd'.upper())
print('helLO wOrLd'.lower())

If we want to see if the user typed in `cs220`, we should also accept `cS220`, `Cs220`, and `CS220`. We can use `upper` or `lower` to do that! 

In [None]:
my_class = input("What class are you in? ")
if my_class.upper() == "CS220": # notice we compare this to an uppercase CS220
    print("Right on!")
else:
    print("That must be some other class...")

**Your Turn!** Ask the user to type in their name. If their name matches your name, tell them so!  **This should be case-insensitive.**

In [None]:
my_name = input("What's your name? ")
if my_name.lower() == "franklin": # TODO: Check if they typed in your name! Use all lowercase.
    print("We share names!")
else:
    print("Oh well...")

### Task 3.3: Calculating Damage Costs

Task 2.2 showed us that damage costs are calculated in thousands, millions, and billions. It would be helpful to have code that converts this string into an integer.

We can slice off the *last* character by using the index `:-1` (that is the entire string *up to* the last character).

Complete the code to print what the cost ends in (e.g. `K`, `M`, or `B`).

In [None]:
my_cost = "3.3B"
print("Cost Amount:", my_cost[:-1])
print("Ending In:  ", ???)

### Task 3.4: Extracting from a Date

Run the below cell which prints the formation and dissipation date of the first hurricane.

In [None]:
print(project.get_formed(0))
print(project.get_dissipated(0))

The dates are represented as a string in `mm/dd/yyyy` notation. Two digits are used to represent the month and day even when they can be represented with a single digit, that is, `'9/1/1950'` is represented as `'09/01/1950'`.

To extract the month, we could run the following code...

In [None]:
project.get_formed(0)[:2]

Notice, however, that this is the *string* `'08'`.

Write the code to get this as the *int* (e.g. `8`).

In [None]:
# TODO: Get the month of the first hurricane as an integer.

### Task 3.5: Helper Functions for Month, Day, and Year

The below functions will be useful in p5. Complete the TODOs for getting the month, day, and year as an int.

In [None]:
def get_month(date):
    """Returns the month when the date is the in the 'mm/dd/yyyy' format"""
    return int(date[:2])

def get_day(date):
    """Returns the day when the date is the in the 'mm/dd/yyyy' format"""
    pass  # TODO: Use string slicing to return the day

def get_year(date):
    """Returns the year when the date is the in the 'mm/dd/yyyy' format"""
    pass  # TODO: Use string slicing to return the year

Write some test cases (e.g., `get_year("10/02/2022")`) to check if your functions are correct.

In [None]:
# TODO Write a test case for get_month

# TODO Write a test case for get_day

# TODO Write a test case for get_year


### Task 3.6: Using Helper Functions

Using the helper functions you made above, complete the following...

**Hint:** You'll use these helper functions in combination with `project.get_formed(idx)` and `project.get_dissipated(idx)`!

Print the *day* that the hurricane at index `10` *formed*.

In [None]:
# TODO: Print the day that the hurricane at index 10 formed.
#       This should be 7
???(project.get_formed(???))

Print the *year* that the hurricane at index `7` *formed*.

In [None]:
# TODO: Print the year that the hurricane at index 7 formed.
#       This should be 2004

Print the *month* that the hurricane at index `2` *dissipated*.

In [None]:
# TODO: Print the month that the hurricane at index 2 dissipated.
#       This should be 9

----------------------------------
## Segment 4: Looping

### Task 4.1: `while` and `for` loops

Run the below code and observe the output.

In [None]:
i = 0
while i < 5:
    print(i)
    i += 1

Equivalently, we can use `for` and `range(n)`. The `range(n)` function returns a sequence of numbers, from `0` to `n` but not including `n`.

In [None]:
for i in range(5):
    print(i)

Now, write the code that will print the numbers from 0 to 25 *inclusive* as both a `while` and `for` loop.

In [None]:
# TODO Write a while loop that prints the numbers from 0 to 25 inclusive.

In [None]:
# TODO Write a for loop that prints the numbers from 0 to 25 inclusive.

### Task 4.2: Print Hurricane Data

Print the index, name, and wind speed of each hurricane. Your output should show all the entries in the dataset.

In [None]:
# range takes in a number. We want to iterate over all entries.
# how do we get the number of entries without hardcoding it?
for idx in range(???):
    name = ???
    wind_speed = ???
    print(idx, name, wind_speed, sep='\t')

### Task 4.3: Filter Hurricanes by Speed

Print the names of all hurricanes with a speed under 80mph. There are 8 such hurricanes.

In [None]:
# TODO: Print the names of all hurricanes with a speed under 80mph.

### Task 4.4: Filter Hurricanes by Deaths

Print the names of all hurricanes with over 1000 deaths. There are 5 such hurricanes.

In [None]:
# Print the names of all hurricanes with over 1000 deaths.

### Task 4.5: Filter Hurricanes by Name

Print the names of all hurricanes that start with letter "D".

In [None]:
# Print the names of all hurricanes that start with letter "D". There are 12 such hurricanes, counting repeats.

### Task 4.6: Find the Fastest Hurricane

Print the name of the hurricane which has the fastest wind speed.

*Special Note*: `None` is a Python keyword which denotes *nothing*. At the beginning of this loop, by saying `fastest_hurr_idx = None`, we make no assumptions about what the fastest hurricane is. Inside the loop, if the `fastest_hurr_idx` is `None`, we know that is our first (and currently fastest) hurricane.

In [None]:
fastest_hurr_idx = None
max_speed = 0
for idx in range(???):
    current_speed = ???
    if fastest_hurr_idx == None or current_speed > max_speed:
        max_speed = ???
        fastest_hurr_idx = idx

if fastest_hurr_idx != None:
    print(project.get_name(fastest_hurr_idx), 'had the fastest speed of', max_speed)

### Task 4.7: Find the Slowest Hurricane

Print the name of the hurricane which has the slowest wind speed.

In [None]:
slowest_hurr_idx = None
min_speed = 0
for idx in range(???):
    current_speed = ???
    if ??? or ???:
        min_speed = ???
        slowest_hurr_idx = ???

if slowest_hurr_idx != None:
    print(project.get_name(slowest_hurr_idx), 'had the slowest speed of', min_speed)

### Task 4.8: Print Hurricanes Between

Given `start_year` and `end_year`, print the names of all hurricanes that *were formed* in between (inclusive).

In [None]:
def print_hurricanes_between(start_year, end_year):
     for i in range(project.count()):
        # TODO: Check if the year the hurricane formed is in range.
        # HINT: use get_year to get the year of the current hurricane
        if ???:  
            print(project.get_name(i), "happened on", project.get_formed(i))

print_hurricanes_between(2017, 2021)

----------------------------------
## Segment 5: Working with the datetime module

The code below uses Python's [datetime module](https://docs.python.org/3/library/datetime.html), which will be used further in p5.

Execute the below function definition and its calls. It will calculate the number of days between 2 dates.

In [None]:
def get_number_of_days(start_date, end_date):
    """Gets the number of days between the start_date (in 'mm/dd/yyyy' format) and end_date 
    (in 'mm/dd/yyyy' format)"""
    # The second argument is a format string to tell the function how to process the date string
    day1 = datetime.datetime.strptime(start_date, '%m/%d/%Y') 
    day2 = datetime.datetime.strptime(end_date, '%m/%d/%Y')
    delta = day2 - day1
    return delta.days

In [None]:
print(get_number_of_days('02/21/2022', '02/23/2022'))
print(get_number_of_days('01/01/2021', '01/01/2022'))
print(get_number_of_days('04/20/2022', '08/12/2022'))

The function `get_number_of_days` uses the `datetime` module to calculate this for us with 2 steps:

1. Convert the dates into a datetime object (we'll talk about objects later in the semester) using `datetime.datetime.strptime`.

2. Subtract the objects `day2 - day1` and return the difference in days `delta.days`.

### Task 5.1: Calculating Hurricane Duration

We can calculate how long a hurricane lasts as the number of days between `project.get_formed(idx)` and `project.get_dissipated(idx)`. Complete the function to calculate this duration.

In [None]:
def get_hurricane_duration(hurricane_idx):
    # Calculate the duration between when the hurricane formed and dissipated.
    duration = get_number_of_days(???, ???)
    return duration

Test your code using the below. Hurricane Karen should last 11 days and Hurricane Cindy should last 6 days.

In [None]:
# Hurricane Karen
hurricane1_idx = 118
hurricane1_name = project.get_name(hurricane1_idx)
hurricane1_duration = get_hurricane_duration(hurricane1_idx)
print(hurricane1_name, 'lasts', hurricane1_duration, 'days.')

In [None]:
# Hurricane Cindy
hurricane2_idx = 90
hurricane2_name = project.get_name(hurricane2_idx)
hurricane2_duration = get_hurricane_duration(hurricane2_idx)
print(hurricane2_name, 'lasts', hurricane2_duration, 'days.')

### Task 5.2: Finding Hurricane with Longest Duration

Using an algorithm similar to Task 4.6 or 4.7, find the hurricane that has the longest duration.

In [None]:
# TODO: Use an algorithim similar to 4.6 or 4.7.
# HINT: Use the get_hurricane_duration function used in 5.1!

----------------------------------
You are now ready to work on [p5](https://github.com/msyamkumar/cs220-s22-projects/tree/main/p5)!
Remember to only work with p5 with your partner from this point on. Have fun!