# Hurricane Analysis Codecademy Project
This is my code /solution for Codecademy's Hurricane Analysis Project, which is part of the Data Scientist track.

Project goal: "To write several functions that organize and manipulate data about Category 5 Hurricanes, the strongest hurricanes as rated by their wind speed. Each one of these functions will use a number of parameters, conditionals, lists, dictionaries, string manipulation, and return statements."

In [1]:
# names of hurricanes
names = ['Cuba I', 'San Felipe II Okeechobee', 'Bahamas', 'Cuba II', 'CubaBrownsville', 'Tampico', 'Labor Day', 'New England', 'Carol', 'Janet', 'Carla', 'Hattie', 'Beulah', 'Camille', 'Edith', 'Anita', 'David', 'Allen', 'Gilbert', 'Hugo', 'Andrew', 'Mitch', 'Isabel', 'Ivan', 'Emily', 'Katrina', 'Rita', 'Wilma', 'Dean', 'Felix', 'Matthew', 'Irma', 'Maria', 'Michael']

# months of hurricanes
months = ['October', 'September', 'September', 'November', 'August', 'September', 'September', 'September', 'September', 'September', 'September', 'October', 'September', 'August', 'September', 'September', 'August', 'August', 'September', 'September', 'August', 'October', 'September', 'September', 'July', 'August', 'September', 'October', 'August', 'September', 'October', 'September', 'September', 'October']

# years of hurricanes
years = [1924, 1928, 1932, 1932, 1933, 1933, 1935, 1938, 1953, 1955, 1961, 1961, 1967, 1969, 1971, 1977, 1979, 1980, 1988, 1989, 1992, 1998, 2003, 2004, 2005, 2005, 2005, 2005, 2007, 2007, 2016, 2017, 2017, 2018]

# maximum sustained winds (mph) of hurricanes
max_sustained_winds = [165, 160, 160, 175, 160, 160, 185, 160, 160, 175, 175, 160, 160, 175, 160, 175, 175, 190, 185, 160, 175, 180, 165, 165, 160, 175, 180, 185, 175, 175, 165, 180, 175, 160]

# areas affected by each hurricane
areas_affected = [['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], ['Lesser Antilles', 'The Bahamas', 'United States East Coast', 'Atlantic Canada'], ['The Bahamas', 'Northeastern United States'], ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], ['The Bahamas', 'Cuba', 'Florida', 'Texas', 'Tamaulipas'], ['Jamaica', 'Yucatn Peninsula'], ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], ['Southeastern United States', 'Northeastern United States', 'Southwestern Quebec'], ['Bermuda', 'New England', 'Atlantic Canada'], ['Lesser Antilles', 'Central America'], ['Texas', 'Louisiana', 'Midwestern United States'], ['Central America'], ['The Caribbean', 'Mexico', 'Texas'], ['Cuba', 'United States Gulf Coast'], ['The Caribbean', 'Central America', 'Mexico', 'United States Gulf Coast'], ['Mexico'], ['The Caribbean', 'United States East coast'], ['The Caribbean', 'Yucatn Peninsula', 'Mexico', 'South Texas'], ['Jamaica', 'Venezuela', 'Central America', 'Hispaniola', 'Mexico'], ['The Caribbean', 'United States East Coast'], ['The Bahamas', 'Florida', 'United States Gulf Coast'], ['Central America', 'Yucatn Peninsula', 'South Florida'], ['Greater Antilles', 'Bahamas', 'Eastern United States', 'Ontario'], ['The Caribbean', 'Venezuela', 'United States Gulf Coast'], ['Windward Islands', 'Jamaica', 'Mexico', 'Texas'], ['Bahamas', 'United States Gulf Coast'], ['Cuba', 'United States Gulf Coast'], ['Greater Antilles', 'Central America', 'Florida'], ['The Caribbean', 'Central America'], ['Nicaragua', 'Honduras'], ['Antilles', 'Venezuela', 'Colombia', 'United States East Coast', 'Atlantic Canada'], ['Cape Verde', 'The Caribbean', 'British Virgin Islands', 'U.S. Virgin Islands', 'Cuba', 'Florida'], ['Lesser Antilles', 'Virgin Islands', 'Puerto Rico', 'Dominican Republic', 'Turks and Caicos Islands'], ['Central America', 'United States Gulf Coast (especially Florida Panhandle)']]

# damages (USD($)) of hurricanes
damages = ['Damages not recorded', '100M', 'Damages not recorded', '40M', '27.9M', '5M', 'Damages not recorded', '306M', '2M', '65.8M', '326M', '60.3M', '208M', '1.42B', '25.4M', 'Damages not recorded', '1.54B', '1.24B', '7.1B', '10B', '26.5B', '6.2B', '5.37B', '23.3B', '1.01B', '125B', '12B', '29.4B', '1.76B', '720M', '15.1B', '64.8B', '91.6B', '25.1B']

# deaths for each hurricane
deaths = [90,4000,16,3103,179,184,408,682,5,1023,43,319,688,259,37,11,2068,269,318,107,65,19325,51,124,17,1836,125,87,45,133,603,138,3057,74]

conversion = {"M": 1000000,
              "B": 1000000000}

## Project Requirements

Hurricanes, also known as cyclones or typhoons, are one of the most powerful forces of nature on Earth. Due to climate change caused by human activity, the number and intensity of hurricanes has risen, calling for better preparation by the many communities that are devastated by them. As a concerned environmentalist, you want to look at data about the most powerful hurricanes that have occurred.

Begin by looking at the damages list. The list contains strings representing the total cost in USD($) caused by 34 category 5 hurricanes (wind speeds ≥ 157 mph (252 km/h )) in the Atlantic region. For some of the hurricanes, damage data was not recorded ("Damages not recorded"), while the rest are written in the format "Prefix-B/M", where B stands for billions (1000000000) and M stands for millions (1000000).

Write a function that returns a new list of updated damages where the recorded data is converted to float values and the missing data is retained as "Damages not recorded".

Test your function with the data stored in damages.

In [2]:
# Function to convert damages list to float values

def convert_to_float(lst):
    fixed_list = []
    for item in lst:
        fixed_item = None
        if item == 'Damages not recorded':
            fixed_item = item
        elif 'B' in item:
            fixed_item = float(item.strip('B')) * conversion.get('B')
        elif 'M' in item:
            fixed_item = float(item.strip('M')) * conversion.get('M')
        fixed_list.append(fixed_item)
    lst = fixed_list
    return lst

# Testing the function. Should be similar to damages but with float values

damages = convert_to_float(damages)

Additional data collected on the 34 strongest Atlantic hurricanes are provided in a series of lists. The data includes:

- names: names of the hurricanes
- months: months in which the hurricanes occurred
- years: years in which the hurricanes occurred
- max_sustained_winds: maximum sustained winds (miles per hour) of the hurricanes
- areas_affected: list of different areas affected by each of the hurricanes
- deaths: total number of deaths caused by each of the hurricanes

The data is organized such that the data at each index, from 0 to 33, corresponds to the same hurricane.

For example, names[0] yields the “Cuba I” hurricane, which occurred in months[0] (October) years[0] (1924).

Write a function that constructs a dictionary made out of the lists, where the keys of the dictionary are the names of the hurricanes, and the values are dictionaries themselves containing a key for each piece of data (Name, Month, Year,Max Sustained Wind, Areas Affected, Damage, Death) about the hurricane.

Thus the key "Cuba I" would have the value: {'Name': 'Cuba I', 'Month': 'October', 'Year': 1924, 'Max Sustained Wind': 165, 'Areas Affected': ['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], 'Damage': 'Damages not recorded', 'Deaths': 90}.

Test your function on the lists of data provided.

In [3]:
# Function to create dictionary

def hurricane_data_by_name(names, months, years, max_sustained_winds, areas_affected, damages, deaths):
    hurricane_data = {name: None for name in names} # To set name as key
    to_unpack = zip(names, months, years, max_sustained_winds, areas_affected, damages, deaths)
    for name, month, year, wind, area, damage, death in to_unpack:
        hurricane_data[name] = {'Name': name, 'Month': month, 'Year': year, 'Max Sustained Wind': wind, 'Areas Affected': area, 'Damages': damage, 'Deaths': death}
    return hurricane_data

# Testing the function

hurricane_data_by_name = hurricane_data_by_name(names, months, years, max_sustained_winds, areas_affected, damages, deaths)

print(hurricane_data_by_name.get('Cuba I'))

{'Name': 'Cuba I', 'Month': 'October', 'Year': 1924, 'Max Sustained Wind': 165, 'Areas Affected': ['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], 'Damages': 'Damages not recorded', 'Deaths': 90}


In addition to organizing the hurricanes in a dictionary with names as the key, you want to be able to organize the hurricanes by year.

Write a function that converts the current dictionary of hurricanes to a new dictionary, where the keys are years and the values are lists containing a dictionary for each hurricane that occurred in that year.

For example, the key 1932 would yield the value: [{'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damage': 'Damages not recorded', 'Deaths': 16}, {'Name': 'Cuba II', 'Month': 'November', 'Year': 1932, 'Max Sustained Wind': 175, 'Areas Affected': ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], 'Damage': 40000000.0, 'Deaths': 3103}].

Test your function on your hurricane dictionary.

In [4]:
# Creating the function

def hurricane_data_by_year(names, months, years, max_sustained_winds, areas_affected, damages, deaths):
    hurricane_data = {year: None for year in years} # To set years as key
    i = 0
    while i < len(years):
        year_key = years[i] # This year value will be used to "filter" our hurricane data
        year_data = [] # Temporary list where we will append hurricane data for the year
        to_unpack = zip(names, months, years, max_sustained_winds, areas_affected, damages, deaths)
        for name, month, year, wind, area, damage, death in to_unpack:
            if year_key == year:
                hurricane_in_year = {'Name': name, 'Month': month, 'Year': year, 'Max Sustained Wind': wind, 'Areas Affected': area, 'Damages': damage, 'Deaths': death}
                year_data.append(hurricane_in_year)
        hurricane_data[year_key] = year_data
        i = i + 1
    return hurricane_data

hurricane_data_by_year = hurricane_data_by_year(names, months, years, max_sustained_winds, areas_affected, damages, deaths)
print(hurricane_data_by_year.get(1932))
        

[{'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damages': 'Damages not recorded', 'Deaths': 16}, {'Name': 'Cuba II', 'Month': 'November', 'Year': 1932, 'Max Sustained Wind': 175, 'Areas Affected': ['Lesser Antilles', 'Jamaica', 'Cayman Islands', 'Cuba', 'The Bahamas', 'Bermuda'], 'Damages': 40000000.0, 'Deaths': 3103}]


You believe that knowing how often each of the areas of the Atlantic are affected by these strong hurricanes is important for making preparations for future hurricanes.

Write a function that counts how often each area is listed as an affected area of a hurricane. Store and return the results in a dictionary where the keys are the affected areas and the values are counts of how many times the areas were affected.

Test your function on your hurricane dictionary.

In [5]:
def area_hurricane_frequency(hurricane_data):
    area_frequency = {area: None for item in areas_affected for area in item}
    
    for area in area_frequency:
        frequency = 0
        for data in hurricane_data.values(): # To retrieve data for each hurricane name in the dictionary
            if area in data.get('Areas Affected'): # To check if the area_key is in the list of areas affected
                frequency += 1
        area_frequency[area] = frequency
    return area_frequency

area_frequency = area_hurricane_frequency(hurricane_data_by_name)

print('The data below shows the frequency of hurricanes for each area:\n{}'.format(area_frequency))

area_list = [area for item in areas_affected for area in item] # List of names to verify that the function is correct
area_to_check = 'Jamaica'

print('The data above should show {} with {} hurricanes.'.format(area_to_check, area_list.count(area_to_check)))
            

The data below shows the frequency of hurricanes for each area:
{'Central America': 9, 'Mexico': 7, 'Cuba': 6, 'Florida': 6, 'The Bahamas': 7, 'Lesser Antilles': 4, 'United States East Coast': 3, 'Atlantic Canada': 3, 'Northeastern United States': 2, 'Jamaica': 4, 'Cayman Islands': 1, 'Bermuda': 2, 'Texas': 4, 'Tamaulipas': 1, 'Yucatn Peninsula': 3, 'Georgia': 1, 'The Carolinas': 1, 'Virginia': 1, 'Southeastern United States': 1, 'Southwestern Quebec': 1, 'New England': 1, 'Louisiana': 1, 'Midwestern United States': 1, 'The Caribbean': 8, 'United States Gulf Coast': 6, 'United States East coast': 1, 'South Texas': 1, 'Venezuela': 3, 'Hispaniola': 1, 'South Florida': 1, 'Greater Antilles': 2, 'Bahamas': 2, 'Eastern United States': 1, 'Ontario': 1, 'Windward Islands': 1, 'Nicaragua': 1, 'Honduras': 1, 'Antilles': 1, 'Colombia': 1, 'Cape Verde': 1, 'British Virgin Islands': 1, 'U.S. Virgin Islands': 1, 'Virgin Islands': 1, 'Puerto Rico': 1, 'Dominican Republic': 1, 'Turks and Caicos Islan

Write a function that finds the area affected by the most hurricanes, and how often it was hit.

Test your function on your affected area dictionary.

In [6]:
def most_hurricanes(area_data):
    highest_frequency = max(area_data.values())
    print('The most number of hurricanes is {}.'.format(highest_frequency))
    for area, frequency in area_data.items():
        if frequency == highest_frequency:
            print('{} had the most number of hurricanes.'.format(area))
            
most_hurricanes(area_frequency)

The most number of hurricanes is 9.
Central America had the most number of hurricanes.


Write a function that finds the hurricane that caused the greatest number of deaths, and how many deaths it caused.

Test your function on your hurricane dictionary.

In [7]:
def most_deaths(hurricane_data):
    hurricane_most_deaths = ''
    most_deaths = 0
    
    for name, data in hurricane_data.items():
        if data.get('Deaths') > most_deaths:
            most_deaths = data.get('Deaths')
            hurricane_most_deaths = name
            
    print('The hurricane with most deaths is {name} with {deaths} casualties.'.format(name=hurricane_most_deaths, deaths=most_deaths))
    
most_deaths(hurricane_data_by_name)

The hurricane with most deaths is Mitch with 19325 casualties.


Just as hurricanes are rated by their windspeed, you want to try rating hurricanes based on other metrics.

Write a function that rates hurricanes on a mortality scale according to the following ratings, where the key is the rating and the value is the upper bound of deaths for that rating.

mortality_scale = {0: 0,
                   1: 100,
                   2: 500,
                   3: 1000,
                   4: 10000}
For example, a hurricane with a 1 mortality rating would have resulted in greater than 0 but less than or equal to 100 deaths. A hurricane with a 5 mortality rating would have resulted in greater than 10000 deaths.

Store the hurricanes in a new dictionary where the keys are mortality ratings and the values are lists containing a dictionary for each hurricane that falls into that mortality rating.

Test your function on your hurricane dictionary.

In [8]:
def mortality_rating(hurricane_data):
    mortality_data = {i: None for i in range(6)}
    list_0 = []
    list_1 = []
    list_2 = []
    list_3 = []
    list_4 = []
    list_5 = []
    
    for name, data in hurricane_data.items():
        deaths = data.get('Deaths')
        if deaths == 0:
            list_0.append({name: data})
            continue
        elif deaths <= 100:
            list_1.append({name: data})
            continue
        elif deaths <= 500:
            list_2.append({name: data})
            continue
        elif deaths <= 1000:
            list_3.append({name: data})
            continue
        elif deaths <= 10000:
            list_4.append({name: data})
            continue
        else:
            list_5.append({name: data})
    
    mortality_data[0] = list_0
    mortality_data[1] = list_1
    mortality_data[2] = list_2
    mortality_data[3] = list_3
    mortality_data[4] = list_4
    mortality_data[5] = list_5
    
    return mortality_data

mortality_data = mortality_rating(hurricane_data_by_name)

print(mortality_data)

{0: [], 1: [{'Cuba I': {'Name': 'Cuba I', 'Month': 'October', 'Year': 1924, 'Max Sustained Wind': 165, 'Areas Affected': ['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], 'Damages': 'Damages not recorded', 'Deaths': 90}}, {'Bahamas': {'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damages': 'Damages not recorded', 'Deaths': 16}}, {'Carol': {'Name': 'Carol', 'Month': 'September', 'Year': 1953, 'Max Sustained Wind': 160, 'Areas Affected': ['Bermuda', 'New England', 'Atlantic Canada'], 'Damages': 2000000.0, 'Deaths': 5}}, {'Carla': {'Name': 'Carla', 'Month': 'September', 'Year': 1961, 'Max Sustained Wind': 175, 'Areas Affected': ['Texas', 'Louisiana', 'Midwestern United States'], 'Damages': 326000000.0, 'Deaths': 43}}, {'Edith': {'Name': 'Edith', 'Month': 'September', 'Year': 1971, 'Max Sustained Wind': 160, 'Areas Affected': ['The Caribbean', 'Central America', 'Mexico', 'U

Write a function that finds the hurricane that caused the greatest damage, and how costly it was.

Test your function on your hurricane dictionary.

In [9]:
def most_damage(hurricane_data):
    hurricane_most_damage = ''
    most_damage = 0
    
    for name, data in hurricane_data.items():
        try:
            if data.get('Damages') > most_damage:
                most_damage = data.get('Damages')
                hurricane_most_damage = name
        except:
            continue
            
    print('The hurricane with most damages is {name} amounting to {damage} dollars.'.format(name=hurricane_most_damage, damage=most_damage))
    
most_damage(hurricane_data_by_name)

The hurricane with most damages is Katrina amounting to 125000000000.0 dollars.


Write a function that rates hurricanes on a damage scale according to the following ratings, where the key is the rating and the value is the upper bound of damage for that rating.

damage_scale = {0: 0,
                1: 100000000,
                2: 1000000000,
                3: 10000000000,
                4: 50000000000}
For example, a hurricane with a 1 damage rating would have resulted in damages greater than 0 USD but less than or equal to 100000000 USD. A hurricane with a 5 damage rating would have resulted in damages greater than 50000000000 USD (talk about a lot of money).

Store the hurricanes in a new dictionary where the keys are damage ratings and the values are lists containing a dictionary for each hurricane that falls into that damage rating.

Test your function on your hurricane dictionary.

In [11]:
def damage_rating(hurricane_data):
    damage_data = {i: None for i in range(6)}
    list_0 = []
    list_1 = []
    list_2 = []
    list_3 = []
    list_4 = []
    list_5 = []
    
    for name, data in hurricane_data.items():
        dmg = data.get('Damages')
        if dmg == 0 or dmg == 'Damages not recorded':
            list_0.append({name: data})
            continue
        elif dmg <= 100000000:
            list_1.append({name: data})
            continue
        elif dmg <= 1000000000:
            list_2.append({name: data})
            continue
        elif dmg <= 10000000000:
            list_3.append({name: data})
            continue
        elif dmg <= 50000000000:
            list_4.append({name: data})
            continue
        else:
            list_5.append({name: data})
    
    damage_data[0] = list_0
    damage_data[1] = list_1
    damage_data[2] = list_2
    damage_data[3] = list_3
    damage_data[4] = list_4
    damage_data[5] = list_5
    
    return damage_data

damage_data = damage_rating(hurricane_data_by_name)

print(damage_data)

{0: [{'Cuba I': {'Name': 'Cuba I', 'Month': 'October', 'Year': 1924, 'Max Sustained Wind': 165, 'Areas Affected': ['Central America', 'Mexico', 'Cuba', 'Florida', 'The Bahamas'], 'Damages': 'Damages not recorded', 'Deaths': 90}}, {'Bahamas': {'Name': 'Bahamas', 'Month': 'September', 'Year': 1932, 'Max Sustained Wind': 160, 'Areas Affected': ['The Bahamas', 'Northeastern United States'], 'Damages': 'Damages not recorded', 'Deaths': 16}}, {'Labor Day': {'Name': 'Labor Day', 'Month': 'September', 'Year': 1935, 'Max Sustained Wind': 185, 'Areas Affected': ['The Bahamas', 'Florida', 'Georgia', 'The Carolinas', 'Virginia'], 'Damages': 'Damages not recorded', 'Deaths': 408}}, {'Anita': {'Name': 'Anita', 'Month': 'September', 'Year': 1977, 'Max Sustained Wind': 175, 'Areas Affected': ['Mexico'], 'Damages': 'Damages not recorded', 'Deaths': 11}}], 1: [{'San Felipe II Okeechobee': {'Name': 'San Felipe II Okeechobee', 'Month': 'September', 'Year': 1928, 'Max Sustained Wind': 160, 'Areas Affected'