# 1. Archive exploration
Description: This data set explores teh ecological impacts of hurricanes across the Yucatan Pennisula since 1851 to 2000, using computer modeling.

Date of Access: 10/24/25

Link: https://portal.edirepository.org/nis/mapbrowse?packageid=knb-lter-hfr.71.23

Citation:

Boose, E. and D. Foster. 2023. Ecological Impacts of Hurricanes Across the Yucatan Peninsula 1851-2000 ver 23. Environmental Data Initiative. https://doi.org/10.6073/pasta/f219113373913f2daf421732e28d3c38 (Accessed 2025-10-24).

# 2. Data loading and preliminary exploration

In [None]:
import pandas as pd

In [None]:
# Saving dataset url for import.
url ='https://pasta.lternet.edu/package/data/eml/knb-lter-hfr/71/23/ab0fe2bf4f3ad850371ccb9c69d78469'

hurricane = pd.read_csv(url)

In [None]:
# View the first 5 rows of the hurricane dataset
hurricane.head()

#### Obtain preliminary information and explore this data frame using pandas methods.

In [1]:
# Preliminary data exploration
print(hurricane.isna().sum())
print(hurricane.shape)
print(hurricane.dtypes)

NameError: name 'hurricane' is not defined

# 3. Brainstorm
In this session we want to answer the following question:

*How many hurricanes with Saffir-Simpson category 5 have been registered and what was their duration?*

a. Individually, write down step-by-step instructions on how you would wrangle the df data frame to answer the question. Do not code anything yet. Remember: It’s okay if you don’t know how to code each step. The important thing is to have an idea of what you’d like to do.

1. Filter data frame for just ss = 5 (or groupby ss)

2. Count method to see how many (value_counts)

3. Make duration column where we do edn.date - start.date (check data type)

# 4. Data Wrangling

# Filter for just hurricanes that are category 5 and saving as its own df.
cat5 = hurricane[hurricane['ss'] == 5]

In [None]:
# Counting the number of category 5 hurricanes.
# cat5.ss.count()

# The better way
len(cat5)

There have been four hurricanes with Saffir-Simpson category 5.

In [None]:
# What is the data type pf the date column
cat5['start.date']

In [None]:
# Changing date variables to be DateTime objects
cat5['start.date'] = pd.to_datetime(cat5['start.date'])
cat5['end.date'] = pd.to_datetime(cat5['end.date'])

In [None]:
# Creating a duration column.
cat5['duration'] = cat5['end.date'] - cat5['start.date']

In [None]:
cat5

**Interpretation:** Most category 5 hurricanes lasted for a day, with on eof the four being less than a day.

# 5. Visualize Saffir-Simpson categories across time

In [None]:
import matplotlib.pyplot as plt

In [None]:
hurricane.loc[:, 'start.date'] = pd.to_datetime(hurricane['start.date'])

hurricane['year'] = hurricane['start.date'].dt.year

In [None]:
plot = hurricane.plot(kind='scatter',
               x = 'start.date',
               y = 'ss',
               xlabel = 'Start Date',
               ylabel = 'Saffir-Simpson Category',
               title='Saffir-Simpson Categories Across Time',
               color = 'hotpink').set_yticks([1, 2, 3, 4, 5])