# Volcano Eruptions

Which volcanoes have experienced the longest eruptions?

The dataset `volcanic-eruptions.csv` includes the start and end dates for all volcanic eruptions since 1800, along with each volcano's identification number. For eruptions that are still ongoing as of December 2024, the end date is recorded as December 2024.

This dataset excludes any eruptive pauses shorter than three months. If an eruption resumes after more than three months of inactivity, it is classified as a new eruption.

Note: Volcano Yasur in Vanuatu is not included in this list due to the absence of a clear start date.

In [44]:
# FOR GOOGLE COLAB ONLY.
# Uncomment and run the code below. A dialog will appear to upload files.
# Upload 'volcanic-eruptions.csv' and 'volcano-list.csv'.

# from google.colab import files
# uploaded = files.upload()

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

eruptions = pd.read_csv('volcanic-eruptions.csv')
eruptions.head(3)

Unnamed: 0,volcano_id,start_date,end_date
0,211020,07-1913,04-1944
1,211020,02-1864,11-1868
2,211020,12-1854,05-1855


### Additional dataset

The dataset `volcano-list.csv` provides detailed information about each volcano, including its name, country, latitude, longitude, and type.

In [2]:
volcanoes = pd.read_csv('volcano-list.csv')
volcanoes.head(3)

Unnamed: 0,volcano_id,volcano_name,country,volcanic_region_group,volcanic_region,volcano_landform,primary_volcano_type,activity_evidence,last_known_eruption,latitude,longitude,elevation_m,tectonic_setting,dominant_rock_type
0,210010,West Eifel Volcanic Field,Germany,European Volcanic Regions,Central European Volcanic Province,Cluster,Volcanic field,Eruption Dated,8300 BCE,50.17,6.85,600,Rift zone / Continental crust (>25 km),Foidite
1,210020,Chaine des Puys,France,European Volcanic Regions,Western European Volcanic Province,Cluster,Lava dome(s),Eruption Dated,4040 BCE,45.786,2.981,1464,Rift zone / Continental crust (>25 km),Basalt / Picro-Basalt
2,210030,Olot Volcanic Field,Spain,European Volcanic Regions,Western European Volcanic Province,Cluster,Volcanic field,Evidence Credible,Unknown,42.17,2.53,893,Intraplate / Continental crust (>25 km),Trachybasalt / Tephrite Basanite


### Project Ideas:

- Find the volcanoes that were erupting as of Dec 2024.

- Find the volcanoes that have had the longest volcanic eruptions. 

Hints:
- Use `pd.to_datetime`.

- Merge the volcanoes dataframe into eruptions.

- Before the merge, reduce the dataframes to the columns of interest.

- Use `df.sort_values`.

In [11]:
# YOUR CODE HERE (add additional cells as needed)

# import pandas as pd

# Load CSVs
# eruptions = pd.read_csv("volcanic-eruptions.csv")
# volcanoes = pd.read_csv("volcano-list.csv")

# Convert start_date and end_date to datetime
eruptions["start_date"] = pd.to_datetime(eruptions["start_date"], format="%m-%Y", errors="coerce")
eruptions["end_date"] = pd.to_datetime(eruptions["end_date"], format="%m-%Y", errors="coerce")

# Handle ongoing eruptions (missing end_date)
eruptions["end_date"] = eruptions["end_date"].fillna(pd.Timestamp("2024-12-31"))

In [None]:
# # Find Volcanoes Still Erupting as of Dec 2024

active_filter = (eruptions['start_date'] <= '2024-12-31') & (eruptions['end_date'] >= '2024-12-01')
active_eruptions = eruptions[active_filter]

volcanoes = pd.read_csv('volcano-list.csv')
active_volcanoes_df = pd.merge(
    active_eruptions[['volcano_id']], 
    volcanoes[['volcano_id', 'volcano_name', 'country']],
    on='volcano_id',
    how='left'
).drop_duplicates()

print(active_volcanoes_df)


    volcano_id           volcano_name           country
0       211040              Stromboli             Italy
1       211060                   Etna             Italy
2       221080               Erta Ale          Ethiopia
3       222120      Lengai, Ol Doinyo          Tanzania
4       223020            Nyamulagira          DR Congo
5       234010                  Heard         Australia
6       241040  Whakaari/White Island       New Zealand
7       243060                  Tofua             Tonga
8       243080              Home Reef             Tonga
9       251020                  Manam  Papua New Guinea
10      252010                Langila  Papua New Guinea
11      256010               Tinakula   Solomon Islands
12      261140                 Marapi         Indonesia
13      263250                 Merapi         Indonesia
14      263300                 Semeru         Indonesia
15      263340                  Raung         Indonesia
16      264180               Lewotobi         In

In [None]:
# Find the Longest Eruptions

# Calculate eruption duration in days
eruptions["duration_days"] = (eruptions["end_date"] - eruptions["start_date"]).dt.days

# Merge to get volcano names
eruptions_named = pd.merge(
    eruptions,
    volcanoes[["volcano_id", "volcano_name", "country"]],
    on="volcano_id"
)

# Sort by duration
longest_eruptions = eruptions_named.sort_values("duration_days", ascending=False)

# Show top 5 longest eruptions
print("Longest volcanic eruptions:")
print(longest_eruptions[["volcano_name", "country", "start_date", "end_date", "duration_days"]].head(5))

Longest volcanic eruptions:
     volcano_name        country start_date   end_date  duration_days
2932  Santa Maria      Guatemala 1922-06-01 2024-12-01          37439
1575       Dukono      Indonesia 1933-08-01 2024-12-01          33360
12      Stromboli          Italy 1934-02-01 2024-12-01          33176
3362       Sangay        Ecuador 1934-08-01 2011-03-01          27971
2782      Kilauea  United States 1823-02-01 1894-12-01          26236
