PROJECT TITLE:

Climate-Induced Disaster Pattern Analysis: Detecting Emerging Risks from Historical Events

PROBLEM STATEMENT :

In recent decades, the frequency and intensity of natural disasters such as floods, storms, droughts, wildfires, and cyclones have increased significantly. Scientific studies suggest that these changes are closely linked to climate change and its cascading effects on weather systems. However, disaster management efforts are often reactive rather than proactive due to a lack of clear insights into long-term patterns and emerging risks.

This project aims to analyze historical global disaster data to identify climate-induced trends, patterns, and emerging risks. By studying decades of disaster records, the project will highlight how climate change is reshaping disaster frequency, geographic spread, and human impact. These insights can support early warning systems, policy formulation, and disaster preparedness strategies, ultimately reducing vulnerabilities and saving lives.

In [3]:
import pandas as pd

# Load with correct filename
df = pd.read_csv("1900_2021_DISASTERS.xlsx - emdat data.csv")

# Explore dataset
df.info()
print("\n")
print(df.describe())
print("\n")
print("Missing values:\n", df.isnull().sum())



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16126 entries, 0 to 16125
Data columns (total 45 columns):
 #   Column                      Non-Null Count  Dtype  
---  ------                      --------------  -----  
 0   Year                        16126 non-null  int64  
 1   Seq                         16126 non-null  int64  
 2   Glide                       1581 non-null   object 
 3   Disaster Group              16126 non-null  object 
 4   Disaster Subgroup           16126 non-null  object 
 5   Disaster Type               16126 non-null  object 
 6   Disaster Subtype            13016 non-null  object 
 7   Disaster Subsubtype         1077 non-null   object 
 8   Event Name                  3861 non-null   object 
 9   Country                     16126 non-null  object 
 10  ISO                         16126 non-null  object 
 11  Region                      16126 non-null  object 
 12  Continent                   16126 non-null  object 
 13  Location                    143

In [4]:
# Show first 10 rows
df.head(10)


Unnamed: 0,Year,Seq,Glide,Disaster Group,Disaster Subgroup,Disaster Type,Disaster Subtype,Disaster Subsubtype,Event Name,Country,...,No Affected,No Homeless,Total Affected,Insured Damages ('000 US$),Total Damages ('000 US$),CPI,Adm Level,Admin1 Code,Admin2 Code,Geo Locations
0,1900,9002,,Natural,Climatological,Drought,Drought,,,Cabo Verde,...,,,,,,3.221647,,,,
1,1900,9001,,Natural,Climatological,Drought,Drought,,,India,...,,,,,,3.221647,,,,
2,1902,12,,Natural,Geophysical,Earthquake,Ground movement,,,Guatemala,...,,,,,25000.0,3.350513,,,,
3,1902,3,,Natural,Geophysical,Volcanic activity,Ash fall,,Santa Maria,Guatemala,...,,,,,,3.350513,,,,
4,1902,10,,Natural,Geophysical,Volcanic activity,Ash fall,,Santa Maria,Guatemala,...,,,,,,3.350513,,,,
5,1903,6,,Natural,Geophysical,Mass movement (dry),Rockfall,,,Canada,...,,,23.0,,,3.479379,,,,
6,1903,12,,Natural,Geophysical,Volcanic activity,Ash fall,,Mount Karthala,Comoros (the),...,,,,,,3.479379,,,,
7,1904,3,,Natural,Meteorological,Storm,Tropical cyclone,,,Bangladesh,...,,,,,,3.479379,,,,
8,1905,5,,Natural,Geophysical,Mass movement (dry),Rockfall,,,Canada,...,,,18.0,,,3.479379,,,,
9,1905,3,,Natural,Geophysical,Earthquake,Ground movement,,,India,...,,,,,25000.0,3.479379,,,,


In [5]:
print("Dataset Shape:", df.shape)


Dataset Shape: (16126, 45)


In [6]:
print("Disaster Types:", df["Disaster Type"].unique())
print("\nNumber of Disaster Types:", df["Disaster Type"].nunique())


Disaster Types: ['Drought' 'Earthquake' 'Volcanic activity' 'Mass movement (dry)' 'Storm'
 'Flood' 'Epidemic' 'Landslide' 'Wildfire' 'Extreme temperature ' 'Fog'
 'Insect infestation' 'Impact' 'Animal accident' 'Glacial lake outburst']

Number of Disaster Types: 15


In [7]:
missing_percent = (df.isnull().sum() / len(df)) * 100
missing_percent.sort_values(ascending=False).head(10)


Unnamed: 0,0
Aid Contribution,95.801811
Associated Dis2,95.615776
Disaster Subsubtype,93.321344
Insured Damages ('000 US$),93.203522
Local Time,93.160114
River Basin,92.0191
Glide,90.195957
OFDA Response,89.495225
No Homeless,84.931167
Appeal,84.069205
