# Data Colllection

## EM-DAT: The International Disaster Database

#### The EM-DAT Public Table is a global disaster database maintained by CRED, tracking natural and technological disasters. It includes data on fatalities, affected populations, and economic damages, and is used for research and disaster management.

In [1]:
!pip install openpyxl

Collecting openpyxl
  Downloading openpyxl-3.1.5-py2.py3-none-any.whl.metadata (2.5 kB)
Collecting et-xmlfile (from openpyxl)
  Downloading et_xmlfile-2.0.0-py3-none-any.whl.metadata (2.7 kB)
Downloading openpyxl-3.1.5-py2.py3-none-any.whl (250 kB)
Downloading et_xmlfile-2.0.0-py3-none-any.whl (18 kB)
Installing collected packages: et-xmlfile, openpyxl
Successfully installed et-xmlfile-2.0.0 openpyxl-3.1.5

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [2]:
import pandas as pd

df1 = pd.read_excel("Datasets/emdat.xlsx")
print(df1.head())

          DisNo. Historic Classification Key Disaster Group Disaster Subgroup  \
0  1999-9388-DJI       No    nat-cli-dro-dro        Natural    Climatological   
1  1999-9388-SDN       No    nat-cli-dro-dro        Natural    Climatological   
2  1999-9388-SOM       No    nat-cli-dro-dro        Natural    Climatological   
3  2000-0001-AGO       No    tec-tra-roa-roa  Technological         Transport   
4  2000-0002-AGO       No    nat-hyd-flo-riv        Natural      Hydrological   

  Disaster Type Disaster Subtype External IDs Event Name  ISO  ...  \
0       Drought          Drought          NaN        NaN  DJI  ...   
1       Drought          Drought          NaN        NaN  SDN  ...   
2       Drought          Drought          NaN        NaN  SOM  ...   
3          Road             Road          NaN        NaN  AGO  ...   
4         Flood   Riverine flood          NaN        NaN  AGO  ...   

  Reconstruction Costs ('000 US$) Reconstruction Costs, Adjusted ('000 US$)  \
0            

## Kaggle Dataset : ALL NATURAL DISASTERS 1900-2021 / EOSDIS

#### This dataset, hosted on Kaggle, provides a record of natural disasters worldwide from 1900 to 2021, sourced from NASA's Earth Observing System Data and Information System (EOSDIS). It includes details such as disaster type, location, dates, and impacts (e.g., fatalities, affected populations, and economic damages), making it useful for analyzing historical disaster trends and impacts.

In [3]:
df2 = pd.read_csv("Datasets/EOSDIS.csv")
print(df2.head())

   Year   Seq Glide Disaster Group Disaster Subgroup      Disaster Type  \
0  1900  9002   NaN        Natural    Climatological            Drought   
1  1900  9001   NaN        Natural    Climatological            Drought   
2  1902    12   NaN        Natural       Geophysical         Earthquake   
3  1902     3   NaN        Natural       Geophysical  Volcanic activity   
4  1902    10   NaN        Natural       Geophysical  Volcanic activity   

  Disaster Subtype Disaster Subsubtype   Event Name     Country  ...  \
0          Drought                 NaN          NaN  Cabo Verde  ...   
1          Drought                 NaN          NaN       India  ...   
2  Ground movement                 NaN          NaN   Guatemala  ...   
3         Ash fall                 NaN  Santa Maria   Guatemala  ...   
4         Ash fall                 NaN  Santa Maria   Guatemala  ...   

  No Affected No Homeless Total Affected Insured Damages ('000 US$)  \
0         NaN         NaN            NaN     

### Global Disaster Alert and Coordination System (GDACS)

In [None]:
import requests
import pandas as pd

# GDACS API URL for latest events
GDACS_URL = "https://www.gdacs.org/gdacsapi/api/events/geteventlist/EVENTS4APP"

def fetch_gdacs_data():
    """Fetches disaster event data from GDACS API."""
    
    response = requests.get(GDACS_URL)
    if response.status_code != 200:
        raise Exception("Failed to fetch data from GDACS API")
    
    data = response.json()["features"]  # Extract event list
    
    # Extract relevant details
    events = []
    for event in data:
        properties = event["properties"]
        
        # Append event details to list
        events.append({
            "Event Name": properties["eventname"],
            "Country": properties["country"],
            "ISO": properties["iso3"],
            "Disaster Group": properties["eventtype"],
            "Latitude": event["geometry"]["coordinates"][1],  # Lat from GeoJSON
            "Longitude": event["geometry"]["coordinates"][0],  # Lon from GeoJSON
            "Start Year": properties["fromdate"][:4],  # Extract year from date
            "Start Month": properties["fromdate"][5:7],  # Extract month
            "Start Day": properties["fromdate"][8:10],  # Extract day
            "End Year": properties["todate"][:4] if properties["todate"] else None,
            "End Month": properties["todate"][5:7] if properties["todate"] else None,
            "End Day": properties["todate"][8:10] if properties["todate"] else None,
            "Magnitude": properties.get("magnitude"),
            "Losses": properties.get("economicloss", None) 
        })
    
    return pd.DataFrame(events)

# Fetch GDACS data
gdacs_df = fetch_gdacs_data()

# Display the first few rows
print(gdacs_df.head())

# Save to CSV (optional)
gdacs_df.to_csv("gdacs_disaster_data.csv", index=False)




  Event Name                  Country  ISO Disaster Group  Latitude  \
0                                                      EQ  -19.0936   
1                              Greece  GRC             EQ   36.6542   
2                                                      EQ  -11.4468   
3   VINCE-25  Cocos (Keeling) Islands                  TC  -18.7900   
4  TALIAH-25                                           TC  -14.8000   

   Longitude Start Year Start Month Start Day End Year End Month End Day  \
0  -172.6221       2025          02        05     2025        02      05   
1    25.6259       2025          02        05     2025        02      05   
2   117.2370       2025          02        05     2025        02      05   
3    83.9100       2025          02        01     2025        02      05   
4   106.0000       2025          02        02     2025        02      05   

  Magnitude Losses  
0      None   None  
1      None   None  
2      None   None  
3      None   None  
4      None