# Datenexploration: Weltweite Naturkatastrophen

## Idee, Daten finden & Verifikation

![Datenpipeline1](../../imgs/datenpipeline1.png)

### Aggregated figures for Natural Disasters in EM-DAT

Link: https://data.humdata.org/dataset/emdat-country-profiles


In 1988, the **Centre for Research on the Epidemiology of Disasters (CRED)** launched the **Emergency Events Database (EM-DAT)**. EM-DAT was created with the initial support of the **World Health Organisation (WHO) and the Belgian Government**.

The main objective of the database is to **serve the purposes of humanitarian action at national and international levels**. The initiative aims to rationalise decision making for disaster preparedness, as well as provide an objective base for vulnerability assessment and priority setting.

EM-DAT contains essential core data on the **occurrence and effects of over 22,000 mass disasters in the world from 1900 to the present day**. The database is compiled from various sources, including UN agencies, non-governmental organisations, insurance companies, research institutes and press agencies.



In [None]:
import pandas as pd
data = pd.read_excel('../../data/emdat.xlsx', engine="openpyxl")

In [4]:
data

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
0,#date +occurred,#country +name,#country +code,#cause +group,#cause +subgroup,#cause +type,#cause +subtype,#frequency,#affected +ind,#affected +ind +killed,,#value +usd,
1,1900,Cabo Verde,CPV,Natural,Climatological,Drought,Drought,1,,11000,,,3.077091
2,1900,India,IND,Natural,Climatological,Drought,Drought,1,,1250000,,,3.077091
3,1900,Jamaica,JAM,Natural,Hydrological,Flood,,1,,300,,,3.077091
4,1900,Japan,JPN,Natural,Geophysical,Volcanic activity,Ash fall,1,,30,,,3.077091
...,...,...,...,...,...,...,...,...,...,...,...,...,...
10338,2022,Yemen,YEM,Natural,Hydrological,Flood,Flash flood,1,3400,13,,,
10339,2022,South Africa,ZAF,Natural,Hydrological,Flood,,7,143119,562,3.164000e+09,,
10340,2022,Zambia,ZMB,Natural,Hydrological,Flood,,1,15000,3,,,
10341,2022,Zimbabwe,ZWE,Natural,Hydrological,Flood,,1,,,,,


## Datenexploration und -bereinigung

![Datenpipeline1](../../imgs/datenpipeline2.png)

### Überblick über die Daten

In [None]:
# head() gibt die ersten 5 Zeilen aus
data.head()

Wie groß ist der Datensatz? Wie viele Zeilen und wie viele Spalten sind vorhanden?

In [None]:
data.shape

In [14]:
print(f'Anzahl an Zeilen: {data.shape[0]}')
print(f'Anzahl an Spalten: {data.shape[1]}')

Anzahl an Zeilen: 10343
Anzahl an Spalten: 13


Die Spaltennamen

In [15]:
print(data.columns)

Index(['Year', 'Country', 'ISO', 'Disaster Group', 'Disaster Subroup',
       'Disaster Type', 'Disaster Subtype', 'Total Events', 'Total Affected',
       'Total Deaths', 'Total Damage (USD, original)',
       'Total Damage (USD, adjusted)', 'CPI'],
      dtype='object')


`info()` für mehr Infos über die Spalten

In [16]:
print(data.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10343 entries, 0 to 10342
Data columns (total 13 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   Year                          10343 non-null  object 
 1   Country                       10343 non-null  object 
 2   ISO                           10343 non-null  object 
 3   Disaster Group                10343 non-null  object 
 4   Disaster Subroup              10343 non-null  object 
 5   Disaster Type                 10343 non-null  object 
 6   Disaster Subtype              8226 non-null   object 
 7   Total Events                  10343 non-null  object 
 8   Total Affected                7507 non-null   object 
 9   Total Deaths                  7318 non-null   object 
 10  Total Damage (USD, original)  3796 non-null   float64
 11  Total Damage (USD, adjusted)  3767 non-null   object 
 12  CPI                           10149 non-null  float64
dtypes

`describe()` zeigt die grundlegenden statistischen Eigenschaften von Spalten mit numerischem Datentyp, also `int` und `float`. 

Die Methode berechnet:
- die Anzahl an fehlenden Werten
- Durchschnitt
- Standardabweichung
- Zahlenrange
- Media
- 0.25 und 0.75 Quartile

In [17]:
data.describe()

Unnamed: 0,"Total Damage (USD, original)",CPI
count,3796.0,10149.0
mean,1058358000.0,58.812266
std,6396224000.0,27.308606
min,2000.0,3.077091
25%,9988750.0,40.450409
50%,67000000.0,63.549547
75%,400000000.0,80.472285
max,210000000000.0,100.0


`.unique()` zeigt die unterschiedlichen Werte einer Spalte an

In [23]:
data['Year'].unique()

array(['#date +occurred', 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907,
       1908, 1909, 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918,
       1919, 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 1929,
       1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 1939, 1940,
       1941, 1942, 1943, 1944, 1945, 1946, 1947, 1948, 1949, 1950, 1951,
       1952, 1953, 1954, 1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962,
       1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973,
       1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984,
       1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995,
       1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
       2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
       2018, 2019, 2020, 2021, 2022], dtype=object)

### Data Cleaning: erste Zeile im DataFrame entfernen

In [24]:
data.index

RangeIndex(start=0, stop=10343, step=1)

In [27]:
data = data.drop(index=0)

In [32]:
data

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
1,1900,Cabo Verde,CPV,Natural,Climatological,Drought,Drought,1,,11000,,,3.077091
2,1900,India,IND,Natural,Climatological,Drought,Drought,1,,1250000,,,3.077091
3,1900,Jamaica,JAM,Natural,Hydrological,Flood,,1,,300,,,3.077091
4,1900,Japan,JPN,Natural,Geophysical,Volcanic activity,Ash fall,1,,30,,,3.077091
5,1900,Turkey,TUR,Natural,Geophysical,Earthquake,Ground movement,1,,140,,,3.077091
...,...,...,...,...,...,...,...,...,...,...,...,...,...
10338,2022,Yemen,YEM,Natural,Hydrological,Flood,Flash flood,1,3400,13,,,
10339,2022,South Africa,ZAF,Natural,Hydrological,Flood,,7,143119,562,3.164000e+09,,
10340,2022,Zambia,ZMB,Natural,Hydrological,Flood,,1,15000,3,,,
10341,2022,Zimbabwe,ZWE,Natural,Hydrological,Flood,,1,,,,,


### Datentypen abfragen und anpassen

In [33]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10342 entries, 1 to 10342
Data columns (total 13 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   Year                          10342 non-null  object 
 1   Country                       10342 non-null  object 
 2   ISO                           10342 non-null  object 
 3   Disaster Group                10342 non-null  object 
 4   Disaster Subroup              10342 non-null  object 
 5   Disaster Type                 10342 non-null  object 
 6   Disaster Subtype              8225 non-null   object 
 7   Total Events                  10342 non-null  object 
 8   Total Affected                7506 non-null   object 
 9   Total Deaths                  7317 non-null   object 
 10  Total Damage (USD, original)  3796 non-null   float64
 11  Total Damage (USD, adjusted)  3766 non-null   object 
 12  CPI                           10149 non-null  float64
dtypes

In [5]:
# Datentyp Abfrage mit dem Attribut
data['Year'].dtype

dtype('O')

In [39]:
# Umwandlung des Datentyp
data["Year"] = pd.to_numeric(data["Year"])
data['Year'].dtype

dtype('int64')

In [42]:
# Auf alle integer und float Spalten anwenden
cols = ['Total Events', 'Total Affected', 'Total Deaths', 'Total Damage (USD, adjusted)']
for col in cols:
    data[col] = pd.to_numeric(data[col])

In [43]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10342 entries, 1 to 10342
Data columns (total 13 columns):
 #   Column                        Non-Null Count  Dtype  
---  ------                        --------------  -----  
 0   Year                          10342 non-null  int64  
 1   Country                       10342 non-null  object 
 2   ISO                           10342 non-null  object 
 3   Disaster Group                10342 non-null  object 
 4   Disaster Subroup              10342 non-null  object 
 5   Disaster Type                 10342 non-null  object 
 6   Disaster Subtype              8225 non-null   object 
 7   Total Events                  10342 non-null  int64  
 8   Total Affected                7506 non-null   float64
 9   Total Deaths                  7317 non-null   float64
 10  Total Damage (USD, original)  3796 non-null   float64
 11  Total Damage (USD, adjusted)  3766 non-null   float64
 12  CPI                           10149 non-null  float64
dtypes

### Überblick über die numerischen Daten

In [44]:
data.describe()

Unnamed: 0,Year,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
count,10342.0,10342.0,7506.0,7317.0,3796.0,3766.0,10149.0
mean,1995.349836,1.444885,1123573.0,3122.315,1058358000.0,1591526000.0,58.812266
std,21.97513,1.236398,9804163.0,72840.39,6396224000.0,8361409000.0,27.308606
min,1900.0,1.0,1.0,1.0,2000.0,2287.0,3.077091
25%,1986.0,1.0,1200.0,6.0,9988750.0,18280970.0,40.450409
50%,2000.0,1.0,11414.0,23.0,67000000.0,136037900.0,63.549547
75%,2011.0,1.0,116972.0,91.0,400000000.0,718804200.0,80.472285
max,2022.0,20.0,330000000.0,3700000.0,210000000000.0,252973400000.0,100.0


### Überblick über die Objekt Daten

In [7]:
# Unterschiedliche Länder
countries = data['Country'].unique()
print(countries)

['#country +name' 'Cabo Verde' 'India' 'Jamaica' 'Japan' 'Turkey'
 'United States of America (the)' 'Azerbaijan' 'China' 'Guatemala'
 'Myanmar' 'Martinique' 'Soviet Union' 'Saint Vincent and the Grenadines'
 'Canada' 'Comoros (the)' 'Iran (Islamic Republic of)' 'Israel'
 'Niger (the)' 'Bangladesh' 'Greece' 'Taiwan (Province of China)'
 'Albania' 'Italy' 'Philippines (the)' 'Belgium' 'Chile' 'Colombia'
 'Hong Kong' 'Romania' 'France' 'Haiti' 'Morocco' 'Pakistan' 'Portugal'
 'Burkina Faso' 'Costa Rica' 'Algeria' 'Gambia (the)' 'Guinea-Bissau'
 'Mali' 'Mauritania' 'Senegal' 'Chad' 'Kazakhstan' 'Mexico' 'Tajikistan'
 'Indonesia' 'Peru' 'Tokelau' 'Puerto Rico' 'Anguilla' 'Argentina'
 'Germany Fed Rep' 'Ecuador' 'Bahamas (the)' 'Cuba' 'Egypt' 'Jordan'
 'Bulgaria' 'Guadeloupe' 'Saint Kitts and Nevis' 'Montserrat' 'Poland'
 'New Zealand' 'Dominica' 'Dominican Republic (the)' 'Nicaragua' 'Armenia'
 'Belize' 'Fiji' 'Honduras' 'Solomon Islands' 'Trinidad and Tobago'
 'El Salvador' 'Korea (the Rep

In [46]:
print(len(countries))

225


In [50]:
# Vorkommen von Ländern der Liste
'Germany' in countries

True

In [51]:
# Vorkommen von Deutschland
for country in countries:
    if 'german' in country.lower():
        print(country)

Germany Fed Rep
Germany Dem Rep
Germany


In [52]:
 data['Disaster Group'].unique()

array(['Natural'], dtype=object)

`.value_counts()` zeigt wie oft eine Spalte die unterschiedlichen Werte annimmt.

In [53]:
data['Disaster Subroup'].value_counts()

Hydrological      4442
Meteorological    3304
Geophysical       1355
Climatological    1148
Biological          93
Name: Disaster Subroup, dtype: int64

Mit dem Argument `normalize=True` wird das Vorkommen der Werte automatisch ins Verhältnis gesetzt.

In [54]:
data['Disaster Subroup'].value_counts(normalize=True)

Hydrological      0.429511
Meteorological    0.319474
Geophysical       0.131019
Climatological    0.111004
Biological        0.008992
Name: Disaster Subroup, dtype: float64

In [55]:
data['Disaster Type'].value_counts()

Flood                    3802
Storm                    2746
Earthquake               1078
Drought                   774
Landslide                 640
Extreme temperature       557
Wildfire                  372
Volcanic activity         230
Insect infestation         92
Mass movement (dry)        47
Glacial lake outburst       2
Fog                         1
Animal accident             1
Name: Disaster Type, dtype: int64

In [56]:
data['Disaster Type'].value_counts(normalize=True)

Flood                    0.367627
Storm                    0.265519
Earthquake               0.104235
Drought                  0.074840
Landslide                0.061884
Extreme temperature      0.053858
Wildfire                 0.035970
Volcanic activity        0.022239
Insect infestation       0.008896
Mass movement (dry)      0.004545
Glacial lake outburst    0.000193
Fog                      0.000097
Animal accident          0.000097
Name: Disaster Type, dtype: float64


### Dataframes Sortieren

Dataframes können anhand einer oder meherer Spalten sortiert werden.



In [57]:
data.sort_values(by="Total Affected")

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
7790,2011,Italy,ITA,Natural,Hydrological,Landslide,,1,1.0,3.0,,,83.012674
6781,2007,China,CHN,Natural,Hydrological,Landslide,Landslide,1,1.0,33.0,,,76.518679
6300,2005,Spain,ESP,Natural,Climatological,Wildfire,Forest fire,1,1.0,11.0,2.050000e+09,2.844401e+09,72.071410
9866,2020,Taiwan (Province of China),TWN,Natural,Meteorological,Storm,Tropical cyclone,1,1.0,1.0,,,95.512967
6767,2007,Barbados,BRB,Natural,Geophysical,Earthquake,Ground movement,1,1.0,,,,76.518679
...,...,...,...,...,...,...,...,...,...,...,...,...,...
10293,2022,Puerto Rico,PRI,Natural,Meteorological,Storm,Tropical cyclone,1,,25.0,,,
10294,2022,Portugal,PRT,Natural,Climatological,Drought,Drought,1,,,,,
10324,2022,Uruguay,URY,Natural,Hydrological,Flood,,1,,,,,
10325,2022,United States of America (the),USA,Natural,Climatological,Drought,Drought,1,,,3.000000e+09,,


In [61]:
# 10 schlimmsten Naturkatastrophen
data.sort_values(by="Total Affected", ascending=False).head(n=10)

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
8610,2015,India,IND,Natural,Climatological,Drought,Drought,1,330000000.0,,3000000000.0,3429750000.0,87.469932
2721,1987,India,IND,Natural,Climatological,Drought,Drought,1,300000000.0,300.0,,,41.932736
5563,2002,India,IND,Natural,Climatological,Drought,Drought,1,300000000.0,,910722000.0,1371942000.0,66.381964
4482,1998,China,CHN,Natural,Hydrological,Flood,Riverine flood,3,241134300.0,3760.0,30998700000.0,51529310000.0,60.157411
3274,1991,China,CHN,Natural,Hydrological,Flood,Riverine flood,2,210232227.0,1835.0,8030000000.0,15976650000.0,50.260853
1319,1972,India,IND,Natural,Climatological,Drought,Drought,1,200000000.0,,100000000.0,647994400.0,15.432233
5778,2003,China,CHN,Natural,Hydrological,Flood,Riverine flood,6,155924986.0,662.0,15329640000.0,22580480000.0,67.888896
4107,1996,China,CHN,Natural,Hydrological,Flood,Riverine flood,2,154634000.0,3975.0,18914500000.0,32676160000.0,57.884706
7490,2010,China,CHN,Natural,Hydrological,Flood,Riverine flood,5,140194136.0,1911.0,18171000000.0,22580440000.0,80.472285
3618,1993,India,IND,Natural,Hydrological,Flood,,1,128000000.0,827.0,7000000000.0,13130350000.0,53.31162


In [59]:
# Mehrere Argumente zum Sortieren sind möglich
data.sort_values(by=["Disaster Type", "Total Affected"], ascending=[True, False]).head(n=10)

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
8426,2014,Niger (the),NER,Natural,Biological,Animal accident,,1,5.0,12.0,,,87.366298
8610,2015,India,IND,Natural,Climatological,Drought,Drought,1,330000000.0,,3000000000.0,3429750000.0,87.469932
2721,1987,India,IND,Natural,Climatological,Drought,Drought,1,300000000.0,300.0,,,41.932736
5563,2002,India,IND,Natural,Climatological,Drought,Drought,1,300000000.0,,910722000.0,1371942000.0,66.381964
1319,1972,India,IND,Natural,Climatological,Drought,Drought,1,200000000.0,,100000000.0,647994400.0,15.432233
913,1965,India,IND,Natural,Climatological,Drought,Drought,1,100000000.0,1500000.0,100000000.0,859993800.0,11.62799
2086,1982,India,IND,Natural,Climatological,Drought,Drought,1,100000000.0,,,,35.612841
3753,1994,China,CHN,Natural,Climatological,Drought,Drought,2,88690000.0,,13755200000.0,25145840000.0,54.701693
5485,2002,China,CHN,Natural,Climatological,Drought,Drought,3,64560000.0,,1210000000.0,1822784000.0,66.381964
7271,2009,China,CHN,Natural,Climatological,Drought,Drought,2,60160000.0,,3600000000.0,4546959000.0,79.173803


### Indexing and Retriving Data

Auf die Werte einer Spalte kann `<dataframe>['<spaltenname>']` zugegriffen werden.

In [62]:
data['Year']

1        1900
2        1900
3        1900
4        1900
5        1900
         ... 
10338    2022
10339    2022
10340    2022
10341    2022
10342    2022
Name: Year, Length: 10342, dtype: int64

Darauf können weitere Operationen oder Methoden angewendet werden:

In [63]:
data['Year'] + 10

1        1910
2        1910
3        1910
4        1910
5        1910
         ... 
10338    2032
10339    2032
10340    2032
10341    2032
10342    2032
Name: Year, Length: 10342, dtype: int64

In [64]:
data['Year'].mean()

1995.3498356217367

Mehrere Spalten werden ausgewählt indem eine Liste von Spaltennamen übergeben wird

In [65]:
data[['Year', 'Country', 'Disaster Type', 'Total Affected']]

Unnamed: 0,Year,Country,Disaster Type,Total Affected
1,1900,Cabo Verde,Drought,
2,1900,India,Drought,
3,1900,Jamaica,Flood,
4,1900,Japan,Volcanic activity,
5,1900,Turkey,Earthquake,
...,...,...,...,...
10338,2022,Yemen,Flood,3400.0
10339,2022,South Africa,Flood,143119.0
10340,2022,Zambia,Flood,15000.0
10341,2022,Zimbabwe,Flood,


### Boolean Indexing

Die ausgewählten Daten können auch gefilteret werden, in dem eine Bedingung mitgegeben wird.


In [68]:
data[data['Country'] == 'Germany']

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
3121,1990,Germany,DEU,Natural,Meteorological,Storm,,6,,64.0,4.440000e+09,9.208027e+09,48.218797
3289,1991,Germany,DEU,Natural,Meteorological,Storm,,1,,,5.000000e+06,9.948100e+06,50.260853
3431,1992,Germany,DEU,Natural,Geophysical,Earthquake,Ground movement,1,1525.0,1.0,5.000000e+07,9.655648e+07,51.783162
3432,1992,Germany,DEU,Natural,Hydrological,Flood,,1,,,3.010000e+07,5.812700e+07,51.783162
3582,1993,Germany,DEU,Natural,Hydrological,Flood,Riverine flood,1,100000.0,5.0,6.000000e+08,1.125458e+09,53.311620
...,...,...,...,...,...,...,...,...,...,...,...,...,...
9469,2019,Germany,DEU,Natural,Meteorological,Storm,Convective storm,1,,1.0,,,94.349092
9708,2020,Germany,DEU,Natural,Meteorological,Storm,Extra-tropical storm,1,33.0,,,,95.512967
9953,2021,Germany,DEU,Natural,Meteorological,Storm,Convective storm,2,604.0,1.0,,,100.000000
9952,2021,Germany,DEU,Natural,Hydrological,Flood,,1,1000.0,197.0,4.000000e+10,4.000000e+10,100.000000


In [69]:
data[data['Total Deaths'] >= 1000]

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
1,1900,Cabo Verde,CPV,Natural,Climatological,Drought,Drought,1,,11000.0,,,3.077091
2,1900,India,IND,Natural,Climatological,Drought,Drought,1,,1250000.0,,,3.077091
6,1900,United States of America (the),USA,Natural,Meteorological,Storm,Tropical cyclone,1,,6000.0,3.000000e+07,9.749468e+08,3.077091
9,1902,China,CHN,Natural,Geophysical,Earthquake,Ground movement,1,,2500.0,,,3.200175
10,1902,Guatemala,GTM,Natural,Geophysical,Earthquake,Ground movement,1,,2000.0,2.500000e+07,7.812074e+08,3.200175
...,...,...,...,...,...,...,...,...,...,...,...,...,...
9996,2021,India,IND,Natural,Hydrological,Flood,,8,1324439.0,1520.0,3.200000e+09,3.200000e+09,100.000000
10150,2022,Afghanistan,AFG,Natural,Geophysical,Earthquake,Ground movement,2,368627.0,1064.0,,,
10226,2022,India,IND,Natural,Hydrological,Flood,,3,2101260.0,2098.0,,,
10278,2022,Pakistan,PAK,Natural,Hydrological,Flood,,1,33012865.0,1730.0,,,


Wie viele Menschen sind im Schnitt pro Erdbeben betroffen?

In [70]:
data[data['Disaster Type'] == 'Earthquake']

Unnamed: 0,Year,Country,ISO,Disaster Group,Disaster Subroup,Disaster Type,Disaster Subtype,Total Events,Total Affected,Total Deaths,"Total Damage (USD, original)","Total Damage (USD, adjusted)",CPI
5,1900,Turkey,TUR,Natural,Geophysical,Earthquake,Ground movement,1,,140.0,,,3.077091
7,1901,Japan,JPN,Natural,Geophysical,Earthquake,Tsunami,1,24.0,18.0,,,3.077091
8,1902,Azerbaijan,AZE,Natural,Geophysical,Earthquake,Ground movement,1,17540.0,86.0,,,3.200175
9,1902,China,CHN,Natural,Geophysical,Earthquake,Ground movement,1,,2500.0,,,3.200175
10,1902,Guatemala,GTM,Natural,Geophysical,Earthquake,Ground movement,1,,2000.0,25000000.0,781207389.0,3.200175
...,...,...,...,...,...,...,...,...,...,...,...,...,...
10276,2022,Pakistan,PAK,Natural,Geophysical,Earthquake,Ground movement,1,,43.0,,,
10281,2022,Peru,PER,Natural,Geophysical,Earthquake,Ground movement,2,602.0,,,,
10285,2022,Philippines (the),PHL,Natural,Geophysical,Earthquake,Ground movement,2,650278.0,10.0,2272000.0,,
10291,2022,Papua New Guinea,PNG,Natural,Geophysical,Earthquake,Ground movement,1,1969.0,7.0,,,


In [71]:
data[data['Disaster Type'] == 'Earthquake']['Total Affected']

5             NaN
7            24.0
8         17540.0
9             NaN
10            NaN
           ...   
10276         NaN
10281       602.0
10285    650278.0
10291      1969.0
10317       140.0
Name: Total Affected, Length: 1078, dtype: float64

In [72]:
data[data['Disaster Type'] == 'Earthquake']['Total Affected'].mean()

233219.1548974943

### Weitere Recherchefragen

- Wie viele Naturkatastrophen gab es in Deutschland?
- In welchem Jahr gabe es die meisten Naturkatastrophen?
- Welche Länder sind am stärksten von Naturkatastrophen betroffen?
- Welche Länder sind von Naturkatastrophen betroffen haben aber vergleichsweise geringe Todesfälle?
- Welche Naturkatastrophen sind am tödlichsten?