There always a lot of news going on, but one of the biggest, and most horrifying, stories from the last few weeks is the massive earthquake that affected turkey and Syria, with the initial shock taking place on February 6th. The report that I've been reading led me to wonder|: How frequently do earthquakes occur? how strong are they normally? Was this one unusually strong?

I decided to dig into information about earthquakes. The [US Geological Survey](https://www.usgs.gov/), a scientific research agency that’s part of the Department of the Interior, [offers many tools and data sets for anyone interested in learning more about this subject](https://www.usgs.gov/programs/earthquake-hazards/data).

For this week’s data, I went to the USGS page that lets you [search their earthquake catalog](https://earthquake.usgs.gov/earthquakes/search/#{%22feed%22%3A%221437493916387%22%2C%22search%22%3A{%22id%22%3A%221437493916387%22%2C%22name%22%3A%22Search%20Results%22%2C%22isSearch%22%3Atrue%2C%22params%22%3A{%22producttype%22%3A%22losspager%22%2C%22orderby%22%3A%22time%22}}%2C%22listFormat%22%3A%22losspager%22%2C%22sort%22%3A%22newest%22%2C%22basemap%22%3A%22grayscale%22%2C%22autoUpdate%22%3Afalse%2C%22restrictListToMap%22%3Atrue%2C%22timeZone%22%3A%22utc%22%2C%22mapposition%22%3A[[-85%2C0]%2C[85%2C360]]%2C%22overlays%22%3A{%22plates%22%3Atrue}%2C%22viewModes%22%3A{%22map%22%3Atrue%2C%22list%22%3Atrue%2C%22settings%22%3Atrue%2C%22help%22%3Afalse}}). I chose to see earthquakes with a minimum magnitude of 2.5, between January 1st, 2000 and today (February 15, 2023). I asked to see all data, from all all over the world. I chose the CSV output option, and ended up with the following URL:

https://earthquake.usgs.gov/fdsnws/event/1/query.csv?starttime=2000-01-01%2000:00:00&endtime=2023-02-15%2000:00:00&minmagnitude=2.5&orderby=time&producttype=losspager

That gives us a data set of more than 8,000 earthquakes from all over the world, for more than 20 years, which should be enough to ask and answer some interesting questions:

1. Read the downloaded CSV file

2. How many seismic events take place each year? In which of the last 20 years did we have the greatest number of such events?

3. What are common magnitudes? Looking only at the integer portion of the magnitudes, how often does each value occur?

4. How many seismic events took place in Turkey on February 6th? What was their average magnitude? What was the mean magnitude?

5. Are earthquakes common in Turkey? From the "place" column, extract the text following the final comma, and get the 30 most common places in the world with earthquakes.

6. Are serious earthquakes common in Turkey? Rerun the previous query, but only look for those with a magnitude of 5 or greater.

The learning goals for this week’s questions are to work with time information, work with strings, and handle some missing data.

In [1]:
# Import Library
import pandas as pd
import numpy as np

In [6]:
# 1. Read the downloaded CSV
filepath = "/Users/tomioredein/data_analyst/pandas_port/pandas_portfolio/pandas_port/Earthquake/query.csv"
df = pd.read_csv(filepath, parse_dates=["time"])
df.head(10)


Unnamed: 0,time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,...,updated,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource
0,2023-02-14 13:16:50.827000+00:00,45.0998,23.2013,10.0,5.6,mww,197.0,27.0,1.217,0.6,...,2023-07-28T14:12:02.324Z,"0 km ESE of Lele?ti, Romania",earthquake,5.14,1.742,0.032,95.0,reviewed,us,us
1,2023-02-14 02:03:16.305000+00:00,-15.3552,167.541,119.646,5.2,mww,130.0,20.0,0.338,0.8,...,2023-04-22T21:33:07.040Z,"44 km ENE of Luganville, Vanuatu",earthquake,7.67,3.289,0.089,12.0,reviewed,us,us
2,2023-02-13 09:18:12.524000+00:00,-29.3698,-178.9718,354.0,6.1,mww,120.0,34.0,0.919,1.07,...,2023-04-22T21:33:05.040Z,"Kermadec Islands, New Zealand",earthquake,9.97,1.679,0.049,40.0,reviewed,us,us
3,2023-02-12 02:21:23.340000+00:00,19.149833,-155.370833,30.21,3.83,ml,49.0,180.0,,0.15,...,2023-04-22T21:33:01.040Z,"12 km ESE of P?hala, Hawaii",earthquake,0.54,0.71,0.1112,39.0,reviewed,hv,hv
4,2023-02-11 15:31:26.636000+00:00,-35.3616,-15.5697,10.0,5.6,mww,43.0,85.0,21.229,0.63,...,2023-04-15T21:39:58.040Z,Tristan da Cunha region,earthquake,12.93,1.881,0.073,18.0,reviewed,us,us
5,2023-02-11 08:55:05.019000+00:00,3.5892,126.6923,25.0,5.9,mww,200.0,36.0,2.879,0.88,...,2023-04-15T21:39:57.040Z,"242 km SE of Sarangani, Philippines",earthquake,6.76,1.86,0.052,35.0,reviewed,us,us
6,2023-02-09 21:12:19.574000+00:00,-1.2752,67.5616,10.0,5.5,mww,61.0,38.0,5.71,0.6,...,2023-04-15T21:39:52.040Z,Carlsberg Ridge,earthquake,8.53,1.835,0.062,25.0,reviewed,us,us
7,2023-02-09 10:53:16.999000+00:00,-10.76,161.6593,35.0,5.5,mww,99.0,28.0,2.137,0.71,...,2023-04-15T21:39:51.040Z,"44 km SW of Kirakira, Solomon Islands",earthquake,7.2,1.833,0.065,23.0,reviewed,us,us
8,2023-02-09 09:09:16.562000+00:00,-55.715,-27.1882,35.0,5.4,mww,68.0,67.0,5.542,0.44,...,2023-04-15T21:39:50.040Z,South Sandwich Islands region,earthquake,10.26,1.885,0.083,14.0,reviewed,us,us
9,2023-02-08 15:16:16.982000+00:00,16.7337,-86.0574,10.0,5.5,mww,201.0,22.0,2.908,0.55,...,2023-04-15T21:39:48.040Z,north of Honduras,earthquake,6.18,1.777,0.034,84.0,reviewed,us,us


In [27]:
df['time'].dt.year.value_counts()

time
2021    1023
2020     931
2019     848
2018     831
2014     755
2022     744
2015     693
2016     676
2013     610
2017     592
2023      90
2010      57
2012      20
2008      19
2009      18
2005      17
2001      15
2011      13
2002      11
2003      10
2004       8
2000       8
2006       5
2007       3
Name: count, dtype: int64

In [14]:
df['mag'].astype(np.int8).value_counts(normalize=True).sort_index()

mag
2    0.001116
3    0.059794
4    0.197618
5    0.567547
6    0.155688
7    0.016747
8    0.001489
Name: proportion, dtype: float64

In [28]:
df.set_index('time').loc['2023-02-06']

Unnamed: 0_level_0,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,net,...,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource,place_extracted
time,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2023-02-06 15:33:32.721000+00:00,38.19,38.1756,7.43,5.4,mwr,112.0,29.0,1.273,0.8,us,...,eastern Turkey,earthquake,5.06,1.918,0.08,15.0,reviewed,us,us,eastern Turkey
2023-02-06 15:14:34.402000+00:00,37.8791,37.7355,10.0,4.9,mwr,132.0,28.0,0.82,0.82,us,...,"13 km NE of Gölba??, Turkey",earthquake,5.22,1.804,0.063,24.0,reviewed,us,us,Turkey
2023-02-06 12:02:11.275000+00:00,38.0582,36.5114,8.516,6.0,mb,138.0,25.0,0.356,0.54,us,...,"4 km NNE of Göksun, Turkey",earthquake,4.32,3.954,0.024,647.0,reviewed,us,us,Turkey
2023-02-06 10:51:30.994000+00:00,38.248,38.1847,10.0,5.7,mb,96.0,42.0,0.95,0.97,us,...,"7 km SW of Ye?ilyurt, Turkey",earthquake,4.74,1.773,0.047,158.0,reviewed,us,us,Turkey
2023-02-06 10:35:58.161000+00:00,38.0249,37.8023,10.0,5.8,mb,109.0,56.0,0.973,0.9,us,...,"9 km SW of Do?an?ehir, Turkey",earthquake,4.5,1.797,0.051,141.0,reviewed,us,us,Turkey
2023-02-06 10:26:46.742000+00:00,38.0315,38.0984,10.0,6.0,mb,65.0,63.0,1.111,0.88,us,...,"12 km W of Çelikhan, Turkey",earthquake,4.28,1.605,0.081,57.0,reviewed,us,us,Turkey
2023-02-06 10:24:48.811000+00:00,38.0106,37.1962,7.432,7.5,mww,111.0,29.0,0.73,0.7,us,...,"Elbistan earthquake, Kahramanmaras earthquake ...",earthquake,5.09,3.709,0.041,58.0,reviewed,us,us,Kahramanmaras earthquake sequence
2023-02-06 02:03:37.341000+00:00,37.7712,37.9141,16.929,5.3,mb,154.0,35.0,0.792,0.61,us,...,Central Turkey,earthquake,4.28,3.347,0.052,122.0,reviewed,us,us,Central Turkey
2023-02-06 01:36:27.357000+00:00,36.9921,36.6832,10.0,5.6,mb,167.0,47.0,0.458,0.59,us,...,Turkey-Syria border region,earthquake,6.2,1.76,0.052,128.0,reviewed,us,us,Turkey-Syria border region
2023-02-06 01:28:15.784000+00:00,37.1893,36.8929,9.797,6.7,mww,194.0,19.0,0.254,0.68,us,...,"14 km E of Nurda??, Turkey",earthquake,7.68,3.333,0.098,10.0,reviewed,us,us,Turkey


In [24]:
df = df.dropna(subset='place')
df.loc[df['place'].str.contains("Turkey")].set_index('time').loc['2023-02-06', 'mag'].describe()

count    10.000000
mean      5.710000
std       0.481779
min       4.900000
25%       5.450000
50%       5.700000
75%       5.950000
max       6.700000
Name: mag, dtype: float64

In [25]:
df['place_extracted'] = df['place'].str.split(',').str[-1].str.strip()
# Alternative method  df['place'].str.split(',').str.get(-1).str.strip()

top_places = df['place_extracted'].value_counts().head(30)

# Display the result
print(top_places)

place_extracted
Alaska                              913
Indonesia                           521
CA                                  397
Papua New Guinea                    367
Chile                               273
Japan                               270
Tonga                               222
South Sandwich Islands region       202
Philippines                         197
New Zealand                         183
Solomon Islands                     183
Vanuatu                             183
Oklahoma                            175
Hawaii                              170
Kermadec Islands region             147
Russia                              137
Mexico                              135
California                          130
Puerto Rico                         129
MX                                  106
Nevada                              102
Peru                                 90
Pacific-Antarctic Ridge              81
Fiji                                 78
Idaho                   

In [26]:
df.loc[df['mag'] >= 5, 'place'].str.split(',').str.get(-1).str.strip().value_counts().head(30)

place
Indonesia                           516
Papua New Guinea                    366
Chile                               270
Japan                               267
Alaska                              265
Tonga                               221
South Sandwich Islands region       202
Philippines                         197
Solomon Islands                     183
Vanuatu                             182
New Zealand                         169
Kermadec Islands region             146
Russia                              134
Mexico                              132
Peru                                 89
Pacific-Antarctic Ridge              81
Fiji                                 76
southeast of the Loyalty Islands     74
south of the Fiji Islands            73
Japan region                         72
Iran                                 71
China                                66
New Caledonia                        60
southern East Pacific Rise           60
Greece                            