**The objective of this project is to practice data manipulation with Python and plotting data onto a map by using geolocation data from a CSV file, the dataset we are going to use for this project contains information about the earthquakes registered in Costa Rica from 1/31/2021 to 3/1/2021 with 40 data points describing: date, time, magnitud, deepth, location, latitude and longitude.*

In [1]:
#Imports of the modules we are going to use
import os
import pandas as pd

In [2]:
#Instead of reading the data from the csv file, let's retrieve the data from the site itself following this URL:http://www.ovsicori.una.ac.cr/sistemas/sentidos_map/indexleqs.php
table_data = pd.read_html("http://www.ovsicori.una.ac.cr/sistemas/sentidos_map/indexleqs.php")

In [3]:
#Convert the data for a dataframe
df = pd.DataFrame(data=table_data[0])
df.head(10)

Unnamed: 0,Fecha,Hora,Magnitud,Profundidad,Latitud,Longitud,Localizacion,Revisado,Autor,Mapa
0,,,,,,,,,,
1,2021-05-06,16:28:30,2.1,54.0,9.4277,-83.9117,3.2 km al Este de Rio Nuevo de San Jose,No,olDbg,mapa
2,2021-05-06,14:54:10,2.1,63.0,10.1848,-84.0883,2.0 km al Noroeste de Varablanca de Heredia,No,olDbg,mapa
3,2021-05-06,12:56:15,2.0,14.0,7.9804,-83.1524,30.7 km al Suroeste de Pavon de Puntarenas,No,olDbg,mapa
4,2021-05-06,08:30:34,2.9,26.0,8.8474,-82.9829,1.0 km al Noreste de San Vito de Puntarenas,No,olDbg,mapa
5,2021-05-06,07:38:38,2.6,98.0,10.1762,-83.8806,3.2 km al Suroeste de Guapiles de Limon,No,olDbg,mapa
6,2021-05-06,05:39:41,2.3,18.0,9.5127,-84.9676,16.5 km al Sureste de Cobano de Puntarenas,No,olDbg,mapa
7,2021-05-06,02:01:44,2.0,22.0,9.6131,-84.9445,12.7 km al Sureste de Cobano de Puntarenas,Si,-,mapa
8,2021-05-06,00:21:03,1.0,4.0,9.8978,-83.609,0.7 km al Este de Pavones de Cartago,Si,UNA:jprottiMwpM,mapa
9,2021-05-05,23:18:24,2.2,18.0,8.524,-82.8923,0.7 km al Noroeste de Canoas de Puntarenas,No,olDbg,mapa


In [4]:
df = df.dropna()

In [5]:
#We dropped the column "Localizacion", "Revisado","Autor","Mapa", since we are going to use the geopy library to find out the location based on the coordinates
df = df.drop(columns=["Localizacion", "Revisado","Autor","Mapa"])

In [6]:
#Confirm the column "Localizacion" was dropped
df.head(10)

Unnamed: 0,Fecha,Hora,Magnitud,Profundidad,Latitud,Longitud
1,2021-05-06,16:28:30,2.1,54.0,9.4277,-83.9117
2,2021-05-06,14:54:10,2.1,63.0,10.1848,-84.0883
3,2021-05-06,12:56:15,2.0,14.0,7.9804,-83.1524
4,2021-05-06,08:30:34,2.9,26.0,8.8474,-82.9829
5,2021-05-06,07:38:38,2.6,98.0,10.1762,-83.8806
6,2021-05-06,05:39:41,2.3,18.0,9.5127,-84.9676
7,2021-05-06,02:01:44,2.0,22.0,9.6131,-84.9445
8,2021-05-06,00:21:03,1.0,4.0,9.8978,-83.609
9,2021-05-05,23:18:24,2.2,18.0,8.524,-82.8923
10,2021-05-05,23:06:24,1.5,42.0,9.5142,-83.8732


In [7]:
#let's rename the columns from Spanish to English
df = df.rename(columns={"Fecha":"Date","Hora":"Local Time","Magnitud":"Magnitude","Profundidad":"Deepth(KM)", "Latitud":"Latitude", "Longitud":"Longitude"})

In [8]:
#Confirmed our columns were renamed correctly
df.head(10)

Unnamed: 0,Date,Local Time,Magnitude,Deepth(KM),Latitude,Longitude
1,2021-05-06,16:28:30,2.1,54.0,9.4277,-83.9117
2,2021-05-06,14:54:10,2.1,63.0,10.1848,-84.0883
3,2021-05-06,12:56:15,2.0,14.0,7.9804,-83.1524
4,2021-05-06,08:30:34,2.9,26.0,8.8474,-82.9829
5,2021-05-06,07:38:38,2.6,98.0,10.1762,-83.8806
6,2021-05-06,05:39:41,2.3,18.0,9.5127,-84.9676
7,2021-05-06,02:01:44,2.0,22.0,9.6131,-84.9445
8,2021-05-06,00:21:03,1.0,4.0,9.8978,-83.609
9,2021-05-05,23:18:24,2.2,18.0,8.524,-82.8923
10,2021-05-05,23:06:24,1.5,42.0,9.5142,-83.8732


In [9]:
#Since the geopy reverse() function only accepts an unique string containing the latitude and longitude for the coordinates, then; let's create a column called Coordinates and let's concatenate the values for latitude and longitude and convert them to strings.
df["Coordinates"] = df["Latitude"].astype(str) + ", " + df["Longitude"].astype(str)

In [10]:
#Now we can see the column Coordinates holding both latitude and longitude but a strings
df.head()

Unnamed: 0,Date,Local Time,Magnitude,Deepth(KM),Latitude,Longitude,Coordinates
1,2021-05-06,16:28:30,2.1,54.0,9.4277,-83.9117,"9.4277, -83.9117"
2,2021-05-06,14:54:10,2.1,63.0,10.1848,-84.0883,"10.1848, -84.0883"
3,2021-05-06,12:56:15,2.0,14.0,7.9804,-83.1524,"7.9804, -83.1524"
4,2021-05-06,08:30:34,2.9,26.0,8.8474,-82.9829,"8.8474, -82.9829"
5,2021-05-06,07:38:38,2.6,98.0,10.1762,-83.8806,"10.1762, -83.8806"


Let's use the Geopy library to figure out from the coordinates the location where the earthquake was registered
https://pypi.org/project/geopy/


In [11]:
#Importing the geopy library
from geopy.geocoders import Nominatim

In [12]:
geolocator = Nominatim(user_agent="Earthquakes in CR")

In [13]:
df["Location"] = df["Coordinates"].apply(geolocator.reverse)

In [14]:
df.head(10)

Unnamed: 0,Date,Local Time,Magnitude,Deepth(KM),Latitude,Longitude,Coordinates,Location
1,2021-05-06,16:28:30,2.1,54.0,9.4277,-83.9117,"9.4277, -83.9117","(El Llano, Río Nuevo, Cantón Pérez Zeledón, Pr..."
2,2021-05-06,14:54:10,2.1,63.0,10.1848,-84.0883,"10.1848, -84.0883","(Varablanca, Cantón Heredia, Provincia Heredia..."
3,2021-05-06,12:56:15,2.0,14.0,7.9804,-83.1524,"7.9804, -83.1524",
4,2021-05-06,08:30:34,2.9,26.0,8.8474,-82.9829,"8.8474, -82.9829","(San Vito, Cantón Coto Brus, Provincia Puntare..."
5,2021-05-06,07:38:38,2.6,98.0,10.1762,-83.8806,"10.1762, -83.8806","(Alturas, Guápiles, Cantón Pococí, Provincia L..."
6,2021-05-06,05:39:41,2.3,18.0,9.5127,-84.9676,"9.5127, -84.9676",
7,2021-05-06,02:01:44,2.0,22.0,9.6131,-84.9445,"9.6131, -84.9445","(Provincia Puntarenas, Costa Rica, (9.13740299..."
8,2021-05-06,00:21:03,1.0,4.0,9.8978,-83.609,"9.8978, -83.609","(San Rafael, Pavones, Cantón Turrialba, Provin..."
9,2021-05-05,23:18:24,2.2,18.0,8.524,-82.8923,"8.524, -82.8923","(Calle La Gloria, La Gloria, Canoas, Cantón Co..."
10,2021-05-05,23:06:24,1.5,42.0,9.5142,-83.8732,"9.5142, -83.8732","(Río Nuevo, Cantón Pérez Zeledón, Provincia Sa..."


Now, that we have our data set ready, let's create some visualization using plotly:https://plotly.com/python/mapbox-layers/ 

In [15]:
import plotly.express as px

In [16]:
fig = px.scatter_mapbox(df, lat="Latitude", lon="Longitude", hover_name="Date", hover_data=["Local Time", "Magnitude", "Deepth(KM)", "Coordinates"], color_discrete_sequence=["blue"])
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

In [20]:
fig.write_html("C:\\Users\jocerdas\\OneDrive - Microsoft\Documents\\Python\\Python Projects\\earthquakers\\cr_earthquakes\\cr_earthquakes.html")

In [19]:
fig2 = px.density_mapbox(df, lat="Latitude", lon="Longitude", z="Magnitude", radius=10, mapbox_style="open-street-map")
fig2.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig2.show()