# Geospatial data analysis for environment studies 
## Lecture 7: Animated global maps of Geospatial Data (Corona Vaccines and Earthquakes)
#### *Name*: Siraphop SAISA-ARD (Mag)  
#### *Student ID*: 21M51964


### Create a global animation of COVID vaccination cases per country
By using the same data as in lecture 6, the animation of national vaccination percentage can be created by using Plotly. The first section here is the example from the lecture.

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import plotly.express as px
import plotly
import matplotlib.pyplot as plt

df_vac = pd.read_csv('../input/covid-world-vaccination-progress/country_vaccinations.csv',header=0)
df_vac.index = pd.to_datetime(df_vac['date'])

df_vac = df_vac.sort_index()
# df_vac.interpolate(method='linear', limit_direction='forward', inplace=True, axis=0)

fig = px.choropleth(df_vac,
                    locations='country',locationmode='country names', color='total_vaccinations_per_hundred',
                    hover_name='country',animation_frame='date', color_continuous_scale='rainbow', range_color=(0.0,50.0))
fig.update_layout(title_text='Vaccination percentage of the country total population', title_x=0.45)
fig.show()

# plotly.offline.plot(fig,filename='vaccinations.html')

#### Data interpolation

In the second section, the data is interpolated linearly in the forward direction. It is done by sorting and interpolating for each country data, then added back into the whole dataframe in the end. It may look simple, but hours were spent on finding a method. The grouping procedure is necessary for the interpolation because the data is sorted by their indices, so the adjacent data will be the data from another country, and just sorting by the country would not work either because of the discontinuity between the data from 2 countries. (I also tried sort_values then sort_index but it did not work, and was considering a stacking dataframes, but it only makes the problem more complicated.)

The colarbar used here is the Yellow-and-Red and the range is from 0 to 100%.

The data visualized here is the total vaccination percentage which is the same as in the lecture, and is believed to be the most rational one to plot because the race against the virus is meeting the threshold total vaccination percentage.

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import plotly.express as px
import plotly
import matplotlib.pyplot as plt

df_vac_in = pd.read_csv('../input/covid-world-vaccination-progress/country_vaccinations.csv',header=0)
df_vac_in.index = pd.to_datetime(df_vac_in['date'])

countries = np.unique(df_vac_in['country'])
df_vac = pd.DataFrame(columns=df_vac_in.columns)

for country in countries:
    tmp = df_vac_in[df_vac_in['country']==country].copy()
    tmp.sort_index()
    tmp.interpolate(method='linear', limit_direction='forward', inplace=True, axis=0)
    df_vac = df_vac.append(tmp)
    
df_vac = df_vac.sort_index()
fig = px.choropleth(df_vac,
                    locations='country',locationmode='country names', color='total_vaccinations_per_hundred',
                    hover_name='country',animation_frame='date', color_continuous_scale='YlOrRd', range_color=(0.0,100.0))
fig.update_layout(title_text='Vaccination percentage of the country total population (interpolated)', title_x=0.44)
fig.show()

This section is used to visualize the interpolation result. By changing the 'country' variable in the first line, the raw data and the interpolated data can be seen. The method of interpolation is chosen as linear because of the simplicity and the assumption that the null period is relatively short, and is a common way to interpolate a cumulative data.

In [None]:
country = 'Japan'

tmp = pd.read_csv('../input/covid-world-vaccination-progress/country_vaccinations.csv',header=0)
tmp.index = pd.to_datetime(tmp['date'])
tmp_interpo = tmp.copy()

tmp = tmp[tmp['country']==country]
tmp_interpo = tmp_interpo[tmp_interpo['country']==country]
tmp_interpo.interpolate(method='linear', limit_direction='forward', inplace=True, axis=0)

fig,ax = plt.subplots(figsize=(15,5))
ax.plot(tmp.index,tmp['total_vaccinations_per_hundred'],label='Raw data')
ax.plot(tmp_interpo.index,tmp_interpo['total_vaccinations_per_hundred'], linestyle = ':',label='Interpolated data')
plt.title("Interpolated total vaccination percentage in " + country)
plt.legend()
plt.show()

### Geo-scatter plot of a severe earthquake incident
Similarly, the first section is the example from the lecture. It visualizes the recent global Earthquake in May 2021. We can see that there is no major incidents.

In [None]:
df_quake = pd.read_csv('../input/202105-recent-earthquake/query.csv', header=0)
df_quake.index = pd.to_datetime(df_quake['time'])
df_quake['time'] = df_quake.index.strftime('%Y-%m-%d %H:00:00')
df_quake = df_quake.sort_index()
fig = px.scatter_geo(df_quake,
                     lat='latitude', lon='longitude', color='mag',
                     animation_frame='time', color_continuous_scale='jet', range_color=(1.0,9.0))
fig.update_layout(title_text='Global Earthquake in May 2021', title_x=0.5)
fig.show()

#### 2004 Indian Ocean earthquake and tsunami (a.k.a. Sumatra-Andaman earthquake)

In 2004, South East Asia, a 9.1-9.3 magnitude scale occured at the coast of northern Sumatra, Indonesia at around 8 AM local time (0 AM UTC). It then consequently caused a 30 m Tsunami that killed around 200,000 people in 14 affected countries. This earthquake is the third-largest ever recorded earthquake in the 21st century. The aftershocks can be observed for days after the main one.

In [None]:
df_quake = pd.read_csv('../input/2004-great-earthquakefix/query (1).csv', header=0)
df_quake.index = pd.to_datetime(df_quake['time'])
df_quake['time'] = df_quake.index.strftime('%Y-%m-%d %H:00:00')
df_quake = df_quake.sort_index()
fig = px.scatter_geo(df_quake,
                     lat='latitude', lon='longitude', color='mag',
                     animation_frame='time', color_continuous_scale='jet', range_color=(1.0,10.0))
fig.update_layout(title_text='South East Asian India Ocean Earthquake in 25-28 December 2004, ', title_x=0.5)
fig.show()

The effect even extended to Alaska on the other side of the Earth, as we can observe a lot of small earthquakes both before and after the main earthquake. The event had attracted a lot of attention back then, and the affected regions received donations totalling more than 14 billions USD.

In [None]:
import pandas as pd
import plotly.express as px

df_quake = pd.read_csv('../input/2004-global-earthquake/query (3).csv', header=0)
df_quake.index = pd.to_datetime(df_quake['time'])
df_quake['time'] = df_quake.index.strftime('%Y-%m-%d %H:00:00')
df_quake = df_quake.sort_index()
fig = px.scatter_geo(df_quake,
                     lat='latitude', lon='longitude', color='mag',
                     animation_frame='time', color_continuous_scale='jet', range_color=(1.0,10.0))
fig.update_layout(title_text='Global Earthquake in 25-27 December 2004, ', title_x=0.5)
fig.show()

#### References
[1] Wikipedia, https://en.wikipedia.org/wiki/2004_Indian_Ocean_earthquake_and_tsunami

[2] BBC. Indian Ocean tsunami: Then and now.(2014). https://www.bbc.com/news/world-asia-30034501

[3] The Standard.(2020). 26 December 2547 B.E. - Thailand crashed by a Tsunami (Thai). https://thestandard.co/onthisday26122547/

[4] Manager Online. (2005). News Overview 27/12/2004 -2/1/2005 (Thai). https://mgronline.com/onlinesection/detail/9480000000537