<a href="https://colab.research.google.com/github/Mayabry/Python_Learning_Jedha/blob/master/01_earthquakes_solutions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Visualizing earthquakes 

Let's use Plotly to make an exploratory analysis of a dataset that gathers data from various incidents around the world, and in particular lots of earthquakes that have occured since 1965 !

The dataset is available on Kaggle, you can download it and you'll find a description here :  https://www.kaggle.com/usgs/earthquake-database

## Beginning with the dataset

1. Import pandas and the different packages of Plotly

In [None]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio

pd.options.display.max_columns = None

2. Read the file and show the first rows as well as basic statistics about the dataset

In [None]:
dataset = pd.read_csv("earthquakes.csv")

dataset.head()

FileNotFoundError: ignored

In [None]:
dataset.describe(include="all")

3. In the following, we will use a lot the `Date` column. To avoid problems with your visualizations, use pandas to convert this column into `Datetime` type :

In [None]:
dataset.loc[:,'Date'] = pd.to_datetime(dataset['Date'])
dataset = dataset.sort_values(by = 'Date')
dataset.head()

## Exploring the number of observations over time 

4. Make a simple plot with plotly express to display the number of observed events as a function of time

In [None]:
fig = px.histogram(dataset,'Date')
fig.show()

If the `Date` column was converted into `Datetime`, plotly handles it and has automatically set the number of bins such that the counts are made by year. This makes the histogram easily readable.

In the following, you will create customized plots with plotly.graph_objects to display more accurate information about the number of events.

5. With plotly.graph_objects, create a histogram with a range slider such that you can visualize the daily number of events

In [None]:
fig = go.Figure(
    data = go.Histogram(
        x = dataset['Date'], nbinsx = dataset['Date'].nunique()),
    layout = go.Layout(
        title = go.layout.Title(text = "Number of observations per day", x = 0.5),
        xaxis = go.layout.XAxis(title = 'X', rangeslider = go.layout.xaxis.Rangeslider(visible = True))
    )
)

fig.show()

6. **Optional** We would like to check if there's some seasonality in the number of events. Create new columns in the dataset with the id of the month and year. Then use plotly's `make_subplots` function to display the monthly number of observations over a given year, independently for each of the five last years (so, there will be 5 independent historgrams)

In [None]:
dataset.loc[:, 'Year'] = dataset['Date'].dt.year
dataset.loc[:, 'Month'] = dataset['Date'].dt.month
dataset.head()

In [None]:
last_years = [2012, 2013, 2014, 2015, 2016]

fig = make_subplots(rows = 5, cols = 1, shared_xaxes = True, vertical_spacing = 0.01)

for i in range(len(last_years)):
    fig.add_trace(
        go.Histogram(
            x = dataset.loc[dataset['Year'] == last_years[i] ,'Month'],
        name = str(last_years[i])),
        row = i + 1,
        col = 1
    )

fig.update_layout(
    autosize=False,
    height=900)
fig.show()

7. **Optional** Now, we would like to allow the user to choose a specific year among the five last years, and display the daily number of observations fot this year only. Use the `Updatemenu` and `Button` classes together with `add_trace` to create an interactive visualization. 

In [None]:
last_years = [2012, 2013, 2014, 2015, 2016]

fig = go.Figure()

for i in range(len(last_years)):
    if i == 0:
        fig.add_trace(
            go.Histogram(
                x = dataset.loc[dataset['Year']==last_years[i], 'Date'],
                nbinsx = dataset.loc[dataset['Year']==last_years[i], 'Date'].nunique(),
                visible=True))
    else:
        fig.add_trace(
            go.Histogram(
                x = dataset.loc[dataset['Year']==last_years[i], 'Date'],
                nbinsx = dataset.loc[dataset['Year']==last_years[i], 'Date'].nunique(),
                visible=False))


fig.update_layout(
        title = go.layout.Title(text = "Daily observations for one year", x = 0.5),
        showlegend = False)

fig.update_layout(
    updatemenus = [go.layout.Updatemenu(
        active = 0,
        buttons = [
                    go.layout.updatemenu.Button(
                        label = "2012",
                        method = "update",
                        args = [{"visible" : [True, False, False, False, False]}]),
                    go.layout.updatemenu.Button(
                            label = "2013",
                            method = "update",
                            args = [{"visible" : [False, True, False, False, False]}]),
                    go.layout.updatemenu.Button(
                            label = "2014",
                            method = "update",
                            args = [{"visible" : [False, False, True, False, False]}]),
                    go.layout.updatemenu.Button(
                            label = "2015",
                            method = "update",
                            args = [{"visible" : [False, False, False, True, False]}]),
                    go.layout.updatemenu.Button(
                            label = "2016",
                            method = "update",
                            args = [{"visible" : [False, False, False, False, True]}])
                ]
    )]
)

## Focusing on earthquakes

8. Use plotly express to display the proportions of each type of events in the dataset

In [None]:
px.pie(dataset, names='Type')

From now, we will focus only on earthquakes. 

9. Create a new dataset containing only earthquakes and plot the distribution of their magnitudes depths

In [None]:
earthquakes = dataset.loc[dataset['Type'] == 'Earthquake',:]
dataset.head()

In [None]:
px.histogram(earthquakes, x = 'Magnitude', nbins=50)

In [None]:
px.histogram(earthquakes, x = 'Depth')

10. Now, we'd like to explore the evolution of the magnitudes as a function of time :


- Create a dataframe containing the mean magnitudes per day
- Then use this dataset to display the mean magnitudes per day, with a range slider allowing the user to navigate among the dates
- **Optional** Add a reference line showing the value of the mean magnitude computed over the whole dataset


In [None]:
mean_mag_date = earthquakes.groupby('Date')['Magnitude'].mean().reset_index(drop=False)
mean_mag_date.sort_values(by='Date')
mean_mag_date.head()

Unnamed: 0,Date,Magnitude
0,1965-01-02,6.0
1,1965-01-04,5.8
2,1965-01-05,6.2
3,1965-01-08,5.8
4,1965-01-09,5.8


In [None]:
fig = go.Figure(
    data = go.Scatter(
        x = mean_mag_date['Date'],
        y = mean_mag_date['Magnitude']),
    layout = go.Layout(
        title = go.layout.Title(text = "Mean magnitude per day", x = 0.5),
        xaxis = go.layout.XAxis(title = 'X', rangeslider = go.layout.xaxis.Rangeslider(visible = True))
    )
)

mean_mag_date.loc[:,'Mean_Magnitude'] = mean_mag_date['Magnitude'].mean()

In [None]:
fig.add_trace(
    go.Scatter(
        x = mean_mag_date['Date'],
        y = mean_mag_date['Mean_Magnitude'])
)

fig.update_layout(showlegend = False)

fig.show()

To finish, let's visualize the distribution of the earthquakes around the world.

11. Use plotly express' `scatter_mapbox` to display the earthquakes on a map. Change the color of the markers depending on the value of the magnitude. Use the documentation and [this page](https://plotly.com/python/builtin-colorscales/) to find a suitable colorscale.

In [None]:
fig = px.scatter_mapbox(earthquakes, lat="Latitude", lon="Longitude", color="Magnitude", 
                        mapbox_style="open-street-map", zoom = 0.5, color_continuous_scale = 'Reds')
fig.show()

12. Let's make the map animated ! Add some arguments in `scatter_mapbox` to create an animation displaying the earthquakes year by year

In [None]:
fig = px.scatter_mapbox(earthquakes, lat="Latitude", lon="Longitude", color="Magnitude", zoom = 0.5,
                        mapbox_style="open-street-map", color_continuous_scale = 'Reds', range_color = [5.0,10.0],
                       animation_frame = 'Year')
fig.show()

13. **Optional** Look for another function in plotly express that allows to display on a map the *density of earthquakes* (instead of each earthquake separately), and make it animated to display the evolution year by year.

In [None]:
fig = px.density_mapbox(earthquakes, lat="Latitude", lon="Longitude", mapbox_style="open-street-map",
                       animation_frame = 'Year', zoom = 0.5, radius = 10)
fig.show()