# Analyse de la magnitude des tremblements de terre

Source : https://www.kaggle.com/usgs/earthquake-database/version/1

Significant Earthquakes, 1965-2016

Date, time, and location of all earthquakes with magnitude of 5.5 or higher

## Context 

The National Earthquake Information Center (NEIC) determines the 
location and size of all significant earthquakes that occur worldwide 
and disseminates this information immediately to national and 
international agencies, scientists, critical facilities, and the general 
public. The NEIC compiles and provides to scientists and to the public 
an extensive seismic database that serves as a foundation for scientific 
research through the operation of modern digital national and global 
seismograph networks and cooperative international agreements. The NEIC 
is the national data center and archive for earthquake information. 

## Content 

This dataset includes a record of the date, time, location, depth, 
magnitude, and source of every earthquake with a reported magnitude 5.5 
or higher since 1965. 

In [1]:
import pandas as pd
import numpy as np

On commence par charger les données dans un dataframe.

In [2]:
df = pd.read_csv("earthquakes-1965-2016.csv")

In [3]:
df.columns.values

array(['Date', 'Time', 'Latitude', 'Longitude', 'Type', 'Depth',
       'Depth Error', 'Depth Seismic Stations', 'Magnitude',
       'Magnitude Type', 'Magnitude Error', 'Magnitude Seismic Stations',
       'Azimuthal Gap', 'Horizontal Distance', 'Horizontal Error',
       'Root Mean Square', 'ID', 'Source', 'Location Source',
       'Magnitude Source', 'Status'], dtype=object)

Ce sont les 23412 magnitudes qui nous intéressent : on les convertit en array.

In [4]:
magnitudes = np.array(df.Magnitude)
dbn = len(magnitudes)
dbn

23412

Puis on génère une liste de dates.

In [5]:
datelist = list(df.Date)

Chaque date est une chaîne de caratères.

In [6]:
datelist[0]

'01/02/1965'

Pour récupérer le jour, le mois et l'année, on extrait les caractères et on les convertit en entier.

In [17]:
day = int(datelist[0][0:2])
day

1

In [18]:
month = int(datelist[0][3:5])
month

2

In [19]:
year = int(datelist[0][6:10])
year

1965

On stocke ces données dans un array numpy.

In [10]:
data = np.zeros((dbn,4))
for i in range(dbn):
    data[i,0] = int(datelist[i][0:2])
    data[i,1] = int(datelist[i][3:5])
    data[i,2] = int(datelist[i][6:10])
    data[i,3] = magnitudes[i]   

In [11]:
data[0:10,:]

array([[1.000e+00, 2.000e+00, 1.965e+03, 6.000e+00],
       [1.000e+00, 4.000e+00, 1.965e+03, 5.800e+00],
       [1.000e+00, 5.000e+00, 1.965e+03, 6.200e+00],
       [1.000e+00, 8.000e+00, 1.965e+03, 5.800e+00],
       [1.000e+00, 9.000e+00, 1.965e+03, 5.800e+00],
       [1.000e+00, 1.000e+01, 1.965e+03, 6.700e+00],
       [1.000e+00, 1.200e+01, 1.965e+03, 5.900e+00],
       [1.000e+00, 1.500e+01, 1.965e+03, 6.000e+00],
       [1.000e+00, 1.600e+01, 1.965e+03, 6.000e+00],
       [1.000e+00, 1.700e+01, 1.965e+03, 5.800e+00]])

In [14]:
np.savetxt("earthquakes-1965-2016-clean.csv",data, fmt='%.1f', header="Day Month Year Magnitude", comments="")