# Tables and Maps

In [2]:
# first import required libraries and settings
import numpy as np
from datascience import *
%matplotlib inline
import obspy

First, let's read in some data from the [USGS Earthquake Records](https://earthquake.usgs.gov/data/). 
The file in this directory is a spreadsheet of information about all the earthquakes recorded in the month of September 2017.

In [32]:
quakes = Table.read_table("last_month_earthquakes.csv")
coords = quakes.select('latitude', 'longitude', 'place')
#plotting the first 100 elements
Marker.map_table(coords.take(np.arange(0, 100)))

The map above shows the first 100 earthquakes in our dataset. Run the cell below to input the number of earthquakes you want displayed. 

In [38]:
num_quakes = int(input("How many earthquakes do you want to see? "))
Marker.map_table(coords.take(np.arange(num_quakes + 1)))

How many earthquakes do you want to see? 250


### Tables

Our earthquake data is stored in a tabular format. Tables can represented in spreadsheets, or in our case, as a data structure containing rows and columns with a well-defined structure. We will use the datascience Python library to read in our data.

In [48]:
quakes.show(5)

time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,net,id,updated,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource
2017-09-30T03:10:29.824Z,59.3154,-153.663,106.9,2.5,ml,,,,0.46,ak,ak16951083,2017-09-30T03:16:22.957Z,"85km SE of Old Iliamna, Alaska",earthquake,,0.5,,,automatic,ak,ak
2017-09-30T02:51:27.750Z,37.5723,-118.843,0.35,1.13,md,9.0,220.0,0.02069,0.03,nc,nc72902026,2017-09-30T02:57:02.746Z,"14km SE of Mammoth Lakes, California",earthquake,0.98,0.47,0.19,8.0,automatic,nc,nc
2017-09-30T02:35:55.270Z,33.3262,-116.756,13.61,0.62,ml,24.0,63.0,0.08998,0.17,ci,ci38015256,2017-09-30T02:39:29.578Z,"9km N of Lake Henshaw, CA",earthquake,0.33,0.7,0.153,15.0,automatic,ci,ci
2017-09-30T02:34:06.370Z,33.3253,-116.774,12.64,0.6,ml,25.0,61.0,0.07952,0.22,ci,ci38015248,2017-09-30T02:37:42.940Z,"9km ESE of Palomar Observatory, CA",earthquake,0.46,0.96,0.122,16.0,automatic,ci,ci
2017-09-30T02:33:57.910Z,36.5837,-121.169,4.28,0.77,md,7.0,168.0,0.01946,0.36,nc,nc72902021,2017-09-30T02:54:02.740Z,"22km NE of Soledad, California",earthquake,2.14,2.26,0.22,4.0,automatic,nc,nc


In our dataset, notice how we have various numerical fields, such as `mag`, `depth`, `nst`. We can analyze these fields and discover some information. We can access columns by using .column("column name") on a Table object.

In [91]:
# taking an average
np.mean(quakes.column('mag'))

nan

Woah! What happened there? The magnitude column contained some non-numerical values. We can work around that by calling np.nanmean on the column.

In [52]:
np.nanmean(quakes['mag'])

1.5682757222503803

We should also note that there are multiple types of magnitudes. So this number does not represent the original Richter and Gutenberg method for measuring local earthquakes. So instead, let's calculate the average local magnitude.

In [None]:
local = quakes.where('magType', are.contained_in(['ML MI', "ml", 'mi']))
sum(local['magType'] == 'ml') == len(local['magType'])
avg_local_mag = np.nanmean(local.column("mag"))
print("The average local magnitude is: {}".format(avg_local_mag))

In [95]:
#mww is the code for moment magnitude
# https://earthquake.usgs.gov/data/comcat/data-eventterms.php#magType
moment = quakes.where('magType', are.equal_to("mww"))
moment
avg_moment_mag = np.nanmean(moment.column("mag"))
print("The average moment magnitude is: {}".format(avg_moment_mag))

The average moment magnitude is: 5.514705882352941


In [94]:
# nst is number of stations that recorded the quake
print("The average number of stations that recorded each quake was: {}".format(np.nanmean(quakes.column('nst'))))

The average number of stations that recorded each quake was: 19.847359454855194


**Exercise: Use the .select notation to extract the latitude, longitude, and place columns**

Also use .column to access the depth column. Store this in a variable called depth

In [101]:
# your code here hint: .select('one thing', 'another', 'the last thing') is the syntax
# store the lat, lon, and place columns in a variable called selection

latitude,longitude,place
59.3154,-153.663,"85km SE of Old Iliamna, Alaska"
37.5723,-118.843,"14km SE of Mammoth Lakes, California"
33.3262,-116.756,"9km N of Lake Henshaw, CA"
33.3253,-116.774,"9km ESE of Palomar Observatory, CA"
36.5837,-121.169,"22km NE of Soledad, California"
33.9507,-116.919,"5km NW of Banning, CA"
59.7625,-136.692,"84km WNW of Skagway, Alaska"
34.6877,-120.263,"6km SSE of Los Alamos, CA"
59.7803,-136.685,"85km WNW of Skagway, Alaska"
33.9952,-118.675,"4km SSE of Malibu Beach, CA"


### Welcome to ObsPy

ObsPy is a framework for working with seismological data. 

In [102]:
from obspy import read
st = read('http://examples.obspy.org/RJOB_061005_072159.ehz.new')
print(st)

1 Trace(s) in Stream:
.RJOB..Z | 2005-10-06T07:21:59.850000Z - 2005-10-06T07:24:59.845000Z | 200.0 Hz, 36000 samples


In [109]:
st[0][1:5]

array([12, -4,  6, 26], dtype=int32)

`st` is a file that contains readings from a single station.