# Fracking and Oklahoma Quakes analysis

In this analysis, I will be looking at the dataset [Oklahoma Earthquakes and Saltwater Injection Wells](https://www.kaggle.com/ksuchris2000/oklahoma-earthquakes-and-saltwater-injection-wells) to see if I can find any correlation between the earthquakes and the use of [Injection Wells](https://en.wikipedia.org/wiki/Injection_well)

### Preface

In [1]:
from datetime import datetime
import pandas as pd
import altair as alt
alt.data_transformers.enable('data_server')

DataTransformerRegistry.enable('data_server')

http://www.ogs.ou.edu/pubsscanned/openfile/OF1_2014_Murray.pdf

### Preprocessing

Preprocessing the data consists of removing unneeded rows and columns relating to the location of the well/quake as well as adding a python Datetime object as an attribute for the date of the event.

In [2]:
WELLS_PATH_RAW = 'data_fracking/raw/InjectionWells.csv'
QUAKES_PATH_RAW = 'data_fracking/raw/okQuakes.csv'
WELLS_PATH = 'data_fracking/processed/Wells.csv'
QUAKES_PATH = 'data_fracking/processed/Quakes.csv'

In [3]:
from scripts import preprocess
wells = preprocess.preprocess_wells()
quakes = preprocess.preprocess_quakes()

First, I want to take a look at the total number of wells over time and the amount of earthquakes over time.

In [4]:
wells_plot = alt.Chart(wells.reset_index()).mark_area().encode(
    x='time:T',
    y='index',
).properties(width=800)

quakes_plot = alt.Chart(quakes.reset_index()).mark_bar().encode(
    x='time:T',
    y='count()'
).properties(width=800, height=200)

wells_plot & quakes_plot

Now let's take a look at where the wells are, and where the quakes are on a map.

In [5]:
ok_url = 'https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/OK-40-oklahoma-counties.json'
ok = alt.topo_feature(url=ok_url, feature='cb_2015_oklahoma_county_20m')
ok_map = alt.Chart(ok).mark_geoshape(fill='lightgrey',stroke='white')
ka_url = 'https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/KS-20-kansas-counties.json'
ka = alt.topo_feature(url=ka_url, feature='cb_2015_kansas_county_20m')
ka_map = alt.Chart(ka).mark_geoshape(fill='lightgrey',stroke='white')

In [6]:
ll = alt.topo_feature(url='https://raw.githubusercontent.com/kylepollina/Fracking_and_Oklahoma_Quakes/master/states.json', feature='us')
alt.Chart(ll).mark_geoshape().properties(width=800, height=500)

In [7]:
states_url = 'https://raw.githubusercontent.com/deldersveld/topojson/master/countries/united-states/us-albers.json'

states = alt.topo_feature(url=states_url, feature='us')

statemap = alt.Chart(states).mark_geoshape(
    fill = 'lightgrey',
    stroke = 'white'
).properties(
    width = 800,
    height = 400
)

In [8]:
quake_points = alt.Chart(quakes).mark_circle().encode(
    latitude = 'latitude',
    longitude = 'longitude'
)

well_points = alt.Chart(wells.reset_index()).mark_circle().encode(
    latitude = 'LAT',
    longitude = 'LONG',
    color = alt.value('red'),
)

In [9]:
statemap + quake_points + well_points

In [17]:
from ipyleaflet import Map, GeoJSON
import json

In [23]:
import urllib.request
map_url = 'https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/OK-40-oklahoma-counties.json'

with urllib.request.urlopen(map_url) as url:
    data = json.loads(url.read().decode())

In [24]:
m = Map(center=(50.6252978589571, 0.34580993652344), zoom=3)

In [25]:
m

Map(center=[50.6252978589571, 0.34580993652344], controls=(ZoomControl(options=['position', 'zoom_in_text', 'z…

In [11]:
alt.Chart(wells.reset_index()).mark_bar().encode(
    x='WellType:O',
    y='count()',
    color='WellType:O'
).properties(width=800, height=500)

In [12]:
well_type_plot = alt.Chart(wells).mark_line().encode(
    x='time:T',
    y='well_count:Q',
    color='WellType:N'
)

(well_type_plot + quakes_plot).properties(width=1000, height=500).interactive()

In [13]:
alt.Chart(wells).mark_bar().encode(
    x='time:T',
    y='count()',
).properties(width=800, height=500).interactive()