### (IN PROGRESS)

# Fracking and Oklahoma Quakes analysis

In this analysis, I will be looking at the dataset [Oklahoma Earthquakes and Saltwater Injection Wells](https://www.kaggle.com/ksuchris2000/oklahoma-earthquakes-and-saltwater-injection-wells) to see if I can find any correlation between the earthquakes and the use of [Injection Wells](https://en.wikipedia.org/wiki/Injection_well)

### Preface

In [1]:
from datetime import datetime
import pandas as pd
import altair as alt

alt.renderers.enable('altair_viewer')
alt.data_transformers.enable('data_server')

DataTransformerRegistry.enable('data_server')

http://www.ogs.ou.edu/pubsscanned/openfile/OF1_2014_Murray.pdf

### Preprocessing

Source: https://www.kaggle.com/ksuchris2000/oklahoma-earthquakes-and-saltwater-injection-wells

In [2]:
wells_data = pd.read_csv('data_fracking/InjectionWells.csv')

wells_data = wells_data.drop(columns=[
    'Operator ID', 'WellNumber', 'OrderNumbers', 'Sec', 'Twp', 'Rng', 'QQQQ',
    'ZONE', 'Unnamed: 18', 'Unnamed: 19', 'Unnamed: 20', 'API#'
])

# Drop outlier datapoints
wells_data = wells_data[wells_data['LONG'] < -95]
wells_data = wells_data[wells_data['LONG'] > -105]

wells_data['Approval Date'] = pd.to_datetime(wells_data['Approval Date'])
wells_data = wells_data.sort_values('Approval Date').reset_index().drop(columns=['index'])
wells_data

Unnamed: 0,Operator,WellType,WellName,Approval Date,County,LAT,LONG,PSI,BBLS
0,XTO ENERGY INC,2R,HEWITT UNIT 22,1936-12-18,CARTER,34.199067,-97.399092,1100,3500
1,GATEWAY RESOURCES USA INC,2R,ANDY BROWN,1945-04-22,WASHINGTON,36.901903,-95.900888,,
2,WHITE MONTY & TERRY PRODUCTION,2D,"ROLLER, B. H.",1946-10-19,LINCOLN,35.511472,-96.767417,0,100
3,CIRCLE 9 RESOURCES LLC,2R,SCHOOL LAND 66,1947-03-18,PAWNEE,36.164978,-96.717249,,
4,CIRCLE 9 RESOURCES LLC,2R,SCHOOL LAND 66,1947-03-18,PAWNEE,36.167568,-96.722799,,
...,...,...,...,...,...,...,...,...,...
11056,CITATION OIL & GAS CORPORATION,2R,COX PENN SAND UNIT,2017-08-30,CARTER,34.373645,-97.399878,1500,999
11057,COMPLETE ENERGY SERVICES INC,CDW,SEILING SWD,2017-08-30,DEWEY,36.146649,-98.934932,1635,5000
11058,URBAN OIL & GAS GROUP LLC,2R,GLADYS LOVE,2017-08-30,MCCLAIN,34.919795,-97.421550,3500,500
11059,BROWER OIL & GAS CO INC,2D,REED,2017-08-30,OKMULGEE,35.475614,-95.911700,400,500


In [29]:
wells_plot = alt.Chart(wells_data.reset_index()).mark_area().encode(
   x=alt.X('Approval Date', type='temporal'),
   y=alt.Y('index', type='quantitative', title='Number of Wells'),
).properties(
    width=800, title='Total Wells Over Time'
)
wells_plot.show()

Displaying chart at http://localhost:18304/


In [3]:
quakes_data = pd.read_csv('data_fracking/okQuakes.csv')
quakes_data = quakes_data[quakes_data['type'] == 'earthquake']
quakes_data['time'] = pd.to_datetime(quakes_data['time'])
quakes_data = quakes_data.sort_values(by='time')
quakes_data

Unnamed: 0,time,latitude,longitude,depth,mag,magType,nst,gap,dmin,rms,...,updated,place,type,horizontalError,depthError,magError,magNst,status,locationSource,magSource
0,1973-03-17 07:43:05.500000+00:00,36.087000,-106.168000,6.00,4.50,mb,,,,,...,2014-11-06T23:21:10.078Z,New Mexico,earthquake,,,,,reviewed,us,us
1,1973-05-25 14:40:13.900000+00:00,33.917000,-90.775000,6.00,,,,,,,...,2014-11-06T23:21:12.859Z,Mississippi,earthquake,,,,,reviewed,s,us
2,1973-09-19 13:28:20.500000+00:00,37.160000,-104.594000,5.00,,,,,,,...,2014-11-06T23:21:20.295Z,Colorado,earthquake,,,,,reviewed,us,us
3,1973-09-23 03:58:54.900000+00:00,37.148000,-104.571000,5.00,4.20,mb,,,,,...,2014-11-06T23:21:20.346Z,Colorado,earthquake,,,,,reviewed,us,us
4,1974-02-15 13:33:49.200000+00:00,36.500000,-100.693000,24.00,4.50,mb,,,,,...,2014-11-06T23:21:22.859Z,Oklahoma,earthquake,,,,,reviewed,us,us
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
13949,2016-09-20 05:38:38.350000+00:00,36.373500,-96.818700,4.69,2.30,ml,,54.0,0.01800,0.16,...,2016-09-20T16:13:04.040Z,"4km NNW of Pawnee, Oklahoma",earthquake,0.90,1.80,0.059,38.0,reviewed,us,us
13950,2016-09-20 06:36:35.520000+00:00,36.412200,-96.882400,4.41,1.40,ml,,59.0,0.02500,0.18,...,2016-09-20T16:21:03.040Z,"10km NW of Pawnee, Oklahoma",earthquake,0.80,2.00,0.062,34.0,reviewed,us,us
13951,2016-09-20 16:01:08.610000+00:00,37.277167,-98.072667,5.85,2.43,ml,15.0,237.0,0.05185,0.04,...,2016-09-20T19:08:23.720Z,"4km WSW of Harper, Kansas",earthquake,0.38,0.33,0.179,17.0,reviewed,ismp,ismp
13952,2016-09-20 17:31:48.380000+00:00,36.939300,-97.896000,2.32,3.00,mb_lg,,37.0,0.05600,0.18,...,2016-09-20T17:43:43.040Z,"20km NW of Medford, Oklahoma",earthquake,1.00,3.70,0.076,45.0,reviewed,us,us


In [None]:
import matplotlib.pyplot as plt

In [5]:
states_url = 'https://raw.githubusercontent.com/kylepollina/Fracking_and_Oklahoma_Quakes/master/states.json'
states = alt.topo_feature(url=states_url, feature='us')

statemap = alt.Chart(states).mark_geoshape(
    fill = 'lightgrey',
    stroke = 'white'
).properties(
    width = 800,
    height = 400
)

statemap.show()

Displaying chart at http://localhost:18304/


KeyboardInterrupt: 

In [None]:

quake_points = alt.Chart(quakes_data).mark_circle().encode(
    latitude='latitude',
    longitude='longitude'
)

well_points = alt.Chart(wells_data).mark_circle().encode(
    latitude='LAT',
    longitude='LONG',
    color=alt.value('red'),
).properties(width=1000,height=500)

(statemap + quake_points + well_points).show()

Displaying chart at http://localhost:18304/


## Mapping

Now let's take a look at where the wells are, and where the quakes are on a map.

In [44]:
states_url = 'https://raw.githubusercontent.com/kylepollina/Fracking_and_Oklahoma_Quakes/master/states.json'
# states_url = 'https://raw.githubusercontent.com/deldersveld/topojson/master/countries/united-states/us-albers.json'
states = alt.topo_feature(url=states_url, feature='us')

statemap = alt.Chart(states).mark_geoshape(
    fill='lightgrey',
    stroke='white'
).properties(
    width=800,
    height=400
).interactive()

statemap

Displaying chart at http://localhost:18304/


KeyboardInterrupt: 

In [28]:
quake_points = alt.Chart(quakes_data).mark_circle().encode(
    latitude = 'latitude',
    longitude = 'longitude'
)

well_points = alt.Chart(wells_data.reset_index()).mark_circle().encode(
    latitude='LAT',
    longitude='LONG',
    color=alt.value('red'),
    tooltip='LONG'
).properties(width=1000,height=500)

As you can see, the well data focuses mainly on wells within Oklahoma. There are a lot of wells here. 

## Looking at the Wells

Looking at the wells data, there are 5 different types of wells. Lets see if they look to have any correlation with the number of earthquakes.

In [None]:
# # # well_type_plot = alt.Chart(wells).mark_line().encode(
# # #     x='time:T',
# # #     y='well_count:Q',
# # #     color='WellType:N'
# # # )
# # #
# # (well_type_plot + quakes_plot).properties(width=800, height=400) & quakes_plot
# #
#

In [None]:
# # alt.Chart(wells).mark_bar().encode(
# #     x='time:T',
# #     y='count()',
# # ).properties(width=800, height=500).interactive()
# #
#