This Jupyter notebook estimates the cost of plugging Wyoming's newest wells -- those drilled between 2011 and 2015 -- based on the relationship between well depth and well-plugging cost in historic reclamation projects, and the orphan rate for historic wells.

First, we'll import some of the basic packages.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Next, we'll import CSV file of wells spudded between 2011 and present. 

This CSV file is based on a database from the **Wyoming Oil and Gas Conservation Commission** that included information such as depth, type, spud date and status for every well, going back to 1900. 

In Excel, we filtered the wells based on spud date (date of first drilling). Then, we took only the depth information for those 5,125 wells and saved it as a CSV. We excluded wells that had no spud and/or depth information.

In [16]:
import csv
with open('WellDepth20112015.csv', 'rb') as f:
    reader = csv.reader(f)
    depth_list = list(reader)

Next, we'll write a function, **singleTrial**, to randomly select 3% of the depth values, or 154 of them. We are selecting 3% because this is the lower bound of the typical orphan rate for wells that were spudded 2010 or earlier.

For those random 154 wells, we'll estimate the cost using the formula we found by doing a linear regression of plugging costs and well depth for historical reclamation projects in Wyoming. That formula is: **Cost Per Well = 12.3635 * AvgDepth - 2211.21**

**SingleTrial** returns the total estimated plugging cost for all 154 of the of the potentially orphaned wells.

In [12]:
import random
def singleTrial(depthlist):
    depth_sample = random.sample(depthlist,154)
    est_well_cost = [12.3635*float(x[0]) - 2211.21 for x in depth_sample]
    total_est_cost = sum(est_well_cost)
    return total_est_cost

Next, we'll write a function, **multipleTrials**, to run that single trial a given number of times and return the results for the total costs of reclamation for each trial in a list.

In [13]:
def multipleTrials(depthlist, numberoftrials):
    trials=[]
    for i in range(numberoftrials):
        trials.append(singleTrial(depthlist))
    return trials

Next, we'll write a function, **summarizeTrials**, that looks at the estimated well-plugging costs for the simulated trial and computes the average, minimum and maximum values, and prints them.

In [14]:
#Average the trials
def summarizeTrials(trialslist, numberoftrials):
    AVERAGE = sum(trialslist) / numberoftrials
    MIN = min(trialslist)
    MAX = max(trialslist)
    X = "AVERAGE is " + str(AVERAGE)
    Y = "MIN is " + str(MIN)
    Z = "MAX is " + str(MAX)
    print X
    print Y
    print Z

Finally, we run our simulation 1,000 times and see what the average estimated well-plugging cost is, as well as the minimum and maximum estimated costs.

In [15]:
number_of_trials = 1000
trials = multipleTrials(depth_list,number_of_trials)
summarizeTrials(trials, number_of_trials)

AVERAGE is 16753767.4455
MIN is 14715962.5467
MAX is 20427557.325
