<p align="center">
    <img src="https://github.com/GeostatsGuy/GeostatsPy/blob/master/TCG_color_logo.png?raw=true" width="220" height="240" />

</p>

## Data Analytics Experiential Learning 

### Uncertainty Modeling with Bootstrap and Monte Carlo Simulation


#### Michael Pyrcz, Associate Professor, University of Texas at Austin 

##### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig)  | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)


### Exercise: Bootstrap for Subsurface Data Analytics in Python 

Let's make a simple bootstrap and Monte Carlo Simulation workflow to calculate an uncertainty model.

Here's some lectures that may be helpful:

* [Monte Carlo Simulation](https://youtu.be/Qb8TsSINpnU)

* [Bootstrap](https://youtu.be/wCgdoImlLY0)

#### The Problem: What is the uncertainty in Oil in Place?

We have 25 wells drilled in the reservoir with porosity and thickness information. The file is available here: [well_data](https://github.com/GeostatsGuy/GeoDataSets/blob/master/WellPorThick.csv).

Build an uncertainty model for oil in place. Provide the distribuion labelled with P10, P50 and P90.

#### Workflow Steps

1. Calculate the uncertainty in average porosity ($\phi$) and thickness ($th$) by bootstrap.

2. Assume known/constant reservoir area ($a$) = $1,000,000 m^2$ and oil saturation ($s_o$) = $1.0$

3. Apply the OIL in place transfer function with the bootstrap and constant inputs. The transfer function is:

\begin{equation}
OIP = \overline{\phi} \cdot \overline{th} \cdot a \cdot s_o
\end{equation}

4. Visualize and summarize the results, include distribution with P10, P50 and P90 labelled.

#### Objective 

To provide a realistic, experiential learning opportunity. 

#### Import Packages

We will also need some standard packages. These should have been installed with Anaconda 3.

In [1]:
import numpy as np                        # ndarrys for gridded data
import pandas as pd                       # DataFrames for tabular data
import matplotlib.pyplot as plt           # for plotting
from inspect import signature             # find number of arguments in a function
import random                             # random sampling

#### Declare Functions and Provide Code Snippets

Declare convenience functions and code snippets to help with the workflow construction.

In [2]:
# Load Data
#DataFrame = pd.read_csv('FileName.csv')
# E.g.: df = pd.read_csv('https://raw.githubusercontent.com/GeostatsGuy/GeoDataSets/master/WellPorThick.csv')

# Extract Column as a 1D Array
# 1d_ndarray = df['Feature_Name']
# E.g.: porosity = df['Por'].values

# Bootstrap to calcuate uncertainty in a statistic
def bootstrap(zdata,nreal,stat):
    zreal = []                            # declare an empty list to store the bootstrap realizations
    for l in range(0,nreal):              # loop over the L bootstrap realizations
        samples = random.choices(zdata, k=len(zdata)) # n Monte Carlo simulations, sample with replacement
        zreal.append(stat(samples))       # calculate the realization of the statistic and append to list
    return np.array(zreal)                          # return the list of realizations of the statistic
# 1d_ndarray = bootstrap(1d_ndarray,L,statistic)
# E.g.: por_avg_real = bootstrap(por,L,np.average)

# Make Constant Inputs, Arrary of Constants if No Uncertainty
# 1d_ndarray = np.full(number_realizations,value)
# E.g.: area = np.full(number_realizations,value)

# Combine Bootstrap Inputs into 1 Array
# 2d_ndarray = np.hstack((1d_ndarray,1d_ndarray,...,1d_ndarray)) 
# E.g.: inputs = np.hstack((boot_avg_por,boot_avg_thick,area,so))

# Tranfer function as a function
def OIP(avg_por,avg_thick,area,so):
    return avg_por*avg_thick*area*so

# Monte Carlo Simulation with input realizations and transfer function
def MCS(transfer_function, inputs):
    sig = signature(transfer_function)
    ninput = len(sig.parameters) 
    nreal = inputs.shape[0]
    if ninput != inputs.shape[1]:
        print('Function does not agree with the number of inputs!')
        return 0
    realizations = []
    for i in range(0,nreal):
        realizations.append(transfer_function( *(inputs[i].tolist()) ) )
    return np.array(realizations)
# 1d_ndarray = MCS(function,2d_ndarray)        # number of inputs to function equal number of columns for inputs
# decision_criteria = MCS(transfer_function=OIP,inputs = x)

#### Insert Your Workflow Below

In [3]:
# Determine the number of realizations
L = 1000

# Load and extract the well data


#### Comments

This was a basic problem for uncertainty modeling with bootstrap and Monte Carlos simulation. Much more can be done!
  
I hope this was helpful,

*Michael*

#### The Author:

### Michael Pyrcz, Associate Professor, University of Texas at Austin 
*Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions*

With over 17 years of experience in subsurface consulting, research and development, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and geoscientists' impact in subsurface resource development. 

For more about Michael check out these links:

#### [Twitter](https://twitter.com/geostatsguy) | [GitHub](https://github.com/GeostatsGuy) | [Website](http://michaelpyrcz.com) | [GoogleScholar](https://scholar.google.com/citations?user=QVZ20eQAAAAJ&hl=en&oi=ao) | [Book](https://www.amazon.com/Geostatistical-Reservoir-Modeling-Michael-Pyrcz/dp/0199731446) | [YouTube](https://www.youtube.com/channel/UCLqEr-xV-ceHdXXXrTId5ig)  | [LinkedIn](https://www.linkedin.com/in/michael-pyrcz-61a648a1)

#### Want to Work Together?

I hope this content is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.

* Want to invite me to visit your company for training, mentoring, project review, workflow design and / or consulting? I'd be happy to drop by and work with you! 

* Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!

* I can be reached at mpyrcz@austin.utexas.edu.

I'm always happy to discuss,

*Michael*

Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin