# Volumetrics: HCIP calculation

We'll implement the volumetric equation:

$$ V = A \times T \times G \times \phi \times N\!\!:\!\!G \times S_\mathrm{O} \times \frac{1}{B_\mathrm{O}} $$

## Gross rock volume 

$$ \mathrm{GRV} = A \times T $$

In [None]:
thick = 80            # metres
area = 20000 * 30000  # metres

grv = thick * area
grv

Wouldn't it be cool if we could carry units around with our calculations? With [`pint`](https://pint.readthedocs.io/en/latest/index.html), we can!

In [None]:
import pint

u = pint.UnitRegistry()

thick = 80 * u.m
area = 20000 * u.m * 30000 * u.m
grv = thick * area
grv

By the way, `pint` can also do some really handy things, like parse strings:

In [None]:
import pint

u.Quantity("2,300cm^3")

### EXERCISE

Make a <bold>function</bold> that computes the GRV by rearranging the following lines of code:

    return grv
    """Compute GRV from thickness and area."""
    grv = thickness * area
    def calculate_grv(thickness, area):


In [None]:
# YOUR CODE HERE



In [None]:
def calculate_grv(thickness, area):
    """
    Compute GRV from thickness and area.
    
    Example                                       # <-- Add as example, if appropriate.
    >>> calculate_grv(thickness=80, area=100)     # <-- But don't add the doctest at first!
    8000
    """
    grv = thickness * area
    return grv

In [None]:
import doctest

doctest.testmod()

Now we can just call this function, instead of remembering the equation. (Admittedly, the equation is rather easy to remember in this case!)

In [None]:
calculate_grv(thick, area)

It works!

Now we need to compensate for the prospect not being a flat slab of rock &mdash; using the geometric factor. 

We will implement the equations implied by this diagram:

<html>
    <img src="http://subsurfwiki.org/images/6/66/Geometric_correction_factor.png", width=600>
</html>

In [None]:
top = input("What shape is the prospect? ")

In [None]:
top

In [None]:
height = 100 * u.m
ratio = thick / height

# Depending on time available, this could be part of the exercise:
if top == 'round':
    g = -0.6 * ratio + 1
elif top == 'flat':
    g = -0.3 * ratio + 1
else:
    g = 1

g

### EXERCISE

Turn the geometric factor into a function. <a title="Remember y = mx + b">HINT</a>

Can you reproduce the plot above?

In [None]:
def geometric_factor(thick, height, top):

    # Your code here.
    
    return g

In [None]:
def geometric_factor(thick, height, top='slab'):
    """
    Compute geometric factor.
    """
    ratio = np.clip(thick / height, 0, 1[y])  # Ensure not more than 1.

    # One way:
    if top == 'round':
        g = -0.6 * ratio + 1
    elif top == 'flat':
        g = -0.3 * ratio + 1
    else:
        g = 1
     
    # Or, slightly better:
    f = {'round': -0.6, 'flat': -0.3}.get(top, 0)
    g = f * ratio + 1

    return g

In [None]:
geometric_factor(thick, height=100*u.m, top='round')

In [None]:
import matplotlib.pyplot as plt

thicknesses = np.arange(0, 1, 0.05)
heights = 1
x = thicknesses / heights

for t in ['round', 'flat', 'slab']:
    y = geometric_factor(thicknesses, heights, top=t)
    plt.plot(x, y)
plt.ylim(0, 1.05)

We'll carry on with a `g` of 1.

In [None]:
g = geometric_factor(thick, height=100*u.m, top='slab')

grv * g

💡 It's not that easy to write this function so that you can pass arrays to it, because you need a way to handle the `top` parameter. Give it a try sometime.

## HC pore volume

We need:

- net:gross &mdash; the ratio of reservoir-quality rock thickness to the total thickness of the interval.
- porosity
- $S_\mathrm{O}$ &mdash; the oil saturation, or proportion of oil to total pore fluid.

In [None]:
netg = 0.5   # fraction
por = 0.24   # fraction
s_o = 0.8    # fraction

netg * por * s_o

We'll leave that as a fraction for now.

### EXERCISE

Turn this into a function.

In [None]:
def calculate_hcpv( ... ):  # Add the arguments.
    
    # YOUR CODE HERE
    
    return  # Don't forget to return something!

In [None]:
def calculate_hcpv(netg, por, s_o):
    """A function to compute the hydrocarbon pore volume."""
    hcpv = netg * por * s_o
    return hcpv

In [None]:
hcpv = calculate_hcpv(netg, por, s_o)

## Formation volume factor

Oil shrinks when we produce it, especially if it has high GOR. The FVF, or $B_\mathrm{O}$, is the ratio of a reservoir barrel to a stock-tank barrel (25 deg C and 1 atm). Typically the FVF is between 1 (heavy oil) and 1.7 (high GOR). 

$B_\mathrm{O}$ is a function of the oil gravity, gas gravity, temperature, and the solution GOR. [Read about it.](https://petrowiki.spe.org/Oil_formation_volume_factor)

In [None]:
fvf = 1.1

### EXERCISE

For gas, $B_\mathrm{G}$ is $0.35 Z T / P$, where $Z$ is the correction factor, or gas compressibility factor. $T$ should be in kelvin and $P$ in kPa. $Z$ is usually between 0.8 and 1.2, but it can be as low as 0.3 and as high as 2.0.

Can you write a function to calculate $B_\mathrm{G}$?

In [None]:
def calculate_Bg( ... ):

    # YOUR CODE HERE
    


In [None]:
def calculate_Bg(T=273.15, P=101.325e3, Z=1, units='SI'):  # <-- Discussion about handling units.
    """
    Compute B_G from correction factor Z, temperature T (K),
    and pressure P (Pa).
    """
    Bg = 0.35 * Z * T / P
    
    # if units == 'SI':
    # etc.
    
    return Bg

In [None]:
x = calculate_Bg(T=293*u.K, P=1000*u.kPa)

assert x.m == 0.10255  # Attribute m gives magnitude only.

x

## Put it all together

Now we have the components of the volumetric equation:

In [None]:
hcip = grv * g * hcpv / fvf
hcip

Pint can convert to other units, e.g. Imperial barrels, for us.

In [None]:
hcip.to('imperial_barrel')

An Imperial barrel is 43 gallons ([Wikipedia](https://en.wikipedia.org/wiki/Barrel_(unit)), whereas an oil barrel is only 42 gallons. [For more on conversion to bbl, BOE, etc.](https://en.wikipedia.org/wiki/Barrel_of_oil_equivalent).

So let's define a custom unit:

In [None]:
u.define('oil_barrel = 42 gallon = bbl')

In [None]:
hcip.to('bbl')

In [None]:
hcip.to('Gbbl')

### EXERCISE

Can you write a function to compute the volume (i.e. the HCIP), given all the inputs?

Try to use the functions you have already written.

Make it possible to pass in geomatric factor, `g` — and if it is passed in, just ignore the values for `height` and `top`. 

In [None]:
# Put your code here.


    

In [None]:
def calculate_hcip(thickness, area, height, top, netg, por, s_o, fvf, g=None):
    """
    Calculate HCIP, given all the parameters.
    
    If geometric factor `g` is given, `height` and `top` are ignored.
    """
    grv = calculate_grv(thickness, area)
    
    if g is None:
        g = geometric_factor(thickness, height, top)

    grv *= g
    hcpv = calculate_hcpv(netg, por, s_o)
    return grv * hcpv / fvf

In [None]:
v = calculate_hcip(thick, area, height, top, netg, por, s_o, fvf).to('Gbbl')

assert abs(v.m - 26.35) < 0.01
assert v.u == 'gigaoil_barrel'

In [None]:
v = calculate_hcip(thick, area, None, None, netg, por, s_o, fvf, g=0.5)

assert abs(v.m - 2.095e9) < 1e6
assert v.u == 'meter ** 3'

## Monte Carlo simulation

We can easily draw randomly from distributions of properties:

- Normal: https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html
- Uniform: https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.uniform.html
- Lognormal: https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.lognormal.html

The normal distribution is probably familiar:

<img src="https://subsurfwiki.org/images/3/3a/Normal_distribution.png" width="500px" />

In [None]:
import numpy as np

por = np.random.normal(loc=0.15, scale=0.025, size=100)
por

In [None]:
import matplotlib.pyplot as plt

_ = plt.hist(por, bins='auto')  # Various bin calcs: sqrt, fd, stone, rice, etc.

In [None]:
import seaborn as sns
sns.set_style("whitegrid")

sns.histplot(por, kde=True)

We expect that our simple functions work on NumPy arrays:

In [None]:
netg = np.random.normal(loc=0.5, scale=0.1, size=100)

hcpv = calculate_hcpv(netg, por, s_o)
hcpv

In [None]:
_ = plt.hist(hcpv)

The histogram looks a bit ragged, but this is probably because of the relatively small number of samples.

### EXERCISE

1. Compute HCIP with these distributions. Make a histogram of the result in millions of barrels.
1. How does the histogram look if you take 1000 samples instead of 100?
1. Make distributions for some of the other properties, like thickness and FVF.
1. Check that you don't get unreasonable values, like negative numbers, or decimal fractions over 1.0 Try to implement this if you have time.

In [None]:
# YOUR CODE HERE



In [None]:
area = np.random.normal(loc=2e4*3e4, scale=1e7, size=1000) * u.m**2
thick = np.random.normal(loc=80, scale=10, size=1000) * u.m
grv = calculate_grv(thick, area)

netg = np.random.normal(loc=0.5, scale=0.1, size=1000)
por = np.random.normal(loc=0.15, scale=0.025, size=1000)
s_o = np.random.normal(loc=0.8, scale=0.05, size=1000)

# Regularize.
netg = np.clip(netg, a_min=0.0, a_max=1.0)
por = np.clip(por, a_min=0.0, a_max=0.35)
s_o = np.clip(s_o, a_min=0.0, a_max=1.0)

hcpv = calculate_hcpv(netg, por, s_o)

fvf = np.random.normal(loc=1.05, scale=0.05, size=1000)
fvf = np.clip(fvf, a_min=1.0, a_max=np.inf)

hcip = grv * hcpv / fvf

In [None]:
_ = plt.hist(hcip.m)

Now we can compute some summary statistics. For the **mode** we'll need the KDE:

In [None]:
import scipy.stats as st
import scipy.optimize as so

# Fit a KDE.
kernel = st.gaussian_kde(hcip.m)

# Evaluate it, e.g. to make a plot.
x = np.linspace(0, 7e9, 100)
kde = kernel(x)

# Find the maximum on the -ve KDE (all we have is 'minimize'):
maxx = so.fmin(lambda x_i: -kernel(x_i), 0)  # 0 is initial guess.

# Make a plot.
plt.hist(hcip.m)
plt.plot(x, kde*300*2e9)
plt.axvline(maxx, c='r')
plt.show()

In [None]:
stats = {
    'p10': np.percentile(hcip, 10),  # Some people call this P90.
    'mode': maxx.item() * hcip.u,    # Add units.
    'p50': np.median(hcip),
    'mean': np.mean(hcip),
    'p90': np.percentile(hcip, 90),  # Some people call this P10.
}

for stat, x in stats.items():
    print(f"{stat} is {x.to('Gbbl').m:.1f} billion bbl")

In [None]:
sns.displot(hcip.to('Gbbl').m, kde=True, aspect=2)
for stat, x in stats.items():
    c = 'g' if stat == 'mode' else 'r' if stat == 'mean' else 'b'
    plt.axvline(x.to('Gbbl').m, c=c, alpha=0.67)

#### Go further

We've just scatched the surface of Monte Carlo simulation here. To find out more, check out the notebook [Monte_Carlo_simulation.ipynb](Monte_Carlo_simulation.ipynb).

Some things to think about:

- Our variables were uncorrelated, whereas some of them might actually be correlated. For example, perhaps net:gross and thickness are related.
- Some of these properties probably do not have a normal distribution; for example, porosity is often skewed.
- Rather than doing the entire field at once, we might want to break it into components and simulate them independently. This would allow us to model the spatial dependence of, say, thickness. Then we're going down the geomodeling road...

---

## Compute on a DataFrame

Suppose we have a spreadsheet of prospect data:

In [None]:
import pandas as pd

uid = "1P2JxXG_jLZ0vx8BlFvm0hD6sBBZH2zU8tk9T-SI27mE"
url = f"https://docs.google.com/spreadsheets/d/{uid}/export?format=csv"

df = pd.read_csv(url)
df.head()

We usually want to avoid looping over `pandas.DataFrame` objects, so when we want to do something to every row (or column) the best idea is usually to write a 'row processing' function, then use `df.apply()` to set it to work. This is a [**functional programming**](https://en.wikipedia.org/wiki/Functional_programming) paradigm.

In [None]:
def hcpv_row(row):
    """Process one row."""
    hcpv = calculate_hcpv(row['N:G'], row['phi'], row['So'])
    return hcpv

df.apply(hcpv_row, axis=1)

### Exercise

- Compute the HCIP for every prospect. <a title="Write a function that computes the HCIP for one row. Then use df.apply(func, axis=1) to apply it to the DataFrame.">Hover for HINT</a>
- Make a histogram of the HCIP volumes.
- List the largest prospects. <a title="Try df.nlargest()">Hover for HINT</a>
- Plot the prospects on a map, using HCIP or some other data as the size of the marker.
- Can you generate a Monte Carlo result for every prospect?

In [None]:
# OPTIONAL
# You might like to use this mapping from the
# DataFrame columns to the function arguments:
names = {
    'thickness': 'Thick [m]',
    'area': 'Area [km2]',
    'height': None,
    'top': None,
    'netg': 'N:G',
    'por': 'phi',
    's_o': 'So',
    'fvf': 'Bo',
    'g': 'GeomFactor',
}

In [None]:
# YOUR CODE HERE



In [None]:
names = {
    'thickness': 'Thick [m]',
    'area': 'Area [km2]',
    'height': None,
    'top': None,
    'netg': 'N:G',
    'por': 'phi',
    's_o': 'So',
    'fvf': 'Bo',
    'g': 'GeomFactor',
}

def hcip_row(row):
    params = {k: row.get(v) for k, v in names.items()}
    hcip = calculate_hcip(**params)
    return hcip

In [None]:
df['HCIP'] = df.apply(hcip_row, axis=1)

df.head()

In [None]:
df.HCIP.hist()

In [None]:
df[df.HCIP > 400]

In [None]:
df.nlargest(5, ['HCIP'])

## A peek at GeoPandas

In [None]:
import geopandas as gpd

geometry = gpd.points_from_xy(df['UTMx [m]'], df['UTMy [m]'])

gdf = gpd.GeoDataFrame(df, geometry=geometry)

gdf

In [None]:
gdf.plot(markersize=df.HCIP, c=df.phi)

In [None]:
gdf.to_file('Prospects.shp')
gdf.to_file('Prospects.geojson', driver='GeoJSON')

<hr />

<div>
<img src="https://avatars1.githubusercontent.com/u/1692321?s=50"><p style="text-align:center">© Agile Geoscience 2021</p>
</div>