
### Kriging for Estimation Maps

#### Reidar B Bratvold, Professor, University of Stavanger

Here's a simple workflow for spatial estimation with kriging. This step is ciritical for:

1. Prediction away from wells, e.g. pre-drill assessments.
2. Spatial cross validation.
3. Spatial uncertainty modeling.

First let's explain the concept of spatial estimation.

#### Spatial Estimation

Consider the case of making an estimate at some unsampled location, $𝑧(\bf{u}_0)$, where $z$ is the property of interest (e.g. porosity etc.) and $𝐮_0$ is a location vector describing the unsampled location.

How would you do this given data, $𝑧(\bf{𝐮}_1)$, $𝑧(\bf{𝐮}_2)$, and $𝑧(\bf{𝐮}_3)$?

It would be natural to use a set of linear weights to formulate the estimator given the available data.

\begin{equation}
z^{*}(\bf{u}) = \sum^{n}_{\alpha = 1} \lambda_{\alpha} z(\bf{u}_{\alpha})
\end{equation}

We could add an unbiasedness constraint to impose the sum of the weights equal to one.  What we will do is assign the remainder of the weight (one minus the sum of weights) to the global average; therefore, if we have no informative data we will estimate with the global average of the property of interest.

\begin{equation}
z^{*}(\bf{u}) = \sum^{n}_{\alpha = 1} \lambda_{\alpha} z(\bf{u}_{\alpha}) + \left(1-\sum^{n}_{\alpha = 1} \lambda_{\alpha} \right) \overline{z}
\end{equation}

We will make a stationarity assumption, so let's assume that we are working with residuals, $y$. 

\begin{equation}
y^{*}(\bf{u}) = z^{*}(\bf{u}) - \overline{z}(\bf{u})
\end{equation}

If we substitute this form into our estimator the estimator simplifies, since the mean of the residual is zero.

\begin{equation}
y^{*}(\bf{u}) = \sum^{n}_{\alpha = 1} \lambda_{\alpha} y(\bf{u}_{\alpha})
\end{equation}

while satisfying the unbaisedness constraint.  

#### Kriging

Now the next question is what weights should we use?  

We could use equal weighting, $\lambda = \frac{1}{n}$, and the estimator would be the average of the local data applied for the spatial estimate. This would not be very informative.

We could assign weights considering the spatial context of the data and the estimate:

* **spatial continuity** as quantified by the variogram (and covariance function)
* **redundancy** the degree of spatial continuity between all of the available data with themselves 
* **closeness** the degree of spatial continuity between the avaiable data and the estimation location

The kriging approach accomplishes this, calculating the best linear unbiased weights for the local data to estimate at the unknown location.  The derivation of the kriging system and the resulting linear set of equations is available in the lecture notes.  Furthermore kriging provides a measure of the accuracy of the estimate!  This is the kriging estimation variance (sometimes just called the kriging variance).

\begin{equation}
\sigma^{2}_{E}(\bf{u}) = C(0) - \sum^{n}_{\alpha = 1} \lambda_{\alpha} C(\bf{u}_0 - \bf{u}_{\alpha})
\end{equation}

What is 'best' about this estimate? Kriging estimates are best in that they minimize the above estimation variance. 

#### Properties of Kriging

Here are some important properties of kriging:

* **Exact interpolator** - kriging estimates with the data values at the data locations
* **Kriging variance** can be calculated before getting the sample information, as the kriging estimation variance is not dependent on the values of the data nor the kriging estimate, i.e. the kriging estimator is homoscedastic. 
* **Spatial context** - kriging takes into account, furthermore to the statements on spatial continuity, closeness and redundancy we can state that kriging accounts for the configuration of the data and structural continuity of the variable being estimated.
* **Scale** - kriging may be generalized to account for the support volume of the data and estimate. We will cover this later.
* **Multivariate** - kriging may be generalized to account for multiple secondary data in the spatial estimate with the cokriging system. We will cover this later.
* **Smoothing effect** of kriging can be forecast. We will use this to build stochastic simulations later.

#### Objective 

To provide hands-on experience with building subsurface modeling workflows. Python provides an excellent vehicle to accomplish this.

The objective is to remove the hurdles of subsurface modeling workflow construction by providing building blocks and sufficient examples. This is not a coding class per se, but we need the ability to 'script' workflows working with numerical methods.    

#### Load the required libraries

The following code loads the required libraries.

In [None]:
import geostats                 # GSLIB methods convert to Python    

We will also need some standard packages. These should have been installed with Anaconda 3.

In [None]:
import os                                               # to set current working directory 
import numpy as np                                      # arrays and matrix math
import pandas as pd                                     # DataFrames
import matplotlib.pyplot as plt                         # plotting

If you get a package import error, you may have to first install some of these packages. This can usually be accomplished by opening up a command window on Windows and then typing 'python -m pip install [package-name]'. More assistance is available with the respective package docs.  

#### Loading Tabular Data

Here's the command to load our comma delimited data file in to a Pandas' DataFrame object. 

In [None]:
fraction_data = 0.2                                     # extract a fraction of data for demonstration / faster runs, set to 1.0 for homework

#df = pd.read_csv("sample_data_MV_biased.csv")                     # read a .csv file in as a DataFrame
df = pd.read_csv("../data/sample_data_MV_biased.csv") # load the data from Dr. Pyrcz's GitHub repository

if fraction_data < 1.0:
    df = df.sample(frac = fraction_data,replace = False,random_state = 73073) #`random_state = 73073` ensures reproducibility
df = df.reset_index() #ensuring a continuous index
df = df.iloc[:,2:] #remove the first two columns

df['LogPerm'] = np.log(df['Perm'].values)

df.head()                                               # we could also use this command for a table preview 

#### Summary Statistics

Let's look at summary statistics for all facies combined:

In [None]:
df.describe().transpose()                          # summary table of all facies combined DataFrame statistics

#### Distributions.  

In [None]:
plt.subplot(121)                                        # plot original sand and shale porosity histograms
plt.hist(df['Porosity'], facecolor='darkorange',bins=np.linspace(0.0,0.25,1000),histtype="stepfilled",alpha=0.8,density=True,cumulative=True,edgecolor='black',label='Original')
plt.xlim([0.05,0.25]); plt.ylim([0,1.0])
plt.xlabel('Porosity (fraction)'); plt.ylabel('Frequency'); plt.title('Porosity')
plt.legend(loc='upper left')
plt.grid(True)

plt.subplot(122)                                        # plot nscore transformed sand and shale histograms
plt.hist(df['Perm'], facecolor='darkorange',bins=np.linspace(0.0,1000.0,100000),histtype="stepfilled",alpha=0.8,density=True,cumulative=True,edgecolor='black',label='Original')
plt.xlim([0.0,1000.0]); plt.ylim([0,1.0])
plt.xlabel('Permeability (mD)'); plt.ylabel('Frequency'); plt.title('Permeability')
plt.legend(loc='upper left')
plt.grid(True)

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.2, wspace=0.2, hspace=0.3)
plt.show()

We can observe from the CDFs that the porosity distribution is 'Gaussian-like' in shape, while the permeability distribution is 'lognormal-like'.  They both look well behaved.

#### Calculating the Representative Mean with Declustering

For brevity we will omit data declustering from this workflow. We will assume declustered means for the porosity and permeability to apply with simple kriging.

#### Location Maps

Let's plot the location maps of porosity and permeability for all facies. We will also include a cross plot of permeability vs. porosity colored by facies to aid with comparison in spatial features between the porosity and permeability data.

#### Location map function

In [None]:
def locmap_st(df, x_col, y_col, feature_col, xmin, xmax, ymin, ymax, 
              vmin, vmax, title, xlabel, ylabel, cmap="inferno"):
    """
    Plots a spatial location map of a feature using a scatter plot in the current subplot.

    Parameters:
    -----------
    df : pd.DataFrame
        Dataframe containing spatial data.
    x_col, y_col : str
        Column names for X and Y coordinates.
    feature_col : str
        Column name for the feature to be visualized.
    xmin, xmax : float
        Minimum and maximum values for the X-axis.
    ymin, ymax : float
        Minimum and maximum values for the Y-axis.
    vmin, vmax : float
        Minimum and maximum values for the color scale.
    title : str
        Title of the plot.
    xlabel, ylabel : str
        Labels for the X and Y axes.
    cmap : str (default="inferno")
        Colormap for visualization.

    Returns:
    --------
    None (displays the plot)
    """

    ax = plt.gca()  # Get the current subplot
    scatter = ax.scatter(
        df[x_col], df[y_col], c=df[feature_col],
        cmap=cmap, edgecolors="black", vmin=vmin, vmax=vmax
    )

    # Add color bar
    cbar = plt.colorbar(scatter, ax=ax, shrink=0.8)
    cbar.set_label(feature_col)

    # Set axis labels and limits
    ax.set_xlabel(xlabel)
    ax.set_ylabel(ylabel)
    ax.set_xlim(xmin, xmax)
    ax.set_ylim(ymin, ymax)

    # Set title
    ax.set_title(title)

In [None]:
xmin = 0.0; xmax = 1000.0               # range of x values
ymin = 0.0; ymax = 1000.0               # range of y values

xsiz = 10; ysiz = 10                    # cell size
nx = 100; ny = 100                      # number of cells
xmn = 5; ymn = 5                        # grid origin, location center of lower left cell

cmap = plt.cm.plasma                    # color map

plt.subplot(121)
locmap_st(df, 'X', 'Y', 'Porosity', 0, 1000, 0, 1000, 0, 0.25,
          'Porosity - All Facies', 'X (m)', 'Y (m)', cmap)

plt.subplot(122)
locmap_st(df, 'X', 'Y', 'Perm', 0, 1000, 0, 1000, 0, 1000,
          'Permeability - All Facies', 'X (m)', 'Y (m)', cmap)

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.0, wspace=0.3, hspace=0.3)
plt.show()

#### Kriging for Porosity and Permeability Maps

Now let's try some kriging with the continuous properties. For this workflow we will demonstrate a cookie-cutter approach.  The steps are:

1. model the facies, sand and shale, probabilities with indicator kriging
2. model the porosity for sand and shale separately and exhaustively, i.e. at all locations in the model
3. model the permeability for sand and shale separately and exhaustively, i.e. at all locations in the model
4. assign sand and shale locations based on the probabilities from step 1 
5. combine the porosity and permeability from sand and shale regions together

Limitations of this Workflow:

* kriging is too smooth, the spatial continuity is too high
* kriging does not reproduce the continuous property distributions
* we are not accounting for the correlation between porosity and permeability 

We will correct these issues when we perform simulation later.

We need to add a couple of parameters and assume a porosity variogram model.


In [None]:
skmean_por = 0.10; skmean_perm = 65.0      # simple kriging mean (used if simple kriging is selected below)
ktype = 0                                  # kriging type, 0 - simple, 1 - ordinary
radius = 300                               # search radius for neighbouring data
nxdis = 1; nydis = 1                       # number of grid discretizations for block kriging (not tested)
ndmin = 0; ndmax = 40                      # minimum and maximum data for an estimate
por_min = 0.0; por_max = 0.3               # minimum property value
perm_min = 0.0; perm_max = 1000.0          # minimum property value

#### Simple Kriging Estimation Map of Porosity

Let's start with spatial estimates of porosity and permeability with all facies combined. We will also look at the kriging estimation variance.

We will also not use cyclicity for now, as we are just getting started.  

* Let's build a reasonable model to the sill.

##### `make_variogram function`

We use the make_variogram function to make a variogram model 

* a dictionary for compact storage of the variogram model parameters to pass into plotting (below), kriging and simulation methods 

The variogram model parameter include:

* **nug** - nugget effect contribution to sill
* **nst** - number of nested structures (1 or 2)
* **it** - type for this nested structure (1 - spherical, 2 - exponential, 3 - Gaussian)
* **c** - contribution of each nested structure (contributions + nugget must sum to the sill)
* **ang** - the azimuth for this nested structure of the major direction, the minor is orthogonal
* **hmaj** - the range for this nested structure in the major direction
* **hmin** - the range for this nested structure in the minor direction

We increment it, c, ang, hmaj, and hmin for the 1st and 2nd structures

* for only 1 structure plus optional nugget, omit the 2nd structure parameters and they will default to $cc2 = 0$, no contribution to the model

Here's my model:

```p
nug = 0.0; nst = 2
it1 = 1; cc1 = 0.6; azi1 = 45; hmaj1 = 350; hmin1 = 350                # first structure
it2 = 1; cc2 = 0.4; azi2 = 45; hmaj2 = 9999.9; hmin2 = 400             # second structure
```

Some comments on our model:

* we model to the sill of 1.0, since we applied the normal score transform ($nug + cc1 + cc2 = 1.0$)

* we used 2 spherical structures to capture zontal anisotropy in the 045 azimuth

* since the experimental variogram exceeds the sill with trend or cyclicity we could have attempted trend modeling and then worked with the residual, but we will not do this for workflow brevity and simplicity

We input these model parameters to make a variogram model dictionary with the make_variogram function as follows:

```p
vario = make_variogram(nug,nst,it1,cc1,azi1,hmaj1,hmin1,it2,cc2,azi2,hmaj2,hmin2)
```

##### vmodel function

To plot the variogram we use the vmodel function to project the model in the major and minor directions

The inputs for vmodel are:

* **nlag** - the number of points along the variogram to calculate for the projection

* **xlag** - the size of a lag for the projection

* **azm** - the direction of the projection in azimuth (this is all we need since we are working in 2D)

* **vario** - the variogram model dictionary from the make_variogram function (above)

Note: this function is just for visualization by projecting the variogram model in a direction, so the convention is to use a very small **xlag** and large **nlag** for a high resolution display of the variogram model

The outputs from the vmodel program include:

* **index** - the lag number for the projection

* **lag distance** - the distance offset along the projection (the **h** in the variogram plot)

* **variogram** - the variogram value at the lag distance for the projection (the $\gamma$(**h**) in the variogram plot)

* **covariance function** - the covariance function at the lag distance for the projection (for the C(**h**) plot)

* **correlogram** - the correlogram at the lag distance for the projection (for the $\rho$(**h**) plot)

We have 2 structures and no nugget effect.  We needed the 2nd structure to capture the zonal anisotropy in the 045 direction.  Let's calculate the variogram model in these directions and plot them with the experimental variograms.   

In [None]:
def make_variogram(nug, nst, it1, c1, ang1, hmaj1, hmin1, it2=None, c2=None, ang2=None, hmaj2=None, hmin2=None):
    """
    Creates a variogram model structure for use in variogram simulations.

    Parameters:
    -----------
    nug : float
        Nugget effect.
    nst : int
        Number of nested structures (1 or 2).
    it1 : int
        Variogram type for first structure (1=Spherical, 2=Exponential, 3=Gaussian, 4=Power).
    c1 : float
        Sill contribution of first structure.
    ang1 : float
        Major azimuth angle for first structure.
    hmaj1 : float
        Major range for first structure.
    hmin1 : float
        Minor range for first structure.
    it2 : int, optional
        Variogram type for second structure (if applicable).
    c2 : float, optional
        Sill contribution of second structure.
    ang2 : float, optional
        Major azimuth angle for second structure.
    hmaj2 : float, optional
        Major range for second structure.
    hmin2 : float, optional
        Minor range for second structure.

    Returns:
    --------
    dict
        Variogram model dictionary.
    """

    vario = {
        "nug": nug,
        "nst": nst,
        "it1": it1,
        "cc1": c1,
        "azi1": ang1,
        "hmaj1": hmaj1,
        "hmin1": hmin1,
    }

    if nst == 2:
        vario.update({
            "it2": it2,
            "cc2": c2,
            "azi2": ang2,
            "hmaj2": hmaj2,
            "hmin2": hmin2,
        })

    return vario

#### Function to plot 2D probability map with overlaid data points

In [None]:


def locpix_st(array, xmin, xmax, ymin, ymax, xsiz, vmin, vmax, df, x_col, y_col, feature_col, 
              title, xlabel, ylabel, cbar_label, cmap="inferno"):
    """
    Plots a 2D categorical probability map with overlaid data points.

    Parameters:
    -----------
    array : np.ndarray
        2D grid of values to plot.
    xmin, xmax : float
        X-axis extent.
    ymin, ymax : float
        Y-axis extent.
    xsiz : float
        Cell size for spatial resolution.
    vmin, vmax : float
        Color scale limits.
    df : pd.DataFrame
        Dataframe containing spatial points.
    x_col, y_col : str
        Column names for X and Y coordinates.
    feature_col : str
        Column name for the feature (facies or probability).
    title : str
        Plot title.
    xlabel, ylabel : str
        Labels for the X and Y axes.
    cbar_label : str
        Label for the colorbar.
    cmap : str, optional (default="inferno")
        Colormap for visualization.

    Returns:
    --------
    None (displays the plot).
    """

    extent = [xmin, xmax, ymin, ymax]

    # Fix rotation issue: Flip vertically to match coordinate system
    array = np.flipud(array)
    array = np.where(np.isnan(array), np.nan, array)  # Preserve NaNs

    plt.imshow(array, extent=extent, origin="lower", vmin=vmin, vmax=vmax, cmap=cmap, aspect="auto")

    # Add data points with yellow color and black edges
    plt.scatter(df[x_col], df[y_col], color="yellow", edgecolors="black", s=20)

    # Add color bar
    cbar = plt.colorbar()
    cbar.set_label(cbar_label)

    # Set axis labels and title
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.title(title)

#### Function for plotting 2D pixel map

In [None]:

def pixelplt_st(array, xmin, xmax, ymin, ymax, step, vmin, vmax, title, xlabel, ylabel, cbar_label, cmap="inferno"):
    """
    Plots a 2D pixel map using `imshow()` with a colorbar.

    Parameters:
    -----------
    array : np.ndarray
        2D array to plot.
    xmin, xmax : float
        X-axis extent.
    ymin, ymax : float
        Y-axis extent.
    step : float
        Resolution step size.
    vmin, vmax : float
        Color scale limits.
    title : str
        Title of the plot.
    xlabel, ylabel : str
        Axis labels.
    cbar_label : str
        Label for the colorbar.
    cmap : str (default="inferno")
        Colormap for visualization.

    Returns:
    --------
    None (displays the plot).
    """

    extent = [xmin, xmax, ymin, ymax]
    
    plt.imshow(array, extent=extent, origin="lower", vmin=vmin, vmax=vmax, cmap=cmap, aspect="auto")

    cbar = plt.colorbar()
    cbar.set_label(cbar_label)

    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    plt.title(title)

# **`kb2d` in GSLIB (2D Kriging)**

## **What is `kb2d`?**
The `kb2d` function (replicated from GSLIB) is used for **2D Kriging**—specifically, it performs **ordinary or simple kriging** in two dimensions.

---

## **Function Purpose**
`kb2d` stands for **Kriging Bivariate in 2D**.  
It estimates unknown values at unsampled locations based on known values using **spatial interpolation** techniques.

---

## **General Syntax of `kb2d` in GSLIB**

# **GeostatsPy `kb2d` Function Argument List**

The `kb2d` function in **GeostatsPy** performs **2D Kriging interpolation** over a **structured spatial grid**.

---

## **Function Signature**
```python
por_kmap, por_vmap = geostats.kb2d(df, 'X', 'Y', 'Porosity', por_min, por_max, 
                                   nx, xmn, xsiz, ny, ymn, ysiz, 
                                   nxdis, nydis, ndmin, ndmax, radius, 
                                   ktype, skmean_por, por_vario)
```

---

## **Function Arguments**
| **Argument** | **Description** |
|-------------|----------------|
| `df`        | **Pandas DataFrame** containing spatial data. |
| `x_col`     | Column name in `df` for **X-coordinates** of known points. |
| `y_col`     | Column name in `df` for **Y-coordinates** of known points. |
| `z_col`     | Column name in `df` for **values to be interpolated** (e.g., Porosity). |
| `zmin`      | **Minimum value** of the property (for scaling/visualization). |
| `zmax`      | **Maximum value** of the property (for scaling/visualization). |
| `nx`        | **Number of grid cells** in the X-direction. |
| `xmn`       | **Minimum X-coordinate** (origin of the grid in X-direction). |
| `xsiz`      | **Grid cell size** in the X-direction. |
| `ny`        | **Number of grid cells** in the Y-direction. |
| `ymn`       | **Minimum Y-coordinate** (origin of the grid in Y-direction). |
| `ysiz`      | **Grid cell size** in the Y-direction. |
| `nxdis`     | **Number of discretization points** in X (for block kriging). |
| `nydis`     | **Number of discretization points** in Y (for block kriging). |
| `ndmin`     | **Minimum number of neighboring points** required for kriging. |
| `ndmax`     | **Maximum number of neighboring points** used for kriging. |
| `radius`    | **Search radius** for selecting neighboring data points. |
| `ktype`     | **Kriging type**: $0$ for **Simple Kriging**, $1$ for **Ordinary Kriging**. |
| `skmean`    | **Mean value for Simple Kriging** ($\bar{Z}$). Ignored if `ktype=1`. |
| `vario`     | **Variogram model** (created using `geostats.make_variogram`). |

---

## **Kriging Type Explanation**
- **Simple Kriging ($ktype=0$)** assumes a **known global mean**:
  $$ Z^*(x) = \bar{Z} + \sum_{i=1}^{n} \lambda_i (Z(x_i) - \bar{Z}) $$
  where:
  - $ Z^*(x) $ is the estimated value at an unknown location.
  - $ \bar{Z} $ is the known global mean.
  - $ Z(x_i) $ are the known values at sampled locations.
  - $ \lambda_i $ are the kriging weights.

- **Ordinary Kriging ($ktype=1$)** assumes an **unknown local mean**:
  $$ Z^*(x) = \sum_{i=1}^{n} \lambda_i Z(x_i) $$
  with the constraint:
  $$ \sum_{i=1}^{n} \lambda_i = 1 $$


## What is the variogram model in this example?

In [None]:
por_vario = make_variogram(nug=0.0,nst=1,it1=1,c1=1.0,ang1=45,hmaj1=300,hmin1=300) # porosity variogram
por_vario

#### Explanation

- The spherical variogram (it1=1) is used.

- There is no nugget effect (nug=0.0), meaning no measurement error or micro-scale variability.

- The spatial correlation extends up to 300 meters in both the major (hmaj1=300) and minor (hmin1=300) directions.

- The direction of maximum continuity is at 45° azimuth.

This means that data points closer than 300 meters are expected to have a strong spatial correlation, while those farther than 300 meters are uncorrelated.

### Creating a Kriged porosity map.

In [None]:
por_vario = make_variogram(nug=0.0,nst=1,it1=1,c1=1.0,ang1=45,hmaj1=300,hmin1=300) # porosity variogram

por_kmap, por_vmap = geostats.kb2d(df,'X','Y','Porosity',por_min,por_max,nx,xmn,xsiz,ny,ymn,ysiz,nxdis,nydis,
         ndmin,ndmax,radius,ktype,skmean_por,por_vario)

plt.subplot(121)
locpix_st(por_kmap,xmin,xmax,ymin,ymax,xsiz,0.0,0.25,df,'X','Y','Porosity','Simple Kriging Estimates and Data','X(m)','Y(m)','Porosity (%)',cmap)

plt.subplot(122)
pixelplt_st(por_vmap,xmin,xmax,ymin,ymax,xsiz,0.0,1.0,'Kriging Variance','X(m)','Y(m)',r'Porosity ($\%^2$)',cmap)

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.0, wspace=0.3, hspace=0.3); plt.show()

## **Some observations:**

* see the smooth kriging estimates, and note the estimated values at distance greater than the variogram range from any data approach the global mean. 

* See the kriging variance map and observe the impact of variogram range and potential anisotropy.

### Simple Kriging Estimation Map of Permeability

Now let's calculate the kriging estimation map for permeability

* Confidence is highest near data points and lowest where interpolation is needed.

* More data points = lower uncertainty.

* If important decisions depend on low-variance estimates, more sampling may be needed in high-variance areas.

* note, due to the strong positive skew of permeability and the kriging screening effect, permeability estimates may be very small or even negative. We truncate the map to clean this up

* we will apply the log transformation to improve the visualization. We could also plot in log scale, but this is not very convenient with MatPlotLib. 

### Creating a Kriged permeability map.

In [None]:
perm_vario = make_variogram(nug=0.0,nst=1,it1=1,c1=1.0,ang1=45,hmaj1=300,hmin1=300) # permeability variogram

perm_kmap, perm_vmap = geostats.kb2d(df,'X','Y','Perm',perm_min,perm_max,nx,xmn,xsiz,ny,ymn,ysiz,nxdis,nydis,
          ndmin,ndmax,radius,ktype,skmean_perm,perm_vario)

perm_kmap[perm_kmap < 0.0001] = 0.0001           # remove small and negative values due to strong positive skew and negative kriging weights

plt.subplot(131)
locpix_st(perm_kmap,xmin,xmax,ymin,ymax,xsiz,0.0,1000,df,'X','Y','Perm','Simple Kriging Estimates and Data','X(m)','Y(m)','Permeability (mD)',cmap)

logperm_kmap = np.log(perm_kmap)

plt.subplot(132)
locpix_st(logperm_kmap,xmin,xmax,ymin,ymax,xsiz,0.0,10.0,df,'X','Y','LogPerm','Simple Kriging Log(Estimates and Data)','X(m)','Y(m)','Log Permeability (mD)',cmap)

plt.subplot(133)
pixelplt_st(perm_vmap,xmin,xmax,ymin,ymax,xsiz,0.0,1.0,'Kriging Variance','X(m)','Y(m)',r'Permeability ($mD^2$)',cmap)

plt.subplots_adjust(left=0.0, bottom=0.0, right=3.0, top=1.0, wspace=0.3, hspace=0.3); plt.show()

# **Some observations**

The maps provide insights into **spatial trends** and **uncertainty** in permeability estimation. Below is an interpretation of each map.

---

## **Left: Simple Kriging Estimates and Data**
- This map displays the **kriged permeability estimates** based on the available data.
- **Yellow dots** represent the **original sample points** (measured permeability values).
- The **color gradient** represents the interpolated permeability values:
  - **Bright areas (yellow-white)** indicate **higher permeability**.
  - **Dark areas (blue-purple)** indicate **lower permeability**.
- The **spatial pattern of high permeability regions** suggests:
  - A **clustered trend** of high permeability near **(x ≈ 300–400, y ≈ 600–800)**.
  - Smooth interpolation extends these values into unsampled regions.

### **Inference:**  
- **Permeability varies spatially** but follows a structured pattern.  
- **More data points = More reliable estimates** (regions with more yellow dots are better constrained).  

---

## **Middle: Simple Kriging Log(Estimates and Data)**
- This map applies a **log transformation** to the permeability estimates.
- Log transformations **reduce the influence of extreme values**, making **spatial variations more visible**.
- The **spatial pattern remains similar**, but:
  - Some **dark holes** indicate **areas of poor data coverage**, leading to **extrapolation uncertainty**.

### **Inference:**  
- The log-transformed map highlights **subtle spatial variations**.  
- **Regions with sharp permeability contrasts** become more visible.  
- Large **dark zones** could indicate **areas with no local data, reducing confidence in estimates**.  

---

## **Right: Kriging Variance (Uncertainty Map)**
- This map represents the **kriging variance**, which measures **prediction uncertainty**.
- **Bright yellow regions** = **High uncertainty (low data support)**.
- **Dark purple regions** = **Low uncertainty (high data density and strong correlation)**.
- We observe **high uncertainty in areas with sparse data coverage**.

### **Inference:**  
- **Uncertainty is lowest near sample points** (dark spots around yellow dots).  
- **Uncertainty increases where kriging relies heavily on extrapolation**.  
- **Higher variance along grid edges** suggests **poorly constrained predictions**.  

---

## **Key Takeaways**
**Data-Rich Areas:**  
- Predictions are **more reliable** where sample points exist.  

**Data-Sparse Areas:**  
- High uncertainty occurs **where kriging relies heavily on extrapolation**.

**Usefulness of the Log Map:**  
- Helps interpret regions with extreme permeability values.

**Next Steps:**  
- Consider **adding more sample points** in high-variance areas to improve prediction accuracy.  
- Use **cross-validation** to test how well kriging estimates match observed data.   


#### Ordinary Kriging Estimation Map of Porosity

Let's try ordinary kriging and compare the results to simple kriging.

* we shorten the variogram model range to exagerate the difference between simple and ordinary kriging, i.e., with esitmates outside variogram range from any data

In [None]:
por_vario2 = make_variogram(nug=0.0,nst=1,it1=1,c1=1.0,ang1=45,hmaj1=50,hmin1=50) # porosity variogram

por_SK_kmap, por_vmap = geostats.kb2d(df,'X','Y','Porosity',por_min,por_max,nx,xmn,xsiz,ny,ymn,ysiz,nxdis,nydis,
         ndmin,ndmax,radius,0,skmean_por,por_vario2)

por_OK_kmap, por_OK_vmap = geostats.kb2d(df,'X','Y','Porosity',por_min,por_max,nx,xmn,xsiz,ny,ymn,ysiz,nxdis,nydis,
         ndmin,ndmax,radius,1,skmean_por,por_vario2)

plt.subplot(121)
locpix_st(por_SK_kmap,xmin,xmax,ymin,ymax,xsiz,0.0,0.25,df,'X','Y','Porosity','Simple Kriging Estimates and Data','X(m)','Y(m)','Porosity (%)',cmap)

plt.subplot(122)
locpix_st(por_OK_kmap,xmin,xmax,ymin,ymax,xsiz,0.0,0.25,df,'X','Y','Porosity','Ordinary Kriging Estimates and Data','X(m)','Y(m)','Porosity (%)',cmap)

plt.subplots_adjust(left=0.0, bottom=0.0, right=2.0, top=1.0, wspace=0.3, hspace=0.3); plt.show()

## **Take aways - Simple Kriging vs. Ordinary Kriging**

## **Simple Kriging Estimates**

- Influenced by the global mean.

- Porosity estimates stay close to the assumed mean except where data exists.


## **Ordinary Kriging Estimates**

- The estimated values appear more locally adaptive.

- Unlike SK, OK does not assume a global mean.

- More spatial variability is preserved, especially in data-sparse regions.

- Porosity values change more dynamically, rather than being pulled toward a global mean.

## **Inference (SK vs. OK):**

- Simple Kriging (SK) assumes a known global mean, leading to smoother predictions.

- Ordinary Kriging (OK) assumes a locally varying mean, allowing for more localized variations in porosity.

- If you lack prior knowledge of a global mean, Ordinary Kriging is more appropriate.

#### Comments

This was a basic demonstration of spatial estimation. Much more could be done.

# The End