# Clay Fraction Module Example

This notebook is aiming at delivering a brief introduction to blending AEM data and lithology log with clay fraction approach.

To simplify the data preprocessing steps, synthetic data is used in this notebook.

Firstly, let's look at the clay fraction for borehole logs.

In [1]:
from clayfraction import *

In [2]:
bore_dict = {(0.61, 0.0): "subsoil",
                 (0, -0.61): "clay",
                 (-0.61, -1.52): "boulder",
                 (-1.52, -6.71): "limestone",
                 (-6.71, -10.36): "limestone",
                 (-10.36, -11.89): "clay",
                 (-11.89, -12.8): "limestone",
                 (-12.8, -14.93): "strata",
                 (-14.93, -18.59): "limestone",
                 (-18.59, -38.71): "limestone",
                 (-38.71, -40.23): "limestone",
                 (-40.23, -62.18): "limestone",
                 (-62.18, -77.72): "clay",
                 (-77.72, -86.87): "limestone"}


Assuming the depth (elevation) is represented in AHD and we define the depth (elevation) is 10m from 10m

In [3]:
fraction_list = bore_to_fraction(10, 10,bore_dict)
fraction_list

{(10, 0): 0.0,
 (0, -10): 0.061,
 (-10, -20): 0.1530000000000001,
 (-20, -30): 0.0,
 (-30, -40): 0.0,
 (-40, -50): 0.0,
 (-50, -60): 0.0,
 (-60, -70): 0.782,
 (-70, -80): 0.7719999999999999,
 (-80, -86.87): 0.0}

There is a simple demo about the CF for this borehole.

<img src="img/cf_for_borehole.png" width="500px;"/>

Now have a exploration on clay fraction model with AEM data

In [4]:
import torch
from CF_tensor import ClayFraction

In this section, we are trying to build a clay fraction model in a (3-D) grid with only 1 depth (elevation) interval of 10m. 

The cell below builds 10 borehole log data location with clay fraction respectively.

In [5]:
log_x=torch.rand(1,10)*10   
log_y = torch.rand(1,10)*10
log_cf= torch.rand(1,10)
for i in range(10):
    print ("Clay fraction value at location ({:f},{:f}) is {:f}".format(log_x[0][i],log_y[0][i],log_cf[0][i]))


Clay fraction value at location (8.050779,0.192690) is 0.946768
Clay fraction value at location (6.235059,0.137408) is 0.919424
Clay fraction value at location (3.452262,7.502848) is 0.535670
Clay fraction value at location (7.350189,6.684819) is 0.086902
Clay fraction value at location (7.569018,3.199718) is 0.971986
Clay fraction value at location (8.277180,1.795287) is 0.222878
Clay fraction value at location (4.203513,5.796410) is 0.929838
Clay fraction value at location (6.910735,2.410186) is 0.817423
Clay fraction value at location (8.164867,6.395492) is 0.620989
Clay fraction value at location (1.487830,3.759527) is 0.734680


This cell makes up 100 AEM data point with resistivity value. (Assume this data grid covers 2 AEM data sample depth intervals which are 4m and 6m)

In [6]:
aem_x = torch.rand(1,100)*10
aem_y = torch.rand(1,100)*10
aem_resist =  torch.rand(1,100,2)*100 
interval = torch.FloatTensor([4,6])
# Showing the first 20 AEM data points.
for i in range(20):
    print ("The resistivity at AEM data point ({:f},{:f}) is {:f} at 0-4m and {:f} at 4-6m".format(aem_x[0][i],
                                                                                                   aem_y[0][i],
                                                                                                   aem_resist[0][i][0],
                                                                                                  aem_resist[0][i][1]))

The resistivity at AEM data point (4.494390,3.425762) is 47.039421 at 0-4m and 90.334328 at 4-6m
The resistivity at AEM data point (3.997748,6.116557) is 21.490574 at 0-4m and 47.926807 at 4-6m
The resistivity at AEM data point (6.007145,6.378081) is 54.830872 at 0-4m and 37.543171 at 4-6m
The resistivity at AEM data point (9.278537,0.844442) is 67.812431 at 0-4m and 26.509130 at 4-6m
The resistivity at AEM data point (9.073026,7.924224) is 72.513321 at 0-4m and 77.426765 at 4-6m
The resistivity at AEM data point (7.961023,5.072204) is 53.470772 at 0-4m and 5.166566 at 4-6m
The resistivity at AEM data point (8.236984,6.791945) is 78.241486 at 0-4m and 83.284164 at 4-6m
The resistivity at AEM data point (9.412109,9.027265) is 79.690979 at 0-4m and 77.791328 at 4-6m
The resistivity at AEM data point (2.016736,7.174189) is 70.043259 at 0-4m and 49.707447 at 4-6m
The resistivity at AEM data point (0.972992,6.953687) is 60.638573 at 0-4m and 17.103703 at 4-6m
The resistivity at AEM data poi

Following is the other parameters to build up this model. Notice that constraint_pair is a experience parameter that calls for experienced knowledge. But generally speaking, it depends on the sparsity of borehole logs. If there is only a few borehole logs in this data grid, we could slightly increase teh constraint pair to mitigate the gap of borehole logs.

tolerant_err represents the tolerant error in each "constraint dimension". The size of this tensor should be consistent with the constraint_pair. For example, if the spatial constraint factor is \[2,3\], the tolerant_err should have $2*3=6$ elements

In [7]:
constraint_pair =[2,3] #assume the constraint scale is 2*3 and it will be applied to the whole map.
tolerant_err = torch.rand(1,6) 
learning_rate =0.01  

Now, we can build a clay_fraction model for this 3-D grid

In [8]:
clay_fraction = ClayFraction(log_x,log_y,log_cf,aem_x,aem_y,aem_resist,constraint_pair,tolerant_err,learning_rate)

Below are steps should be executed accourding to the flow chart.

Fistly, clay fraction value at each AEM data point should be computed using translator function. Here, assume the parameter m_up and m_low is 70 and 50

In [9]:
m_up=70
m_low = 50
fed_out = (2*clay_fraction.AEM_resist-m_up-m_low)/(m_up-m_low)
aem_cf = clay_fraction.translator_function(fed_out,interval)
# Showing the clay fraction value at first 20 AEM data points.
for i in range(20):
    print ("The CF at AEM data point ({:f},{:f}) is {:f}".format(aem_x[0][i],
                                                                aem_y[0][i],
                                                                aem_cf[0][i]))

The CF at AEM data point (4.494390,3.425762) is 0.397784
The CF at AEM data point (3.997748,6.116557) is 0.994610
The CF at AEM data point (6.007145,6.378081) is 0.937797
The CF at AEM data point (9.278537,0.844442) is 0.625144
The CF at AEM data point (9.073026,7.924224) is 0.003028
The CF at AEM data point (7.961023,5.072204) is 0.959870
The CF at AEM data point (8.236984,6.791945) is 0.000071
The CF at AEM data point (9.412109,9.027265) is 0.000169
The CF at AEM data point (2.016736,7.174189) is 0.596704
The CF at AEM data point (0.972992,6.953687) is 0.780080
The CF at AEM data point (1.392376,4.565879) is 0.400000
The CF at AEM data point (1.613799,5.789477) is 0.604820
The CF at AEM data point (8.283892,5.091525) is 0.400000
The CF at AEM data point (1.318916,7.317348) is 0.600248
The CF at AEM data point (0.200688,7.195616) is 0.979852
The CF at AEM data point (2.672215,5.641165) is 0.999982
The CF at AEM data point (6.626270,4.261785) is 0.662102
The CF at AEM data point (3.692

Then implementing Kriging interpolation at AEM data location to borehole data location. The size of returned tensor should be consistency with borehole data tensor size. 

In [10]:
interpolated_cf, interpolated_var = clay_fraction.Kriging_interpolation(aem_cf)
interpolated_cf, interpolated_var   # the interpolated CF value (using AEM data) at borehole loaction, and the vairance.

(tensor([0.6022, 0.6088, 0.6202, 0.6040, 0.6059, 0.6027, 0.6165, 0.6077, 0.6001,
         0.6077], dtype=torch.float64),
 tensor([0.0971, 0.0969, 0.0960, 0.0960, 0.0962, 0.0967, 0.0957, 0.0963, 0.0961,
         0.0962], dtype=torch.float64))

Now, calculate 2 regularisation terms $R_{data}$ and $R_{con}$

In [11]:
r_data = clay_fraction.regularization_data(interpolated_cf,interpolated_var)
r_con = clay_fraction.regularization_constraint()
r_data,r_con

(0.6280386484814423, 1.4412093196001134)

Finally, we can see the initial objective function value with the parameter $(m_{up} =70, m_{low}=50)$

In [12]:
obj_value = clay_fraction.regularization_data(interpolated_cf,interpolated_var)
obj_value

0.6280386484814423

Indeed, it is the first iteration to find out proper parameters $(m_{up}, m_{low})$ for a 3-D grid.

The work should be done in the future includes:

* Real data preprocessing.
* Set up mini batch to train the model. If we train the model for N epochs, the model with smallest objective function value can be used to predict clay fraction in this 3-D grid.
* Clustering analysis and visualisation.
* Parallel computation for multiple 3-D grid and multi-thread computation.