# GeoClass - Uncertainty classifier:
# How to visualise uncertainty by zones and levels?

## Read our Paper [Link]

## Abstract

Subsurface geological structures are generally complicated and very hard to interpret. This algorithm aims to use Python coding language to visualise and measure the uncertainty in subsurface geological interpretations of any subsurface structure from sparse and incomplete datasets. The complex history of geological structures is difficult to unravel from limited data, and ‘accurate’ interpretations are associated with subsurface structural interpretation uncertainties. These challenges often result in the employment of heuristics, rules of thumb, and know solutions to subsurface interpretations that introduced bias. Excellent visualisation provided by Open 3D as a modern open-source library created for massive data processing makes such library perfect for visualising interpretations of subsurface structural geometries. However, illustrating and quantifying uncertainty in geological interpretations of subsurface cross-sections is still ambiguous. Here we provide an automatic data-driven approach model to illustrate and quantify uncertainty using subsurface cross-sections of geological/structural geometries. Five zones have been calculated to display uncertainty in geological cross-section interpretations. These five uncertainty zones are applied to horizons and faults interpretations. Together they form a critical part of the dataset. These calculated uncertainty zones and levels allow the investigation of cross-section building and interpretation from level 1, where direct observations of the rock can be made, outwards, whilst illustrating increasing uncertainty. Our uncertainty classification model applies to any sub-surface datasets and can be used to inform approaches to sub-surface interpretations elsewhere. We claim that quantifying uncertainty by zones and levels can provide a framework for reducing interpretation risk and improving the visualisation of uncertainties in subsurface cross-sections.

## Overview

This model is a simple method to classify, quantify and illustrate the uncertainty in subsurface interpretation using python open source libraries. This model takes a practical and coding focused approach to visualise the uncertainty in subsurface interpretations by calculating zones and levels of uncertainty—the Five uncertainty zones created by measuring the distance from outcrops, galleries and boreholes.

![image-5.png](attachment:image-5.png)

Schematic model to illustrate uncertainty and risk zones and associated uncertainty nomenclature for horizons and faults interpretations. (a) Uncertainty zones defined by five levels. Zone-1 it is the most certain zone and defines as the area of the outcrops and galleries. Zone-2 it is the certain zone and defines as the areas around outcrops and between galleries. Zone-3 it is the possible zone and defines as the area within 100m from the data source. Zone-4 it is the uncertain zone and defines as the area beyond 100m from the data source. Zone-5 it is the least certain zone and defines as the surface area needed to understand the subsurface geological model. (b) Schematic representation of geology with the definition of uncertainty in geological boundaries. Level-1 it is the direct observation from outcrops and galleries. Level-2 it is geological boundaries interpretation in and around outcrops and between galleries. Level-3 it is the geological boundaries interpretation within 100m from direct observation. Level-4 it is geological boundaries interpretation beyond 100m from direct observation. Level-5 it is the eroded interpretation of geology needed to understand the subsurface geological model.

You can access our models and dataset or apply your geological model, form visualising to classification.

With geoclass you can perform complex 3D processing operations and visualise your uncertainty classes. For example, you can:

    1/ Load your 3D geological model from disk.

    2/ Visualise your 3D geological model using point clouds.

    3/ Create uncertainty zones around your structural interpretation.

    4/ Classify your interpretation by level 1, 2, 3, 4 and 5.

    5/ visualise the risk in your subsurface structural interpretation.

    6/ Get output file with the class you like to continue the further investigation.

### Zones

Zone-1 represent direct observation from outcrops and galleries.

Zone-2 show the area or space between galleries and around outcrops.

Zone-3 show interpretation which filled the space within 100m from direct observation.

Zone-4 show interpretation zone which filled the space beyond 100m from direct observation.

Zone-5 show surface interpretation zone which filled the space above the Earth’s surface, projected in the air for eroded or covered geology.

### Levels

Level-1 define as the parts of the horizons that directly collect from outcrops, coalmine galleries or boreholes. Plus direct faults observations.

Level-2 geological boundaries which are the verified parts of the horizons between the galleries and around the outcrops. Plus the secured faults interpretations.

Level-3 represent the parts of the horizons that projected due to nearby excavations or boreholes up to a distance of approximately 100 m).

Level-4 is subsurface observations in areas beyond 100m of direct observation.

Level-5 surface geological boundaries represent the interpretation parts above the Earth’s surface for covered or eroded geology. Plus the presumed faults interpretation.

## Requirements

•	Python 3.8.5

•	Jupyter notebook

•	Suitable geoscience environment e.g. geoclass

In [8]:
#conda create --name geoclass anaconda
#conda activate geoclass

### Run locally - recommended
Set up Python, download the notebook and install the required libraries. We recommend using the Conda or pip of Python. 

### Using free online resources:
Run-on Colab (Google's cloud infrastructure), Run-on Binder or Run-on Kaggle.
However, many online resources don't support the external window of Open 3D, so you need to use docker to solve the error. For example 
https://stackoverflow.com/questions/54483960/pyopengl-headless-rendering/55429262

## Dataset

The geological dataset used in this tutorial is a high-resolution dataset on CSV format. We will provide 3D models to run the code. This dataset from late Carboniferous multi-layered stratigraphy through the Ruhr basin, coal measures of Germany.

## Visualise dataset

In [1]:
# import needed libraries
import numpy as np
import open3d as o3d
import pandas as pd

In [2]:
# Uncomment and run the commands below if imports fail
#!conda install numpy pandas 
#!pip install open3d
#!pip install matplotlib --upgrade --quiet

In [2]:
# open and read the csv file
# Read the columns of x, y & z values of the csv file using numpy.
# use your own directory to dataset

if __name__ == "__main__":
    data = pd.read_csv("C:/Users/r04ra18/Documents/Coding/cfv1/m1/hor1-6_model_plot.csv")
    data.columns = ["X", "Y", "Z"]
    X = data["X"].to_numpy()
    Y = data["Y"].to_numpy()
    Z = data["Z"].to_numpy()

In [3]:
# convert the x, y & z to array using np.
xyz = np.asarray([X,Y,Z])
xyz_t = np.transpose(xyz)

# check the data
print(xyz_t)

[[ 1.5600000e+03 -4.9200000e+03  1.3920000e+00]
 [ 1.5600000e+03 -5.0209417e+03 -2.9413000e+00]
 [ 1.5600000e+03 -5.1218833e+03 -7.2747000e+00]
 ...
 [ 1.4666667e+03 -5.2987499e+03 -3.2461540e+02]
 [ 1.4633333e+03 -5.2987499e+03 -3.2461540e+02]
 [ 1.4600000e+03 -5.2987499e+03 -3.2461540e+02]]


In [4]:
# Create a PointCloud class from your array and save it in .ply file. 
# A point cloud consists of point coordinates, and optionally point colors.
# check the .ply file if return Ture you are ready to plot it.

pcd = o3d.geometry.PointCloud()
pcd.points = o3d.utility.Vector3dVector(xyz_t)
pcd.colors = o3d.utility.Vector3dVector()
# use your own directory to dataset
o3d.io.write_point_cloud("C:/Users/r04ra18/Documents/Coding/cfv1/hor1-6_model_plot.ply", pcd)

True

In [5]:
# plot 3D model using point cloud - open 3D
# use your own directory to dataset

pcd_2 = o3d.io.read_point_cloud("C:/Users/r04ra18/Documents/Coding/cfv1/hor1-6_model_plot.ply")
o3d.visualization.draw_geometries([pcd_2],window_name="Tunnel", width=700,height=700,left=50,top=50)

![image-2.png](attachment:image-2.png)

![image-2.png](attachment:image-2.png)

# Calculate zones and levels

In [6]:
# import needed libraries
import open3d as o3d
import pandas as pd
from scipy.spatial import Delaunay
import numpy as np
from copy import deepcopy as copy
from tqdm import tqdm

Zone-1 already define by user since it is the source of information.

In [7]:
# open and read the model file
# use your own directory to dataset
interprets = "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/hor1-6.csv"

# Define the uncertainty distance between the data source (100 m) on the y-direction due to separation, scale and complicity.
# here define the distance between outcrops and boreholes that define zone-2.
y_dist = 100

# open and read the data source files
# Define the uncertainty distance (100 m) on the x-direction due to separation, scale and complicity
geoclass = {
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw1.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw2.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw3.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw4.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw5.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw6.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw7.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw8.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw9.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw10.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw11.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw12.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/tw13.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/oc1.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/oc2.csv":50,
    "C:/Users/r04ra18/Documents/Coding/cfv1/v7/v7/oc3.csv":50,
}

In [8]:
# Read the columns of x, y & z values of the csv file assigned to the variable ‘interprets’. 
# Read a comma-separated values (csv) file into DataFrame.
# Create a PointCloud class, consists of point coordinates, and point colors.
# Convert float64 numpy array of shape (n, 3) to Open3D format.
# paint_uniform_color paints all the points to a uniform color. The color is in RGB space, [0, 1] range.
# calculate the number of points using the len() function.

interprets_pc = pd.read_csv(interprets, usecols = ["x", "y", "z"])
interprets_pc_pts = interprets_pc.values
interprets_pcd = o3d.geometry.PointCloud()
interprets_pcd.points = o3d.utility.Vector3dVector(interprets_pc_pts)
interprets_pcd.paint_uniform_color([0, 0, 1.0])

# we clculate the number of points using the len() function to returns the number of point in our zone 
# and return a new array of given shape and type, filled with zeros.

shortest_distances = np.zeros((len(list(geoclass.keys())), len(interprets_pc_pts)))

# we use the for loop to iterate through the data source and return a new array (x,y) of given shape and type, filled with zeros.
# we use the enumerate() method adds counter to an iterable and returns it (the enumerate object).
# Read the columns of x, y & z values of the csv file assigned to the variable ‘uncertainty_distance’. 
# Read a comma-separated values in (csv) file into DataFrame.
# Create a PointCloud class of outcrops, vertical and horizontal wells. 
# A point cloud consists of point coordinates, and optionally point colors and point normals.
# Convert float64 numpy array of shape (n, 3) to Open3D format.
# Convert the input to an array and Returns the indices that would sort an array and return a sorted copy of an array.

for idx, key in enumerate(geoclass.keys()):
    geoclass_pc_pts = pd.read_csv(key, usecols = ["x", "y", "z"]).values
    # print("geoclass_pc_pts", target_pc_pts.shape)
    geoclass_pcd = o3d.geometry.PointCloud()
    geoclass_pcd.points = o3d.utility.Vector3dVector(geoclass_pc_pts)
    distances = np.asarray(interprets_pcd.compute_point_cloud_distance(geoclass_pcd))
    shortest_distances[idx] = distances
two_min = np.argsort(shortest_distances, axis = 0)[:2]
shortest_distances = np.sort(shortest_distances,axis = 0)
valid_distances = np.sum(shortest_distances[:2], axis = 0) <= y_dist

def in_hull(p, hull):
        if not isinstance(hull,Delaunay):
            hull = Delaunay(hull)

        return hull.find_simplex(p)>=0

lies_or_not = {}
for idx, key in enumerate(geoclass.keys()):
    for jdx, jkey in enumerate(geoclass.keys()):
        if idx != jdx:
            hull_points = np.vstack((pd.read_csv(key, usecols = ["x", "y", "z"]).values, \
                                    pd.read_csv(jkey, usecols = ["x", "y", "z"]).values))
            lies_or_not[str(idx)+str(jdx)] = in_hull(interprets_pc_pts,hull_points)

total_ans = np.zeros(len(interprets_pc_pts))
geometries = []
keys = list(geoclass.keys())
for key in tqdm(geoclass.keys()):
    geoclass_pc_pts = pd.read_csv(key, usecols = ["x", "y", "z"]).values
    geoclass_pcd = o3d.geometry.PointCloud()
    geoclass_pcd.points = o3d.utility.Vector3dVector(geoclass_pc_pts)
    geoclass_pcd.paint_uniform_color([1.0, 0, 0])

    geoclass[key]+=0.01
    sphere_mesh = o3d.geometry.TriangleMesh.create_sphere(geoclass[key],10)
    sphere_pts = np.asarray(np.asarray(sphere_mesh.vertices))
    sphere = o3d.geometry.PointCloud()
    sphere.points = o3d.utility.Vector3dVector(sphere_pts)

    to_expand_pcd = copy(geoclass_pcd)
    to_expand_pcd_pts = np.asarray(to_expand_pcd.points)
    expanded_pts = np.array([[to_expand_pcd_pts[0][0],to_expand_pcd_pts[0][1],to_expand_pcd_pts[0][2]]])
    for i in to_expand_pcd_pts:
        expanded_pts = np.vstack((expanded_pts, i+sphere_pts))

    expanded_pcd = o3d.geometry.PointCloud()
    expanded_pcd.points = o3d.utility.Vector3dVector(expanded_pts)

    hull, _ = expanded_pcd.compute_convex_hull()
    hull_ls = o3d.geometry.LineSet.create_from_triangle_mesh(hull)
    hull_ls.paint_uniform_color((0, 1, 1)) #looseboundgraphicCYAN

    hull_hard, _ = geoclass_pcd.compute_convex_hull()
    hull_ls_hard = o3d.geometry.LineSet.create_from_triangle_mesh(hull_hard)
    hull_ls_hard.paint_uniform_color((1, 0, 0)) #rigidboundRED

    def in_hull(p, hull):
        if not isinstance(hull,Delaunay):
            hull = Delaunay(hull)

        return hull.find_simplex(p)>=0

    ans_losen = in_hull(interprets_pc_pts, expanded_pts)
    ans_hard = in_hull(interprets_pc_pts, geoclass_pc_pts)

    for i in range(len(total_ans)):
        # hull_points = np.vstack((pd.read_csv(keys[two_min[0][i]], usecols = ["x", "y", "z"]).values, \
        #                         pd.read_csv(keys[two_min[1][i]], usecols = ["x", "y", "z"]).values))
        # print(hull_points.shape)
        if valid_distances[i] == True and total_ans[i] != 1 and lies_or_not[str(two_min[0][i])+str(two_min[1][i])][i]:
            total_ans[i] = 2
        if ans_hard[i] == True:
            total_ans[i] = 1
        if ans_hard[i] == False and ans_losen[i] == True and total_ans[i] != 1:
            total_ans[i] = 3
        if ans_hard[i] == False and ans_losen[i] == False and interprets_pc_pts[i][-1] > 0 and total_ans[i] != 1:
            total_ans[i] = 5
    if geoclass[key] != 0.01:
        geometries.append(hull_ls)
    geometries.append(hull_ls_hard)

interprets_pcd_true = o3d.geometry.PointCloud()
interprets_pcd_true.points = o3d.utility.Vector3dVector(interprets_pc_pts[total_ans == 1])
interprets_pcd_false = o3d.geometry.PointCloud()
interprets_pcd_false.points = o3d.utility.Vector3dVector(interprets_pc_pts[total_ans == 0])
interprets_pcd_losen = o3d.geometry.PointCloud()
interprets_pcd_losen.points = o3d.utility.Vector3dVector(interprets_pc_pts[total_ans == 3])
interprets_pcd_false_zpos = o3d.geometry.PointCloud()
interprets_pcd_false_zpos.points = o3d.utility.Vector3dVector(interprets_pc_pts[total_ans == 5])
interprets_pcd_y = o3d.geometry.PointCloud()
interprets_pcd_y.points = o3d.utility.Vector3dVector(interprets_pc_pts[total_ans == 2])

interprets_pcd_true.paint_uniform_color([0, 1, 0])   #inside/1
interprets_pcd_false.paint_uniform_color([0.698, 0.133, 0.133])  #outside/0
interprets_pcd_false_zpos.paint_uniform_color([0.5, 0.5, 0.5])  #outsidepos/5
interprets_pcd_losen.paint_uniform_color([1, 0, 1])  #within100/2
interprets_pcd_y.paint_uniform_color([0, 0, 1])  #within_y
geometries.append(interprets_pcd_true)
geometries.append(interprets_pcd_false)
geometries.append(interprets_pcd_losen)
geometries.append(interprets_pcd_y)
geometries.append(interprets_pcd_false_zpos)

o3d.visualization.draw_geometries(geometries)
o3d.visualization.draw_geometries([interprets_pcd_true, interprets_pcd_false, interprets_pcd_false_zpos, interprets_pcd_losen, interprets_pcd_y])

interprets_pc_add = pd.read_csv(interprets)
interprets_pc_add['uncertaintyclasses'] = total_ans
interprets_pc_add.to_csv("level_1to5.csv")
df = pd.read_csv('level_1to5.csv')

level_1 = df[df['uncertaintyclasses']==1]
level_2 = df[df['uncertaintyclasses']==2]
level_3 = df[df['uncertaintyclasses']==3]
level_4 = df[df['uncertaintyclasses']==0]
level_5 = df[df['uncertaintyclasses']==5]

level_1.to_csv('level_1.csv', index=False)
level_2.to_csv('level_2.csv', index=False)
level_3.to_csv('level_3.csv', index=False)
level_4.to_csv('level_4.csv', index=False)
level_5.to_csv('level_5.csv', index=False)

100%|██████████████████████████████████████████████████████████████████████████████████| 16/16 [16:51<00:00, 63.25s/it]
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


![image-2.png](attachment:image-2.png)

![image-2.png](attachment:image-2.png)

Here we calculate zone 2; the area filled the space between galleries and around outcrops in 100 m (where the user can define this distance).

## Output and Future work

Generate an output as CSV file with all levels together and each level spatially on the interpretation. This files can be used as an input for any further investigation using machine learning (GAN). For example, part of the output (level-1 or perhaps level-1 & 2) can be used on machine learning/deep learning models as input to predict the remaining levels (3, 4 and 5).

## Reference 