<img src="../images/logo.png" alt="slb" style= "width: 1700px"/>

## ⚡️   - Tutorial 1: Log Evaluation

💡 In this tutorial, we will learn how to import and manipulate well log data for reservoir evaluation. The tutorial is subdivided into four sections:

1. Exploring well log data (LAS file)
2. Importing well tops 
3. Defining facies using well logs
4. Plotting well log data


💪 After completing this exercise you should be comfortable with displaying and manipulating log data

### 🏁 Step 1: Install Required Packages

👇 `pip install 'package-name'` is the standard way of installing required libraries

In [101]:
# !pip install lasio
# !pip install plotly

# If the libraries are already installed in the current environment, 
# the output message will be "Requirement already satisfied ..."

### 🏁 Step 2: Import Libraries

In [102]:
# Import required libraries

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
import lasio

### 🏁 Step 3: Import the .las File 

Use the Lasio library to import the .las file. The file is named similar to its associated the well name 👉  **"Diamond-14.las"** 


In [103]:
# Read the .LAS file using lasio, call it D14

D14 = lasio.read("../Data/Diamond-14.las")

### 🏁 Step 4: Display the Data in the .las File

Now that our file has been loaded, we can display its content: Header, curves description, and log data 

In [None]:
# Display the header of the .las file



In [None]:
# Display the description of the log curves present in the .las file



In [None]:
# Display a summary of the log data in the .las file



### 🏁 Step 5: Create a Dataframe 

💡 In this step, we will convert the data that was loaded using lasio ('D14') into a pandas dataframe. This will facilitate data manipulation and allow for further plotting

👌 Dataframes are typically used for data wrangling because they represent data in a tabular format, with rows and columns

In [None]:
# Create the dataframe using the "D14" dataset, call it D14_logs



In [None]:
# To get an overview of the newly created dataframe, print its summary using the 'info()' method



In [None]:
# Display the first five rows of the dataframe using the 'head()' method



# ✍️ The value inside the brackets specifies the number of rows to display

In [None]:
# Generate descriptive statistics of the log data



In [None]:
# We can also use the .T() function to transpose the data frame (index and columns)



### 🏁 Step 6: Replace Negative Values by NaN

💡 Note that from the previous step, we can see that certain logs such as RHOB, DT, etc. contain negative values. Such values are likely due to tool errors and should be replaced with appropriate values 👇


In [None]:
# Replace negative values on the 'RHOB' and 'DT' logs using the .mask () method



In [None]:
# Check the statistics for the well logs after we replaced the negative values



### 🏁 Step 7: Compute a New Well Log 

💡 After cleaning our log data from negative values, we can now use it to estimate geological properties. As an example, we will calculate an Acoustic Impedance (AI) log using the density (RHOB) and sonic logs (DT)

**AI= Bulk density x Velocity**

Note that the sonic log measures transit time, so we will need to compute velocity: 👇

**Velocity= 1000000 / DT**

In [None]:
# Calculate acoustic impedance (AI): AI= density x velocity



In [None]:
# Validate the resulting AI log by looking at its summary statistics



### 🏁 Step 8: Replace Infinity Values in the AI log with NaN

Note that the minimum and maximum values of the AI log are 👉-inf and inf👈. 

Infinite values may occur when the denominator (DT) equals 0

👉 We can use the .mask () function to replace any inf values by NaN

In [None]:
# Replace the inf and -inf data in the AI log by NaN



👁‍🗨 The **|** operator is the "or" operator

In [None]:
# Validate the results by looking at the stats of the AI log again



### 🏁 Step 9: Identify and Handle Outliers

👇 We will create a box plot to detect potential outliers in the data. We will use the AI log as an example

In [None]:
# Evaluate the data range from a boxplot and identify potential outliers



💡 Box plots are very useful to identify outliers

In this case, we can see that the outliers are so extreme that the plot is skewed, which is very common in geological data.

Therefore, it is recommended to remove some outliers for a better understanding of the data distribution 👇

In [None]:
# First, let's print the number of rows in the AI column before removing outliers



In [None]:
# Remove the rows for AI>25000 (Feel free to play with this number)


In [None]:
# Print the number of rows after removing the outliers



In [None]:
# Display the content of the edited AI column



In [None]:
# Create a boxplot of the AI data after removing the outliers



In [None]:
# Create a histogram of the AI log to validate the data range after removing outliers 



### 🏁 Step 10: Create a Sub-set of Data

👇 To facilitate the visualization and manipulation of the log data, we can create a sub-data frame to keep only the well logs that are required for the rest of the tutorial

Among the 38 columns in the original .las files (+ AI log), we will make a sub-set containing the following curves: CALI, GR, DT, RHOB, NPHI, PHIT, RT, VCL, and AI


In [None]:
# Create a sub-set selecting the following columns: # 'CALI', 'GR','DT','RHOB','NPHI','PHIT','RT','VCL','AI'



# 👁‍🗨 Double square brackets are used to return a DataFrame

In [None]:
# Print a summary of the resulting sub-set



### 🏁 Step 11: Import the Well Tops from a .csv file

In [None]:
# Read well tops from a .csv file




# Display the content of the loaded well tops file



### 🏁 Step 12: Run Basic Operations on the Well Tops

In [None]:
# Compute the average depth for each formation top for the three wells (Diamond-14, DIamond-10 and Diamond-03)



### 🏁Step 13: Re-Arrange the Well Tops Dataframe  

In [None]:
# Re-arrange the dataframe using the .pivot_table () function for: columns= Surface and index= Well name



### 🏁 Step 14: Sort Well Tops

In [None]:
# Sort the tops in each well based on depth ('MD')



### 🏁 Step 15: Define a Zone of Interest Between Two Well Tops 


💡 Now, we will create a sub-set of our log dataframe to keep only the values within the interval of "HOUSTON" and "HOUSTON_BASE" well tops


In [None]:
# Index the 'tops' dataframe based on well ("Well name") and well top ("Surface")


# 👁‍🗨 The inplace=True parameter means that the operation will be performed on the original tops DataFrame


# Display the resulting dataframe



# 👇 As shown in the output table, the data now has 2 index columns, Well name and Surface

🧐 The list of column names, ["Well name", "Surface"], specifies that the new index will have two levels: the first level is the "Well name" column and the second level is the "Surface" column

In [None]:
# Define two variables to store the top= HOUSTON and base= HOUSTON_BASE of our zone of interest


# 👌 Here we are filtering the rows where the "Well name" is "Diamond-14" and the "Surface" is "HOUSTON"



In [None]:
# Let's print the variables top and base



In [None]:
# Use the .loc() function to retrieve the data values in the zone of interest (within 'top' and 'base')



# 👌 Here we select only the rows whose index values are within the variables top and base

# Display the resulting dataframe



### 🏁Step 16: Facies Classification Using Well Logs

💡 In this portion of the tutorial we will create a facies classification based on a Gamma Ray cut off

In [None]:
# Create a function to define the facies classes -> GR> 50 : Shale and  GR< 50 : Sand



In [None]:
# This is an example of how the function GR_Facies could be used for a single value



💊 However, as we need to calculate the facies log for every value in a specific column of a dataframe, we should use the .apply () method 👇

In [None]:
# Create a column named 'Facies Type' and apply the 'GR_Facies' function to it




# Display a summary of the dataset. It should include the 'Facies Type' column



### 🏁 Step 17: Plotting Well Log Data

💡 The analysis of well log data relies on a variety of plots such as line plots (log vs depth), histograms, cross-plots, etc.

In this exercise we will explore the usage of various python libraries including matplotlib, seaborn and plotly, to create the most frequently used plots for well log evaluation:

- Line plot
- 2D and 3D scatter plot
- Box plots
- Histogram 
- Correlation matrix

In [None]:
# Using Matplotlib create a vertical plot of the GR log within the ZOI




In [None]:
# We can improve the plot above by adjusting the size, adding a title, axis labels and a grid



In [None]:
# Use Matplotlib to create a scatter plot of gamma ray Vs bulk density, with markers colored by DT



In [None]:
# Use plotly to create a 3D plot of the NPHI, RHOB and GR logs with markers colored by DT



# 👁‍🗨 Make sure to explore the icons on the top right of the plot!

In [None]:
# Use plotly to create a histogram of the bulk density log. 
# Be sure to customize the bin number ('nbins')! 



In [None]:
# We can also can create a histogram of the bulk density grouped/colored by facies



In [None]:
# Use plotly to create a boxplot of the gamma ray by facies



# 💥 Make sure to hover over the boxplot to read the statistics!

In [None]:
# Use plotly to create a strip plot of the Neutron porosity, grouped by facies



In [None]:
# Use Plotly to create a scatterplot of RHOB versus NPHI, and compute a trend line for the data



# 'ols' stands for Ordinary Least Squares (OLS) regression, which is the standard method ...
# used to fit a linear regression line to data point

### 🏁 Step 18: Calculate and Visualize a Correlation Matrix 

In [None]:
# Use the function .corr() to compute the pearson correlation coefficient for all columns ... 
# within the D14_ZOI dataframe


In [None]:
# Use Seaborn to create a heat map of the correlation matrix ('Matrix_Full')



In [None]:
# Use Plotly to display the graphical correlation between 'RHOB','NPHI','DT', and 'GR'



In [None]:
# We can also use plotly to create a parallel coordinates plot to visualize the correlation between those logs




☝️ In the plot above, each horizontal line represents a data point (a row in the data frame)!

💥 The plot provides a way to quickly see the relationships between multiple variables in a single visualization 

🎯 Well done!