# Loading and Displaying Well Log Data from LAS

**Created by:** Andy McDonald  
  
This notebook illustrates how to load data in from a LAS file and carry out a basic QC of the data before plotting it on a log plot.

## Loading and Checking Data
The first step is to import the required libraries: pandas, matplotlib and LASIO.  
lasio is a library that has been developed to handle and work with LAS files. More info on the library can be found at: https://lasio.readthedocs.io/en/latest/

In [24]:
import pandas as pd
import matplotlib.pyplot as plt
import lasio

%matplotlib inline 
#%matplotlib qt 

To load our file in, we can use the read() method from LASIO like so:

In [25]:
las = lasio.read("FCIEN/NO_07_P_X1 - PELADO/NO_07_P_X1 - Logs/NO_07_P_X1 - Dipmeter_01.las")

Now that our file has been loaded, we can start investigating it's contents.  
To find information out about where the file originated from, such as the well name, location and what the depth range of the file covers, we can create a simple for loop to go over each header item. Using Python's f-string we can join the items together.

In [26]:
for item in las.well:
    print(f"{item.descr} ({item.mnemonic}): {item.value}")

START DEPTH (STRT): 65536.5
STOP DEPTH (STOP): 65481.0
STEP (STEP): -0.5
NULL VALUE (NULL): -999.25
COMPANY (COMP): ANCAP
WELL (WELL): N 07 P X-1
FIELD (FLD): PELADO
LOCATION (LOC): 
COUNTY (CNTY): 
STATE (STAT): 
COUNTRY (CTRY): URUGUAY
SERVICE COMPANY (SRVC): Schlumberger
API NUMBER (API): 
LOG DATE (DATE): 20-11-2008
UNIQUE WELL ID (UWI): 


If we just want to extract the Well Name, we can simply call it by:

In [27]:
las.well.WELL.value

'N 07 P X-1'

To quickly see what curves are present within the las file we can loop through `las.curves`

In [28]:
for curve in las.curves:
    print(curve.mnemonic)

INDEX
INDEX.
DIFF
TOD
TIME
ETIM
TENS
CS
MARK
FC1
FC2
FC3
FC4
RHDT
RAZI
RRB
RDEV
C1
C2
HAZI
AZIM
P1AZ
RB
DEVI


To see what curves are present within the las file, we can repeat the process with the CurveItem object and call upon the `unit` and `descr` functions to get info on the units and the curve's description.
The enumerate function allows us to keep a count of the number of curves that are present within the file. As enumerate returns a 0 on the first loop, we need to 1 to it if we want to include the depth curve.

In [29]:
for count, curve in enumerate(las.curves):
    print(f"Curve: {curve.mnemonic}, Units: {curve.unit}, Description: {curve.descr}")
print(f"There are a total of: {count+1} curves present within this file")

Curve: INDEX, Units: F, Description: DEPTH (BOREHOLE)
Curve: INDEX., Units: 1IN, Description: 
Curve: DIFF, Units: M, Description: 
Curve: TOD, Units: S, Description: 
Curve: TIME, Units: MS, Description: 
Curve: ETIM, Units: S, Description: 
Curve: TENS, Units: LB, Description: 
Curve: CS, Units: F/HR, Description: 
Curve: MARK, Units: M, Description: 
Curve: FC1, Units: , Description: 
Curve: FC2, Units: , Description: 
Curve: FC3, Units: , Description: 
Curve: FC4, Units: , Description: 
Curve: RHDT, Units: , Description: {A;0} [0]
Curve: RAZI, Units: DEG, Description: 
Curve: RRB, Units: DEG, Description: 
Curve: RDEV, Units: DEG, Description: 
Curve: C1, Units: IN, Description: 
Curve: C2, Units: IN, Description: 
Curve: HAZI, Units: DEG, Description: 
Curve: AZIM, Units: DEG, Description: 
Curve: P1AZ, Units: DEG, Description: 
Curve: RB, Units: DEG, Description: 
Curve: DEVI, Units: DEG, Description: 
There are a total of: 24 curves present within this file


## Creating a Pandas Dataframe
Data loaded in using LASIO can be converted to a pandas dataframe using the .df() function. This allows us to easily plot data and pass it into one of the many machine learning algorithms.

In [32]:
well = las.df()
well.to_excel('Pelado dipmeter.xlsx')

The `.head()` function generates a table view of the header and the first 5 rows within the dataframe.

In [31]:
well.describe()

Unnamed: 0,INDEX.,DIFF,TOD,TIME,ETIM,TENS,CS,MARK,FC1,FC2,...,RAZI,RRB,RDEV,C1,C2,HAZI,AZIM,P1AZ,RB,DEVI
count,112.0,112.0,112.0,112.0,112.0,112.0,112.0,112.0,112.0,112.0,...,112.0,112.0,112.0,112.0,112.0,112.0,112.0,112.0,112.0,112.0
mean,7861050.0,-0.198,10000000.0,320.007812,33.343804,34.386161,2999.450893,0.0,0.0,0.0,...,233.44017,39.641991,-6.538902,3.344,3.451,233.44017,233.44017,273.12033,39.641991,-6.536857
std,1948.109,3.066835e-16,1.496813e-08,0.773156,19.480859,1.388656,1.170604,0.0,0.0,0.0,...,0.421438,0.556472,0.121649,4.460851e-15,4.014766e-15,0.421438,0.421438,0.734598,0.556472,0.018214
min,7857720.0,-0.198,10000000.0,316.0,0.32,30.0,2993.25,0.0,0.0,0.0,...,232.313,38.875,-6.723,3.344,3.451,232.313,232.313,271.781,38.875,-6.585
25%,7859385.0,-0.198,10000000.0,320.0,16.691,35.0,2999.5,0.0,0.0,0.0,...,233.25,39.055,-6.643,3.344,3.451,233.25,233.25,272.563,39.055,-6.55125
50%,7861050.0,-0.198,10000000.0,320.0,33.341,35.0,3000.0,0.0,0.0,0.0,...,233.25,39.6835,-6.522,3.344,3.451,233.25,233.25,273.031,39.6835,-6.532
75%,7862715.0,-0.198,10000000.0,320.0,49.991,35.0,3000.0,0.0,0.0,0.0,...,233.422,40.313,-6.402,3.344,3.451,233.422,233.422,273.5,40.313,-6.522
max,7864352.0,-0.198,10000000.0,325.0,66.641,35.0,3000.0,0.0,0.0,0.0,...,234.625,40.313,-6.282,3.344,3.451,234.625,234.625,275.0,40.313,-6.508


In [None]:
well.reset_index(drop=True, inplace=True)
well.set_index(['DEPT:2'], inplace=True)
well.describe()

To find out more information about data, we can call upon the `.info()` and `.describe()` functions.  
    
The `.info()` function provides information about the data types and how many non-null values are present within each curve.  
The `.describe()` function, provides statistical information about each curve and can be a useful QC for each curve.

In [None]:
well['LITH'].describe()

In [None]:
well.info()

## Visualising Data Extent

Instead of the summary provided by the pandas describe() function, we can create a visualisation using matplotlib. Firstly, we need to work out where we have nulls (nan values). We can do this by creating a second dataframe and calling .notnull() on our well dataframe.  
  
As this returns a boolean (True or False) for each depth, we need to multiply by 1 to convert the values from True and False to 1 and 0 respectively.

In [None]:
#Quitamos valores 0 que estropean el ploteo

well = well.drop(well[well['CALI']==0].index)
well = well.drop(well[well['GR']==0].index)
well = well.drop(well[well['SP']==0].index)
well = well.drop(well[well['ILD']==0].index)


In [None]:
well_nan = well.notnull() * 1

In [None]:
well_nan.head()

We can now create a summary plot of the missing data

In [None]:
fig = plt.subplots(figsize=(7,10))

#Set up the plot axes
ax1 = plt.subplot2grid((1,7), (0,0), rowspan=1, colspan = 1) 
ax2 = plt.subplot2grid((1,7), (0,1), rowspan=1, colspan = 1)
ax3 = plt.subplot2grid((1,7), (0,2), rowspan=1, colspan = 1)
ax4 = plt.subplot2grid((1,7), (0,3), rowspan=1, colspan = 1)
ax5 = plt.subplot2grid((1,7), (0,4), rowspan=1, colspan = 1)
ax6 = plt.subplot2grid((1,7), (0,5), rowspan=1, colspan = 1)
ax7 = plt.subplot2grid((1,7), (0,6), rowspan=1, colspan = 1)

columns = well_nan.columns
axes = [ax1, ax2, ax3, ax4, ax5, ax6, ax7]

for i, ax in enumerate(axes):
    ax.plot(well_nan.iloc[:,i], well_nan.index, lw=0)
    ax.set_ylim(3000, 0)
    ax.set_xlim(0, 1)
    ax.set_title(columns[i])
    ax.set_facecolor('whitesmoke')
    ax.fill_betweenx(well_nan.index, 0, well_nan.iloc[:,i], facecolor='red')
    # Remove tick labels from each subplot
    if i > 0:
        plt.setp(ax.get_yticklabels(), visible = False)
    plt.setp(ax.get_xticklabels(), visible = False)

ax1.set_ylabel('Depth', fontsize=14)

plt.subplots_adjust(wspace=0)
plt.show()

## Plotting Log Data
Finally, we can plot our data using the code below. Essentially, the code is building up a series of subplots and plotting the data on the relevant tracks.  
  
When we add curves to the tracks, we need to set the curve's properties, including the limits, colour and labels. We can also specify the shading between curves. An example has been added to the caliper curve to show shading between a bitsize value (8.5") and the CALI curve.  
  
If there are a number of features that are common between the plots, we can iterate over them using a for loop.

In [None]:
fig, ax = plt.subplots(figsize=(15,10))




# Create a dictionary of formations with a top and bottom depth
formations = {"A":[470, 900], 
              "B": [900, 1370],
              "C": [1370, 1553],
              "D": [1553, 1668],
              "E": [1668, 1778]}

# Select the same number of colours as there are formations
zone_colours = ["red", "blue", "green", "orange", "purple"]









#Set up the plot axes
ax1 = plt.subplot2grid((1,5), (0,0), rowspan=1, colspan = 1)
ax2 = plt.subplot2grid((1,5), (0,1), rowspan=1, colspan = 1, sharey = ax1)
ax4 = plt.subplot2grid((1,5), (0,2), rowspan=1, colspan = 1, sharey = ax1)
ax6 = plt.subplot2grid((1,5), (0,3), rowspan=1, colspan = 1, sharey = ax1)
ax7 = ax2.twiny()

# As our curve scales will be detached from the top of the track,
# this code adds the top border back in without dealing with splines
ax10 = ax1.twiny()
ax10.xaxis.set_visible(False)
ax11 = ax2.twiny()
ax11.xaxis.set_visible(False)
ax12 = ax3.twiny()
ax12.xaxis.set_visible(False)
ax13 = ax4.twiny()
ax13.xaxis.set_visible(False)
ax14 = ax6.twiny()
ax14.xaxis.set_visible(False)

# Gamma Ray track
ax1.plot(well["GR"], well.index, color = "green", linewidth = 0.5)
ax1.set_xlabel("Gamma")
ax1.xaxis.label.set_color("green")
ax1.set_xlim(0, 200)
ax1.set_ylabel("Depth (m)")
ax1.tick_params(axis='x', colors="green")
ax1.spines["top"].set_edgecolor("green")
ax1.title.set_color('green')
ax1.set_xticks([0, 50, 100, 150, 200])
ax1.fill_betweenx(well_nan.index, well["GR"], 200, facecolor='yellow')
ax1.set_xticks([6,  11, 16])

# Resistivity track
ax2.plot(well["ILD"], well.index, color = "red", linewidth = 0.5)
ax2.set_xlabel("Resistivity - Deep")
ax2.set_xlim(0.2, 3000)
ax2.xaxis.label.set_color("red")
ax2.tick_params(axis='x', colors="red")
ax2.spines["top"].set_edgecolor("red")
ax2.set_xticks([0.1, 1, 10, 100, 1000])
ax2.semilogx()

# Sonic track
ax4.plot(well["SP"], well.index, color = "purple", linewidth = 0.5)
ax4.set_xlabel("SP")
ax4.set_xlim(-80, 60)
ax4.xaxis.label.set_color("purple")
ax4.tick_params(axis='x', colors="purple")
ax4.spines["top"].set_edgecolor("purple")

# Caliper track
ax6.plot(well["CALI"], well.index, color = "black", linewidth = 0.5)
ax6.set_xlabel("Caliper")
ax6.set_xlim(6, 16)
ax6.xaxis.label.set_color("black")
ax6.tick_params(axis='x', colors="black")
ax6.spines["top"].set_edgecolor("black")
ax6.fill_betweenx(well_nan.index, 8.0, well["CALI"], facecolor='yellow')
ax6.set_xticks([6,  11, 16])



# Common functions for setting up the plot can be extracted into
# a for loop. This saves repeating code.
for ax in [ax1, ax2, ax4, ax6]:
    ax.set_ylim(2400, 500)
    ax.grid(which='major', color='lightgrey', linestyle='-')
    ax.xaxis.set_ticks_position("top")
    ax.xaxis.set_label_position("top")
    ax.spines["top"].set_position(("axes", 1.02))
    
    # loop through the formations dictionary and zone colours
    for depth, colour in zip(formations.values(), zone_colours):
        # use the depths and colours to shade across the subplots
        ax.axhspan(depth[0], depth[1], color=colour, alpha=0.1)
    
    
    
for ax in [ax2, ax4, ax6]:
    plt.setp(ax.get_yticklabels(), visible = False)
    
plt.tight_layout()
fig.subplots_adjust(wspace = 0.15)
plt.show()

In [None]:
well['LITH'].describe()

In [None]:
well['FM'] = 0

well['FM'][(well.index > 470) & (well.index <= 900)] = 'TACUAREMBO'
well['FM'][(well.index > 900) & (well.index <= 1370)] = 'BUENA VISTA'
well.head()

In [None]:
well.to_excel('well.xlsx')

In [None]:
(9.8124+11.1527)/2