# Reading and writing LAS files

This notebook goes with [the Agile blog post](https://agilescientific.com/blog/2017/10/23/x-lines-of-python-load-curves-from-las) of 23 October.

Set up a `conda` environment with:

    conda create -n welly python=3.6 matplotlib=2.0 scipy pandas

You'll need `welly` in your environment:

    conda install tqdm  # Should happen automatically but doesn't
    pip install welly
    
This will also install the latest versions of `striplog` and `lasio`.

In [None]:
import welly

In [None]:
ls ../data/*.LAS

### 1. Load the LAS file with `lasio`

In [None]:
import lasio

l = lasio.read('../data/P-129.LAS')  # Line 1.

That's it! But the object itself doesn't tell us much — it's really just a container:

In [None]:
l

### 2. Look at the WELL section of the header

In [None]:
l.header['Well']  # Line 2.

### 3. Look at the curve data

The curves are all present one big NumPy array:

In [None]:
l.data

Or we can go after a single curve object:

In [None]:
l.curves.GR  # Line 3.

And there's a shortcut to its data:

In [None]:
l['GR']  # Line 4.

...so it's easy to make a plot against depth:

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

plt.figure(figsize=(15,3))
plt.plot(l['DEPT'], l['GR'])
plt.show()

### 4. Inspect the curves as a `pandas` dataframe

In [None]:
l.df().head()  # Line 5.

### 5. Load the LAS file with `welly` 

In [None]:
from welly import Well

w = Well.from_las('../data/P-129.LAS')  # Line 6.

`welly` Wells know how to display some basics:

In [None]:
w

And the `Well` object also has `lasio`'s access to a pandas DataFrame:

In [None]:
w.df().head()

### 6. Look at `welly`'s Curve object

Like the `Well`, a `Curve` object can report a bit about itself:

In [None]:
gr = w.data['GR']  # Line 7.
gr

One important thing about Curves is that each one knows its own depths — they are stored as a property called `basis`. (It's not actually stored, but computed on demand from the start depth, the sample interval (which must be constant for the whole curve) and the number of samples in the object.)

In [None]:
gr.basis

### 7. Plot part of a curve

We'll grab the interval from 300 m to 1000 m and plot it.

In [None]:
gr.to_basis(start=300, stop=1000).plot()  # Line 8.

### 8. Smooth a curve

Curve objects are, fundamentally, NumPy arrays. But they have some extra tricks. We've already seen `Curve.plot()`. 

Using the `Curve.smooth()` method, we can easily smooth a curve, eg by 15 m (passing `samples=True` would smooth by 15 samples):

In [None]:
sm = gr.smooth(window_length=15, samples=False)  # Line 9.

sm.plot()

### 9. Export a set of curves as a matrix

You can get at all the data through the lasio `l.data` object:

In [None]:
print("Data shape: {}".format(w.las.data.shape))

w.las.data

But we might want to do some other things, such as specify which curves you want (optionally using aliases like GR1, GRC, NGC, etc for GR), resample the data, or specify a start and stop depth — `welly` can do all this stuff. This method is also wrapped by `Project.data_as_matrix()` which is nice because it ensures that all the wells are exported at the same sample interval.

Here are the curves in this well:

In [None]:
w.data.keys()

In [None]:
keys=['CALI', 'DT', 'DTS', 'RHOB', 'SP']

In [None]:
w.plot(tracks=['TVD']+keys)

In [None]:
X, basis = w.data_as_matrix(keys=keys, start=275, stop=1850, step=0.5, return_basis=True)

In [None]:
w.data['CALI'].shape

So CALI had 12,718 points in it... since we downsampled to 0.5 m and removed the top and tail, we should have substantially fewer points:

In [None]:
X.shape

In [None]:
plt.figure(figsize=(15,3))
plt.plot(X.T[0])
plt.show()

### 10+. BONUS: fix the lat, lon

OK, we're definitely going to go over our budget on this one.

Did you notice that the location of the well did not get loaded properly?

In [None]:
w.location

Let's look at some of the header:

    # LAS format log file from PETREL
    # Project units are specified as depth units
    #==================================================================
    ~Version information
    VERS.   2.0:
    WRAP.   YES:
    #==================================================================
    ~WELL INFORMATION
    #MNEM.UNIT      DATA             DESCRIPTION
    #---- ------ --------------   -----------------------------
    STRT .M      1.0668          :START DEPTH     
    STOP .M      1939.13760      :STOP DEPTH     
    STEP .M       0.15240        :STEP        
    NULL .          -999.25      :NULL VALUE
    COMP .        Elmworth Energy Corporation              :COMPANY
    WELL .        Kennetcook #2                            :WELL
    FLD  .        Windsor Block                            :FIELD
    LOC  .        Lat = 45* 12' 34.237" N                  :LOCATION
    PROV .        Nova Scotia                              :PROVINCE
      UWI.        Long = 63* 45'24.460  W                  :UNIQUE WELL ID
    LIC  .        P-129                                    :LICENSE NUMBER
    CTRY .        CA                                       :COUNTRY (WWW code)
     DATE.        10-Oct-2007                              :LOG DATE {DD-MMM-YYYY}
    SRVC .        Schlumberger                             :SERVICE COMPANY
    LATI .DEG                                              :LATITUDE
    LONG .DEG                                              :LONGITUDE
    GDAT .                                                 :GeoDetic Datum
    SECT .        45.20 Deg N                              :Section
    RANG .        PD 176                                   :Range
    TOWN .        63.75 Deg W                              :Township

Look at **LOC** and **UWI**. There are two problems:

1. These items are in the wrong place. (Notice **LATI** and **LONG** are empty.)
2. The items are malformed, with lots of extraneous characters.

We can fix this in two steps:

1. Remap the header items to fix the first problem.
2. Parse the items to fix the second one.

We'll define these in reverse because the remapping uses the transforming function.

In [None]:
import re

def transform_ll(text):
    """
    Parses malformed lat and lon so they load properly.
    """
    def callback(match):
        d = match.group(1).strip()
        m = match.group(2).strip()
        s = match.group(3).strip()
        c = match.group(4).strip()
        if c.lower() in ('w', 's') and d[0] != '-':
            d = '-' + d
        return ' '.join([d, m, s])
    pattern = re.compile(r""".+?([-0-9]+?).? ?([0-9]+?).? ?([\.0-9]+?).? +?([NESW])""", re.I)
    text = pattern.sub(callback, text)
    return welly.utils.dms2dd([float(i) for i in text.split()])

Make sure that works!

In [None]:
print(transform_ll("""Lat = 45* 12' 34.237" N"""))

In [None]:
remap = {
    'LATI': 'LOC',  # Use LOC for the parameter LATI.
    'LONG': 'UWI',  # Use UWI for the parameter LONG.
    'LOC':  None,   # Use nothing for the parameter SECT.
    'SECT': None,   # Use nothing for the parameter SECT.
    'RANG': None,   # Use nothing for the parameter RANG.
    'TOWN': None,   # Use nothing for the parameter TOWN.
}

funcs = {
    'LATI': transform_ll,  # Pass LATI through this function before loading.
    'LONG': transform_ll,  # Pass LONG through it too.
    'UWI': lambda x: "No UWI, fix this!"
}

In [None]:
w = Well.from_las('../data/P-129.LAS', remap=remap, funcs=funcs)

In [None]:
w.location.latitude, w.location.longitude

In [None]:
w.uwi

Let's just hope the mess is the same mess in every well. (LOL, no-one's that lucky.)

<hr>

**&copy; 2017 [agilescientific.com](https://www.agilescientific.com/) and licensed [CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/)**