# Basic objects

A `striplog` depends on a hierarchy of objects. This notebook shows the objects and their basic functionality.

- [Lexicon](#Lexicon): A dictionary containing the words and word categories to use for rock descriptions.
- [Component](#Component): A set of attributes. 
- [Interval](#Interval): One element from a Striplog — consists of a top, base, a description, one or more Components, and a source.

Striplogs (a set of `Interval`s) are described in [a separate notebook](Striplog_object.ipynb).

Decors and Legends are also described in [another notebook](Display_objects.ipynb).

In [43]:
import striplog
striplog.__version__

# If you get a lot of warnings here, just run it again.

'0.8.3'

<hr />

## Lexicon

In [44]:
from striplog import Lexicon Striplog
print(Lexicon.__doc__)

SyntaxError: invalid syntax (<ipython-input-44-6a7d25e22d1b>, line 1)

In [None]:
lexicon = Lexicon.default()
lexicon

In [None]:
lexicon.synonyms

Most of the lexicon works 'behind the scenes' when processing descriptions into `Rock` components.

In [None]:
lexicon.find_synonym('Halite')

In [45]:
s = "grysh gn ss w/ sp gy sh"
lexicon.expand_abbreviations(s)

'greyish green sandstone with spotty gray shale'

<hr />

## Component

A set of attributes. All are optional.

In [46]:
from striplog import Component

In [47]:
print(Component.__doc__)


    Initialize with a dictionary of properties. You can use any
    properties you want e.g.:

        - lithology: a simple one-word rock type
        - colour, e.g. 'grey'
        - grainsize or range, e.g. 'vf-f'
        - modifier, e.g. 'rippled'
        - quantity, e.g. '35%', or 'stringers'
        - description, e.g. from cuttings
    


We define a new rock with a Python `dict` object:

In [48]:
r = {'colour': 'grey',
     'grainsize': 'vf-f',
     'lithology': 'sand'}
rock = Component(r)
rock

0,1
colour,grey
grainsize,vf-f
lithology,sand


The Rock has a colour:

In [49]:
rock['colour']

'grey'

And it has a summary, which is generated from its attributes. 

In [50]:
rock.summary()

'Grey, vf-f, sand'

We can format the summary if we wish:

In [51]:
rock.summary(fmt="My rock: {lithology} ({colour}, {grainsize!u})")

'My rock: sand (grey, VF-F)'

The formatting supports the usual `s`, `r`, and `a`: 

* `s`: `str`
* `r`: `repr`
* `a`: `ascii`

Also some string functions:

* `u`: `str.upper`
* `l`: `str.lower`
* `c`: `str.capitalize`
* `t`: `str.title`

And some numerical ones, for arrays of numbers:

* `+` or `∑`: `np.sum`
* `m` or `µ`: `np.mean`
* `v`: `np.var`
* `d`: `np.std`
* `x`: `np.product`

In [52]:
x = {'colour': ['Grey', 'Brown'],
     'bogosity': [0.45, 0.51, 0.66],
     'porosity': [0.2003, 0.1998, 0.2112, 0.2013, 0.1990],
     'grainsize': 'VF-F',
     'lithology': 'Sand',
     }

X = Component(x)

# This is not working at the moment.
#fmt  = 'The {colour[0]!u} {lithology!u} has a total of {bogosity!∑:.2f} bogons'
#fmt += 'and a mean porosity of {porosity!µ:2.0%}.'

fmt  = 'The {lithology!u} is {colour[0]!u}.'

X.summary(fmt)

'The SAND is _.'

In [53]:
X.json()

'{"grainsize": "VF-F", "lithology": "Sand"}'

We can compare rocks with the usual `==` operator: 

In [54]:
rock2 = Component({'grainsize': 'VF-F',
                   'colour': 'Grey',
                   'lithology': 'Sand'})
rock == rock2

True

In [55]:
rock

0,1
colour,grey
grainsize,vf-f
lithology,sand


In order to create a Component object from text, we need a lexicon to compare the text against. The lexicon describes the language we want to extract, and what it means.

In [56]:
rock3 = Component.from_text('Grey fine sandstone.', lexicon)
rock3

0,1
lithology,sandstone
grainsize,fine
colour,grey


Components support double-star-unpacking:

In [57]:
"My rock: {lithology} ({colour}, {grainsize})".format(**rock3)

'My rock: sandstone (grey, fine)'

<hr />

## Position

Positions define points in the earth, like a top, but with uncertainty. You can define:

* `upper` — the highest possible location
* `middle` — the most likely location
* `lower` — the lowest possible location
* `units` — the units of measurement
* `x` and `y` — the _x_ and _y_ location (these don't have uncertainty, sorry)
* `meta` — a Python dictionary containing anything you want

Positions don't have a 'way up'. 

In [58]:
from striplog import Position
print(Position.__doc__)


    Used to represent a position: a top or base.

    Not sure whether to go with upper-middle-lower or z_max, z_mid, z_min.
    Sticking to upper and lower, because ordering in Intervals is already
    based on 'above' and 'below'.
    


In [59]:
params = {'upper': 95,
          'middle': 100,
          'lower': 110,
          'meta': {'kind': 'erosive', 'source': 'DOE'}
          }

p = Position(**params)
p

0,1
upper,95.0
middle,100.0
lower,110.0


Even if you don't give a `middle`, you can always get `z`: the central, most likely position:

In [60]:
params = {'upper': 75, 'lower': 85}
p = Position(**params)
p

0,1
upper,75.0
middle,
lower,85.0


In [61]:
p.z

80.0

<hr />

## Interval

Intervals are where it gets interesting. An interval can have:

* a top
* a base
* a description (in natural language)
* a list of `Component`s

Intervals don't have a 'way up', it's implied by the order of `top` and `base`. 

In [62]:
from striplog import Interval
print(Interval.__doc__)


    Used to represent a lithologic or stratigraphic interval, or single point,
    such as a sample location.

    Initialize with a top (and optional base) and a description and/or
    an ordered list of components.

    Args:
        top (float): Required top depth. Required.
        base (float): Base depth. Optional.
        description (str): Textual description.
        lexicon (dict): A lexicon. See documentation. Optional unless you only
            provide descriptions, because it's needed to extract components.
        max_component (int): The number of components to extract. Default 1.
        abbreviations (bool): Whether to parse for abbreviations.

    TODO:
        Seems like I should be able to instantiate like this:

            ``Interval({'top': 0, 'components':[Component({'age': 'Neogene'})``s

        I can get around it for now like this:

            ``Interval(**{'top': 0, 'components':[Component({'age': 'Neogene'})``

        Question: should Interval itself c

I might make an `Interval` explicitly from a Component...

In [63]:
Interval(10, 20, components=[rock])

0,1,2
,top,10.0
,primary,colourgreygrainsizevf-flithologysand
,summary,"10.00 m of grey, vf-f, sand"
,description,
,data,
,base,20.0

0,1
colour,grey
grainsize,vf-f
lithology,sand


... or I might pass a description and a `lexicon` and Striplog will parse the description and attempt to extract structured `Component` objects from it.

In [64]:
Interval(20, 40, "Grey sandstone with shale flakes.", lexicon=lexicon).__repr__()

"Interval({'top': Position({'middle': 20.0, 'units': 'm'}), 'base': Position({'middle': 40.0, 'units': 'm'}), 'description': 'Grey sandstone with shale flakes.', 'data': {}, 'components': [Component({'lithology': 'sandstone', 'colour': 'grey'})]})"

Notice I only got one `Component`, even though the description contains a subordinate lithology. This is the default behaviour, we have to ask for more components:

In [65]:
interval = Interval(20, 40, "Grey sandstone with black shale flakes.", lexicon=lexicon, max_component=2)
print(interval)

{'top': Position({'middle': 20.0, 'units': 'm'}), 'base': Position({'middle': 40.0, 'units': 'm'}), 'description': 'Grey sandstone with black shale flakes.', 'data': {}, 'components': [Component({'lithology': 'sandstone', 'colour': 'grey'}), Component({'lithology': 'shale', 'amount': 'flakes', 'colour': 'black'})]}


`Interval`s have a `primary` attribute, which holds the first component, no matter how many components there are.

In [66]:
interval.primary

0,1
lithology,sandstone
colour,grey


Ask for the summary to see the thickness and a `Rock` summary of the primary component. Note that the format code only applies to the `Rock` part of the summary.

In [67]:
interval.summary(fmt="{colour} {lithology}")

'20.00 m of grey sandstone with black shale'

We can change an interval's properties:

In [68]:
interval.top = 18

In [69]:
interval

0,1,2
,top,18.0
,primary,lithologysandstonecolourgrey
,summary,"22.00 m of sandstone, grey with shale, flakes, black"
,description,Grey sandstone with black shale flakes.
,data,
,base,40.0

0,1
lithology,sandstone
colour,grey


In [70]:
interval.top

0,1
upper,18.0
middle,18.0
lower,18.0


<hr />

## Comparing and combining intervals

In [71]:
# Depth ordered
i1 = Interval(top=61, base=62.5, components=[Component({'lithology': 'limestone'})])
i2 = Interval(top=62, base=63, components=[Component({'lithology': 'sandstone'})])
i3 = Interval(top=62.5, base=63.5, components=[Component({'lithology': 'siltstone'})])
i4 = Interval(top=63, base=64, components=[Component({'lithology': 'shale'})])
i5 = Interval(top=63.1, base=63.4, components=[Component({'lithology': 'dolomite'})])

# Elevation ordered
i8 = Interval(top=200, base=100, components=[Component({'lithology': 'sandstone'})])
i7 = Interval(top=150, base=50, components=[Component({'lithology': 'limestone'})])
i6 = Interval(top=100, base=0, components=[Component({'lithology': 'siltstone'})])

In [72]:
s_striplog=Striplog([i1,i2])
#s_striplog=Striplog.__init__(None,i1, source=None, order='depth')
s_striplog.plot(legend, ladder=True, aspect=5)

NameError: name 'Striplog' is not defined

In [35]:
i2.order

'depth'

**Technical aside:** The `Interval` class is a `functools.total_ordering`, so providing `__eq__` and one other comparison (such as `__lt__`) in the class definition means that instances of the class have implicit order. So you can use `sorted` on a Striplog, for example.

It wasn't clear to me whether this should compare tops (say), so that '>' might mean 'above', or if it should be keyed on thickness. I chose the former, and implemented other comparisons instead.

In [36]:
print(i3 == i2)  # False, they don't have the same top
print(i1 > i4)   # True, i1 is above i4
print(min(i1, i2, i5).summary())  # 0.3 m of dolomite

False
True
0.30 m of dolomite


In [37]:
i2 > i4 > i5  # True

True

We can combine intervals with the `+` operator. (However, you cannot subtract intervals.)

In [38]:
i2 + i3

0,1,2
,top,62.0
,primary,lithologysandstone
,summary,1.50 m of sandstone with siltstone
,description,50.0% 1.00 m of siltstone with 50.0% 1.00 m of sandstone
,data,
,base,63.5

0,1
lithology,sandstone


Adding a rock adds a (minor) component and adds to the description. 

In [39]:
interval + rock3

0,1,2
,top,18.0
,primary,colourgreylithologysandstone
,summary,"22.00 m of grey, sandstone with flakes, black, shale with grey, fine, sandstone"
,description,"Grey sandstone with black shale flakes. with Grey, fine, sandstone"
,data,
,base,40.0

0,1
colour,grey
lithology,sandstone


In [40]:
i6.relationship(i7), i5.relationship(i4)

('partially', 'containedby')

In [41]:
print(i1.partially_overlaps(i2))  # True
print(i2.partially_overlaps(i3))  # True
print(i2.partially_overlaps(i4))  # False
print()
print(i6.partially_overlaps(i7))  # True
print(i7.partially_overlaps(i6))  # True
print(i6.partially_overlaps(i8))  # False
print()
print(i5.is_contained_by(i3))  # True
print(i5.is_contained_by(i4))  # True
print(i5.is_contained_by(i2))  # False

True
True
False

True
True
False

True
True
False


In [42]:
x = i4.merge(i5)
x[-1].base = 65
x

Striplog(3 Intervals, start=63.0, stop=65.0)

In [43]:
i1.intersect(i2, blend=False)

0,1,2
,top,62.0
,primary,lithologysandstone
,summary,0.50 m of sandstone
,description,
,data,
,base,62.5

0,1
lithology,sandstone


In [44]:
i1.intersect(i2)

0,1,2
,top,62.0
,primary,lithologylimestone
,summary,0.50 m of limestone with sandstone
,description,60.0% 1.50 m of limestone with 40.0% 1.00 m of sandstone
,data,
,base,62.5

0,1
lithology,limestone


In [45]:
i1.union(i3)

0,1,2
,top,61.0
,primary,lithologylimestone
,summary,2.50 m of limestone with siltstone
,description,60.0% 1.50 m of limestone with 40.0% 1.00 m of siltstone
,data,
,base,63.5

0,1
lithology,limestone


In [46]:
i3.difference(i5)

(Interval({'components': [Component({'lithology': 'siltstone'})], 'top': Position({'middle': 62.5, 'units': 'm'}), 'data': {}, 'description': '', 'base': Position({'middle': 63.1, 'units': 'm'})}),
 Interval({'components': [Component({'lithology': 'siltstone'})], 'top': Position({'middle': 63.4, 'units': 'm'}), 'data': {}, 'description': '', 'base': Position({'middle': 63.5, 'units': 'm'})}))

<hr />

<p style="color:gray">©2015 Agile Geoscience. Licensed CC-BY. <a href="https://github.com/agile-geoscience/striplog">striplog.py</a></p>