# Basic objects

A `striplog` depends on a hierarchy of objects. This notebook shows the objects and their basic functionality.

- [Lexicon](#Lexicon): A dictionary containing the words and word categories to use for rock descriptions.
- [Component](#Component): A set of attributes. 
- [Interval](#Interval): One element from a Striplog — consists of a top, base, a description, one or more Components, and a source.

Striplogs (a set of `Interval`s) are described in [a separate notebook](Striplog_object.ipynb).

Decors and Legends are also described in [another notebook](Display_objects.ipynb).

In [10]:
import striplog
striplog.__version__

'0.5.6'

<hr />

## Lexicon

In [11]:
from striplog import Lexicon
print(Lexicon.__doc__)


    A Lexicon is a dictionary of 'types' and regex patterns.

    Most commonly you will just load the default one.

    Args:
        params (dict): The dictionary to use. For an example, refer to the
            default lexicon in ``defaults.py``.
    


In [12]:
lexicon = Lexicon.default()
lexicon

{'lithology': ['overburden', 'sandstone', 'siltstone', 'shale', 'mudstone', 'limestone', 'dolomite', 'salt', 'halite', 'anhydrite', 'gypsum', 'sylvite', 'clay', 'mud', 'silt', 'sand', 'gravel', 'boulders'], 'grainsize': ['vf(?:-)?', 'f(?:-)?', 'm(?:-)?', 'c(?:-)?', 'vc', 'very fine(?: to)?', 'fine(?: to)?', 'medium(?: to)?', 'coarse(?: to)?', 'very coarse', 'v fine(?: to)?', 'med(?: to)?', 'med.(?: to)?', 'v coarse', 'grains?', 'granules?', 'pebbles?', 'cobbles?', 'boulders?'], 'colour': ['red(?:dish)?', 'gray(?:ish)?', 'grey(?:ish)?', 'black(?:ish)?', 'whit(?:e|ish)', 'blu(?:e|ish)', 'purpl(?:e|ish)', 'yellow(?:ish)?', 'green(?:ish)?', 'brown(?:ish)?', 'light', 'dark', 'sandy'], 'abbreviations': {'Rpl': 'ripple', 'ox': 'oxidized', 'Macrofos': 'macrofossil', 'or': 'orangish', 'op': 'open', 'mrl': 'marly', 'Peld': 'pelletoid', 'hornbd': 'hornblend', 'Deer': 'decrease', 'euhed': 'euhedral', 'oo': 'ooidal', 'mtx': 'matrix', 'prly': 'pearly', 'magn': 'magnetic', 'Scs': 'scarce', 'sa-c': 's

In [13]:
lexicon.synonyms

{'Anhydrite': ['Gypsum'],
 'Overburden': ['Drift'],
 'Salt': ['Halite', 'Sylvite']}

Most of the lexicon works 'behind the scenes' when processing descriptions into `Rock` components.

In [14]:
lexicon.find_synonym('Halite')

'salt'

In [15]:
s = "grysh gn ss w/ sp gy sh"
lexicon.expand_abbreviations(s)

'greyish green sandstone with spotty gray shale'

<hr />

## Component

A set of attributes. All are optional.

In [16]:
from striplog import Component

In [17]:
print(Component.__doc__)


    Initialize with a dictionary of properties. You can use any
    properties you want e.g.:

        - lithology: a simple one-word rock type
        - colour, e.g. 'grey'
        - grainsize or range, e.g. 'vf-f'
        - modifier, e.g. 'rippled'
        - quantity, e.g. '35%', or 'stringers'
        - description, e.g. from cuttings

    You can include as many other things as you want, e.g.

        - porosity
        - cementation
        - lithology code
    


We define a new rock with a Python `dict` object:

In [18]:
r = {'colour': 'grey',
     'grainsize': 'vf-f',
     'lithology': 'sand'}
rock = Component(r)
rock

0,1
colour,grey
lithology,sand
grainsize,vf-f


The Rock has a colour:

In [19]:
rock.colour

'grey'

And it has a summary, which is generated from its attributes. 

In [20]:
rock.summary()

'Grey, sand, vf-f'

We can format the summary if we wish:

In [21]:
rock.summary(fmt="My rock: {lithology} ({colour}, {GRAINSIZE})")

'My rock: sand (grey, VF-F)'

We can compare rocks with the usual `==` operator: 

In [22]:
rock2 = Component({'grainsize': 'VF-F',
              'colour': 'Grey',
              'lithology': 'Sand'})
rock == rock2

True

In order to create a Component object from text, we need a lexicon to compare the text against. The lexicon describes the language we want to extract, and what it means.

In [23]:
rock3 = Component.from_text('Grey fine sandstone.', lexicon)
rock3

0,1
colour,grey
lithology,sandstone
grainsize,fine


In [24]:
rock4 = Component.from_text('Grey, sandstone, vf-f ', lexicon)
rock4

0,1
colour,grey
lithology,sandstone
grainsize,vf-f


<hr />

## Interval

Intervals are where it gets interesting. An interval can have:

* a top
* a base
* a description (in natural language)
* a list of `Component`s

Intervals don't have a 'way up', it's implied by the order of `top` and `base`. 

In [25]:
from striplog import Interval
print(Interval.__doc__)


    Used to represent a lithologic or stratigraphic interval, or single point,
    such as a sample location.

    Initialize with a top (and optional base) and a description and/or
    an ordered list of components.

    Args:
        top (float): Required top depth. Required.
        base (float): Base depth. Optional.
        lexicon (dict): A lexicon. See documentation. Optional unless you only
            provide descriptions, because it's needed to extract components.
        max_component (int): The number of components to extract. Default 1.
        abbreviations (bool): Whether to parse for abbreviations.

    


I might make an `Interval` explicitly from a Component...

In [26]:
Interval(10, 20, components=[rock])

Interval(top: 10.0, base: 20.0, description: '', components: [Component("colour":"grey", "lithology":"sand", "grainsize":"vf-f")])

... or I might pass a description and a `lexicon` and Striplog will parse the description and attempt to extract structured `Component` objects from it.

In [27]:
Interval(20, 40, "Grey sandstone with shale flakes.", lexicon=lexicon)

Interval(top: 20.0, base: 40.0, description: 'Grey sandstone with shale flakes.', components: [Component("colour":"grey", "lithology":"sandstone")])

Notice I only got one `Component`, even though the description contains a subordinate lithology. This is the default behaviour, we have to ask for more components:

In [28]:
interval = Interval(20, 40, "Grey sandstone with black shale flakes.", lexicon=lexicon, max_component=2)
interval

Interval(top: 20.0, base: 40.0, description: 'Grey sandstone with black shale flakes.', components: [Component("colour":"grey", "lithology":"sandstone"), Component("lithology":"shale", "colour":"black", "amount":"flakes")])

`Interval`s have a `primary` attribute, which holds the first component, no matter how many components there are.

In [29]:
interval.primary

0,1
colour,grey
lithology,sandstone


Ask for the summary to see the thickness and a `Rock` summary of the primary component. Note that the format code only applies to the `Rock` part of the summary.

In [30]:
interval.summary(fmt="{colour} {lithology} {amount}")

'20.00 m of grey sandstone  with black shale flakes'

We can compare intervals, based on their thickness. Let's make one which is 5 m thicker than the prvious one.

In [31]:
interval_2 = Interval(40, 65, "Red sandstone.", lexicon=lexicon)

**Technical aside:** The `Interval` class is a `functools.total_ordering`, so providing `__eq__` and one other comparison (such as `__lt__`) in the class definition means that instances of the class have implicit order. So you can use `sorted` on a Striplog, for example.

It wasn't clear to me whether this should compare tops (say), so that '>' might mean 'deeper', or if it should be keyed on thickness. I chose the latter, and implemented other comparisons instead.

In [32]:
print(interval_2 == interval)
print(interval_2 > interval)
print(max(interval, interval_2).summary())

False
True
25.00 m of sandstone, red


We can combine intervals with the `+` operator. (However, you cannot subtract intervals.)

In [33]:
interval_2 + interval

Interval(top: 20.0, base: 65.0, description: '55.6% Red sandstone with 44.4% Grey sandstone with black shale flakes', components: [Component("lithology":"sandstone", "colour":"red"), Component("colour":"grey", "lithology":"sandstone"), Component("lithology":"shale", "colour":"black", "amount":"flakes")])

If we add a number to an `interval`, it adds thickness to the base.

In [34]:
interval + 5

Interval(top: 20.0, base: 45.0, description: 'Grey sandstone with black shale flakes.', components: [Component("colour":"grey", "lithology":"sandstone"), Component("lithology":"shale", "colour":"black", "amount":"flakes")])

Adding a rock adds a (minor) component and adds to the description. 

In [35]:
interval + rock3

Interval(top: 20.0, base: 40.0, description: 'Grey sandstone with black shale flakes. with Grey, sandstone, fine', components: [Component("colour":"grey", "lithology":"sandstone"), Component("lithology":"shale", "colour":"black", "amount":"flakes"), Component("colour":"grey", "lithology":"sandstone", "grainsize":"fine")])

<hr />

<p style="color:gray">©2015 Agile Geoscience. Licensed CC-BY. <a href="https://github.com/agile-geoscience/striplog">striplog.py</a></p>