# Fundamentals of Programming

Evan Bianco
[agilegeoscience](http://agilegeoscience.com), [@EvanBianco](http://twitter.com/EvanBianco)

- Variables and Assignment

- Native data types

- Operators and Expressions

- Data collections and data structures

- Procedures and control: Loops and Making choices

- Getting data, manipulating data

- Defining functions and calling functions 

- Writing and running programs 

- Objects and classes

# Variables and Assignment

In [None]:
x = 7
y = 10

In [None]:
x, y

In [None]:
x

In [None]:
x.__repr__()

Checking the `type` of a variable

In [None]:
type(x)

In [None]:
%whos

In [None]:
del y

In [None]:
%whos 

# Native data `types`

In [None]:
z = 1.4 + 2.3

In [None]:
print(z)

In [None]:
c = 2 + 1.5j  # same as writing: complex(2, 1.5) 
c

In [None]:
5 / 3.0

In [None]:
5 // 3

Why are there 2 kinds of numbers?

## Strings `str`

In [None]:
s1 = '#Nordegg:'

In [None]:
s1.strip('#')

In [None]:
s1.startswith('Nor')

## `str` indexing (how to count, part 1)

----
- **Exercise**: return the `e` character in `s`


Try `help(s)`, `s1?`, `s1??`, `s1.<tab>`, `s1.upper()` , `s1.strip()`, `s1.startswith()`, `s1.pop()`

## Can we do add two 'strings' together? 

In [None]:
s2, s3 = 'Limestone \n', 'Shale'

s2 + s3

In [None]:
print(s2 + s3)

In [None]:
print(s2 * 5)

In [None]:
lithology = s + s2 + 'has minor ' + s3 + ' fragments'
lithology

In [None]:
'{0} {1} has minor {2} fragments'.format(s,s2,s3)

## String methods and string formatting
----
- **Exercise**: Use a combination of string methods on `s` text formatting to produce the following output:

    `> The Nordegg limestone has minor shale fragments` 
    
    (ensure sentence case and remove `'#'`, `':'`, `'\n')

# Operators and Expressions

* mathematical operations

* comparison operations

* bitwise operations

* augmented assignment, copies, and pointers

* boolean expressions

* conversion functions

### mathematical operations

### comparison operations

### bitwise operations

### augmented assignment, copies and pointers

In [None]:
y += y / 125.0
y

### boolean expressions

In [None]:
type(ord('\t'))

### conversion functions

# Data collections and data structures

`list, dict, tuples, sets`

### `list`

Lists in Python are one-dimensional, ordered containers whose elements may be any Python objects. Lists are *mutable* and have methods for adding and removing elements to and from themselves. The literal syntax for lists is surround commas seperated values with square brackets (`[]`). The square brackets are a syntactic hint that lists are indexable.

In [None]:
# [1,1] + [3,3] + [4, 4]
list(str(10))

In [None]:
fib = [1, 1, 2, 3, 5, 8] + [13]
fib.append(13)

In [None]:
del(fib)

In [None]:
fib.extend([21.0, 34.0, 55.0])
fib

In [None]:
fib += [89.0, 144.0]
fib

In [None]:
fibm = np.array(fib[:-1])
fibp = np.array(fib[1:])
plt.plot(fibp/fibm)
plt.title('Golden Ratio')
fibp/fibm

### Indexing, slicing, striding

In addition to accessing a single element in a `list` or `string`, we can also *slice* or *stride* into data structures to access multiple elements at once.

In [None]:
name = 'Cambrian (C)'
name

In [None]:
name = 'Cambrian (C)'

## Without using the Python interpreter, what is the expected output of the following commands?:

- a) `name[:7]`

- b) `name[:-4]`

- c) `name[3:7]`

- d) `name[::2]`

In [None]:
ages = ['Cambrian (C)', 'Ordivician (O)',  'Silurian (S)',  'Devonian (D)', 
           'Mississipian (M)', 'Pennsylvanian (IP)', 'Permian (P)',
           'Triassic (Tr)', 'Jurassic (J)',  'Cretaceous (C)', 
           'Tertiary (T)', 'Quaternary (Q)']

## Indexing practice

----
**Exercise**:

- return the string: 

    ` > Triassic (Tr)` 


- return just the word: 

    ` > Triassic`


- return the abbreviation:

    ` > (Tr)` enclosed in parenthesis


- return just the abbreviation: 

    ` > Tr` 


(bonus points if you can do (d) all in one line)

In [None]:
n = 9
ages[n][ages[n].index('(')+1:ages[n].ages(')')]
#ages[8][10:12]

## Nested `list`

lists can contain anything*

In [None]:
age_intervals = [
                 ['Cambrian (C)', [544,495]], ['Ordivician (O)', [495, 492] ], 
                 ['Silurian (S)', [442, 416]], ['Devonian (D)',[416, 354]], 
                 ['Mississipian (M)', [354, 324]], ['Pennsylvanian (IP)', [324, 295]], 
                 ['Permian (P)', [304, 248]], ['Triassic (Tr)', [248, 205]], 
                 ['Jurassic (J)', [205, 144]], ['Cretaceous (C)', [160, 65]], 
                 ['Tertiary (T)', [65, 1.8]], ['Quaternary (Q)']
                 ]

*almost

In [None]:
age_intervals[9][1][0] = 144

In [None]:
age_intervals[-1].append([1.8, 0])

----
**Exercise**: what is the expected output of:

* a) `age_intervals[:2]`

* b) `age_intervals[6]`

* c) what command would you type to return the age of the end of the Permian, 248?

* d) the start of the Cretaceous is wrong (it should be 144). Change it to the correct value

* e) We've lost the dates for the Quaternary Period [1.8 mya to present (0)]. Index into that entry, and append it.

### `tuples`

*Tuples* are the immutable form of lists. They behave almost exactly the same as lists in every way except that you cannot change any of their values. There are no `append()` or `extend()` methods, and there are no *in-place* operators. 

They also differ from lists in their syntax. They are so central to how Python works, that *tuples* are defined by commas. Oftentimes, tuples will be seen surrounded by parentheses. These parentheses only serve to group actions or make the code more readable, not to actually define tuples.

In [None]:
a = 1,2,3,4  # a length-4 tuple
b = (42,)    # length-1 tuple defined by the comma
c = (42)     # not a tuple, just the number 42
d = ()       # length-0 tuple- no commas means no elements

You can concatenate tuples together in the same way as lists, but be careful about the order of opeartions. This is where parentheses come in handy,

(1, 2) + (3, 4)

In [None]:
(1,2)+(3,4)

Note that even though tuples are immutable, they may have immutable elements. Suppose that we have a list embedded in a tuple. This list may be modified in-place even though the list may not be removed or replaced wholesale:

In [None]:
x = 1.0, [2, 4], 16
x[1].append(8)
x

### `Sets`

Instances of the `set` type are equivalent to mathematical sets. Like their math counterparts, literal sets in Python are defined by comma seperated values between curly braces ({}). Sets are unordered containers of unique values. Duplicated elements are ignored. Beacuse they unordered, sets are not sequences and cannot be duplicated.

In [None]:
# a literal set formed with elements of various types
{1.0, 10, "one hundred", (1, 0, 0, 0)}

In [None]:
# a literal set OF special values
{True, False, None, "", 0.0, 0}

In [None]:
# conversion from a list to a set
set([2.0, 4, "eight", (16,), 4, 4, 2.0])

### `dicts`

Dictionaries are hands down *the most important* data structure in Python. Everything in Python is a dictionary. A dictionary, or `dict`, is a mutable, unordered collection of unique key / value pairs. 

In [None]:
timescale = {
           'Cambrian (C)' : (544,495), 'Ordivician (O)': (495, 492), 
           'Silurian (S)' : (442, 416), 'Devonian (D)': (416, 354), 
           'Mississipian (M)' : (354, 324), 'Pennsylvanian (IP)' : (324, 295), 
           'Permian (P)' : (304, 248), 'Triassic (Tr)' : (248, 205), 
           'Jurassic (J)' : (205, 144), 'Cretaceous (C)' : (160, 65), 
           'Tertiary (T)' : (65, 1.8), 'Quaternary (Q)' : (1.8, 0.0)
           }

In [None]:
timescale[0]

In [None]:
timescale['Cambrian (C)']

`timescale = dict([(k1,v1),(k1,v1),(k1,v1)])`

Here's a good time to take a break

- Variables and Assignment
- Native data types
- Operators and Expressions
- Data collections and data structures
- <font color='lightgrey'>Procedures and control: Loops and Making choices</font>
- <font color='lightgrey'>Getting data, manipulating data</font>
- <font color='lightgrey'>Defining functions and calling functions</font>
- <font color='lightgrey'>Writing and running programs</font>
- <font color='lightgrey'>Objects and classes</font>

# Procedures and control: Loops and Making choices

## Loops

*Doing stuff many times*

the <code><font color="green">while</font></code> loop

the <code><font color="green">for</font></code> loop

In [None]:
nums = [10,11, 12, 'hello', 'dog', 'geology', 29]

In [None]:
# for loop syntax
for item in nums:
    print(str(item) + '\n') 

<font color="#0A5394">**\*iteration, *iterable**</font>

## List comprehension

In [None]:
[x*x for x in nums]

## Making choices

The <code><font color="green">if</font></code> statement

In [None]:
n = [[1,10],[2,20],3,4,5,6,7,8]
# the if statement:

<font color="#0A5394">**\*conditionals**</font>

# Getting data...

## ... from text files

You can explicitly read from and write to files directly in your code. Python makes working with files pretty simple.

The first step to working with a text file is to obtain a 'file object' using `open`.

In [None]:
file_for_reading = open('reading_file.txt', 'r')  # 'r' means read-only

file_for_writing = open('writing_file.txt', 'r')  # 'w' is for write - will destroy file if already exists

file_for_appending = open('appending_file.txt', 'a')  # 'a' is for appending to the end of a file.

file_for_writing.close()  # don't forget to close your files when you're done.

In [None]:
## Open the file with read only permit
f = open('data/B-41_tops.txt', "r")

header = f.readline()    # is string containing the next line in the file
data = f.readlines()  # The variable "lines" is a list containing all lines

## close the file after reading the lines.
f.close()
[item.strip() for item in data]

Because it is easy to forget to close your files, it is convenient to use them with a a `with` block, at the end of which they will be close automatically.

In [None]:
filename = 'data/B-41_tops.txt'
with open(fname, r) as f:
    for line in f:                    # look at each line in the file 
        if re.match("^#", line):      # use a regex to see if it starts with '#'
            starts_with_hash += 1     # if is does, add 1 to the count

If you need to read a whole text file, you can just iterate over the lines of the file using `for`:

In [None]:
fname = 'data/B-41_tops.txt'
with open(fname) as f:
    i = 0
    line = f.readline()
    if line.startswith('#') and i < 5:
        i+=1
        content = f.readlines()
        print(content)

Every line you get this way ends in a newline character, `\n`, so you'll often want to `strip()` it before doing anything with it.

# Defining and calling functions

the <code><font color="green">def</font></code> statement

In [None]:
def myfunc(args):
    """
    Documentation string
    """
    # statement
    # statement
    return # optional

<font color="#0A5394">**\*scope**</font>

----
**Exercise**: write a function called `process_tops` that takes a
filename as input and return a dictionary of the tops

In [None]:
topsfile = 'data/B-41_tops.txt'

def my_tops(filename ='data/B-41_tops.txt'):
    """
    Takes a file as input and returns a dictionary of tops
    f : a filename path
    """
    my_tops = {}
    with open(filename, 'r') as f:
        for line in f:
            if line.startswith('#') == 0:
                row = line.strip('\n')
                split_rows = row.split(',')
                name = split_rows[0]
                depth = float(split_rows[-1].strip())
                my_tops[name] = depth
    return my_tops

my_tops(filename ='data/L-30_tops.txt')


# Writing and running programs

Put the previous function in a text file and give it the name, `load_tops.py`

## Your first module

In [None]:
import load_tops

In [None]:
tops = load_tops.my_tops('data/B-41_tops.txt')

## ... from delimited files

In [None]:
import csv
csv.
with open('data/periods.csv', 'rt') as f:
    reader = csv.DictReader(f, delimiter=',')
    for row in reader:
        print (row)

You can write out a delimited data using `csv.writer`:

In [None]:
my_tops = {'GOC' : 1200.0 , 'OWC' : 1300.0, 'Top Reservoir' : 1100.0}

with open('comma_delimited_stock_prices.txt', 'wb') as f:
    writer = csv.writer(f, delimiter=',')
    for name, depth in my_tops.items():
        writer.writerow([name, depth])

## ... from the web

Use View Source in your browser to figure out where the age range is on the page, and what it looks like.

Try to find the same string here.

In [None]:
url = "http://en.wikipedia.org/wiki/Cretaceous"

In [None]:
import requests
r = requests.get(url)
r.text[:2000]

Using a [regular expression](https://docs.python.org/2/library/re.html):

In [None]:
import re

s = re.search(r'<i>(.+?million years ago)</i>', r.text)
text = s.group(1)

----
**Exercise**: Make a function to get the start and end ages of *any* geologic period, taking the name of the period as an argument.

In [None]:
def get_age(period):
    url = "http://en.wikipedia.org/wiki/" + period
    r = requests.get(url)
    start, end = re.search(r'<i>([\.0-9]+)–([\.0-9]+)&#160;million years ago</i>', r.text).groups()
    return float(start), float(end)

In [None]:
period = "Cretaceous"
get_age(period)

In [None]:
def duration(period):
    t0, t1 = get_age(period)
    duration = t0 - t1
    response = "According to Wikipedia, the {0} period was {1:.2f} Ma long. ".format(period, duration)
    return response

In [None]:
duration('Cretaceous')

## Using built-in functions

## Importing modules

the <code><font color="green">import</font></code> statement


In [None]:
import this

## The Python standard library

[Built-in functions](https://docs.python.org/3/library/functions.html)

[Built-in Types](https://docs.python.org/3/library/stdtypes.html)

[docs.python.org](https://docs.python.org/3/library/)

In [None]:
import datetime

## External python languges

The Python Package Index, [PyPI](https://pypi.python.org/pypi)

* [SciPy](http://www.scipy.org/) -  a collection of often-used libraries

## Using external libraries

In [None]:
np.loadtxt('../data/R-39.las')

In [None]:
import numpy as np
import matplotlib.pyplot as plt
% matplotlib inline

In [None]:
# Acoustic impedance
Vp = Well['DT']
RHOB = Well['RHOB']
Ip = Vp * RHOB 

In [None]:
def rc(Ip2, Ip1):
    """
    returns the normal incidence reflection coefficient
    between two layers with impedances Z2, Z1
    Ip2 : Impedance of the bottom layer 
    Ip1 : Impedance of the upper layer
    """
    return (Ip2 - Ip1) / (Ip2 + Ip1)

----
**Exercise**: compute a reflection coefficient series by completing the following for loop


In [None]:
rc_series = []  
for layer in range(len(Ip)-1):
    Z2 = Ip[layer + 1]
    Z1 = Ip[layer]
    coeff = rc(Z2, Z1)
    rc_series.append(coeff)

----
**Exercise**: make two tracks of a well log; plot the impedance in one track, and the R.C series in the other 

In [None]:
plt.subplot(121)
plt.plot(impedance, c='g', lw=4, alpha=0.5)
plt.subplot(122)
plt.plot(rc, c='g', lw=4, alpha=0.5)

Getting started with [bruges](https://github.com/agile-geoscience/notebooks/blob/master/Bruges_getting_started.ipynb)


# Writing and running programs

## Objects and Classes

In [None]:
class Layers(object):
    
    def __init__(self, layers, label=None):
        # Just make sure we end up with an array
        self.layers = np.array(layers)
        self.label = label or "My log"
        self.length = self.layers.size  # But storing len in an attribute is unexpected...
        
    def __len__(self):  # ...better to do this.
        return len(self.layers)
        
    def rcs(self):
        uppers = self.layers[:-1]
        lowers = self.layers[1:]
        return (lowers-uppers) / (uppers+lowers)
    
    def plot(self, lw=0.5, color='#6699ff'):
        fig = plt.figure(figsize=(2,6))
        ax = fig.add_subplot(111)
        ax.barh(range(len(self.layers)), self.layers, color=color, lw=lw, align='edge', height=1.0, alpha=1.0, zorder=10)
        ax.grid(zorder=2)
        ax.set_ylabel('Layers')
        ax.set_title(self.label)
        ax.set_xlim([-0.5,1.0])
        ax.set_xlabel('Measurement (units)')
        ax.invert_yaxis()  
        #ax.set_xticks(ax.get_xticks()[::2])    # take out every second tick
        ax.spines['right'].set_visible(False)  # hide the spine on the right
        ax.yaxis.set_ticks_position('left')    # Only show ticks on the left and bottom spines
        
        plt.show()

In [None]:
velocities = [0.23, 0.34, 0.45, 0.25, 0.23, 0.35]

In [None]:
l = Layers(velocities, label='Well # 1')

In [None]:
l.label

In [None]:
l.plot()