# Introduction to Python
**Carlos Calvo Hernandez**

2019/05/29

"In a world where every subject matter can have a data-supported treatment, where computational devices are omnipresent and pervasive, the union of natural language and computation creates compelling communication and learning opportunities." Teaching and learning with Jupyter

## Why Python?

> "It's a beautifully designed, intuitive, but exceedingly powerful general-purpose programming language" From [Data Analysis in Python](www.data-analysis-in-python.org)

Python was designed to be human-readable and to minimize the amount of time spent writing code. If you learn Python, you're learning a full programming language. This means that if you're interested in doing *computational* social science, building a generalizable programming skill just makes you more flexible.

## Jupyter Notebook

At this point you already installed [Anaconda](https://www.anaconda.com/download/), if not go and do it ASAP.

To start Anaconda, go to your programs (Applications for MacOS, or Start/Programs for PC) and open it.

This is what it should look like:

![Anaconda Navigator](img/conda_navigator.png)

We're gonna use Jupyter Notebook

What is Jupyter Notebook? You might ask yourselves (or me, in this case). In the words of [Project Jupyter](https://jupyter-notebook.readthedocs.io/en/stable/notebook.html)

> "...notebook extends the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process..."

The Notebooks contain two differente components:

A **web application**: a browser-based tool for interactive authoring of documents which combine explanatory text, mathematics, computations and their rich media output.

**Notebook documents**: a representation of all content visible in the web application, including inputs and outputs of the computations, explanatory text, mathematics, images, and rich media representations of objects.



## Python

### Basic Properties

Python is **strongly typed** (i.e. object types are enforced), **dynamically**, **implicitly typed** (i.e. you don't have to declare variables), **case sensitive** (i.e. var and VAR are two different variables), and **object-oriented** (i.e. everything is an object).


Values are **assigned** with the *equals* sign (=). In fact, objects are **bound** to names. **Equality testing** is done using *two equals* signs (==). 

In [2]:
myvar = 3
myvar += 2
myvar

5

In [3]:
myvar -= 1
myvar

4

`+=` and `-=` increase/decrease values respectively by the RHS amount.

There is no **mandatory statement termination character** and **blocks are specified by indentation**. 
Indent to begin a block, dedent to end one. Statements that expect an indentation level end in a colon (:).

**Comments** start with (#) and are single-line. Multi-line (""" ... """) strings are used for multi-line comments.

In [4]:
"""This is a multi-line comment.
The following line concatenates two strings"""

mystring = "Hello"
mystring += " world!"
#Let's now print the result
print(mystring)

Hello world!


### [Strings](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str)

A Python **string** `str` is a sequence of 0 or more characters enclosed within `''` or `""`.

In [5]:
my_string = 'Python is my favorite programming language!'

In [6]:
my_string

'Python is my favorite programming language!'

Python **identifiers** (or names) are composed of letters, numbers, and/or underscores, starting with a letter or underscore. Python identifiers are case sensitive. Python variables and functions typically start with lowercase letters; Python classes (we don't really need to use classes) start with uppercase letters.

In [7]:
type(my_string)

str

In [8]:
len(my_string)

43

#### Printing

The `print()` function writes the value of its comma-delimited arguments to the console. Each value in the output is separated by a single blank space.

In [9]:
print('A', 'B', 'C', 1, 2, 3)
print('Instance 1:', my_string)

A B C 1 2 3
Instance 1: Python is my favorite programming language!


The print function has an optional keyword argument, `end`. When this argument is used and its value does not include '\n' (newline character), the output cursor will not advance to the next line.

In [10]:
print('A', 'B')  # no end argument
print('C')
print ('A', 'B', end='...\n')  # end includes '\n' --> output cursor advances to next line
print ('C')
print('A', 'B', end=' ')  # end=' ' --> use a space rather than newline at the end of the line
print('C')  # so that subsequent printed output will appear on same line

A B
C
A B...
C
A B C


### Respecting [PEP8](https://www.python.org/dev/peps/pep-0008/#maximum-line-length) with long strings

In [11]:
long_story = ('Lorem ipsum dolor sit amet, consectetur adipiscing elit.' 
              'Pellentesque eget tincidunt felis. Ut ac vestibulum est.' 
              'In sed ipsum sit amet sapien scelerisque bibendum. Sed ' 
              'sagittis purus eu diam fermentum pellentesque.')
long_story

'Lorem ipsum dolor sit amet, consectetur adipiscing elit.Pellentesque eget tincidunt felis. Ut ac vestibulum est.In sed ipsum sit amet sapien scelerisque bibendum. Sed sagittis purus eu diam fermentum pellentesque.'

#### `str.replace`

If you don't know how something works, just call for `help`

In [12]:
help(str.replace)

Help on method_descriptor:

replace(...)
    S.replace(old, new[, count]) -> str
    
    Return a copy of S with all occurrences of substring
    old replaced by new.  If the optional argument count is
    given, only the first count occurrences are replaced.



This will not modify `my_string` because replace is not done in-place.

In [13]:
my_string.replace('a', '?')

'Python is my f?vorite progr?mming l?ngu?ge!'

In [14]:
my_string

'Python is my favorite programming language!'

You have to store the return value of `replace` instead.

In [15]:
my_modified_string = my_string.replace('is', 'will be')
print(my_modified_string)

Python will be my favorite programming language!


#### `str.format`

In [16]:
secret = '{} is cool'.format('Python')
print(secret)

Python is cool


In [17]:
print('My name is {} {}, you can call me {}.'.format('John', 'Doe', 'John'))
# is the same as:
print('My name is {first} {family}, you can call me {first}.'.format(first='John', family='Doe'))

My name is John Doe, you can call me John.
My name is John Doe, you can call me John.


#### `str.join`

In [18]:
pandas = "pandas"
numpy = "numpy"
requests = "requests"
cool_python_libs = ", ".join([pandas, numpy, requests])

In [19]:
print("Some cool Python libraries: {}".format(cool_python_libs))

Some cool Python libraries: pandas, numpy, requests


### Lists

A `list` is an ordered **sequence** of 0 or more comma-delimited elements enclosed within square brackets (`[`, `]`). The Python `str.split(sep)` method can be used to split a `sep`-delimited string into a corresponding list of elements.

In the following example, a comma-delimited string is split using `sep=','`.

In [20]:
single_instance_str = 'p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d'
single_instance_list = single_instance_str.split(',')
print(single_instance_list)

['p', 'k', 'f', 'n', 'f', 'n', 'f', 'c', 'n', 'w', 'e', '?', 'k', 'y', 'w', 'n', 'p', 'w', 'o', 'e', 'w', 'v', 'd']


Python lists are heterogeneous.

In [21]:
mixed_list = ['a', 1, 2.3, True, [1, 'b']]
print(mixed_list)

['a', 1, 2.3, True, [1, 'b']]


The Python + operator can be used for addition, and also to concatenate strings and lists.

In [22]:
print(1 + 2 + 3)
print('a' + 'b' + 'c')
print(['a', 1] + [2.3, True] + [[1, 'b']])

6
abc
['a', 1, 2.3, True, [1, 'b']]


In [23]:
month = "may"
day = "29"
date = " ".join([month,day])
print(date)

may 29


#### Accessing sequence elements & subsequences

Individual elements of **sequences** (e.g., lists and strings) can be accessed by specifying their **zero-based** index position within square brackets (`[`, `]`).

The following statements print out the 3rd element - at zero-based index position 2 - of single_instance_str and single_instance_list.

Note that the 3rd elements are not the same, as commas count as elements in the string, but not in the list created by splitting a comma-delimited string.

In [24]:
print(single_instance_str)
print(single_instance_str[2])
print(single_instance_list)
print(single_instance_list[2])

p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d
k
['p', 'k', 'f', 'n', 'f', 'n', 'f', 'c', 'n', 'w', 'e', '?', 'k', 'y', 'w', 'n', 'p', 'w', 'o', 'e', 'w', 'v', 'd']
f


Negative index values can be used to specify a position offset from the end of the sequence.

It is often useful to use a -1 index value to access the last element of a sequence.

In [25]:
print(single_instance_str)
print(single_instance_str[-1])
print(single_instance_str[-2])

p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d
d
,


In [26]:
print(single_instance_list)
print(single_instance_list[-1])
print(single_instance_list[-2])

['p', 'k', 'f', 'n', 'f', 'n', 'f', 'c', 'n', 'w', 'e', '?', 'k', 'y', 'w', 'n', 'p', 'w', 'o', 'e', 'w', 'v', 'd']
d
v


The Python slice notation can be used to access subsequences by specifying two index positions separated by a colon (`:`); seq[start:stop] returns all the elements in seq between start and stop - 1 (inclusive).

In [27]:
print(single_instance_str[2:4])
print(single_instance_list[2:4])

k,
['f', 'n']


Slices index values can be negative

In [28]:
print(single_instance_str[-4:-2])
print(single_instance_list[-4:-2])

,v
['e', 'w']


The start and/or stop index can be omitted. A common use of slices with a single index value is to access all but the first element or all but the last element of a sequence.

In [29]:
print(single_instance_str)
print(single_instance_str[:-1])  # all but the last 
print(single_instance_str[:-2])  # all but the last 2 
print(single_instance_str[1:])  # all but the first
print(single_instance_str[2:])  # all but the first 2

p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d
p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,
p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v
,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d
k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d


Slice notation includes an optional third element, step, as in `seq[start:stop:step]`, that specifies the steps or increments by which elements are retrieved from seq between start and step - 1:

In [30]:
print(single_instance_str)
print(single_instance_str[::2])  # print elements in even-numbered positions
print(single_instance_str[1::2])  # print elements in odd-numbered positions
print(single_instance_str[::-1])  # print elements in reverse order

p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d
pkfnfnfcnwe?kywnpwoewvd
,,,,,,,,,,,,,,,,,,,,,,
d,v,w,e,o,w,p,n,w,y,k,?,e,w,n,c,f,n,f,n,f,k,p


#### Splitting / separating statements

Python statements are typically separated by newlines (rather than, say, the semi-colon in other programming languages). Statements can extend over more than one line; it is generally best to break the lines after commas, parentheses, braces or brackets. Inserting a backslash character (`\`) at the end of a line will also enable continuation of the statement on the next line, but it is generally best to look for other alternatives.

In [31]:
attribute_names = ['class', 
                   'cap-shape', 'cap-surface', 'cap-color', 
                   'bruises?', 
                   'odor', 
                   'gill-attachment', 'gill-spacing', 'gill-size', 'gill-color', 
                   'stalk-shape', 'stalk-root', 
                   'stalk-surface-above-ring', 'stalk-surface-below-ring', 
                   'stalk-color-above-ring', 'stalk-color-below-ring',
                   'veil-type', 'veil-color', 
                   'ring-number', 'ring-type', 
                   'spore-print-color', 
                   'population', 
                   'habitat']
print(attribute_names)

['class', 'cap-shape', 'cap-surface', 'cap-color', 'bruises?', 'odor', 'gill-attachment', 'gill-spacing', 'gill-size', 'gill-color', 'stalk-shape', 'stalk-root', 'stalk-surface-above-ring', 'stalk-surface-below-ring', 'stalk-color-above-ring', 'stalk-color-below-ring', 'veil-type', 'veil-color', 'ring-number', 'ring-type', 'spore-print-color', 'population', 'habitat']


In [32]:
print('a', 'b', 'c',  # no '\' needed when breaking after comma
      1, 2, 3)

a b c 1 2 3


In [33]:
print(  # no '\' needed when breaking after parenthesis, brace or bracket
    'a', 'b', 'c',
    1, 2, 3)

a b c 1 2 3


In [34]:
print(1 + 2 \
      + 3)

6


#### Processing strings & other sequences

The `str.strip([chars])` method returns a copy of `str` in which any leading or trailing `chars` are removed. If no `chars` are specified, it removes all leading and trailing whitespace. [Whitespace is any sequence of spaces, tabs (`\t`) and/or newline (`\n`) characters.]

Note that since a blank space is inserted in the output after every item in a comma-delimited list, the second asterisk below is printed after a leading blank space is inserted on the new line.

In [35]:
print('*', '\tp,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d\n', '*')

* 	p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d
 *


In [36]:
print('*', '\tp,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d\n'.strip(), '*')

* p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d *


In [37]:
print('*', '\tp,k,f,n,f,n,f,c,n,w,e,       ?,k,y\t,w,n,p,w\n,o,e,w,v,d\n'.strip(), '*')

* p,k,f,n,f,n,f,c,n,w,e,       ?,k,y	,w,n,p,w
,o,e,w,v,d *


A common programming pattern when dealing with CSV (comma-separated value) files is to repeatedly:

1. read a line from a file
2. strip off any leading and trailing whitespace
3. split the values separated by commas into a list

We will get to repetition control structures (loops) and file input and output shortly, but here is an example of how str.strip() and str.split() be chained together in a single instruction for processing a line representing a single instance from the mushroom dataset file. Note that chained methods are executed in left-to-right order.

[Python providees a `csv` module to facilitate the processing of CSV files, but we will not use that module here]

In [38]:
single_instance_str = 'p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d\n'
print(single_instance_str)
# first strip leading & trailing whitespace, then split on commas
single_instance_list = single_instance_str.strip().split(',')  
print(single_instance_list)

p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d

['p', 'k', 'f', 'n', 'f', 'n', 'f', 'c', 'n', 'w', 'e', '?', 'k', 'y', 'w', 'n', 'p', 'w', 'o', 'e', 'w', 'v', 'd']


A number of Python methods can be used on strings, lists and other sequences.

The `len(s)` function can be used to find the length of (number of items in) a sequence `s`. It will also return the number of items in a dictionary, a data structure we will cover further below.

In [39]:
print(len(single_instance_str))
print(len(single_instance_list))

46
23


The `in` operator can be used to determine whether a sequence contains a value.

Boolean values in Python are `True` and `False`.

In [40]:
print(',' in single_instance_str)
print(',' in single_instance_list)

True
False


#### Mutability

One important distinction between strings and lists has to do with their mutability.

Python strings are immutable, i.e., they cannot be modified. Most string methods (like `str.strip()`) return modified copies of the strings on which they are used.

Python lists are mutable, i.e., they can be modified.

The examples below illustrate a number of list methods that modify lists.

In [66]:
list_1 = [1, 2, 3, 5, 1]
list_2 = [list_1, 7]  # list_2 now references the same object as list_1

print('list_1:             ', list_1)
print('list_2:             ', list_2)
print()

list_1.remove(1)  # remove [only] the first occurrence of 1 in list_1
print('list_1.remove(1):   ', list_1)
print()

list_1.pop(2)  # remove the element in position 2
print('list_1.pop(2):      ', list_1)
print()

list_1.append(6)  # add 6 to the end of list_1
print('list_1.append(6):   ', list_1)
print()

list_1.insert(0, 7)  # add 7 to the beinning of list_1 (before the element in position 0)
print('list_1.insert(0, 7):', list_1)
print()

list_1.sort()
print('list_1.sort():      ', list_1)
print()

list_1.reverse()
print('list_1.reverse():   ', list_1)

list_1:              [1, 2, 3, 5, 1]
list_2:              [[1, 2, 3, 5, 1], 7]

list_1.remove(1):    [2, 3, 5, 1]

list_1.pop(2):       [2, 3, 1]

list_1.append(6):    [2, 3, 1, 6]

list_1.insert(0, 7): [7, 2, 3, 1, 6]

list_1.sort():       [1, 2, 3, 6, 7]

list_1.reverse():    [7, 6, 3, 2, 1]


When more than one name (e.g., a variable) is bound to the same mutable object, changes made to that object are reflected in all names bound to that object. For example, in the second statement above, `list_2` is bound to the same object that is bound to `list_1`. All changes made to the object bound to `list_1` will thus be reflected in `list_2` (since they both reference the same object).

In [67]:
print('list_1:          ', list_1)
print('list_2:          ', list_2)

list_1:           [7, 6, 3, 2, 1]
list_2:           [[7, 6, 3, 2, 1], 7]


We can create a copy of a list by using slice notation and not specifying a start or end parameter, i.e., `[:]`, and if we assign that copy to another variable, the variables will be bound to different objects, so changes to one do not affect the other.

In [43]:
list_1 = [1, 2, 3, 5, 1]
list_2 = list_1[:]  # list_1[:] returns a copy of the entire contents of list_1

print('list_1:             ', list_1)
print('list_2:             ', list_2)
print()

list_1.remove(1)  # remove [only] the first occurrence of 1 in list_1
print('list_1.remove(1):   ', list_1)
print()

print('list_1:          ', list_1)
print('list_2:          ', list_2)

list_1:              [1, 2, 3, 5, 1]
list_2:              [1, 2, 3, 5, 1]

list_1.remove(1):    [2, 3, 5, 1]

list_1:           [2, 3, 5, 1]
list_2:           [1, 2, 3, 5, 1]


The `dir()` function returns all the attributes associated with a Python name (e.g., a variable) in alphabetical order.

When invoked with a name bound to a list object, it will return the methods that can be invoked on a list. The attributes with leading and trailing underscores should be treated as protected (i.e., they should not be used); we'll discuss this further below.

In [44]:
dir(list_1)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__imul__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__reversed__',
 '__rmul__',
 '__setattr__',
 '__setitem__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

There are sorting and reversing functions, `sorted()` and `reversed()`, that do not modify their arguments, and can thus be used on mutable or immutable objects.

Note that `sorted()` always returns a sorted list of each element in its argument, regardless of which type of sequence it is passed. Thus, invoking `sorted()` on a string returns a list of sorted characters from the string, rather than a sorted string.

In [45]:
print('sorted(list_1):', sorted(list_1)) 
print('list_1:        ', list_1)
print()
print('sorted(single_instance_str):', sorted(single_instance_str)) 
print('single_instance_str:        ', single_instance_str)

sorted(list_1): [1, 2, 3, 5]
list_1:         [2, 3, 5, 1]

sorted(single_instance_str): ['\n', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', '?', 'c', 'd', 'e', 'e', 'f', 'f', 'f', 'k', 'k', 'n', 'n', 'n', 'n', 'o', 'p', 'p', 'v', 'w', 'w', 'w', 'w', 'y']
single_instance_str:         p,k,f,n,f,n,f,c,n,w,e,?,k,y,w,n,p,w,o,e,w,v,d



The `sorted()` function sorts its argument in ascending order by default.

An optional keyword argument, `reverse`, can be used to sort in descending order. The default value of this optional parameter is `False`; to get non-default behavior of an optional argument, we must specify the name and value of the argument, in this case, `reverse=True`.

In [46]:
print(sorted(single_instance_str)) 
print(sorted(single_instance_str, reverse=True))

['\n', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', '?', 'c', 'd', 'e', 'e', 'f', 'f', 'f', 'k', 'k', 'n', 'n', 'n', 'n', 'o', 'p', 'p', 'v', 'w', 'w', 'w', 'w', 'y']
['y', 'w', 'w', 'w', 'w', 'v', 'p', 'p', 'o', 'n', 'n', 'n', 'n', 'k', 'k', 'f', 'f', 'f', 'e', 'e', 'd', 'c', '?', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', ',', '\n']


#### Tuples (immutable list-like sequences)

A tuple is an ordered, immutable sequence of 0 or more comma-delimited values enclosed in parentheses (`'('`, `')'`). Many of the functions and methods that operate on strings and lists also operate on tuples.

In [74]:
x = (5, 4, 3, 2, 1)  # a tuple
print('x =', x)
print('len(x) =', len(x))
print('x.index(3) =', x.index(3))
print('x[2:4] = ', x[2:4])
print('x[4:2:-1] = ', x[4:2:-1])
print('sorted(x):', sorted(x))  # note: sorted() always returns a list

x = (5, 4, 3, 2, 1)
len(x) = 5
x.index(3) = 2
x[2:4] =  (3, 2)
x[4:2:-1] =  (1, 2)
sorted(x): [1, 2, 3, 4, 5]


In [68]:
dir(x)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'count',
 'index']

Note that the methods that modify lists (e.g., `append()`, `remove()`, `reverse()`, `sort()`) are not defined for immutable sequences such as tuples (or strings). Invoking one of these sequence modification methods on an immutable sequence will raise an `AttributeError` exception.

In [48]:
x.append(6)

AttributeError: 'tuple' object has no attribute 'append'

However, one can approximate these modifications by creating modified copies of an immutable sequence and then re-assigning it to a name.

In [75]:
x = x + (6,)  # need to include a comma to differentiate tuple from numeric expression
x

(5, 4, 3, 2, 1, 6)

REMEMBER that Python has a **`+=`** operator which is a shortcut for the *`name = name + new_value`* pattern. This can be used for addition (e.g., `x += 1` is shorthand for `x = x + 1`) or concatenation (e.g., `x += (7,)` is shorthand for `x = x + (7,)`).

In [50]:
x += (7,)
x

(5, 4, 3, 2, 1, 6, 7)

A tuple of one element must include a trailing comma to differentiate it from a parenthesized expression.

In [51]:
('a')

'a'

In [52]:
('a',)

('a',)

#### Conditionals

One common approach to handling errors is to *look before you leap (LBYL)*, i.e., test for potential exceptions before executing instructions that might raise those exceptions. 

This approach can be implemented using the `if` statement (which may optionally include an **`else`** and any number of **`elif`** clauses).

The following is a simple example of an `if` statement:

In [78]:
class_value = 'x'  # try changing this to 'p' or 'x'

if class_value == 'e':
    print('edible')
elif class_value == 'p':
    print('poisonous')
else:
    print('unknown')

unknown


Note that 

* a colon ('`:`') is used at the end of the lines with `if`, `else` or `elif`
* no parentheses are required to enclose the boolean condition (it is presumed to include everything between `if` or `elif` and the colon)
* the statements below each `if`, `elif` and `else` line are all indented

Python does not have special characters to delimit statement blocks (like the '{' and '}' delimiters in R); instead, sequences of statements with the same *indentation level* are treated as a statement block. The [Python Style Guide](https://www.python.org/dev/peps/pep-0008/) recommends using 4 spaces for each indentation level.

An `if` statement can be used to follow the LBYL paradigm in preventing the `ValueError` that occured in an earlier example:

In [82]:
attribute = 'bruises'  # try substituting 'bruises?' for 'bruises' and re-running this code

if attribute in attribute_names:
    i = attribute_names.index(attribute)
    print(attribute, 'is in position', i)
else:
    print(attribute, 'is not found')

bruises is not found


#### Seeking forgiveness vs. asking for permission (EAFP vs. LBYL)

Another perspective on handling errors championed by some pythonistas is that it is [*easier to ask forgiveness than permission (EAFP)*](http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#eafp-vs-lbyl).

As in many practical applications of philosophy, religion or dogma, it is helpful to *think before you choose (TBYC)*. There are a number of factors to consider in deciding whether to follow the EAFP or LBYL paradigm, including code readability and the anticipated likelihood and relative severity of encountering an exception.

In keeping with practices most commonly used with other languages, we will follow the LBYL paradigm throughout most of this primer. 

However, as a brief illustration of the EAFP paradigm in Python, here is an alternate implementation of the functionality of the code above, using a **`try/except`** statement.

In [81]:
attribute = 'bruises?'  # try substituting 'bruises' for 'bruises' and re-running this code

i = -1
try:
    i = attribute_names.index(attribute)
    print(attribute, 'is in position', i)
except ValueError:
    print(attribute, 'is not found')

bruises? is in position 4


The Python *null object* is **`None`** (note the capitalization).

#### Defining and calling functions

Python *function definitions* start with the **`def`** keyword followed by a function name, a list of 0 or more comma-delimited *parameters* (aka 'formal parameters') enclosed within parentheses, and then a colon ('`:`'). 

A function definition may include one or more **`return`** statements to indicate the value(s) returned to where the function is called. It is good practice to include a short `docstring` to briefly describe the behavior of the function and the value(s) it returns.

In [64]:
def attribute_value(instance, attribute, attribute_names):
    '''Returns the value of attribute in instance, based on its position in attribute_names'''
    if attribute not in attribute_names:
        return None
    else:
        i = attribute_names.index(attribute)
        return instance[i]  # using the parameter name here

A *function call* starts with the function name, followed by a list of 0 or more comma-delimited *arguments* (aka 'actual parameters') enclosed within parentheses. A function call can be used as a statement or within an expression.

In [65]:
attribute = 'cap-shape'  # try substituting any of the other attribute names shown above
print(attribute, '=', attribute_value(single_instance_list, 'cap-shape', attribute_names))

cap-shape = k
