# Programming in Python #

Now that you have some familiarity with the environment and tools we'll be working with its time to get to work and learn how to program in Python. I'm going to assume you have some basic familiarity with programming principles such as variables, conditionals and looping. I'll show you how to do these things in Python but if you do not know what a variable is I suggest you stop, spend some time reading up on the basics and then come back to this.

## Python ##

Python is a general purpose, high-level, interpreted, dynamically typed multi-paradigm programming language. It was developed by Guido Van Rossum in 1989 and emphasizes simple readable code and rapid development.

- General Purpose: Python is designed to be used for a variety of problems
- High-level: Python uses high-level abstractions such as variables and functions and deals with things like memory management for you
- Interpreted: Python does not generally use a compiler which converts code from a programming language into machine code to be run by a computer directly. Python code is sent to a program called an interpreter that converts it on the fly for any supported platform. This sacrifices speed for flexibility.
- Dynamically Typed: Variables in python do not have a type that you specify in the code (int, string etc.) the interpreter will infer the type of a variable when the code is run.
- Multi-Paradigm: Programming paradigms are ways of programming such as object-oriented, functional or procedural which use different abstractions for representing data and manipulations of data. Python contains code for most of these abstractions and so can be used to write programs in any of these styles or in a mix of them.

## Types and data structures ##

### Types ###

We're going to go over the built-in Python types as well as the numeric types introduced in Numpy which you will also be using heavily. Python is dynamically typed so any given variable may have any type assigned or reassigned to it. This gives a lot of flexibility but your main source of errors when dealing with data WILL be unexpected type assignments.

### Built-In Numeric Types ###

Python has four distinct numeric types built in to the standard library. These are integers, long integers, floating point numbers and complex numbers. Integers will hold natural numbers (no decimals) with up to 32 bits of precision, long integers are the same thing but can hold natural numbers of arbitrary size. Floating point numbers will hold decimal numbers of arbitrary size and complex numbers will hold any of the above and include an imaginary component.

In [1]:
# Declaring Numeric Types

x = 42    #Simple Integer
y = 42L   #Long Integer
z = 42.0  #Floating Point
a = 42J   #Complex Number

print 'Types are {} {} {} {}'.format( type(x),type(y),type(z),type(a) )

Types are <type 'int'> <type 'long'> <type 'float'> <type 'complex'>


### Numpy Numeric Types ###

For performance reasons numpy uses its own numeric types as well as its own container types.  Pythons generic types and containers offer a lot of flexibility in terms of dynamic allocation and mixed type operatons, but that flexibility comes at a cost.  The python interpreter has to do a lot of checking on the back end to determine what types are being used, their size in memory and then picking a valid way to operate on them.  This prevents python from doing fast numeric operations, particularly matrix and vector operations with its built in types.  Enter Numpy.

Numpy is a python module that provides fixed-size numeric types and homogenous numeric containers so we can skip the checking and do blazing fast numeric and matrix operations in python. Numpy types and containers are the basis for almost all scientific compuation in python so it is important for us to understand the difference and when to use them. Numpy types are fixed length, you the programmer must decide the appropriate size and type for your needs and protect yourself from issues such as overflows, lost precision in floating point operations and etc. 

#### Signed Integers ####
Integer type of 8,16,32 or 64 bits. Can be negative.

In [3]:
import numpy as np

# Numpy signed integer types

a = np.int8(1)
b = np.int16(2)
c = np.int32(3)
d = np.int64(4)

print 'Signed Integers'
print 'a={} b={} c={} d={}\n'.format(a,b,c,d)


Signed Integers
a=1 b=2 c=3 d=4



#### Unsigned Integers ####
Integer type of 8,16,32 or 64 bits. Cannot be negative, represents greater range of positive numbers.

In [4]:
# Numpy unsigned integer types
a = np.uint8(1)
b = np.uint16(2)
c = np.uint32(3)
d = np.uint64(4)

print 'Unsigned Integers'
print 'a={} b={} c={} d={}\n'.format(a,b,c,d)

Unsigned Integers
a=1 b=2 c=3 d=4



#### Floating Point ####
Floating point numbers of 16, 32 or 64 bits. Represent decimal type numbers, always signed.

In [5]:
# Numpy floating point types
a = np.float16(1)
b = np.float32(2)
c = np.float64(3)

print 'Floating Point'
print 'a={} b={} c={}\n'.format(a,b,c)

Floating Point
a=1.0 b=2.0 c=3.0



#### Complex Numbers ####
Complex numbers of 64 or 128 bits. Represent numbers with an imaginary component.

In [6]:
# Numpy complex types
a = np.complex64(125 + 32j)
b = np.complex128(125 + 32j)

print 'Complex Numbers'
print 'a={} b={}\n'.format(a,b)

Complex Numbers
a=(125+32j) b=(125+32j)



As an exercise figure out what is going on with the following variable initializations. This will give you some idea of what is meant when we say that numpy types can require greater care from a programmer than pythons native numeric types.

In [None]:
# Bonus - Why do these assignments not give the output we expect?

a = np.int8(256)
b = np.uint8(512)
c = np.uint8(-1)
d = np.float16(0.1234567)
e = np.complex64(2147483649 + 2147483649j)

print 'Bonus'
print 'a={} b={} c={} d={} e={}'.format(a,b,c,d,e)

### Math Operators for Numeric Types ###

Python supports all the standard mathematical operation such as addition, subtraction, multiplication and etc. for its numeric types. Parentheses can be used to change order of operations and write longer expressions. Python also includes shorthand versions of common variable assignment operations such as adding a number to a variable and assigning the result to the original variable. Take a minute to read through the following code and write down the result you expect each line to produce. Are your predictions correct?

In [7]:
x = 2
y = 7

# Mathematical Operators
#
a = -x         #Negation
b = +x         #Does Nothing

c = x + y      #Addition
d = x - y      #Subtraction
e = x * y      #Multiplication
f = x / y      #Division
g = x // y     #Floored Division - drops the remainder
h = x % y      #Modulo - returns the remainder
i = (x + y)/x  #Parentheses - Change order of operations
j = x**y       #Exponentiation - X to the power of Y

print 'Common Operations'
print 'a={} b={} c={} d={} e={} f={} g={} h={} i={} j={}\n'.format(a,b,c,d,e,f,g,h,i,j)

Common Operations
a=-2 b=2 c=9 d=-5 e=14 f=0 g=0 h=2 i=4 j=128



#### Shorthand Operators ####
Shorthand operators are convenience functions that give an abreviated syntax for the common task of performing some operation on a variable and then setting the value of that variable to the result of the operation.

In [8]:
# Shorthand Operators
#
# shorthand assignment operators are available in Python for all the two-place operators ( +=, -=, *=, /=, **=, //=, %=)
# x += 1 is equivalent to x = x + 1

x += 1
y %= 2

print 'Shorthand Operations'
print 'x={} y={}'.format(x,y)

Shorthand Operations
x=3 y=1


### Same and Mixed Type Arithmetic ###

#### Same Type ####
If you wrote down the expected answers for the previous code you'll notice that in python2.7 2 / 7 = 0 instead of 2/7. This is because both x and y are integers. By default any operation performed between two numbers of the same type will return an answer of the same type. Floats return floats, longs return longs and as we saw integers return integers. When dividing integers Python will round down to the nearest whole number, which for 2 / 7 is zero. *Note the default behavior of this has been changed in python 3 but you should still be aware of it until python 2 is fully phased out.*

#### Mixed Type ####
There are also rules for what is returned when performing operations between different types.   

The priority of each type from lowest to highest is

**Integer -> Long Integer -> Float -> Complex Number**

Any operation between numbers of two different types will return a number with the same type as the highest priority variable in your expression. For instance adding an Integer and a Float will always return a Float.

In [9]:
# Operations for same and mixed types

a = 5 * 2      #int and int -> int
b = 5 * 2.0    #int and float -> float
c = 5 * 2L     #int and long -> long
d = 5 * 2J     #int and complex -> complex

e = 5.0 * 2L   #float and long -> float
f = 5.0 * 2J   #float and complex -> complex

g = 5L * 2J    #long and complex -> complex

print 'Mixed Type Operations'
print 'a={} b={} c={} d={} e={} f={} g={}\n'.format( type(a),type(b),type(c),type(d),type(e),type(f),type(g) )

Mixed Type Operations
a=<type 'int'> b=<type 'float'> c=<type 'long'> d=<type 'complex'> e=<type 'float'> f=<type 'complex'> g=<type 'complex'>



#### Numpy Mixed Type ####
When mixing numpy and standard types python will return the numpy version of the highest priority type
AND try to preserve maximum precision so a uint8 x float returns float64 or a float32 x int returns float64. 

In [10]:
# Numpy Mixed Type operations
h = np.uint8(4) * 4.0  # numpy uint8 and float -> numpy float64
i = np.float32(4) * 4  # numpy float32 and int -> numpy float64

print 'Mixing Numpy and Default Types'
print 'h={} i={}'.format( type(h), type(i))

Mixing Numpy and Default Types
h=<type 'numpy.float64'> i=<type 'numpy.float64'>


### Comparison Operations ###

Python also supports all standard comparison operators for its numeric types. You can perform comparisons between numbers of different types and the comparisons will give you results based on their values. For each comparison you will get back a Boolean value (True, False).  

### Booleans ###

Booleans in python are implemented as a subset of the integer type. Internally they are 1 for True and 0 for False. This means you can perform arithmetic using boolean values. For instance you could take the sum of a list of booleans to get the count of True values but generally speaking doing too much math with booleans is bad practice.

In [11]:
# Comparison Operators
# 
# Return a Boolean Value of True or False (0 or 1) if the comparison is true

x = 8
y = 22

a = x < y   #Less than
b = x <= y  #Less than or equal to
c = x > y   #Greater than
d = x >= y  #Greater than or equal to
e = x == y  #Equal to
f = x != y  #Not Equal to

print 'All Numeric Comparison Operations'
print 'a={} b={} c={} d={} e={} f={}'.format( a,b,c,d,e,f )

All Numeric Comparison Operations
a=True b=True c=False d=False e=False f=True


### Strings and Unicode Strings ###

There are two string types in Python. The default string class 'str' and unicode strings which contain a broader range of characters. String types are used to store written text and characters of any sort. There is no character type in Python so even single letters are strings of length one. Any sort of data including numbers can be interpreted as a string. Strings are immutable in Python which means you cannot change them once they are declared. Strings are also objects in Python, not primitive types, which means the strings themselves can have functions and variables attached to them. Almost everything in Python is secretly an object.

*Note: In python 3 the default string type was changed to unicode strings.*

#### Encodings ####

Encodings are the mappings between binary representations of characters on a computer and the letter displayed on screen. By default Python strings use the ASCII encoding which contains common english characters. But there are many other encodings available for the string type and it is possible to change the default encoding if for instance you need to analyze foreign language text. A common error is performing equality comparisons between strings with different encodings. Even if they contain the same display characters this may return false.

In [12]:
# Declaring String

a = "some text"   # anything wrapped in quotes is a string
b = 'some text'   # anything wrapped in single quotes is also a string
c = '234453'      # including numbers

# Declaring Unicode Strings

d = u"some text"
e = u'some text'
f = u'234453'

print 'Different Ways of Declaring Strings'
print 'a={} b={} c={} d={} e={} f={}'.format( type(a),type(b),type(c),type(d),type(e),type(f) )

# Bonus - Uncomment the following code, why does this return an error?
# a = 'abcdefg'
# a[0] = 'b'

Different Ways of Declaring Strings
a=<type 'str'> b=<type 'str'> c=<type 'str'> d=<type 'unicode'> e=<type 'unicode'> f=<type 'unicode'>


## Data Structures ##

Pythons data-structures have the same sort of style we saw in their built in types. Python offers a few main data-structures which are flexible, powerful and accept heterogenous types. The "list" structure can be a list of any mix of types or other data-structures you want to put in it. Numpy data-structures are brought in when the flexibility is less valuable than fast computation and the added utility of matrix operations.    

**Mutable** - An object or data structure is immutable when its contents cannot be modified after it is created, a mutable object can be modified at any time.

*ex: strings and tuples are immutable, lists and primitive types are mutable*

**Hashable** - An object or data structure is hashable when it can be converted (hashed) into an integer key that is uniquely based on the contents of the object and that key will not change over the objects lifetime. You can take a hash of any object but only immutable objects have stable keys and are called hashable.

*ex: tuples are hashable and later we will see they can be used as keys in dicts, lists cannot*

**Container** - Container in python refers to any data structure which 'contains' other items, all container structures support iteration (looping over the contained items) and membership testing (is item1 in container2?)

*ex: dictionary, list, set, struct... pretty much everything other than types is a container*

**Sequence** - Sequence in python refers to any data structure which is both a container and preserves the order of items it contains. Sequences implement all the methods of containers but also support indexing and slicing.

*ex: list, string, tuple*

**Generators** - A generator is any python data structure which supports a "lazy" iteration method. Rather than looping through all items and storing the results in memory, a structure can return a generator object which does not store anything in memory and just pulls out one item at a time when called.

### Built-In Data Structures ###

#### Lists ####
Lists are a simple ordered sequence of anything you want to put in them. Lists can contain either primitive types such as integers or floats or they can contain other container types such as tuples, strings, other lists and etc. Lists are containers, sequences and mutable.  

In [13]:
# List structure
# Declare lists using any comma seperated sequence between square brackets

a = []                              # empty list
b = [1,2,3,4,5]                     # list of integers
c = ['abc', 1, 2, 3, 4.1]           # mixed type list
d = [ [1,2,3], [4,5,6], [7,8,9] ]   # list of lists

# Access items using integer index between square brackets
print b[0]
print c[0]
print d[0]

# Nested lists can be indexed twice
print d[0][1]

# Bonus - in python lists are often created using a functional tool called a list-comprehension
# what do you expect list e to contain?
e = [x for x in range(100)]

1
abc
[1, 2, 3]
2


#### Tuples ####
Tuples are immutable lists. They are exactly the same in every way except that their values cannot be changed. Tuples are often used for returning values from functions and other places where you do not want your data modified.

In [14]:
# Tuple structure
# Declare tuples using any comma seperated sequence between parentheses 

a = ()                              # empty tuple
b = (1,2,3,4,5)                     # tuple of integers
c = ('abc', 1, 2, 3, 4.1)           # mixed type tuple
d = ( (1,2,3), (4,5,6), (7,8,9) )   # tuple of tuples

# Access items using integer index between square brackets
print b[0]
print c[0]
print d[0]

# Nested tuples can be indexed twice
print d[0][1]

# Bonus - while you might expect the following syntax to create a tuple it is actually
# reserved for creating generator expressions
e = (x for x in range(100))
f = tuple( (x for x in range(100) ))  # typecast to a tuple if you need to generate tuples on the fly

1
abc
(1, 2, 3)
2


#### Sets ####

Sets are unordered collections which do not allow duplicate values. They may contain multiple types and other collections. Sets are rarely used as for basic data storage but are commonly used for eliminating duplicates and set-theoretic operations between lists such as union, intersection, difference etc. 

In [15]:
# Set structure
# Sets are declared by type-casting any sequence type (list, tuple, etc.) using the set() function

a = [1,1,2,2,3,3,4,4]   # list
b = (3,3,4,4,5,5,6,6)   # tuple

c = set(a)              # set from list
d = set(b)              # set from tuple

print 'List and Tuple'
print 'a={} b={}\n'.format(a,b)
print 'List and Tuple cast to Sets'
print 'c={} d={}\n'.format(c,d)

# Sets use pythons bitwise operators so for instance the intersection of two sets is written
# a & b which is the members which are present in both a AND b

print 'Union(c,d): {}'.format(c | d)
print 'Intersection(c,d): {}'.format(c & d)
print 'Difference(c,d): {}'.format(c ^ d)

List and Tuple
a=[1, 1, 2, 2, 3, 3, 4, 4] b=(3, 3, 4, 4, 5, 5, 6, 6)

List and Tuple cast to Sets
c=set([1, 2, 3, 4]) d=set([3, 4, 5, 6])

Union(c,d): set([1, 2, 3, 4, 5, 6])
Intersection(c,d): set([3, 4])
Difference(c,d): set([1, 2, 5, 6])


#### Dictionaries ####

A dictionary is an unordered set of key-value items. In other languages these sorts of objects are called associative arrays. To retrieve an item from a dictionary, rather than giving the location where it is located you retrieve it by key. Any hashable variable or object can be a key, and anything can be a value. A common use for dictionaries is implementing directory systems where you look up information such as name, phone-number and etc. using an id number as the key.  

*note: Item lookups for dictionaries are implemented using hash tables and are close to a constant time lookup, O(1), under most conditions*  

In [16]:
# Dictionary structure (dict)
# Declare dictionaries using key:value pairs seperated by commas between curly brackets

a = {}                                                          # empty dict
b = {'john':123789133, 'alice':4562134343 }                     # dict of integers with string names as keys
c = {'john':'abc', 1:0.56, 4.56:300j }                          # dict with mixed-type keys and values
d = { ('john','mary'):123, ('fred','susan'):456 }               # dict using hashable tuples as keys

# Access items using key value between square brackets
print b['john']
print c[4.56]
print d[('fred','susan')]

# Bonus - even experienced python programmers don't always know that you can do dictionary-comprehensions
# what does dict e contain?  Hint: ord() returns the ascii code of a character
import string
e = { ord(x):x for x in string.ascii_letters }

123789133
300j
456


### Indexing and Slicing ###

Recall that many of the data structures we just discussed are sequence types. A sequence is an ordered list of some item and because the order of items is preserved we can refer to items using their location (index). Python provides two standard ways of working with indexes that are the same across all sequence types: indexing and slicing. Sequence types include lists and tuples but also strings and non-standard types such as numpy and pandas data structures. 

#### Indexing ####

There are two types of indexing in python, standard and reverse indexing. For standard indexing the first item in a sequence is labeled with the integer 0, the second is 1 and so on. Reverse indexing starts at the end of the sequence. Unlike some other languages there is no -0 in Python so reverse indexing starts at -1, goes to -2 and so on. Square brackets are used for all indexing operations.

In [17]:
# String Indexing
my_string = "the devil went down to georgia"

# Normal Indexing - specify a position using an integer index starting at zero
# begins at the leftmost position on the string and counts up to the right

# "t h e   d e v i l.."
# "0 1 2 3 4 5 6 7 8.."

a = my_string[0]   

# Reverse Indexing - specify a character using a negative integer starting at -1
# begins at the rightmost position on the string and counts down to the left

# "...t o   g e o r g i a"
# "..10 9 8 7 6 5 4 3 2 1   (using negative integers) 

b = my_string[-1]   # reverse indexing - 

print 'Standard and Reverse Indexing'
print 'index1 = {} index1r = {}'.format( a,b )

Standard and Reverse Indexing
index1 = t index1r = a


#### Slicing ####

Slices are one of the nice features of Python that you will not find in every other language. Slices are a way of pulling out subsets of any sequence object. For instance if you wanted to take the first 10 letters of a string you would use a slice. What if we need to reduce a large dataset and want to pull out every 10th item? Easily accomplished with slices. The syntax on slices is fairly intuitive once you understand indexing and is very powerful. 

In [18]:
# Slices

my_string = "abcdefghijklmnopqrstuvwxyz"

# Substring from position 0 up to but not including position 5 
a = my_string[0:5]
b = my_string[:5]    # defaults to 0 if we put nothing so these statements are equivalent

# Substring from position -5 up to but not including position -1
c = my_string[-5:-1]
d = my_string[-5:]   # defaults to end of string if we put nothing so these are NOT equivalent

# Step Option
# can include a step argument that provides the size of 'step' to make
# between start and end points

e = my_string[0:10:2]  # steps of size 2 starting at 0 up to 10

print "a= {} b= {} c= {} d= {} e= {}".format(a,b,c,d,e)


#Bonus
# why does this print a reversed string?
print my_string[::-1]

a= abcde b= abcde c= vwxy d= vwxyz e= acegi
zyxwvutsrqponmlkjihgfedcba


#### Assigning to Slices ####

You will sometimes find it useful to set values to the slice of a sequence. Be aware this only works for mutable sequences so you cannot do this for strings or tuples. The only thing you can assign to the slice of a sequence is a sequence of equal length to the slice. 

In [19]:
# Slice Assignment
import random

my_list = [0,1,2,3,4,5,6,7,8,9]
print my_list

# Set the first half of the list equal to 0
my_list[:5] = [0,0,0,0,0]
print my_list

# Set every other number to a random value between -1 and -10
my_list[::2] = [ -random.randint(0,10) for x in range(5)]
print my_list

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 0, 0, 0, 0, 5, 6, 7, 8, 9]
[-10, 0, -4, 0, -1, 5, -4, 7, -1, 9]


## Conditionals ##

### If, Elif and Else ###

Conditionals are flow control statements that determine whether a block of code is executed based on some variable or state of the program. Python does not have a switch statement so all branching and conditionals are done with some combination of three conditional statements. If, Elif and Else.

**if** - executes a block of code when the attached statement evaluates to True

**elif** - must follow an if statement, executes a block of code when the if statement evaluates to false and its own statement evaluates as true (is short for else if)

**else** - executes a block of code when all other if and elif statments have returned false

The most common mistake made with conditionals in Python is confusing if and elif. When you want to guarantee that only one block of code is executed by a conditional block you should always use the if, elif, elif, else pattern. 

In [20]:
# Conditionals - common patterns

# Execute one or more blocks based on some condition
print 'Repeated If Block'
x = 5
if x > 2:
    print 'x > 2'
if x > 3:
    print 'x > 3'
if x > 4:
    print 'x > 4'

# Execute only one block based on some condition
# note that while all statements are true only the first valid one is executed
print '\nIf, Elif, Else Block'
y = 5
if y > 2:
    print 'y > 2'
elif y > 3:
    print 'y > 3'
elif y > 4:
    print 'y > 4'
else:
    print 'y <= 2'

Repeated If Block
x > 2
x > 3
x > 4

If, Elif, Else Block
y > 2


## Loops & Itteration ##

### While Loops ###

While loops in Python will continue running a block of code as long as an attached condition evaluates to true. They are not used often in data analysis since a for loop is usually preferable but you may see them for consuming external data streams, game loops or waiting for user input.

In [21]:
# Sum all numbers from 0-5 using a while loop
counter = 0
c_sum = 0
while counter < 5:
    c_sum += counter
    counter += 1
    
print 'Sum = {}'.format(c_sum)

# Bonus - an obscure but useful bit of python syntax, you can use the else keyword with any loop construct to
# perform a final action after the loop finishes
counter = 0
c_sum = 0
while counter < 5:
    c_sum += counter
    counter += 1
else:
    c_avg = float(c_sum) / float(counter)
        
print 'Sum = {} Avg = {}'.format(c_sum, c_avg)

Sum = 10
Sum = 10 Avg = 2.0


### For Loops ###

Python has a very powerful and flexible synax for performing for loops. Rather than using counters and indices as in C++ or Java, Python for loops retrieve items sequentially from any object with defined iter() or next() methods until the object is exhausted. This includes all the container types we've talked about so far and even user defined types.

In [22]:
# A simple for loop over a list
container = ['a', 'b', 'c', 1, 2, 3]

for item in container:
    print item

a
b
c
1
2
3


#### Common For Loop Patterns ####

Easy right? And for many problems that is as much as you'll need but here are a number of common patterns for using loops in python that will make some problems much easier. 

In [23]:
# For-Range Loops
# range() is a python function that returns x number counting from zero
# using a for loop over the result of a range function gives you a loop that runs a fixed number of times
print 'range loop'
for x in range(10):
    print x
    
# For-XRange Loops
# xrange() is equivalent to range() but instead of a list returns a generator
# you should always use xrange unless you need the whole list pre-generated for some reason
print '\n xrange loop (generator)'
for x in xrange(10):
    print x
    
# Multi-Item Loops
# use unpacking syntax to pull out multiple values from an object, for instance a list of tuples
print '\n loop with unpacking'
nested = [('a','b','c'),('d','e','f')]
for x,y,z in nested:
    print '{} {} {}'.format(x,y,z)
    
# Enumerate Loops
# enumerate returns an enumerate object which is a fancy generator that returns a tuple of (item index, item)
# using tuple unpacking we can easily retrieve both to use in a loop, use the enumerate pattern whenever you
# need both the item and its index
print '\n enumerate loop'
letters = ['a','b','c','d']
for index, item in enumerate(letters):
    print '{} {}'.format(index, item)
    
# Dictionary Loops
# python dictionaries implement an iter() method that returns keys so you can write loops like this
# note - the builtin iter() method is faster than calling dictionary.keys() so use the built-in
print '\n dictionary keys loop'
dictionary = {'a':1, 'b':2, 'c':3}
for x in dictionary:
    print 'key={}'.format(x)
    
# Dictionary items() loop
# items() returns a list of key value pairs for a dictionary you can loop over
print '\n dictionary key and values loop'
for key, val in dictionary.items():
    print 'key={} value={}'.format(key,val)
    
# Dictionary iteritems() loop
# iteritems() returns a generator that produces key value pairs for a dictionary
# use iteritems() unless there is a reason to pre-generate the list
print '\n dictionary key and values loop (generator)'
for key, val in dictionary.iteritems():
    print 'key={} value={}'.format(key,val)    


range loop
0
1
2
3
4
5
6
7
8
9

 xrange loop (generator)
0
1
2
3
4
5
6
7
8
9

 loop with unpacking
a b c
d e f

 enumerate loop
0 a
1 b
2 c
3 d

 dictionary keys loop
key=a
key=c
key=b

 dictionary key and values loop
key=a value=1
key=c value=3
key=b value=2

 dictionary key and values loop (generator)
key=a value=1
key=c value=3
key=b value=2


### Break, Continue and Pass ###
Python also has several statements which can alter the normal flow of loops. While these can be useful they are easy to use badly. Think three times whether you could eliminate a break or continue statement by rewriting your loop before you use one. 

**break** - immediately terminates the innermost loop

**continue** - immediately stops the current itteration of the loop and begins the next one

**pass** - placeholder for when a statement is required syntactically, does nothing

In [24]:
# For loop with break statement
# what should the output of this be?
print 'Loop with break'
for x in range(10):
    print x
    break
    
# For loop with continue statement
# what should the output of this be?
print '\nLoop with continue'
for x in range(10):
    continue
    print x
    
# For loop with pass
print '\nLoop with pass'
for x in range(10):
    pass

Loop with break
0

Loop with continue

Loop with pass


## Functions ##

A function is a programming abstraction that defines a block of code to be executed when the function is called and then returns control back to the calling statement. Functions can take any sort of object or variable as arguments and can return any type of object. Functions should always be modular and avoid <i>side effects</i> (changes to the state of the program outside of the function such as modifying global variables). 

In python functions can be written using either the def keyword to create a named function, or the lambda keyword to create an anonymous (unnamed) function. Lambdas are good for expressing simple mathematical relationships and for use with other functional tools but for anything more hefty than that use named functions.

In [25]:
# Named Function
# def keyword followed by name(arguments) and return return_var
def scalarAdd(x,y,n):
    '''Returns the sum of two inputs with each scaled by n'''
    return (x*n) + (y*n)

print 'Named Function Call'
print scalarAdd(4,1,2)

# Lambda Function
# lambda arguments: return_var
s = lambda x,y,n: (x*n) + (y*n) 
 
print '\nLambda Function Call'
print s(2,3,1)

# Bonus - python functions are also objects and you can access 
# their special fields such as the docstring we defined for scalarAdd
print '\nGet Function Docstring'
print scalarAdd.__doc__

Named Function Call
10

Lambda Function Call
5

Get Function Docstring
Returns the sum of two inputs with each scaled by n


### Function Scope ###
A scope is a domain of bindings between variable names and their associated objects. Python functions have their own scope which is checked first when the python interpreter tries to find an object. Variables declared within functions are considered local and cannot be referenced outside of the function because they are not part of the outside scope. Inside of the function you can use both variables that are part of the local scope and any <i>enclosing</i> scopes. For instance if the function is nested in another function or class it can call variables from that function/class as well as anything from the global scope. 

#### Scope Resolution Order ####
It is possible for variables accessible from the same scope to use identical names. When this is the case which variable actually gets used? What do you think will happen when we run the code below?

In [26]:
x = 'outside variable'

def a():
    x = 'a function local variable'
    print x
   
# call function a
a()

a function local variable


### LEGB Rule ###

**Local** - local variables such as variables declared inside of functions or classes are resolved first

**Enclosing** - local variables of enclosing functions are called next in order from inner to outermost

**Global** - names assigned with the global keyword are resolved next

**Built-In** - builtin python function and variable names are resolved last

*Note: Because built-ins are resolved last any function you make with the same name as a built-in function will be called instead. It is even possible to do this for things you would not expect like the + and - operator. This is usually a bad idea and it can go very badly if you do it accidently. Never name your functions the same thing as python built ins, it can break all other code in your program that relies on that built in function. A common error is to define a "sum" function.*

## Classes and Objects ##
Classes are a type of structure that allows you to define your own member variables and functions and then instantiate multiple instances of that class (objects). Effectively you can create your own custom types along with methods (functions attached to a class) to act on them and then use them wherever they are needed. Object oriented programming is a very powerful technique and I will not go into too much depth here. Use classes when it makes sense to group several different datastructures and functions as a logical unit. For instance we might want to create a participant class to store data from an experiment.

### Python Class Syntax ###
The syntax for declaring classes in Python is not very straightforward compared to other languages. By default every variable or method declared inside the class definition does not belong to the object, instead it belongs to the class. This means the variable is shared by every instance of a class. Usually we do not want this but in Python we must use some trickery to make the variables and methods owned by an instance of the class. We do this by passing the keyword argument self, which identifies the current instance, to all class methods.

*Note: there is no such thing as private or protected class variables in python, an class variable can be changed or reassigned it is up to you as the programmer to avoid doing this when you shouldn't.*

In [27]:
# class declared using class keyword
class Participant:
    
    # __init__ is a special function that is called when a class is first initialized
    # self must be the first argument to __init__ so we know which instance to assign variables to
    def __init__(self, participant_id, condition, results):
        
        # Set the instance variables self.x to the values passed in as function arguments to __init__ 
        self.participant_id = participant_id
        self.condition = condition
        self.results = results
        
    # All additional methods must also have self as an argument to access instance data
    def meanResult(self):
        return sum(self.results) / len(self.results)
    
    
# Initialize an instance of the Participant class
p1 = Participant(participant_id=1, condition=2, results=[1,2,3,4])

# Retrieve or set instance/class variables using instance.var_name
p1.condition = 55
print 'Participant Condition = {}'.format(p1.condition)
print 'Results = {}'.format(p1.results)

# Call instance methods in the same way
r_mean = p1.meanResult()
print 'Mean Result = {}'.format(r_mean)    

Participant Condition = 55
Results = [1, 2, 3, 4]
Mean Result = 2


## Input and Output ##

### Print ###
Python actually has several different methods for print output formatting. They are fairly redundant so for this tutorial I will just be teaching you the most modern one. **.format()** Format is a method of the string type which will take variable arguments and automatically add them to a string to be printed. By now you should have seen me use this syntax a number of times but lets take a look at the full power of this method.  

In [28]:
# Simple variable output
# Variables are inserted at each {} in order
print 'Simple format {} {}'.format(4,5)

# Variable output with specified order
# Specifying a number ex: {1} can change the order in which the variables are printed 
print 'Ordered format {1} {1} {0}'.format(4,5)

# We can also use keyword arguments and specify what to print by keyword
print 'Keyword format name={name} job={job}'.format(name='John', job='Accountant')

# A useful trick is using **dict which is the syntax for unpacking a dictionary into a list of keyword arguments
# and directly passing a dictionary for printing
var_dict = {'name':'Alice', 'job':'Programmer'}
print 'Dict Format: name={name} job={job}'.format(**var_dict)

Simple format 4 5
Ordered format 5 5 4
Keyword format name=John job=Accountant
Dict Format: name=Alice job=Programmer


### Files ###
Python has built in syntax to deal with reading and writing files on disk. In practice you will often be using read and write functions from libraries such as Pandas for loading and outputting data but you should know the defaults. File objects in python are essentially treated as itterable collections of "lines". Lines are some amount of text data terminated by a newline character. Lets do a simple example.

In [29]:
# Write data
data = ['alice','bob','carol','dave','elanor']
temp_file = open('test1.txt', 'w')

for x in data:
    temp_file.write( x + '\n' )
    
temp_file.close()

# Read data
temp_file = open('test1.txt','r')

for line in temp_file:
    print line
    
temp_file.close()

alice

bob

carol

dave

elanor



So what's going on here? As we can see we create a file using the open function with two arguments. The first is the name and filepath, if just the name is given it defaults to using the current directory. The second argument is an optional string that specifies the mode we open the file with.

### File Modes ###

#### Writing Type ####

**r** - Read: the file is opened for reading

**w** - Write: the file is opened for writing, overwrites the file if it already exists

**a** - Append: the file is opened for writing, does not overwrite and adds new data to the end of the current file

**+** - Extended: must follow r, w or a makes the file read/write 

#### Other ####

**b** - Binary: opens the file as a binary blob instead of text

**x** - Create: creates the file if it does not exist

#### Examples ####

**'r+'** -  open in read/write mode

**'w+b'** - open as binary in read/write mode, overwrite existing

**'a+'**  - read and append mode

Now that we have an open file object we perform a few other operations on it. We write string objects to the file or read them and then finally we call close on the file object. Reading and writing is very straightforwards. When reading you should generally just use the built in itteration. When writing just call the write method of the file object with a string argument. Close does a few different things here. It cuts the connection between our file object and the file on disk, it commits any unsaved changes we made to the file and it destroys the file object freeing up any memory it was using. (this can become a problem very fast if you don't close large files)

### With ###

Python also provides some nice syntactic sugar for dealing with the common process of opening a file, performing some operations on it and then closing it. Rather than write that all out you can create a "with" block. With opens some resource such as a file, performs whatever operations on it within the block and then no matter how the block is exited (exception thrown, break, etc.) closes the file. Because of this with is not only prettier but is considered a safer way to handle resources. Use with rather than open... operations.. close, when possible.

In [30]:
data = ['a', 'b', 'c', 'd']

with open('test1.txt', 'w') as out:
    for x in data:
        out.write(x)