<h1 id="tocheading">Table of Contents and Notebook Setup</h1>
<div id="toc"></div>

In [1]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>

# Introduction to Functions

Functions are used for code organization and code reuse. They also make code more readable by giving a name to a block of code. They are defined with the "def" keyword and returned with the "return" keyword.

In [2]:
def sum_square(x, y, z=2):
    return (x+y)**2

In [3]:
sum_square(4,5)

81

Functions have <b> positional </b> arguments and <b> keyword </b> arguments. In the preceding example, x and y are positional arguments and z is a keyword argument. Python's rule is that <i> positional arguments must come before keyword arguments. </i> Typically keyword arguments are used for default values or optional arguments.

The function above can be called in the following three ways:

In [4]:
sum_square(x=4, y=5, z=3)
sum_square(4, 5, 3)
sum_square(4, y=3)

49

In the last line above, we make "y" a keyword argument since we use an equal sign to specify it. Sometimes, for clarity, it is nice to specify everything as a keyword argument like we did in the first line.

# Namespaces, Scope, and Local Functions

Functions can access both <b> global </b> variables and <b> local </b> variables. These two terms refer to the <b> namespace </b> of a python variable. The local namespace is created when the function is called and immediately populated by the functions arguments. Afterwards, the local namespace is destroyed and only the return value survives.

In [5]:
def func(x,y):
    a=[]
    a.append(x)
    a.append(y)
    return 0

This is a pointless function, but it is used to clarify things above. When the function is created, x and y are put in the local namespace. Then a is put in the local namespace (when the second line is executed). When the function is finished, x, y, and a are all deleted.

If need be, we can <i> assign </i> variables outside the functions scope (global variables) using the global keyword.

In [6]:
a = None
def func(x,y):
    global a
    a=[]
    a.append(x)
    a.append(y)
    return None

func(3,4)
print(a)
    

[3, 4]


Use of the global keyword is generally discouraged.

# Returning Multiple Values

In a python function, you can return multiple values in the form of a tuple. This application alone makes Python superior to Java and C++, where you would have to return an array and the code would look messy.

In [7]:
def func(x,y,z):
    return x-y, z-y

a, b = func(2, 3, 4)
a

-1

Notice that the values get unpacked into the variables a and b. We could have also just done:

In [8]:
return_val = func(2, 3, 4)
return_val

(-1, 1)

but here return_val itself is a tuple (we didn't take advantage of the unpacking feature).

# Anonymous (Lambda) Functions

Lambda functions are easy ways to express functions in one line of code. This serves multiple purposes. For starters, we can define very simple functions in one line of code:

In [9]:
def long(x):
    return x**2

short = lambda x: x**2

long(2)
short(2) #equivalent ^^

4

It also serves an extreme important second purpose; sometimes the arguments to functions <i> are functions themselves. </i> This is very prevalent in data analysis and it will show up frequently in these notes. Take the following two snippets of code as an example.

In [10]:
def derivative(f, x, delta=0.001): #returns derivative of f at value x
    return (f(x+delta)-f(x))/delta

def x_square(x):
    return x**2

derivative(x_square, 1)
    
    

2.0009999999996975

Not bad. But we can cut down on the code significantly since x_square can be represented using a lambda function.

In [11]:
def derivative(f, x, delta=0.001): #returns derivative of f at value x
    return (f(x+delta)-f(x))/delta

derivative(lambda x: x**2, 1)

2.0009999999996975

Much more concise, and now we don't have to waste space in our program defining an x squared function. Note specifically how the lambda function takes the place of x_square in the first sample of code.

Lambda functions can take in multiple variables and return multiple variables as well.

In [12]:
value = lambda x, y, z: (x-y, y-z)
a, b = value(2,3,4)
a

-1

Go crazy. There's quite literally endless possibilities.

# Currying: Partial Argument Application

<b>Currying</b> is jargon for defining new functions based on old ones. This type of coding may help with lambda functions.

In [13]:
def add(x, y):
    return x+y

add_five = lambda x: add(x,5)
add_five(2)

7

The first element to the add function is said to be <b>curried</b>. This will be used later in data analysis.

# Generators

Python has a <i> consistent </i> way to iterate over sequences by means of <b> iterator protocol </b>. This protocol is a generic way to make objects iterable.

In [14]:
some_dict={'a':1, 'b':2, 'c':3}
for item in some_dict:
    print(item)

a
b
c


When writing "for item in some_dict" the python interpretor first creates an iterator out of some_dict:

In [15]:
dict_iterator = iter(some_dict)
dict_iterator

<dict_keyiterator at 0x1055a2e58>

An <b>iterator</b> is an object that will <i> yield objects </i> when used in a context like a for loop. Most methods that accept a list or list-like-obejct will also accept iterators. This includes methods like min, max, sum, tuple, and list.

In [16]:
list(dict_iterator)

['a', 'b', 'c']

A <b> generator </b> is a concise way to contruct an iterable object. Normal functions execute and return a single object (this can be a list or tuple that contains multiple things). Generators return a sequence of values, but lazily; they pause at each one untill the next one is requested. This can save computation space. To create a generator, use the "yield" keyword.

In [17]:
def squares(n=10):
    for i in range(1, n+1):
        yield i**2

gen1 = squares()
gen1

<generator object squares at 0x105564938>

It is not until we start requesting the values of gen that it returns the values.

In [18]:
for x in gen1:
    print(x, end= ' ')

1 4 9 16 25 36 49 64 81 100 

Here's another example.

In [19]:
def sentence_gen():
    yield 'Cats in the cradle '
    yield 'and the silver spoon. '
    yield 'Little boy blue '
    yield 'and the man in the moon.'

gen2 = sentence_gen()
for x in gen2:
    print(x, end='')

Cats in the cradle and the silver spoon. Little boy blue and the man in the moon.

## Generator Expressions

Like lambda functions for functions, there are more concise way to make generators as well. This involves using a generator expression like such:

In [20]:
gen = (x**3 for x in range(10))
gen

<generator object <genexpr> at 0x1055ae620>

These generator expressions are often easier to use than lists when using functions like "sum."

In [21]:
sum(x**3 for x in range(1000))

249500250000

## itertools module

There is a standard library called "itertools" that has a collection of generators for many common data algorithms. The method "groupby" takes a sequence and a function and groups <i> consecutive </i> elements in the sequence by their return values:

In [22]:
import itertools
first_letter = lambda x: x[0]
male_names=['Adam', 'Axtor', 'Jim', 'James', 'Asshole']

for letter, names in itertools.groupby(male_names, first_letter):
    print(letter, list(names))

A ['Adam', 'Axtor']
J ['Jim', 'James']
A ['Asshole']


There are many other useful tools which you can find by googling "itertools."

# Errors and Exception Handling

In order to write elegant code, one needs to be able to handle errors or <i> exceptions </i> gracefully. In data analysis, many functions only work on certain kinds of data input. Take, for example, the float function.

In [23]:
float('1.234')

1.234

In [24]:
float('something')

ValueError: could not convert string to float: 'something'

If we want a float method that fails gracefully, we can use a <b> try </b> and <b> except </b> block as such.

In [25]:
def attempt_float(x):
    try:
        return float(x)
    except:
        return x
    
attempt_float('something')

'something'

The code in the except block is only executed if the code in the try block raises an exception.

We have freedom to decide <i> what type of exceptions </i> we want to permit. For example,

In [26]:
float((1,2))

TypeError: float() argument must be a string or a number, not 'tuple'

institues a type error and not a value error like before. Maybe we only want the code in the except block to be completed if the try block raises a "value exception." We can accomplish that as such

In [27]:
def attempt_float(x):
    try:
        return float(x)
    except(ValueError):
        return x
    
attempt_float('something')

'something'

In [28]:
attempt_float(1,2)

TypeError: attempt_float() takes 1 positional argument but 2 were given

Exactly what we wanted. In general, we specify what types of exceptions we will permit inside the brackets of except(). 

Consider the following most general sequence of "try" block code. We wil analyze it afterwards.

In [29]:
a = 4
b = 0

try:
    print(a/b)
except:
    print("Failed")
else:
    print("Succeeded")
finally:
    print("The value of a was {0} and the value of b was {1}".format(a,b))

Failed
The value of a was 4 and the value of b was 0


Firstly the <b>try</b> block is attempted. If it doesn't succeed then the <b>except</b> block is executed. If is does succeed then the <b>else</b> block is executed. Regardless of whether or not the try block succeeds, the <b>finally</b> block is then executed. This becomes a nice way to organize code.