# The Python Programming Language

From Coursera: Intro to Data Science, Week 1   
Patricia Schuster, University of Michigan  
Feb. 2017

# Advanced Python Objects, map()

Start by importing relevant packages

In [6]:
import numpy as np

## The `map()` function

The `map()` function takes a function as the first input and an iterable (like a list or array) as the second input. It serves as a generator and returns the value of the function applied to each of the iterable values. 

It is useful because it does not do everything at once- it applies the function each time, so as not to store everything in memory.

## Basic implementation- sum two vectors

Try it out.

In [3]:
def add_two(a,b=0):
    'Returns the sum of a and b'
    return a+b

In [4]:
add_two(1,2)

3

Return the sum of an array of `a` values, added together with the default `b=0`. So, in essence, return the array of `a` values.

In [10]:
a_values = np.arange(0,10,1)
print('a_values: ', a_values)

for added_two in map(add_two,a_values):
    print('added_two: ', added_two)

a_values:  [0 1 2 3 4 5 6 7 8 9]
added_two:  0
added_two:  1
added_two:  2
added_two:  3
added_two:  4
added_two:  5
added_two:  6
added_two:  7
added_two:  8
added_two:  9


Now pass in another list for the `b` values. If I want to add 5 to each of the `a` values...

In [14]:
b_values = 5*np.ones(10, dtype=np.int)
print('b_values: ', b_values)

for added_two in map(add_two, a_values, b_values):
    print('added_two: ', added_two)

b_values:  [5 5 5 5 5 5 5 5 5 5]
added_two:  5
added_two:  6
added_two:  7
added_two:  8
added_two:  9
added_two:  10
added_two:  11
added_two:  12
added_two:  13
added_two:  14


## More advanced implementation- split names

This is an example exercise in the Coursera lecture. I need to write a function and apply it using `map()` that will produce a list of all faculty titles and last names (eliminating their first names). 

In [15]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

In [52]:
person = 'Dr. Christopher Brooks'
print(person.split()[0] , person.split()[-1])

Dr. Brooks


To get the help information from the `split` function, I need to call it by including `str` because `split` is an attribute of a string object.

So I can call it using `help(str.split)` or `help('abc'.split).`

In [18]:
help(str.split)

Help on method_descriptor:

split(...)
    S.split(sep=None, maxsplit=-1) -> list of strings
    
    Return a list of the words in S, using sep as the
    delimiter string.  If maxsplit is given, at most maxsplit
    splits are done. If sep is not specified or is None, any
    whitespace string is a separator and empty strings are
    removed from the result.



In [56]:
def split_title_and_name(person):
    title = person.split()[0]
    last_name = person.split()[-1]
    
    return ' '.join([title,last_name])

An alternate method to using `join()` would be to use `format()`:

    return '{} {}'.format(title,last_name)

In [41]:
print(split_title_and_name(people[0]))

Dr. Brooks


In [44]:
list(map(split_title_and_name,people))

['Dr. Brooks', 'Dr. Collins-Thompson', 'Dr. Vydiswaran', 'Dr. Romero']

# Advanced python lambda and list comprehensions

`lambda`s are Python's way of making anonymous functions. They should be simple or short-lived, and it's easier to write out the function in one line instead of creating a named function. 

The syntax is simple:

* Declare a `lambda` function with `function_name = lambda`
* Follow it with a list of arguments
* Follow by a colon
* Follow by a single expression.

There is only one expression to be evaluated.

In [47]:
my_function = lambda a, b, c : a + b

In [48]:
my_function(1,2,3)

3

You can't have default values or complex expressions in a `lambda` function. They are very useful for simple data cleaning tasks.

## Apply `lambda` to the previous task

Try turning the previous task of splitting faculty names into a `lambda` expression.

In [54]:
split_title_and_name = lambda person : ' '.join([person.split()[0],person.split()[-1]])

In [55]:
split_title_and_name(person)

'Dr. Brooks'

## Using `lambda` on the fly

It is possible to define and apply `lambda` in the same line.

Try verifying that the function `split_title_and_name` works the same as the corresponding `lambda` expression. 

(Go back and rerun the line where I defined `def function split_title_and_name`.)

In [60]:
for person in people:
    print(split_title_and_name(person) == (lambda person : ' '.join([person.split()[0],person.split()[-1]]))(person))

True
True
True
True


Notice that you must provide the argument immediately after defining the `lambda` statement using `(person)` at the end.

I can also use `map()` to eliminate the `for` loop.

In [65]:
list(map(split_title_and_name, people)) == list(map(lambda person : ' '.join([person.split()[0],person.split()[-1]]), people))

True

# Lambda and list comprehensions

Sequences are structures we can iterate over. Python has built-in support for creating these structures using more abbreviated syntax called list comprehensions.

For example, first use a `for` loop to iterate through 30 numbers and identify all the even numbers. See if the number divided by two results in a decimal.

In [69]:
my_list = []
for number in range(0,30):
    if number % 2 == 0:
        my_list.append(number)
        
my_list

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

We can rewrite this as a list comprehension by writing the iteration on one line. Syntax includes:

* Start with the value we want to add to the list
* Put it in a `for` loop
* Add any condition clauses

In [70]:
my_list = [number for number in range(0,30) if number % 2 == 0]
my_list

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

Try converting the function `times_tables()` into a list comprehension.

In [71]:
def times_tables():
    lst = []
    for i in range(10):
        for j in range(10):
            lst.append(i*j)
    return lst

Since we are iterating over two values, we can just put both of the loops together.

In [81]:
times_tables() == [i*j for i in range(10) for j in range(10)]

True

# Challenge question

This question from the coursera video brings a few things together. 

Many organizations have user ids which are constrained in some way. Imagine you work at an internet service provider and the user ids are all two letters followed by two numbers (e.g. aa49). Your task at such an organization might be to hold a record on the billing activity for each possible user. 

Write an initialization line as a single list comprehension which creates a list of all possible user ids. Assume the letters are all ower case.

In [82]:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
digits = '0123456789'

In [105]:
all_possibilities = [''.join([lowercase[l1],lowercase[l2],digits[d1],digits[d2]]) for l1 in range(26) for l2 in range(26) for d1 in range(10) for d2 in range(10)]

In [106]:
print(all_possibilities[:30])

['aa00', 'aa01', 'aa02', 'aa03', 'aa04', 'aa05', 'aa06', 'aa07', 'aa08', 'aa09', 'aa10', 'aa11', 'aa12', 'aa13', 'aa14', 'aa15', 'aa16', 'aa17', 'aa18', 'aa19', 'aa20', 'aa21', 'aa22', 'aa23', 'aa24', 'aa25', 'aa26', 'aa27', 'aa28', 'aa29']


A much simpler way...

In [107]:
all_possibilities = [l1+l2+d1+d2 for l1 in lowercase for l2 in lowercase for d1 in digits for d2 in digits]

In [108]:
print(all_possibilities[:30])

['aa00', 'aa01', 'aa02', 'aa03', 'aa04', 'aa05', 'aa06', 'aa07', 'aa08', 'aa09', 'aa10', 'aa11', 'aa12', 'aa13', 'aa14', 'aa15', 'aa16', 'aa17', 'aa18', 'aa19', 'aa20', 'aa21', 'aa22', 'aa23', 'aa24', 'aa25', 'aa26', 'aa27', 'aa28', 'aa29']
