# Ch. 2 - Crash Course in Python

## Basics

<ul>
<li>Whitespace	is	ignored	inside	parentheses	and	brackets = helpful	for	long-winded	computations and	for	making	code	easier	to	read</li>
<li>Can use a backslash to indicate statements continue onto next line</li>

In [1]:
list_of_lists = [[1,2,3],[4,5,6],
                [7,8,9]]

two_plus_three = 2 + \
3
print(list_of_lists)
print(two_plus_three)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
5


IPython has a **magic function** `%paste` which allows one to *correctly* paste whatever is on the clipboard, and keep the formatting, in a shell, for example.

### Modules

**Modules** are **imported**, as well as explictly importing specific values/functions to use them without qualifiers.

In [2]:
from collections import defaultdict, Counter

If using Python 2.7 and want to overwrite using **integer division** (i.e. 5/2 = 2), use

In [3]:
from __future__ import division

print(5/2)

# use // to do integer division
print(5//2)

2.5
2


### Functions

These are rules taking in 0+ inputs and returning an output, defined starting with `def`. They are also **first-class** = they can be assigned to variables and passed into other functions.

In [4]:
def double(x):
    """optional docstring to describe
    what function does"""
    return x*2

def apply_to_one(func):
    """calls provided function with an
    argument of an integer = 1"""
    return func(1)

my_double = double
x = apply_to_one(my_double)
print(x)

2


We can also create **lambdas**, which are short, anonymous (non-named) functions, which we *could* assign to a variable, but it'd be better to use `def` instead in such cases.

In [5]:
# lambda argument = *action to do to argument*
another_double = lambda x: 2*x # bad

def another_double_2(x): # good
    return 2*x

In [6]:
# make function with default value
def my_print(message = "default"):
    print(message)
    
my_print("hello")
my_print()

hello
default


In [7]:
# call function while specifying argument by name
def subtract(a=0, b=0):
    #return a-b
    print(a-b)

subtract(10,7)
subtract(5)
subtract(b=5)

3
5
-5


### Strings

Can be delimited by single *or* double quotes (but they must match). 

To encode special characters, use must **escape** them with backslashes, but to use normal backslashes, create a **raw string** via `r"xxxxx"`.

Create multi-line strings with triple-double quotes

In [8]:
# tab as string
tab_as_tab = "\t"
print(len(tab_as_tab))

tab_as_string = r"\t"
print(tab_as_string)

1
\t


### Try and Except

To try and handle an **exception** (something going wrong), **try** to remedy it:

In [9]:
try:
    print(0/0)
except ZeroDivisionError:
    print("can't divide by zero")

can't divide by zero


### Lists

**List** = ordered **collection** (like an array but with added functionality).

In [10]:
int_list = [1,2,3]
heterogenous_list = [1,"string",0.5,True]
list_of_lists = [[1,2,3],[4,5],[6]]

In [11]:
print(len(list_of_lists))
print(sum(int_list))

3
6


In [12]:
# make list of integers from 0-9
x = range(10)

# get 1st and then 2nd element
print(x[0])
print(x[1])

0
1


In [13]:
# get the last and 2nd-to-last elements in the Pythonic way
print(x[-1])
print(x[-2])

9
8


In [14]:
# change 1st element to -1 
# must first convert to list first as range() is a generator in Python3
# no longer returns a list
x = list(x)
x[0] = -1

i = 0
while i < 5:
    print(x[i])
    i = i+1

-1
1
2
3
4


In [15]:
# slice lists with square brackets

# get 1st 3 elements
print(x[:3])
# get from 3rd element to the end
print(x[3:])

[-1, 1, 2]
[3, 4, 5, 6, 7, 8, 9]


In [16]:
# get last 3 elements
print(x[-3:])
# cut off 1st and last
print(x[1:-1])

[7, 8, 9]
[1, 2, 3, 4, 5, 6, 7, 8]


In [17]:
# make a copy of list
x2 = x[:]
print(x2)

[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [18]:
# check for values within a list
print(4 in x)
print(6 in [2,2,2,3])

True
False


The above list **membership** check goes over each element one-by-one, so it shouldn't be used unless the list is small.

In [19]:
# concatenate lists
x1 = [1,2,3]
x1.extend([3,4,5])
print(x1)

[1, 2, 3, 3, 4, 5]


In [20]:
# add to elements of list
y = x + [1,1,1]
print(y)

[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 1, 1]


In [21]:
# append items to list one at a time
x.append(15)
print(x)

[-1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 15]


It is helpful to **unpack lists** if we are unsure of how many elements it contains

In [22]:
x,y = [1,2]
print(x)
print(y)

1
2


The above returns a `ValueError` if there is not the same number of elements on both sides. If we're going to throw a value away while unpacking, we use an underscore.

In [23]:
_,y,z = [3,2,1] # 3 is nowhere to be found
print(y)
print(z)

2
1


### Tuples

These are **immutable** versions of lists, as anything we can do to lists, we can do to tuples, except change them.

In [24]:
my_list = [1,2]
my_tuple = (1,2)
tupl2 = 3,4
my_list[1] = 3

In [25]:
print(my_list)

[1, 3]


In [26]:
try:
    my_tuple[1] = 3
except TypeError:
    print("can't modify a tuple")

can't modify a tuple


A good use of tuples is returning multiple values from a function.

In [27]:
def sum_and_product(x,y):
    return (x+y),(x*y)

print(sum_and_product(2,3))

(5, 6)


In [28]:
sp = sum_and_product(4,3)
# unpack sp
s,p = sp
print(s)
print(p)

7
12


In [29]:
s,p = sum_and_product(5,10)
print(s)
print(p)

15
50


Can also use tuples and lists for **multiple assignment**.

In [30]:
x,y, = 1,2
print(x)
print(y)
# swap variables in the Pythonic way
x,y = y,x
print(x)
print(y)

1
2
2
1


### Dictionaries

These are containers for **key-value (kv) pairs** from which we can easily obtain values via their keys

In [31]:
empty_dict = {} # = Pythonic manner
empty_dict2 = dict() # = not-so Pythonic

grades = {"Joel":80,"Tim":95} # literal defining of dict

In [32]:
# look up values via a key in square brackets
joels_grade = grades["Joel"]
print(joels_grade)

80


In [33]:
# get KeyError if we ask for a key that doesn't exist in dict
grades["Bob"]

KeyError: 'Bob'

In [None]:
# check for existence of keys via Try/Except
try:
    kates_grade = grades["Kate"]
except KeyError:
    print("No grade for Kate")

In [None]:
# check for existence of keys via 'in'
joel_has_grade = "Joel" in grades
rob_has_grade = "Rob" in grades

print(joel_has_grade)
print(rob_has_grade)

In [None]:
# get default value for a key instead of raising exception via .get()
joels_grade = grades.get("Joel",0)
kates_grade = grades.get("Kate",2)
no_ones_grade = grades.get("NA") # default value of None

print(joels_grade)
print(kates_grade)
print(no_ones_grade)

In [None]:
# assing KV pairs via brackets
grades["Tim"] = 99 # used to be 95
grades["Kate"] = 78 # new KV pair

print(grades)

Dictionaries are frequently used to display structured data in a simple manner.

In [None]:
tweet = {
    "user":"joel_grus"
    ,"text": "Hello world"
    ,"retweet_count":108
    ,"hashtags":["data","science","hello","world"] # list
}

print(tweet)

In [None]:
# look for ALL keys
tweet_keys = tweet.keys()

# get ALL values
tweet_vals = tweet.values()

# get all KV pairs (tuples w/in dict)
tweet_tuples = tweet.items()

print(tweet_keys)
print(tweet_vals)
print(tweet_tuples)

In [None]:
print("joel_grus" in tweet_keys)
print("joel_grus" in tweet_vals)

Dictionary keys *must* be immutable, so we **cannot use lists as keys**. 

To create a multi-part key, we must use a tuple or we must figure out a way to turn the key into a string.

### defaultdict

This is like a regular dictionary, but its imported from `collections` and when we attempt to lookup a key that doesn't exist, it adds the key to the dictionary with a value using a zero-argument function provided when the dictionary was created.

In [None]:
# try to count words in a doc in a dict where keys = words
# and vals = counts

# increment counts by 1 each time a word is checked and its in the
# dict, and add it to the dict if its not
word_counts = {};
doc = "Hello, world, hi"
for word in doc:
    if word in word_counts:
        word_counts[word] += 1 # increment word count if present
    else:
        word_counts[word] = 1 # add word if not present

In [None]:
# Try "forgiveness is better than permission" = handle the exception
# from trying to loop up a missing/invalid key
word_counts = {};
doc = "Hello, world, hi"
for word in doc:
    try:
        word_counts[word] += 1 # increment word count if present
    except KeyError:
        word_counts[word] = 1 # add word if not present

In [None]:
# another approach = use .get() with a default value for missing keys
word_counts = {};
doc = ["Hello", "world", "hi"]
for word in doc:
    previous_count = word_counts.get(word,1)
    word_counts[word] = previous_count + 1

In [None]:
# use defaultdict (better)
from collections import defaultdict

word_counts = defaultdict(int); # (int)= int() = produces a value of 0
doc = ["Hello", "world", "hi"]

for word in doc:
    word_counts[word] += 1 # increment word count if present

`defaultdict`'s are also useful with `list`'s or `dict`'s, or user-defined functions.

In [None]:
dd_list = defaultdict(list) # (list) = list() =  produces empty list

# dd.list[key].append(value)
dd_list[2].append(1) 

print(dd_list)

In [None]:
dd_dict = defaultdict(dict) # = dict() = produces empty dict

# dd_dict[Key][InnerKey] = Value
dd_dict["Joel"]["City"] = "Seattle" # uses dictionary as a value

print(dd_dict)

In [None]:
# use lambda (anonymous) function to set 1st keys value = [0,0]
dd_pair = defaultdict(lambda: [0,0])
print(dict(dd_pair))

In [None]:
# for key = 2, make the 2nd value for this key = 1
dd_pair[2][1] = 1
print(dict(dd_pair))

These are also useful for collecting results by some key and we don't want to check if a key exists yet every time

### Counter

These turn sequences of values into a `defaultdict(int)`-like object, mapping keys to counts. It's primarily used to create histograms.

In [None]:
from collections import Counter
c = Counter([0,1,2,3,0])
print(c)

This provides a simple solution to the `word_counts` problem

In [None]:
doc = ["Hello", "world", "hi","hi","hi","hi","hi","but","flyers",
      "why","suck","pizza"]

word_counts = Counter(doc)
print(word_counts)

In [None]:
# get most common word
for word,count in word_counts.most_common(10):
    print(word,count)

In [None]:
word_counts.most_common(10)

### Sets

These are a collection of *distint* elements.

In [None]:
s = set()
s.add(1)
s.add(2)
s.add(2) # does not change the set as we already have a '2'
print(s)

In [None]:
x = len(s)
y = 2 in s
z = 3 in s
print(x,y,z)

Sets are useful because `in` works very fast on sets, and to find distinct elements in a collection (but used less than `dict`'s and `list`'s

If we have a large collection of items we want to use for a **membership test**, sets are more appropriate than lists.

In [None]:
stopwords_list = ["a","an","at"] + ["yet","you"];

print("zip" in stopwords_list) # must check every element in list

stopwords_set = set(stopwords_list)
print("zip" in stopwords_set) # faster

In [None]:
# find distinct items
item_list = [1,2,3,1,2,3]
num_items = len(item_list)

# get distinct items
item_set = set(item_list)
num_distinct_items = len(item_set)

# convert back to list
distinct_item_list = list(item_set)

print(item_list)
print(num_items)
print(item_set)
print(num_distinct_items)
print(distinct_item_list)

### Control Flow

Perform an action **conditionally** using `if`

In [None]:
if 1 > 3:
    message = "if only 1 were greater than 3..."
elif 1 > 4:
    message = "nope"
else:
    message = "when all else fails (if desired)"
    
print(message)

Can also write **ternary** if-then-else statements on one line

In [None]:
parity = "even" if x % 2 == 0 else "odd"
print(parity)

In [None]:
# WHILE loops
x = 0
while x < 10:
    print(x,"is less than 10")
    x += 1

In [None]:
# more common to use `for` and `in` than a `while`
for x in range(10):
    print(x,"is less than 10")

In [None]:
# more complex logic = use `continue` and `break`
for x in range(10):
    if x == 3:
        continue # go to next iteration (increase x)
    if x == 5:
        break # quit loop
    print(x)

### Truthiness

**Booleans** work like in other languages, but capitalized

In [None]:
print(1 < 2)
print(True == False)

Python uses `None` for nonexistent values (such as `Null`, `Nan` or `NA`'s)

In [None]:
x = None
print(x == None)
print(x is None)

We can use any value where Python expects a Boolean

In [None]:
# all False
falsy = [False, None, [], {}, "", set(), 0, 0.0]

Anythin else is treated at `True`. This makes it easy to use `if` statements to test for empty lists/strings/dicts, etc. But it can also cause bugs if we're not expecting this behavior

In [None]:
def function_returning_a_string():
    return "it's a string, baby"

s = function_returning_a_string()
if s:
    first_char = s[0]
else:
    first_char = ""
    
first_char

In [None]:
# simpler way of doing the above
# return 2nd arg when 1st arg is "Truthy", return 1st arg if not
first_char

In [None]:
# if x is either a number or possibly None
safe_x = x or 0 # returns 2nd arg since s = None
safe_x

`all()` takes a list + returns `True` precisely when *every* element is "Truthy"

`any()` returns `True` if *at least one* element is Truthy

In [None]:
print(all([True,1,{3}]))
print(all([True,1,{}]))
print(any([True,1,{}]))
print(all([]))
print(any([]))

## Not-so-basics

### Sorting

Python `list`'s have a `.sort()` method to sort the list *in-place*. To get a *new* list, used the `sorted()` function.

In [None]:
x = [4,1,3,2]
print(x.sort())
print(x)

In [None]:
x = [4,1,3,2]
y = sorted(x)
print(x,y)

By default both techniques above sort from "smallest" to "largest", baised on a naive comparison of list `elements`.

To do the opposite, utilize argument `reverse = True`.

We can also compare the results of a function with specified keys.

In [None]:
# sort list from largest to smallest
x = [4,1,3,2]
x.sort(reverse = True)
print(x)

In [None]:
# sort words and counts from highest to lowest counts
doc = ["Hello", "world", "hi","hi","hi","hi","hi","but","flyers",
      "why","suck","pizza","pizza","flyer","flyers"]

word_counts = Counter(doc)
print(word_counts)

wc = sorted(word_counts.items(),
            key=lambda x: x[1], # sort by 2nd value in items = count
            reverse=True)

### List Comprehensions

Common to transform lists into another list by choosing only certain elements, transforming certain elements, or maybe both. To do this in a Pythonic manner, use **list comprehensions**.

In [40]:
# get all numbers that are divisible by 2 in range 0 to 4
even_num = [x for x in range(5) if x % 2 == 0] 

# get all values in range 0-4 and square them
squares = [x**2 for x in range(5)]
# squares = [x*x for x in range(5)]

# get all even squares
even_squares = [x**2 for x in even_num]
# even_squares = [x*x for x in even_num]

print(even_num)
print(squares)
print(even_squares)

[0, 2, 4]
[0, 1, 4, 9, 16]
[0, 1, 4, 9, 16]
[0, 4, 16]


Can also turn `list`'s into `dict`'s and `set`'s

In [42]:
# for each value in 0-4, make the int a key and its square = its value
square_dict = {x : x**2 for x in range(5)}

# for each int in list, square it into a set (DISTINCT)
square_set = {x**2 for x in [1,-1]}

print(square_dict)
print(square_set)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
{1}


Use an underscore as a variable if we don't need a value from a list

In [44]:
# returns a list of 0's with same length as even_num
zeroes = [0 for _ in even_num]
print(zeroes)

[0, 0, 0]


In [47]:
## use multiple `for`'s in list comprehension

# make a 100 (10x10) element list of pairs
pairs = [(x,y)
        for x in range(10)
        for y in range(10)]

print(len(pairs))
print()
print(pairs)

100

[(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (1, 0), (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (2, 0), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (3, 0), (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (4, 0), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (5, 0), (5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (5, 7), (5, 8), (5, 9), (6, 0), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6), (6, 7), (6, 8), (6, 9), (7, 0), (7, 1), (7, 2), (7, 3), (7, 4), (7, 5), (7, 6), (7, 7), (7, 8), (7, 9), (8, 0), (8, 1), (8, 2), (8, 3), (8, 4), (8, 5), (8, 6), (8, 7), (8, 8), (8, 9), (9, 0), (9, 1), (9, 2), (9, 3), (9, 4), (9, 5), (9, 6), (9, 7), (9, 8), (9, 9)]


In [50]:
## use previous `for`'s in later `for`'s
increasing_pairs = [(x,y)
                   for x in range(10)
                    # only pair y's in 0-9 with NON-matching x's
                    for y in range(x+1,10)] 

print(increasing_pairs)

[(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (0, 9), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (1, 8), (1, 9), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), (2, 8), (2, 9), (3, 4), (3, 5), (3, 6), (3, 7), (3, 8), (3, 9), (4, 5), (4, 6), (4, 7), (4, 8), (4, 9), (5, 6), (5, 7), (5, 8), (5, 9), (6, 7), (6, 8), (6, 9), (7, 8), (7, 9), (8, 9)]


### Generators and Iterators

Lists can grow very big, which could be a problem. Ex: `range(1000000)` makes a 1M-element list, and if we only need to deal with one element at a time, this can be very inefficient or memory-intensive. If we maybe only need the first couple of elements, to calculate all is wasteful.

We can use `generator`'s to iterate over (usually using `for`) but in which values are produced only as needed (i.e. **lazily**). 

Generators can only be iterated over *once*, as they don't store all values in memory but calculate on the fly. To iterate over something more than once, we need to recreate the generator for each time it's needed, or use a list

In [53]:
# create generator with functions and `yield` operator
def lazy_range(n):
    """Creates a 'laze' version of range()"""
    i = 0
    while i < n:
        # yield = like return but function returns a generator
        # return i while it is less than given arg = n
        yield i
        i += 1
        
# loop to consume only the `yield`ed values, one at a time
# until none are left
for i in lazy_range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


To use the "real" version of `lazy_range()`, use `xrange()`, and `range()` is lazy in Python3, so we could create an infinite sequence (which shouldn't be iterated over without some `break` logic):

In [54]:
def natural_nums():
    """Returns 1, 2, 3, ..."""
    n = 1
    while True:
        yield n
        n += 1

A second way to create a generator is using **`for` comprehensions** wrapped inside `()`'s

In [61]:
lazy_even_below_20 = (i for i in lazy_range(20) if i % 2 == 0)
for i in lazy_even_below_20:
    print(i)

0
2
4
6
8
10
12
14
16
18


### Randomness

It's common to generate random numberes in data science via the `random` module.

In [63]:
import random

# create list of 4 uniformly random values between 0-1
uniform_rand = [random.random() for _ in range(4)]
print(uniform_rand)

[0.46529535419225077, 0.5595849090036804, 0.6220746703184373, 0.22027435993386602]


`random` module actually creates *pseudorandom* (**deterministic**) values based on an internal state that can be changed via `random.seed` (for reproducible results)

In [67]:
random.seed(10)
print(random.random())
# get same "random" value
random.seed(10)
print(random.random())

0.5714025946899135
0.5714025946899135


Can use `random.randrange` with 1 or 2 args to return an element randomly chosen from a given range

In [70]:
print(random.randrange(10)) # random value from [0-10]
print(random.randrange(3,6)) # random val from [3-6]

3
4


In [72]:
# use random.shuffle to randomly re-order elements in a list
l_ten = list(range(10)) # must make return val of range into a list
random.shuffle(l_ten)
print(l_ten)

[6, 1, 5, 7, 8, 3, 9, 0, 2, 4]


In [75]:
# use random.choice to get a random element from a list
best_friend = random.choice(["Bob","Tim","John"])
print(best_friend)
best_friend = random.choice(["Bob","Tim","John"])
print(best_friend)

Bob
John


In [76]:
## get random sample of elements WITHOUT replacement
num = range(100)
# get random sample of 7
nums = random.sample(num, 7)
print(nums)

[45, 48, 53, 36, 86, 33, 58]


In [77]:
## get random sample of elements WITH replacement
## via muliple calls to random.choice()
num = range(10)
nums_with_replacement = [random.choice(num)
                        for _ in range(8)]
print(nums_with_replacement)

[2, 4, 5, 2, 7, 3, 7, 9]


### RegEx

These provide ways of searching for text, and are useful but complicated.

In [79]:
import re

print(all([ # returns True when every element is "Truthy"
    # return True if "cat" does NOT start with "a"
    not re.match("a", "cat"),
    
    # return True if "cat" contains "a"
    re.search("a", "cat"),
    
    # return True if "dog" does NOT contain "c"
    not re.search("c", "dog"),
    
    # split on 'a' AND 'b' giving 'c', 'rbs' --> []'c', 'r', 's']
    3 == len(re.split("[ab]", "carbs")),
    
    # replace digits in 'R2D2' with hyphens
    "R-D-" == re.sub("[0-9]", "-", "R2D2") 
    ])
)

True


### OOP

Python allows us to define `classes` to **encapsulate** data and functions that operate on them. They can be used to make code cleaner and simpler.

Example: We don't have the `set` method, so we want to create a `Set` class. For its behavior, given an **instance** of `Set`, we need to `add` and `remove` items to/from it, check if it `contains` a value. These would all be **member functions** (accessed via `.` after class name)

In [95]:
# classes = PascalCase names
class Set:
    # define member functions that take paramters "self" which
    # refers to the particular Set object being used
    def __init__(self, values=None):
        """This is a constructor which is called when
        we want to create a new instance of Set
        Used like: s1 = Set(), which is empty and
        s2 = Set([1,2,2,3]), which is initialized
        with values"""
        # each Set instance has its own dict property
        # used to track memberships
        self.dict = {}
        
        # add only distinct values from given values
        if values is not None:
            for value in values:
                self.add(value)
                
    def __repr__(self):
        """This is the 'string' representation of the Set
        object if we had typed it at the Python prompy or 
        passed it to str()"""
        return "Set: " + str(self.dict.keys())

    # represent membership via a key in self.dict w/ value = True
    def add(self, value):
        self.dict[value] = True

    # define that value is in the set if it is a key in the dict
    def contains(self, value):
        return value in self.dict

    def remove(self, value):
        del self.dict[value]

In [97]:
# use the class and its functions/methods
s = Set([1,2,4])
s.add(4)
s.add(3)
print(s)

Set: dict_keys([1, 2, 4, 3])


In [98]:
print(s.contains(5))
print(s.contains(3))

False
True


In [99]:
s.remove(3)
print(s)

Set: dict_keys([1, 2, 4])


### Functional Tools

When passing functions around, maybe we'd want to *partially* apply (**curry**) functions in order to create new ones.

In [100]:
# we have a function of 2 variables
def exp(base,power):
    return base**power

# use above function to create a one-variable function
# its input = value for the power to raise to
# output = call to exp() with given power

# using def can get unwieldy
def two_to_the(power):
    return exp(2,power)

two_to_the(3)

8

In [101]:
# better way = use `functools.partial`
from functools import partial

# create function of 1 variable, x
two_to_the = partial(exp,2) 
print(two_to_the(3))

8


In [103]:
# can also use `partial` to fill in later args if we specify names

# specify arg in the exp() we're calling via partial()
square_of = partial(exp,power=2)
print(square_of(3))

9


**Currying** arguments in the middle of a function can get messy, so try to avoid it.

Can occasionally use `map`, `reduce`, and `filter` as functional alternatives to list comprehensions (they return **iterators**, so convert them to lists with `list`).

In [105]:
def double(x):
    return 2*x

xs = [1,2,3,4]
twice_xs = [double(x) for x in xs]
print(twice_xs)

[2, 4, 6, 8]


In [108]:
# "function" to double a list
list_doubler = partial(map,double)
twice_xs = list_doubler(xs)
print(list(twice_xs))

[2, 4, 6, 8]


In [110]:
# use `map` with multiple-arg functions by providing multiple lists
def multiply(x,y): return x*y

# use multiply on multiple lists to do element-wise multiplication
products = map(multiply, [1,2], [4,5])
print(list(products)) # [1*2, 2*5]

[4, 10]


In [112]:
# can use `filter` to do the work of a list-comprehension `if`
def is_even(x):
    """Returns True if x is even, False if x is odd"""
    return x % 2 == 0

# list comprehension = get each value in xs if its even (check via is_even())
x_evens = [x for x in xs if is_even(x)]
print(x_evens)

# filter() = get each value in xs if its even (check via is_even())
x_evens2 = filter(is_even, xs)
print(list(x_evens2))

[2, 4]
[2, 4]


In [114]:
# make "function" to filter a list
list_evener = partial(filter, is_even)
print(list(list_evener(xs)))

[2, 4]


In [117]:
# use `reduce` to combine the 1st 2 elements of a list, then
# combine that result w/ the 3rd + so on returning a single value

# reduce() not available in Python3 without specifying module
from functools import reduce

x_product = reduce(multiply, xs) # = 1*2*3*4
print(x_product)

24


In [119]:
# make "function" to reduce a list
list_product = partial(reduce, multiply)
x_product = list_product(xs) # = 1*2*3*4
print(x_product)

24


### enumerate

We can iterate over a list and use both its elements *and* their indexes for things.

In [121]:
docs = ["Hello", "world", "hi","hi","hi","hi","hi","but","flyers",
      "why","suck","pizza"]

# non-Pythonic
for i in range(len(docs)):
    doc = docs[i]
    print(i, doc)

print()

# non-Pythonic
i = 0
for doc in docs:
    print(i,doc)
    i += 1

0 Hello
1 world
2 hi
3 hi
4 hi
5 hi
6 hi
7 but
8 flyers
9 why
10 suck
11 pizza

0 Hello
1 world
2 hi
3 hi
4 hi
5 hi
6 hi
7 but
8 flyers
9 why
10 suck
11 pizza


In [122]:
# Pythonic way to do the above
# returns tuples = (index, element)
for i, doc in enumerate(docs):
    print(i,doc)

0 Hello
1 world
2 hi
3 hi
4 hi
5 hi
6 hi
7 but
8 flyers
9 why
10 suck
11 pizza


In [124]:
# get just indexes

# non-Pythonic
for i in range(len(docs)):
    print(i)
    
print(i)

# Pythonic
for i, _ in enumerate(docs):
    print(i)

0
1
2
3
4
5
6
7
8
9
10
11
11
0
1
2
3
4
5
6
7
8
9
10
11


### zip and arg unlocking

Can `zip` 2 lists together, transforming multiple lists into a single list of tuples of corresponding elements.

If lists are different lengths, `zip` will stop when the 1st list ends.

In [127]:
l1 = ["a","b","c"]
l2 = [1,2,3]
list(zip(l1,l2)) # returns zip object = must convert

[('a', 1), ('b', 2), ('c', 3)]

In [128]:
# unzip
pairs = zip(l1,l2)

# * = argument unpacking, using elements of zip object as
# individual arguments to `zip`
letters, numbers = zip(*pairs)
print(letters,numbers)

('a', 'b', 'c') (1, 2, 3)


In [129]:
# get same result at above
list(zip(('a',1),('b',2),('c',3)))

[('a', 'b', 'c'), (1, 2, 3)]

In [130]:
# use argument unpacking with any function
def add(a,b):
    return a + b

print(add(1,2))
print(add[1,2])

3


TypeError: 'function' object is not subscriptable

In [133]:
# unpack list of values to add
print(add(*[1,2]))

3


### args and kwargs

To create a higher-order function that takes, as input, some function `f` and returns a *new* function, that, for any input, returns twice the value of `f`

In [134]:
def doubler(f):
    def g(x):
        return 2*f(x)
    return g

# works in some cases
def f1(x):
    return x + 1

g = doubler(f1)
print(g(3)) # = (3+1)*2
print(g(-1)) # = (-1+1)*2

8
0


In [135]:
# breaks down with functions taking >1 arg
def f2(x,y):
    return x + y

g = doubler(f2)
print(g(1,2))

TypeError: g() takes 1 positional argument but 2 were given

So, we need a way to specify a functoin that takes *arbitrary* numbers of args, via argument unpacking and some more:

In [136]:
def magic(*args, **kwargs):
    print("unnames args:", args)
    print("named args:", kwargs)

magic(1,2, key="word", key2="word2")

unnames args: (1, 2)
named args: {'key': 'word', 'key2': 'word2'}


Here, **`args`** is a tuple of unnamed args, and **`kwargs`** is a `dict` of its named args. It also works the other way around, to use a `list`/`tuple` and `dict` to *supply* args to a function:

In [137]:
def other_magic(x,y,z):
    return x + y + z

xy_list = [1,2]
z_dict = {"z":3}
print(other_magic(*xy_list,**z_dict))

6


Many things can be done in this way, but for data science its mainly used to produce higher-order functions whose inputs can accept arbitrary arguments:

In [138]:
def doubler_correct(f):
    """Works no matter the inputs that f expects"""
    def g(*args,**kwargs):
        """Any arg supplied to g, pass them to f"""
        return 2* f(*args,**kwargs)
    return g

g = doubler_correct(f2)
print(g(1,2))

6
