# Python Workshop

This workshop will review Python fundamentals 
and prepare you for Galvanize's DSI.

# Topics

### Day 2

Morning:

* sets
* dictionaries
* efficiency

Afternoon:

1. Discuss morning solutions:
    * View student solutions & official solutions.
    * Why such a dramatic increase in speed for each of the four functions?
2. Prep for afternoon exercise:
    * Understand the data.
    * Read through the exercise text.
    * Brainstorm the 4 functions you'll need to write.

# Sets

In [1]:
groceries = set()
groceries.add('carrots')
groceries.add('figs')
groceries.add('popcorn')

In [2]:
groceries2 = {'popcorn', 'carrots', 'figs'}

In [3]:
groceries3 = set(('popcorn', 'carrots', 'figs'))

In [4]:
groceries == groceries2 == groceries3

True

In [5]:
groceries

{'carrots', 'figs', 'popcorn'}

In [6]:
'figs' in groceries

True

In [7]:
'ryan' in groceries

False

In [8]:
for element in groceries:
    print element

popcorn
carrots
figs


# Dictionaries

A dictionary is a set where each element has an associated value.

In [9]:
prices = {}
prices['banana'] = 1
prices['steak'] = 10
prices['ice cream'] = 5

In [10]:
prices2 = {'steak': 10, 'banana': 1, 'ice cream': 5}

In [11]:
prices == prices2

True

In [12]:
prices

{'banana': 1, 'ice cream': 5, 'steak': 10}

In [13]:
'banana' in prices

True

In [14]:
print prices['banana']

1


In [15]:
'ryan' in prices

False

In [16]:
print prices['ryan']

KeyError: 'ryan'

In [17]:
for key, value in prices.iteritems():
    print key, '->', value

steak -> 10
banana -> 1
ice cream -> 5


# Hashing

* a 'hash' function computes an integer for the given object
* dictionaries and sets use hashing for fast inserts, removes, and lookups

BOARDWORK HERE!

In [24]:
print hash("ryan")
print hash(7)
print hash((1, 3, 'bob'))

for i in range(10):
    print hash(i)
    
for i in range(10):
    print hash(str(i))

-5995074038886156178
7
-8771069815892485235
0
1
2
3
4
5
6
7
8
9
6144018481
6272018864
6400019251
6528019634
6656020021
6784020404
6912020791
7040021174
7168021561
7296021944


In [26]:
p = 113

print hash("ryan")        % p #modulo 113
print hash(7)             % p
print hash((1, 3, 'bob')) % p

63
7
105


# Variations on Dictionaries

* defaultdict
* Counter

### defaultdict

See: https://docs.python.org/2/library/collections.html

In [29]:
from collections import defaultdict

d_int = defaultdict(int) #defaults to 0
d_int[1] = 25
print d_int[1]
print d_int
print d_int[2] #automatically creates the entry instead of giving error
print d_int

25
defaultdict(<type 'int'>, {1: 25})
0
defaultdict(<type 'int'>, {1: 25, 2: 0})


In [30]:
d_float = defaultdict(float)
print 'Default float:', d_float['some_key']

d_str = defaultdict(str)
print 'Default string:', d_str['some_key']

d_list = defaultdict(list)
print 'Default list:', d_list['some_key']

Default float: 0.0
Default string: 
Default list: []


### Why defaultdict?

In [31]:
document = "ryan walks to the gym then walks home to eat and sleep".split()

# Common pattern:
word_counts = {}
for word in document:
    if word in word_counts:
        word_counts[word] += 1
    else:
        word_counts[word] = 1
print word_counts

# Better if you use a defaultdict!
word_counts_2 = defaultdict(int)
for word in document:
    word_counts_2[word] += 1
print word_counts_2

# Same?
word_counts == word_counts_2

{'and': 1, 'then': 1, 'gym': 1, 'ryan': 1, 'to': 2, 'sleep': 1, 'walks': 2, 'home': 1, 'the': 1, 'eat': 1}
defaultdict(<type 'int'>, {'and': 1, 'then': 1, 'gym': 1, 'ryan': 1, 'to': 2, 'sleep': 1, 'walks': 2, 'home': 1, 'the': 1, 'eat': 1})


True

### Counter

In [32]:
from collections import Counter

letters = ['c', 'a', 'a', 'b', 'b', 'c']
counter = Counter(letters) # note the difference in capitalization!
print counter

Counter({'a': 2, 'c': 2, 'b': 2})


In [33]:
# Counters have a 'most_common' method

print counter.most_common()

# Elements with equal counts are ordered arbitrarily:
print counter.most_common(2)
print counter.most_common(1) 

[('a', 2), ('c', 2), ('b', 2)]
[('a', 2), ('c', 2)]
[('a', 2)]


In [35]:
# Can two numbers draw (with replacement) from a list sum to a given value?

list_ = [3, 5, 7, 9]

# make_sum(list_, 4) ==> False #can two numbers drawn from the list add up to 4?
# make_sum(list_, 8) ==> True


In [39]:
# Method 1

from itertools import combinations_with_replacement

def make_sum1(numbers, target):
    combinations = combinations_with_replacement(numbers, 2)
    for combo in combinations:
        if sum(combo) == target:
            return True
    return False


In [40]:
# Method 2

def make_sum2(numbers, target):
    for number in numbers:
        if target - number in numbers:
            return True
    return False


In [54]:
# Method 3

def make_sum3(numbers, target):
    numbers = set(numbers)
    for number in numbers:
        if target - number in numbers:
            return True
    return False


In [52]:
import random

number_of_samples = 1000
list_range = (1, 1000)
list_length = 100

samples = []

for s in xrange(number_of_samples):
    list_ = [random.randint(*list_range) for i in xrange(list_length)]
    target = random.randint(*list_range)
    samples.append((list_, target))


In [53]:
def test_make_sum(samples, make_sum):
    for numbers, target in samples:
        make_sum(numbers, target)

In [44]:
time test_make_sum(samples, make_sum1)

CPU times: user 519 ms, sys: 19.6 ms, total: 539 ms
Wall time: 556 ms


In [45]:
time test_make_sum(samples, make_sum2)

CPU times: user 79.2 ms, sys: 3.85 ms, total: 83 ms
Wall time: 85.2 ms


In [46]:
time test_make_sum(samples, make_sum3)

CPU times: user 12.7 ms, sys: 3.41 ms, total: 16.1 ms
Wall time: 13.2 ms
