Based on content by Wes McKinney

# List comprehensions

List comprehensions allow us to create new lists concisely based on an existing collection

They take the form:

`[expr for val in collection if condition]`

This is basically equivalent to the following loop:

`result = []
for val in collection:
    if condition:
        result.append(expr)`

In [1]:
# make a list of the squares 
[x**2 for x in range(1,11)]

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [2]:
import numpy as np
np.array([x**2 for x in range(1,11)])

array([  1,   4,   9,  16,  25,  36,  49,  64,  81, 100])

In [3]:
# square only the odd numbers
[x**2 for x in range(1,11) if x % 2 == 1]

[1, 9, 25, 49, 81]

In [4]:
# take a list of strings, and write the words that are over 2 characters long in uppercase.
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

You can create a list comprehension from any iterable (list, tuple, string, etc)

In [5]:
# extract the digits from a string
string = "Hello 963257 World"
[int(x) for x in string if x.isdigit()]
# for x in string, will look at each character individually
# if x is a digit, then convert it using int()

[9, 6, 3, 2, 5, 7]

In [6]:
# iterate over a dictionary's items
d = {'a':'apple', 'b':'banana', 'c':'cookie'}

In [7]:
list(d.items())  # recall what dict.items() returns: a list of tuples

[('a', 'apple'), ('b', 'banana'), ('c', 'cookie')]

In [8]:
[key + ' is for ' + value for key, value in d.items() if key != 'b' ]

['a is for apple', 'c is for cookie']

## Dictionary Comprehensions

A dict comprehension looks like this:

`dict_comp = {key-expr : value-expr for value in collection if condition}`

Look at the list `strings` from above.

In [9]:
# create a dictionary, where the key is the word capitalized, and the value is the length of the word
fruits = ['apple', 'mango', 'banana','cherry']
{f.capitalize():len(f) for f in fruits}

{'Apple': 5, 'Mango': 5, 'Banana': 6, 'Cherry': 6}

In [10]:
# create a dictionary where the key is the index, and the value is the string in the strings list.
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

In [11]:
list(enumerate(strings))  # enumerate produces a collection of tuples, with index and value

[(0, 'a'), (1, 'as'), (2, 'bat'), (3, 'car'), (4, 'dove'), (5, 'python')]

In [12]:
index_map = {index:val for index, val in enumerate(strings)}
index_map

{0: 'a', 1: 'as', 2: 'bat', 3: 'car', 4: 'dove', 5: 'python'}

In [13]:
# note that enumerate returns tuples in the order (index, val)
# in the creation of a dictionary, you can swap those positions
# and even apply functions to them

# We create a dictionary where the key is the string, and the value is the index in the strings list.
loc_mapping = {val : index for index, val in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

In [14]:
index_map['a']

KeyError: 'a'

In [15]:
loc_mapping['a']

0

In [16]:
# combine dictionaries with kwargs (to be covered later)
dd = {**loc_mapping, **index_map}
print(dd)

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5, 0: 'a', 1: 'as', 2: 'bat', 3: 'car', 4: 'dove', 5: 'python'}


In [17]:
# even better... use dict.update(). This modifies the dictionary in place
loc_mapping.update(index_map)
loc_mapping

{'a': 0,
 'as': 1,
 'bat': 2,
 'car': 3,
 'dove': 4,
 'python': 5,
 0: 'a',
 1: 'as',
 2: 'bat',
 3: 'car',
 4: 'dove',
 5: 'python'}

# Prompting the user for input

You can prompt the user for input with the function `input()`

In [18]:
name = input()
print('Hello, ' + name)

Miles
Hello, Miles


In [19]:
name = input()
print(type(name))

123
<class 'str'>


In [20]:
# This is a Guess the Number game.
# adapted from Invent Your Own Computer Games with Python by Al Sweigart
import numpy as np

guessesTaken = 0

print('Hello! What is your name?')
myName = input()

number = np.random.randint(1, 21)
print('Hi ' + myName + ', I am thinking of a number between 1 and 20. You have up to 6 guesses.')

for guessesTaken in range(6):
    print('Take a guess.')
    guess = input()
    guess = int(guess) # convert the string into an integer.
        # If the user inputs something that can't be converted, it will produce an error

    if guess < number:
        print('Your guess is too low.')

    if guess > number:
        print('Your guess is too high.')

    if guess == number:
        break

if guess == number:
    guessesTaken = str(guessesTaken + 1)
    print('Good job, ' + myName + '! You guessed my number in ' +  
      guessesTaken + ' guesses!')

if guess != number:
    number = str(number)
    print('Nope. The number I was thinking of was ' + number + '.')

Hello! What is your name?
Miles
Hi Miles, I am thinking of a number between 1 and 20. You have up to 6 guesses.
Take a guess.
10
Your guess is too high.
Take a guess.
5
Your guess is too low.
Take a guess.
8
Your guess is too high.
Take a guess.
7
Good job, Miles! You guessed my number in 4 guesses!


# Functions and scope

If python can't find a variable in the functions body, it will look for it in the next higher scope. Each function has a parent scope, which is where the function is defined.

Variables inside functions only exist inside the function.

In [21]:
x, y, z = 1, 1, 1

def f():
    y = 2  # changing y to 2, only affects the value inside the function
    return(x,y,z)  # it does not find x or z in the local environment, so it searches the higher scope

print(f())
print(x,y,z)

(1, 2, 1)
1 1 1


In [22]:
x, y, z = 1, 1, 1

# g is defined inside f
# function f returns the value of g()
# g() uses z = 3, can't find x, or y
# it searches the higher scope f() for x and y
# is uses f's value of y = 2
# it uses global x = 1

def f():
    y = 2
    def g():
        z = 3
        return(x, y, z) 
    return(g()) 

print(f())
print(x, y, z)

(1, 2, 3)
1 1 1


In [23]:
x, y, z = 1, 1, 1

# g is defined in the global environment
# function f returns the value of g()
# g() uses z = 3, can't find x, or y
# it searches the global environment for x and y because g is defined in the global environment
# it uses global x = 1, and y = 1

def g():
    z = 3
    return(x, y, z)

def f():
    y = 2
    return(g()) 

print(f())
print(x, y, z)

(1, 1, 3)
1 1 1


In [24]:
# keyword global gives the function access to the value in the global environment
x, y, z = 1, 1, 1

def f():
    y = 2
    def g():
        global z  # calling global, gives g access to the global value of z
        z = 3     # will assign 3 to the global variable z
        return(x, y, z) 
    return(g()) 

print(f())
print(x, y, z)

(1, 2, 3)
1 1 3


In [25]:
x, y, z = 1, 1, 1

def g():
    z = 3
    return(x, y, z)

def f():
    global y
    y = 2
    return(g()) 

print(g()) # when we first run g(), it uses the global values of x and y, but the local value of z
# the local value of z does not affect the globals value of z
print(x, y, z)

(1, 1, 3)
1 1 1


In [26]:
print(f())  # when we run f(), the global value of y is changed
# it calls g(), which uses its own local value of z = 3,
# and uses the global value of y, which has been changed
print(x, y, z)

(1, 2, 3)
1 2 1


In [27]:
print(g())
print(x, y, z)

(1, 2, 3)
1 2 1


In [28]:
p, q = 1, 1

def f():
    global s   # will create s in the global
    s = 2
    return(p, q, s)
f()

(1, 1, 2)

In [29]:
x, y, z = 1, 1, 1

def f():
    global y
    y = 4
    def g():
        global y 
        y = 10 
        global z  
        z = 3
        return(x, y, z) 
    return(g()) 

print(f())
print(x, y, z)

(1, 10, 3)
1 10 3


In [30]:
# the keyword nonlocal will search the higher scope for the variable
# and will modify it
x, y, z = 1, 1, 1

def f():
    y = 4
    def g():
        nonlocal y  
        y = 10  # affects the y defined inside f
        global z  
        z = 3
        return(x, y, z)
    print(x, y, z)  # this line is run before g() is called
    return(g())  # when g() is called, y will be modified

print(f())
print(x, y, z)

1 4 1
(1, 10, 3)
1 1 3


In [31]:
p, q = 1, 1

def f():
    nonlocal r   # will return an error because r does not exist in the nonlocal environment
    r = 2
    return(p, q, r)

f()

SyntaxError: no binding for nonlocal 'r' found (<ipython-input-31-df8bb5ee4b32>, line 4)

# Python functions are objects themselves

You can reference python functions as objects

In [32]:
states = ['   Alabama ', 'Georgia!', 'Georgia', 'georgia', 
          'FlOrIda', 'south  carolina##', 'West virginia?']

In [33]:
import re  # package for regular expressions

# here is a function that applies a series of operations to clean up the strings

def clean_strings1(strings):
    result = []
    for value in strings:
        value = value.strip()  # strip whitespace
        value = re.sub('[!#?]', '', value)  # substitutes the characters !, #, ? with ''
        value = value.title()  # title case
        result.append(value)
    return result

In [34]:
clean_strings1(states)  # when we apply the function to the list, it cleans up the messy text

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South  Carolina',
 'West Virginia']

In [35]:
# we define a new function called remove_punctuation
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

In [36]:
# this is a list of functions 
clean_ops = [str.strip, remove_punctuation, str.title]

In [37]:
# just to demonstrate what these functions do...
str.strip('    alabama    ')

'alabama'

In [38]:
# the function clean strings takes two arguments:
# a list of strings
# a list of functions
def clean_strings2(strings, ops):
    result = []
    for value in strings:            # we loop over each string
        for function in ops:         # for each string, we loop over the functions listed in ops
            value = function(value)  # we update the value each time
        result.append(value)         # we append the list results with the value
    return result

In [39]:
clean_strings2(states, clean_ops)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South  Carolina',
 'West Virginia']

In [40]:
clean_strings2(states, [str.strip, remove_punctuation, str.upper, lambda x: re.sub('  ',' ', x)])  
# I can provide a different list of functions

['ALABAMA',
 'GEORGIA',
 'GEORGIA',
 'GEORGIA',
 'FLORIDA',
 'SOUTH CAROLINA',
 'WEST VIRGINIA']

In [41]:
# the python function map() takes in an function name as an argument and applies it to a list

map(str.strip, states)  # map returns a map object

<map at 0x276330b5a20>

In [42]:
# to see the contents of the map object, you can put it into a list:
# map only allows you to specify one function
list(map(str.strip, states))

['Alabama',
 'Georgia!',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south  carolina##',
 'West virginia?']

# lambda functions

In one of the later examples, I created a lambda function

A lambda function allows you to create and use a new short function without having to formally define it.

In [43]:
# I could define a function that replaces  two spaces with one space:
def replace_space(x):
    return(re.sub('  ', ' ', x))

In [44]:
# and then apply it to the strings:
list(map(replace_space, states))

['  Alabama ',
 'Georgia!',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south carolina##',
 'West virginia?']

In [45]:
# however, because the code for the function is so short, it might be easier to just create
# a quick function without a formal name. These 'anonymous' functions are also known as lambda functions

list(map(lambda x: re.sub('  ',' ', x), states))

['  Alabama ',
 'Georgia!',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south carolina##',
 'West virginia?']

lambda functions are written in the form:

`lambda argument1, argument2, etc: expression to return`

In [46]:
# lambda functions can accept multiple arguments
# if you use it with map, you'll need to provide a list for each argument
list(map(lambda x, y: x + y, [1,2,3], [100,200,300]))

[101, 202, 303]