# Lecture 9-1

# Pythonic Features

## Week 9 Monday

## Miles Chen, PhD

## Named Tuples

Named tuples are a quick and simple way to define a new class if the Class definition only contains values and does not require its own methods.

Recall we defined a class Point with the following definition.

```
class Point:
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y
    def __str__(self):
        return '(%g, %g)' % (self.x, self.y)
```

A named tuple can be created that functions in a nearly identical fashion. You will need to import `namedtuple` from the `collections` module

In [1]:
from collections import namedtuple

Once we have imported namedtuple, we can create a named tuple.

We'll create a Point Class named tuple that contains two values, `x` and `y`.

In [2]:
Point = namedtuple('Point', ['x', 'y'])

In [3]:
Point

__main__.Point

With our namedtuple defined, we can create instances of it like we would any other class.

In [4]:
p = Point(1, 2)

In [5]:
p

Point(x=1, y=2)

Now that we have created an instance of the named tuple, we can access the values using dot notation. We can also access values using indexed square-bracket notation as well because it is a tuple.

In [6]:
p.x

1

In [7]:
p.y

2

In [8]:
p[0]

1

A named tuple will inherit all of the methods associated with tuples such as comparison and "addition"

In [9]:
p1 = Point(0, 1)
p2 = Point(3, 4)
p3 = Point(2, 2)

In [10]:
p1 > p2

False

In [11]:
p1 + p2

(0, 1, 3, 4)

In [12]:
l = [p1, p2, p3]
l 

[Point(x=0, y=1), Point(x=3, y=4), Point(x=2, y=2)]

In [13]:
sorted(l)

[Point(x=0, y=1), Point(x=2, y=2), Point(x=3, y=4)]

If the class definition needs to become more complicated you can define a new class that inherits from the namedtuple.

In [14]:
class Vector(Point):
    """A class based on the named tuple Point"""
    def __add__(self, other):
        return Vector(x = self.x + other.x, y = self.y + other.y)

In [15]:
v1 = Vector(0, 1)
v2 = Vector(3, 4)
v3 = v1 + v2

In [16]:
v3

Vector(x=3, y=5)

## Counters

Counters are like dictionaries and are useful for quickly tallying elements.

In [17]:
from collections import Counter

In [18]:
wordlist = ['red', 'blue', 'red', 'green', 'blue', 'blue']

In [19]:
tally = Counter(wordlist)

In [20]:
tally

Counter({'blue': 3, 'red': 2, 'green': 1})

In [21]:
tally.most_common(2)

[('blue', 3), ('red', 2)]

## List comprehensions

List comprehensions allow us to create new lists concisely based on an existing collection

They take the form:

`[expr for val in collection if condition]`

This is basically equivalent to the following loop:

```
result = []
for val in collection:
    if condition:
        result.append(expr)
```

In [22]:
# make a list of the squares 
[x ** 2 for x in range(1, 11)]

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

In [23]:
import numpy as np
np.array([x**2 for x in range(1, 11)])

array([  1,   4,   9,  16,  25,  36,  49,  64,  81, 100])

In [24]:
# square only the odd numbers
[x**2 for x in range(1, 11) if x % 2 == 1]

[1, 9, 25, 49, 81]

In [25]:
# take a list of strings, and write the words that are over 2 characters long in uppercase.
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

You can create a list comprehension from any iterable (list, tuple, string, etc)

In [26]:
# extract the digits from a string
string = "Hello 963257 World"
[int(x) for x in string if x.isdigit()]
# for x in string, will look at each character individually
# if x is a digit, then convert it using int()

[9, 6, 3, 2, 5, 7]

In [27]:
# iterate over a dictionary's items
d = {'a':'apple', 'b':'banana', 'c':'carrots', 'd':'donut', 'e':'eggs'}

In [28]:
list(d.items())  # recall what dict.items() returns: a list of tuples

[('a', 'apple'),
 ('b', 'banana'),
 ('c', 'carrots'),
 ('d', 'donut'),
 ('e', 'eggs')]

In [29]:
['%s is for %s' % (key, value) for key, value in d.items() if key not in ('b', 'd') ]

['a is for apple', 'c is for carrots', 'e is for eggs']

## Dictionary Comprehensions

A dict comprehension looks like this:

`dict_comp = {key-expr : value-expr for value in collection if condition}`

In [30]:
# create a dictionary, where the key is the word capitalized, and the value is the length of the word
fruits = ['apple', 'mango', 'banana', 'cherry']
{f.capitalize():len(f) for f in fruits}

{'Apple': 5, 'Mango': 5, 'Banana': 6, 'Cherry': 6}

In [31]:
# create a dictionary where the key is the index, and the value is the string in the strings list.
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

In [32]:
list(enumerate(strings))  # enumerate produces a collection of tuples, with index and value

[(0, 'a'), (1, 'as'), (2, 'bat'), (3, 'car'), (4, 'dove'), (5, 'python')]

In [33]:
index_map = {index:val for index, val in enumerate(strings)}
index_map

{0: 'a', 1: 'as', 2: 'bat', 3: 'car', 4: 'dove', 5: 'python'}

In [34]:
# note that enumerate returns tuples in the order (index, val)
# in the creation of a dictionary, you can swap those positions
# and even apply functions to them

# We create a dictionary where the key is the string, and the value is the index in the strings list.
loc_mapping = {val : index for index, val in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

In [35]:
index_map['a']

KeyError: 'a'

In [36]:
loc_mapping['a']

0

In [37]:
# combine dictionaries with kwargs 
dd = {**loc_mapping, **index_map}
print(dd)

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5, 0: 'a', 1: 'as', 2: 'bat', 3: 'car', 4: 'dove', 5: 'python'}


In [38]:
# even better... use dict.update(). This modifies the dictionary in place
loc_mapping.update(index_map)
loc_mapping

{'a': 0,
 'as': 1,
 'bat': 2,
 'car': 3,
 'dove': 4,
 'python': 5,
 0: 'a',
 1: 'as',
 2: 'bat',
 3: 'car',
 4: 'dove',
 5: 'python'}

## Generator Expressions

Generator Expressions are similar to List comprehensions, with the key difference being that they are *lazy*. 

You create them with parentheses instead of square brackets.

The result is a generator object. You can access values in the generator using `next()`

In [39]:
g = (n**2 for n in range(12))

In [40]:
g

<generator object <genexpr> at 0x000002631B870B30>

In [41]:
next(g)

0

In [42]:
next(g)

1

In [43]:
next(g)

4

In [44]:
next(g)

9

In [45]:
for val in g:
    print(val)

16
25
36
49
64
81
100
121


In [46]:
next(g) # calling next after it has run out of iterations will result in an error

StopIteration: 

## List Comprehension vs Generator Expressions in Python

The big difference between a list comprehension and a generator is that the generator is **lazy**.

The list comprehension will evaluate the entire sequence of iterations. The generator will only generate the next value when it is asked to do so.

Depending on the expression that needs to be evaluated, you may prefer to use a generator over the list comprehension.

The following examples are from: https://code-maven.com/list-comprehension-vs-generator-expression

In [47]:
l = [n*2 for n in range(1000)] # List comprehension
g = (n*2 for n in range(1000))  # Generator expression

In [48]:
print(type(l))  # 'list'
print(type(g))  # 'generator'

<class 'list'>
<class 'generator'>


In [49]:
import sys
print(sys.getsizeof(l))  # more space in memory
print(sys.getsizeof(g))  # less space in memory

8856
112


In [50]:
# cannot access values in a generator by index
print(l[4])   # 8
print(g[4])   # TypeError: 'generator' object is not subscriptable

8


TypeError: 'generator' object is not subscriptable

In [51]:
# you can interate over lists and generators
for item in l:
    print(item)
    if item > 12:
        break

0
2
4
6
8
10
12
14


In [52]:
for item in g:
    print(item)
    if item > 12:
        break

0
2
4
6
8
10
12
14


In [53]:
g

<generator object <genexpr> at 0x000002631B748820>

In [54]:
# sum demands that all elements of g be calculated so the generator evaluates them and provides the sum
# note that the first 8 values have already been evaluated, so the sum is the sum begins at n = 8
sum(g) 

998944

In [55]:
sum(l) # the list has all of the values in memory ready to be summed

999000

In [56]:
sum(l[8:]) # to get the equivalent sum, we can start it at 8

998944

In [57]:
# now that the generator has finished running, there are no more values left to evaluate
sum(g)

0

In [58]:
# the list is unaffected by calling sum on it.
sum(l)

999000

# map and lambda functions

The `map(function, iterable)` function takes a particular function and maps it to each element of an iterable. The object it returns is a map object which itself is iterable.

A lambda function allows you to create and use a new short function without having to formally define it.

In [59]:
# the module re is used for regular expressions
import re

In [60]:
# re.sub substitutes one pattern of text with another.
# Here we define a function that replaces multiple instances of white space (\s+) with one space:
def replace_space(x):
    return(re.sub('\s+', ' ', x))

In [61]:
replace_space('Hello     Alabama ')

'Hello Alabama '

In [62]:
text = ['Hello     Alabama', 
        'Georgia!',
        'Georgia',
        'georgia', 
        'FlOrIda',
        'south  carolina##',
        'West virginia?']

In [63]:
map(replace_space, text)

<map at 0x2631add45b0>

In [64]:
g2 = map(replace_space, text)

In [65]:
next(g2)

'Hello Alabama'

In [66]:
next(g2)

'Georgia!'

In [67]:
# we can use the map function to map the replace_space() function to each element of the list text
for item in map(replace_space, text):
    print(item)

Hello Alabama
Georgia!
Georgia
georgia
FlOrIda
south carolina##
West virginia?


In [68]:
# we can also put the map results inside a list
list(map(replace_space, text))

['Hello Alabama',
 'Georgia!',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south carolina##',
 'West virginia?']

In [69]:
# however, because the code for the function is so short, it might be easier to just create
# a quick function without a formal name. These 'anonymous' functions are also known as lambda functions
list(map(lambda x: re.sub('\s+',' ', x), text))

['Hello Alabama',
 'Georgia!',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south carolina##',
 'West virginia?']

In [70]:
# here's a similar function that turns the text into title case.
list(map(lambda string: string.title(), text))

['Hello     Alabama',
 'Georgia!',
 'Georgia',
 'Georgia',
 'Florida',
 'South  Carolina##',
 'West Virginia?']

lambda functions are written in the form:

`lambda argument1, argument2, etc: expression to return`

In [71]:
# lambda functions can also accept multiple arguments
# if you use it with map, you'll need to provide a list for each argument
list(map(lambda x, y: x + y, [1, 2, 3], [100, 200, 300]))

[101, 202, 303]