In this string,

• {0:.2f} means to format the first argument as a floating-point number with two
decimal places.

• {1:s} means to format the second argument as a string.

• {2:d} means to format the third argument as an exact integer.

In [3]:
a = range(10)

In [7]:
for i in a:
   print(i) 

0
1
2
3
4
5
6
7
8
9


In [8]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [9]:
list(range(0,20,2))

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

In [10]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

In [12]:
seq = [1, 2, 3, 4]
for i in range(len(seq)):
    val = seq[i]

### Chapter 3 - Data Structure Python

#### Tuple:

A tuple is a fixed-length, immutable sequence of Python objects.

In [16]:
tup = 1, 2, 3,

In [17]:
tup

(1, 2, 3)

In [25]:
nested_tup = (4,[1,2,3],6), (2,3)

In [19]:
nested_tup

((4, 5, 6), (2, 3))

Elements can be accessed with square brackets [] as with most other sequence types

In [22]:
nested_tup[0]

(4, 5, 6)

In [23]:
nested_tup[0][0]

4

While the objects stored in a tuple may be mutable themselves, once the tuple is cre‐
ated it’s not possible to modify which object is stored in each slot:

In [24]:
nested_tup[2] = 'renato'

TypeError: 'tuple' object does not support item assignment

If an object inside a tuple is mutable, such as a list, you can modify it in-place:

In [26]:
nested_tup[0][1].append(4)

In [27]:
nested_tup

((4, [1, 2, 3, 4], 6), (2, 3))

In [29]:
(1, 4) + (2,4) + (3,4,5)

(1, 4, 2, 4, 3, 4, 5)

In [30]:
(1, 4) * 4

(1, 4, 1, 4, 1, 4, 1, 4)

Note that the objects themselves are not copied, only the references to them.

**Unpacking**

In [31]:
a, b = 1, 2

In [32]:
a

1

In [33]:
b, a = a, b

In [34]:
a

2

In [35]:
seq = [(1,2,3), (4,5,6), (7,8,9)]

In [38]:
for a, b, c in seq:
    print('a={0}, b={1}, c={2}'.format(a,b,c))

a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


**rest**

In [39]:
values = 1,2,3,4,5

In [54]:
a,b, *rest = values

In [55]:
a, b

(1, 2)

In [56]:
rest

[3, 4, 5]

In [57]:
a, b, *_  = values

In [60]:
b

2

#### List

In contrast with tuples, lists are variable-length and their contents can be modified
in-place. You can define them using square brackets [] or using the list type func‐
tion:

In [61]:
gen = range(10)

In [62]:
gen

range(0, 10)

In [64]:
type(gen)

range

In [67]:
list_example = list(gen)

In [69]:
list_example.append('Renato')

In [70]:
list_example

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'Renato']

In [72]:
list_example.insert(0, 'begin')

In [73]:
list_example


['begin', 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'Renato']

insert is computationally expensive compared with append,
because references to subsequent elements have to be shifted inter‐
nally to make room for the new element. If you need to insert ele‐
ments at both the beginning and end of a sequence, you may wish
to explore collections.deque, a double-ended queue, for this pur‐
pose.

In [77]:
fast = list(gen)
fast

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [78]:
fast.pop(1)

1

In [79]:
fast

[0, 2, 3, 4, 5, 6, 7, 8, 9]

Elements can be removed by value with remove, which locates the first such value and
removes it from the last:

In [82]:
fast.remove(7)


In [83]:
fast

[0, 2, 3, 4, 5, 6, 8, 9]

In [84]:
2 in fast

True

In [85]:
100 in fast

False

Checking whether a list contains a value is a lot slower than doing so with dicts and
sets (to be introduced shortly), as Python makes a linear scan across the values of the
list, whereas it can check the others (based on hash tables) in constant time.

**Concatenating and combining lists**

Note that list concatenation by addition is a comparatively expensive operation since
a new list must be created and the objects copied over. Using extend to append ele‐
ments to an existing list, especially if you are building up a large list, is usually pref‐
erable. Thus,

In [88]:
everything = []
list_of_lists = [[1,2,3,4,],[2,3,4,5]]

for chunk in list_of_lists:
 everything.extend(chunk)

In [89]:
# is faster than the concatenative alternative:
everything = []
for chunk in list_of_lists:
 everything = everything + chunk

**Sorting**

In [90]:
a = [7, 2, 5, 1, 3]

In [91]:
a.sort()

In [92]:
a

[1, 2, 3, 5, 7]

In [93]:
b = ['saw', 'small', 'He', 'foxes', 'six']

In [94]:
b.sort(key=len)

In [95]:
b

['He', 'saw', 'six', 'small', 'foxes']

**Binary search and maintaining a sorted list**

In [96]:
import bisect

In [97]:
c = [1,2,2,2,3,4,7]

In [99]:
bisect.bisect(c, 2)

4

In [100]:
bisect.bisect(c, 5)

6

In [101]:
c

[1, 2, 2, 2, 3, 4, 7]

In [102]:
bisect.insort(c, 6)

In [103]:
c

[1, 2, 2, 2, 3, 4, 6, 7]

The bisect module functions do not check whether the list is sor‐
ted, as doing so would be computationally expensive. Thus, using
them with an unsorted list will succeed without error but may lead
to incorrect results.


**Slicing**

In [111]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq.sort()

seq[1:5]

[1, 2, 3, 5]

In [112]:
seq

[0, 1, 2, 3, 5, 6, 7, 7]

In [113]:
seq[3:4] = [6, 3]

In [114]:
seq

[0, 1, 2, 6, 3, 5, 6, 7, 7]

In [115]:
seq[:5]

[0, 1, 2, 6, 3]

In [116]:
seq[5:]

[5, 6, 7, 7]

In [117]:
seq.sort()

In [118]:
seq

[0, 1, 2, 3, 5, 6, 6, 7, 7]

In [119]:
seq.pop(7)

7

In [120]:
seq

[0, 1, 2, 3, 5, 6, 6, 7]

In [121]:
seq[-4:]

[5, 6, 6, 7]

In [122]:
seq[:-4]

[0, 1, 2, 3]

In [123]:
seq[-4:-4]

[]

In [136]:
seq[:-4] == seq[:4]

True

In [137]:
seq[-4:] == seq[4:]

True

In [139]:
seq.pop(6)


6

A step can also be used after a second colon to, say, take every other element:

In [141]:
seq[::2]

[0, 2, 5, 7]

A clever use of this is to pass -1, which has the useful effect of reversing a list or tuple:

In [143]:
seq[::-1]

[7, 6, 5, 3, 2, 1, 0]

**Built-in Sequence Functions**

Enumerate:

In [167]:
collection = ['foo', 'bar', 'baz']
mapping = {}

In [168]:
for i, v in enumerate(collection):
    mapping[v] = i

In [169]:
mapping

{'foo': 0, 'bar': 1, 'baz': 2}

**Sorted**

In [170]:
sorted(collection)

['bar', 'baz', 'foo']

In [171]:
sorted(seq)

[0, 1, 2, 3, 5, 6, 7]

In [172]:
sorted('renato')

['a', 'e', 'n', 'o', 'r', 't']

**zip**

In [175]:
seq1 = ['foo', 'bar', 'baz']
seq2 = ['one', 'two', 'three']
zipped = zip(seq1, seq2)
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

zip can take an arbitrary number of sequences, and the number of elements it pro‐
duces is determined by the shortest sequence:

In [176]:
seq3 = [False, True]
list(zip(seq1, seq2, seq3))

[('foo', 'one', False), ('bar', 'two', True)]

A very common use of zip is simultaneously iterating over multiple sequences, possi‐
bly also combined with enumerate:


In [177]:
for i, (a,b) in enumerate(zip(seq1, seq2)):
    print('{0}: {1}, {2}'.format(i, a, b))

0: foo, one
1: bar, two
2: baz, three


Given a “zipped” sequence, zip can be applied in a clever way to “unzip” the
sequence. Another way to think about this is converting a list of rows into a list of
columns. The syntax, which looks a bit magical, is:

In [183]:
pitchers = [('Nolan', 'Ryan'), ('Roger', 'Clemens'), ('Schilling', 'Curt')]
first_names, last_names = zip(*pitchers)

In [184]:
first_names

('Nolan', 'Roger', 'Schilling')

In [185]:
last_names

('Ryan', 'Clemens', 'Curt')

**reversed**

Keep in mind that reversed is a generator (to be discussed in some more detail later),
so it does not create the reversed sequence until materialized (e.g., with list or a for
loop).

In [186]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

#### Dict

dict is likely the most important built-in Python data structure. A more common
name for it is hash map or associative array. It is a flexibly sized collection of key-value
pairs, where key and value are Python objects. One approach for creating one is to use
curly braces {} and colons to separate keys and values:

In [188]:
empty_dict = {}
d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}


In [189]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

In [195]:
d1[7] = 'an integer'

In [197]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

You can delete values either using the del keyword or the pop method (which simul‐
taneously returns the value and deletes the key):

In [198]:
del d1[7]

In [199]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

In [200]:
value = d1.pop('a')

In [201]:
value

'some value'

In [202]:
d1['renato'] = 'vencedor'

In [203]:
d1

{'b': [1, 2, 3, 4], 'renato': 'vencedor'}

In [204]:
list(d1.keys())

['b', 'renato']

In [205]:
list(d1.values())

[[1, 2, 3, 4], 'vencedor']

In [206]:
d1.update({'c': 'test', 'd':[1,2,3,4]})

In [207]:
d1

{'b': [1, 2, 3, 4], 'renato': 'vencedor', 'c': 'test', 'd': [1, 2, 3, 4]}

**Creating dicts from sequences**

In [209]:
key_list = [1,2,3,4,5,6,7]
value_list = ['a', 'b', 'c', 'd', 'f', 'g', 'h']

In [210]:
mapping = {}
for key, value in zip(key_list, value_list):
    mapping[key] = value

In [211]:
mapping

{1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'f', 6: 'g', 7: 'h'}

In [212]:
mapping = dict(zip(range(5), reversed(range(5))))

In [213]:
mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

**Default value**

In [None]:
if key in some_dict:
 value = some_dict[key]
else:
 value = default_value

value = some_dict.get(key, default_value)

In [229]:
words = ['apple', 'bat', 'bar', 'atom', 'book']

In [230]:
by_letter = {}

In [217]:
for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)

In [218]:
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [226]:
for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)

In [227]:
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [231]:
from collections import defaultdict
by_letter = defaultdict(list)
for word in words:
    by_letter[word[0]].append(word)

In [234]:
by_letter['b']

['bat', 'bar', 'book']

In [235]:
by_letter

defaultdict(list, {'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']})

**Valid dict key types**

hashbility

In [236]:
hash('string')

-3469262851541541628

In [237]:
hash((1, 2, (2, 3)))

-9209053662355515447

In [238]:
hash((1, 2, [2, 3]))

TypeError: unhashable type: 'list'

In [239]:
d = {}

In [240]:
d[tuple([1,2,3])] = 5

In [241]:
d

{(1, 2, 3): 5}

#### Set

A set is an unordered collection of unique elements. You can think of them like dicts,
but keys only, no values. A set can be created in two ways: via the set function or via
a set literal with curly braces:


In [242]:
set([1,2,3,4,5,6])

{1, 2, 3, 4, 5, 6}

In [243]:
{1,2,3,4,5,6}

{1, 2, 3, 4, 5, 6}

Sets support mathematical set operations like union, intersection, difference, and
symmetric difference. Consider these two example sets:

In [244]:
a = {1, 2, 3, 4, 5}

In [245]:
b = {3, 4, 5, 6, 7, 8}


In [246]:
a.union(b)

{1, 2, 3, 4, 5, 6, 7, 8}

In [247]:
a

{1, 2, 3, 4, 5}

In [248]:
b

{3, 4, 5, 6, 7, 8}

In [249]:
a | b

{1, 2, 3, 4, 5, 6, 7, 8}

In [250]:
a |= b

In [251]:
a

{1, 2, 3, 4, 5, 6, 7, 8}

In [252]:
a & b

{3, 4, 5, 6, 7, 8}

In [253]:
a &= b

In [255]:
a

{3, 4, 5, 6, 7, 8}

In [256]:
a.issubset(b)

True

In [258]:
# True if a and b have no elements in common
a.isdisjoint(b)

False

In [259]:
c = a.copy()

In [260]:
c

{3, 4, 5, 6, 7, 8}

In [261]:
c |= b

In [262]:
c

{3, 4, 5, 6, 7, 8}

In [263]:
b

{3, 4, 5, 6, 7, 8}

In [265]:
b.add(1)

In [266]:
b

{1, 3, 4, 5, 6, 7, 8}

In [267]:
a

{3, 4, 5, 6, 7, 8}

In [268]:
c 

{3, 4, 5, 6, 7, 8}

In [269]:
c |= b

In [270]:
c

{1, 3, 4, 5, 6, 7, 8}

Like dicts, set elements generally must be immutable. To have list-like elements, you
must convert it to a tuple:

In [273]:
my_data = [1, 2, 3, 4]

In [274]:
my_set = {tuple(my_data)}

In [275]:
my_set

{(1, 2, 3, 4)}

In [277]:
a_set = {1,2,3,4,5}

You can also check if a set is a subset of (is contained in) or a superset of (contains all
elements of) another set:

In [278]:
a_set = {1,2,3,4,5}

In [279]:
{1,2,3}.issubset(a_set)

True

In [280]:
a_set.issuperset({1,2,3})

True

Sets are equal if and only if their contents are equal:

In [281]:
{1,2,3} == {3,2,1}

True

**List, Set, and Dict Comprehensions**

List comprehensions are one of the most-loved Python language features. They allow
you to concisely form a new list by filtering the elements of a collection, transforming
the elements passing the filter in one concise expression. They take the basic form:

[expr for val in collection if condition]


In [None]:
result = []
for val in collection:
    if condition:
        result.append(expr)

In [284]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

In [285]:
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

Set and dict comprehensions are a natural extension, producing sets and dicts in an
idiomatically similar way instead of lists. A dict comprehension looks like this:
dict_comp = {key-expr : value-expr for value in collection
 if condition}

A set comprehension looks like the equivalent list comprehension except with curly
braces instead of square brackets:
set_comp = {expr for value in collection if condition}


In [286]:
unique_lengths = {len(x) for x in strings}

In [287]:
unique_lengths

{1, 2, 3, 4, 6}

In [288]:
set(map(len, strings))

{1, 2, 3, 4, 6}

In [290]:
loc_mapping = {index: val for index, val in enumerate(strings)}
loc_mapping

{0: 'a', 1: 'as', 2: 'bat', 3: 'car', 4: 'dove', 5: 'python'}

**Nested list comprehensions**

In [293]:
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
            ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

In [294]:
names_of_interest = []
for names in all_data:
    enough_es = [name for name in names if name.count('e') >= 2]
    names_of_interest.extend(enough_es)

In [295]:
names_of_interest

['Steven']

In [296]:
result = [name for names in all_data for name in names if name.count('e') >= 2]

In [297]:
result

['Steven']

In [299]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [300]:
flattened = [x for tup in some_tuples for x in tup]

In [301]:
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [302]:
flattened = []
for tup in some_tuples:
    for x in tup:
        flattened.append(x)

In [303]:
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [304]:
[[x for x in tup] for tup in some_tuples]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

## 3.2 Functions

Each function can have positional arguments and keyword arguments. Keyword argu‐
ments are most commonly used to specify default values or optional arguments. In
the preceding function, x and y are positional arguments while z is a keyword argu‐
ment. This means that the function can be called in any of these ways:
    
my_function(5, 6, z=0.7)

my_function(3.14, 7, 3.5)

my_function(10, 20)

In [None]:
Cleaning some data

In [312]:
import re

states = [' Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda', 'south carolina##', 'West virginia?']

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub('[!#?]', '', value)
        value = value.title()
        result.append(value)
    return result

In [313]:
clean_strings(states)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

In [316]:
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

clean_ops = [str.strip, remove_punctuation, str.title]

def clean_strings(strings, ops):
    result = []
    for value in strings:
        for function in ops:
            value = function(value)
        result.append(value)
    return result

In [317]:
clean_strings(states, clean_ops)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

In [318]:
for x in map(remove_punctuation, states):
    print(x)

 Alabama 
Georgia
Georgia
georgia
FlOrIda
south carolina
West virginia


In [320]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]

ints = [4, 0, 1, 5, 6]
apply_to_list(ints, lambda x: x * 2)

[8, 0, 2, 10, 12]

In [321]:
[ x * 2 for x in ints]

[8, 0, 2, 10, 12]

In [322]:
strings = ['foo', 'card', 'bar', 'aaaa', 'abab']

In [323]:
strings.sort(key=lambda x: len(set(list(x))))

In [324]:
strings

['aaaa', 'foo', 'abab', 'bar', 'card']

Currying: Partial Argument Application

Currying is computer science jargon (named after the mathematician Haskell Curry)
that means deriving new functions from existing ones by partial argument applica‐
tion. 

In [325]:
def add_numbers(x, y):
    return x + y

In [326]:
add_five = lambda y: add_numbers(5, y)

The second argument to add_numbers is said to be curried

In [331]:
from functools import partial
add_five = partial(add_numbers, 5)

In [332]:
add_five(10)

15

In [333]:
gen = (x ** 2 for x in range(100))

In [335]:
gen

<generator object <genexpr> at 0x000002428CDC7040>

In [345]:
def _make_gen():
    for x in range(100):
        yield x ** 2
gen = _make_gen()

In [346]:
gen

<generator object _make_gen at 0x000002428D01E580>

In [348]:
dict((i, i ** 2) for i in range(5))

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

**itertools module**


In [352]:
import itertools

In [353]:
first_letter = lambda x: x[0]

In [354]:
names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']

In [355]:
for letter, names in itertools.groupby(names, first_letter):
    print(letter, list(names))

A ['Alan', 'Adam']
W ['Wes', 'Will']
A ['Albert']
S ['Steven']


**Handling Error**

In [360]:
def attempt_float(x):
    try:
        return float(x)
    except (ValueError):
        return x

In [361]:
attempt_float((1,2))

TypeError: float() argument must be a string or a number, not 'tuple'

In [362]:
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x

In [363]:
attempt_float((1,2))

(1, 2)

***Open-Files**

In [None]:
with open(path) as f:
    lines = [x.rstrip() for x in f]

If we had typed f = open(path, 'w'), a new file at examples/segismundo.txt would
have been created (be careful!), overwriting any one in its place. There is also the 'x'
file mode, which creates a writable file but fails if the file path already exists.