# Chapter 2 — An Array of Sequences

**Sections with code snippets in this chapter:**

* [List Comprehensions and Generator Expressions](#List-Comprehensions-and-Generator-Expressions)
* [Slicing](#Slicing)
* [Building Lists of Lists](#Building-Lists-of-Lists)
* [Augmented Assignment with Sequences](#Augmented-Assignment-with-Sequences)
* [list.sort and the sorted Built-In Function](#list.sort-and-the-sorted-Built-In-Function)
* [Managing Ordered Sequences with bisect](#Managing-Ordered-Sequences-with-bisect)
* [Arrays](#Arrays)
* [Memory Views](#Memory-Views)
* [NumPy and SciPy](#NumPy-and-SciPy)
* [Deques and Other Queues](#Deques-and-Other-Queues)
* [Soapbox](#Soapbox)

## List Comprehensions and Generator Expressions

#### Example 2-1. Build a list of Unicode codepoints from a string

In [1]:
symbols = '$¢£¥€¤'
codes = []

for symbol in symbols:
    codes.append(ord(symbol))

codes

[36, 162, 163, 165, 8364, 164]

#### Example 2-2. Build a list of Unicode codepoints from a string, take 2

In [2]:
symbols = '$¢£¥€¤'

codes = [ord(symbol) for symbol in symbols]

codes

[36, 162, 163, 165, 8364, 164]

#### Box: Listcomps No Longer Leak Their Variables

In [3]:
x = 'ABC'
codes = [ord(x) for x in x]
x

'ABC'

In [4]:
codes

[65, 66, 67]

#### Example 2-3. The same list built by a listcomp and a map/filter composition

In [5]:
symbols = '$¢£¥€¤'
beyond_ascii = [ord(s) for s in symbols if ord(s) > 127]
beyond_ascii

[162, 163, 165, 8364, 164]

In [6]:
beyond_ascii = list(filter(lambda c: c > 127, map(ord, symbols)))
beyond_ascii

[162, 163, 165, 8364, 164]

#### Example 2-4. Cartesian product using a list comprehension

In [7]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L']
tshirts = [(color, size) for color in colors for size in sizes]
tshirts

[('black', 'S'),
 ('black', 'M'),
 ('black', 'L'),
 ('white', 'S'),
 ('white', 'M'),
 ('white', 'L')]

In [8]:
for color in colors:
    for size in sizes:
        print((color, size))

('black', 'S')
('black', 'M')
('black', 'L')
('white', 'S')
('white', 'M')
('white', 'L')


In [9]:
shirts = [(color, size) for size in sizes
                        for color in colors]
tshirts

[('black', 'S'),
 ('black', 'M'),
 ('black', 'L'),
 ('white', 'S'),
 ('white', 'M'),
 ('white', 'L')]

#### Example 2-5. Initializing a tuple and an array from a generator expression

In [10]:
symbols = '$¢£¥€¤'
tuple(ord(symbol) for symbol in symbols)

(36, 162, 163, 165, 8364, 164)

In [11]:
import array
array.array('I', (ord(symbol) for symbol in symbols))

array('I', [36, 162, 163, 165, 8364, 164])

#### Example 2-6. Cartesian product in a generator expression

In [12]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L']

for tshirt in ('%s %s' % (c, s) for c in colors for s in sizes):
    print(tshirt)

black S
black M
black L
white S
white M
white L


## Slicing

### Why Slices and Range Exclude the Last Item

In [13]:
l = [10, 20, 30, 40, 50, 60]

l[:2]  # split at 2

[10, 20]

In [14]:
l[2:]

[30, 40, 50, 60]

In [15]:
l[:3]  # split at 3

[10, 20, 30]

In [16]:
l[3:]

[40, 50, 60]

### Slice Objects

In [17]:
s = 'bicycle'
s[::3]

'bye'

In [18]:
s[::-1]

'elcycib'

In [19]:
s[::-2]

'eccb'

#### Example 2-9. Line items from a flat-file invoice

In [20]:
invoice = """
0.....6.................................40........52...55........
1909 Pimoroni PiBrella                      $17.50    3    $52.50
1489 6mm Tactile Switch x20                  $4.95    2    $9.90
1510 Panavise Jr. - PV-201                  $28.00    1    $28.00
1601 PiTFT Mini Kit 320x240                 $34.95    1    $34.95
"""

SKU = slice(0, 6)
DESCRIPTION = slice(6, 40)
UNIT_PRICE = slice(40, 52)
QUANTITY = slice(52, 55)
ITEM_TOTAL = slice(55, None)

line_items = invoice.split('\n')[2:]

for item in line_items:
    print(item[UNIT_PRICE], item[DESCRIPTION])

    $17.50   imoroni PiBrella                  
     $4.95   mm Tactile Switch x20             
    $28.00   anavise Jr. - PV-201              
    $34.95   iTFT Mini Kit 320x240             
 


### Assigning to Slices

In [21]:
l = list(range(10))
l

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [22]:
l[2:5] = [20, 30]
l

[0, 1, 20, 30, 5, 6, 7, 8, 9]

In [23]:
del l[5:7]
l

[0, 1, 20, 30, 5, 8, 9]

In [24]:
l[3::2] = [11, 22]
l

[0, 1, 20, 11, 5, 22, 9]

By design, this example raises an exception::

In [25]:
try:
    l[2:5] = 100
except TypeError as e:
    print(repr(e))

TypeError('can only assign an iterable')


In [26]:
l[2:5] = [100]
l

[0, 1, 100, 22, 9]

### Using + and * with Sequences

In [27]:
l = [1, 2, 3]
l * 5

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

In [28]:
5 * 'abcd'

'abcdabcdabcdabcdabcd'

### Building Lists of Lists

#### Example 2-10. A list with three lists of length 3 can represent a tic-tac-toe board

In [29]:
board = [['_'] * 3 for i in range(3)]
board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [30]:
board[1][2] = 'X'
board

[['_', '_', '_'], ['_', '_', 'X'], ['_', '_', '_']]

#### Example 2-11. A list with three references to the same list is useless

In [31]:
weird_board = [['_'] * 3] * 3
weird_board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [32]:
weird_board[1][2] = 'O'
weird_board

[['_', '_', 'O'], ['_', '_', 'O'], ['_', '_', 'O']]

#### Explanation

In [33]:
board = []
for i in range(3):
    row = ['_'] * 3
    board.append(row)
board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [34]:
board[2][0] = 'X'
board

[['_', '_', '_'], ['_', '_', '_'], ['X', '_', '_']]

## Augmented Assignment with Sequences

In [35]:
l = [1, 2, 3]
idl = id(l)

In [36]:
# NBVAL_IGNORE_OUTPUT
idl

4414271936

In [37]:
l *= 2
l

[1, 2, 3, 1, 2, 3]

In [38]:
id(l) == idl  # same list

True

In [39]:
t = (1, 2, 3)
idt = id(t)

In [40]:
# NBVAL_IGNORE_OUTPUT
idt

4414275328

In [41]:
t *= 2
id(t) == idt # new tuple

False

### A += Assignment Puzzler

In [42]:
t = (1, 2, [30, 40])
try:
    t[2] += [50, 60]
except TypeError as e:
    print(repr(e))

TypeError("'tuple' object does not support item assignment")


In [43]:
t

(1, 2, [30, 40, 50, 60])

#### Example 2-14. Bytecode for the expression s[a] += b

In [44]:
import dis

dis.dis('s[a] += b')

  1           0 LOAD_NAME                0 (s)
              2 LOAD_NAME                1 (a)
              4 DUP_TOP_TWO
              6 BINARY_SUBSCR
              8 LOAD_NAME                2 (b)
             10 INPLACE_ADD
             12 ROT_THREE
             14 STORE_SUBSCR
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE


## list.sort and the sorted Built-In Function

In [45]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
sorted(fruits)

['apple', 'banana', 'grape', 'raspberry']

In [46]:
fruits

['grape', 'raspberry', 'apple', 'banana']

In [47]:
sorted(fruits, reverse=True)

['raspberry', 'grape', 'banana', 'apple']

In [48]:
sorted(fruits, key=len)

['grape', 'apple', 'banana', 'raspberry']

In [49]:
sorted(fruits, key=len, reverse=True)

['raspberry', 'banana', 'grape', 'apple']

In [50]:
fruits

['grape', 'raspberry', 'apple', 'banana']

In [51]:
fruits.sort()
fruits

['apple', 'banana', 'grape', 'raspberry']

## Managing Ordered Sequences with bisect

#### Example 2-15. bisect finds insertion points for items in a sorted sequence

In [52]:
# BEGIN BISECT_DEMO
import bisect
import sys

HAYSTACK = [1, 4, 5, 6, 8, 12, 15, 20, 21, 23, 23, 26, 29, 30]
NEEDLES = [0, 1, 2, 5, 8, 10, 22, 23, 29, 30, 31]

ROW_FMT = '{0:2d} @ {1:2d}    {2}{0:<2d}'

def demo(haystack, needles, bisect_fn):
    print('DEMO:', bisect_fn.__name__)  # <1>
    print('haystack ->', ' '.join('%2d' % n for n in haystack))
    for needle in reversed(needles):
        position = bisect_fn(haystack, needle)  # <2>
        offset = position * '  |'  # <3>
        print(ROW_FMT.format(needle, position, offset))  # <4>

demo(HAYSTACK, NEEDLES, bisect.bisect)  # <5>
# END BISECT_DEMO

DEMO: bisect_right
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |29
23 @ 11      |  |  |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  5      |  |  |  |  |8 
 5 @  3      |  |  |5 
 2 @  1      |2 
 1 @  1      |1 
 0 @  0    0 


In [53]:
demo(HAYSTACK, NEEDLES, bisect.bisect_left)

DEMO: bisect_left
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 12      |  |  |  |  |  |  |  |  |  |  |  |29
23 @  9      |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  4      |  |  |  |8 
 5 @  2      |  |5 
 2 @  1      |2 
 1 @  0    1 
 0 @  0    0 


#### Example 2-16. Given a test score, grade returns the corresponding letter grade

In [54]:
def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
    i = bisect.bisect(breakpoints, score)
    return grades[i]

[grade(score) for score in [55, 60, 65, 70, 75, 80, 85, 90, 95]]

['F', 'D', 'D', 'C', 'C', 'B', 'B', 'A', 'A']

#### Example 2-17. bisect_left maps a score of 60 to grade F, not D as in Example 2-16.

In [55]:
def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
    i = bisect.bisect_left(breakpoints, score)
    return grades[i]

[grade(score) for score in [55, 60, 65, 70, 75, 80, 85, 90, 95]]

['F', 'F', 'D', 'D', 'C', 'C', 'B', 'B', 'A']

#### Example 2-18. Insort keeps a sorted sequence always sorted

In [56]:
import bisect
import random

SIZE = 7

random.seed(1729)

my_list = []

for i in range(SIZE):
    new_item = random.randrange(SIZE*2)
    bisect.insort(my_list, new_item)
    print(f'insert {new_item:2d} -> {my_list}')

insert 10 -> [10]
insert  0 -> [0, 10]
insert  6 -> [0, 6, 10]
insert  8 -> [0, 6, 8, 10]
insert  7 -> [0, 6, 7, 8, 10]
insert  2 -> [0, 2, 6, 7, 8, 10]
insert 10 -> [0, 2, 6, 7, 8, 10, 10]


## When a List Is Not the Answer

### Arrays

#### Example 2-19. Creating, saving, and loading a large array of floats

In [57]:
from array import array
from random import random

floats = array('d', (random() for i in range(10**7)))
floats[-1]

0.5963321947530882

In [58]:
with open('floats.bin', 'wb') as fp:
    floats.tofile(fp)

In [59]:
floats2 = array('d')

with open('floats.bin', 'rb') as fp:
    floats2.fromfile(fp, 10**7)

floats2[-1]

0.5963321947530882

In [60]:
floats2 == floats

True

### Memory Views

#### Example 2-20. Changing the value of an array item by poking one of its bytes

In [61]:
numbers = array('h', [-2, -1, 0, 1, 2])
memv = memoryview(numbers)
len(memv)

5

In [62]:
memv[0]

-2

In [63]:
memv_oct = memv.cast('B')
memv_oct.tolist()

[254, 255, 255, 255, 0, 0, 1, 0, 2, 0]

In [64]:
memv_oct[5] = 4
numbers

array('h', [-2, -1, 1024, 1, 2])

### NumPy and SciPy

#### Example 2-21. Basic operations with rows and columns in a numpy.ndarray

In [65]:
import numpy as np
a = np.arange(12)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [66]:
type(a)

numpy.ndarray

In [67]:
a.shape

(12,)

In [68]:
a.shape = 3, 4
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [69]:
a[2]

array([ 8,  9, 10, 11])

In [70]:
a[2, 1]

9

In [71]:
a[:, 1]

array([1, 5, 9])

In [72]:
a.transpose()


array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

#### Example 2-22. Loading, saving, and vectorized operations

In [73]:
with open('floats-1M-lines.txt', 'wt') as fp:
    for _ in range(1_000_000):
        fp.write(f'{random()}\n')

In [74]:
floats = np.loadtxt('floats-1M-lines.txt')

In [75]:
floats[-3:]

array([0.29150425, 0.33893554, 0.08112756])

In [76]:
floats *= .5
floats[-3:]

array([0.14575213, 0.16946777, 0.04056378])

In [77]:
from time import perf_counter as pc

t0 = pc()
floats /= 3
(pc() - t0) < 0.01

True

In [78]:
np.save('floats-1M', floats)
floats2 = np.load('floats-1M.npy', 'r+')
floats2 *= 6

In [79]:
floats2[-3:]

memmap([0.29150425, 0.33893554, 0.08112756])

### Deques and Other Queues

#### Example 2-22. Working with a deque

In [80]:
import collections

dq = collections.deque(range(10), maxlen=10)
dq

deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [81]:
dq.rotate(3)
dq

deque([7, 8, 9, 0, 1, 2, 3, 4, 5, 6])

In [82]:
dq.rotate(-4)
dq

deque([1, 2, 3, 4, 5, 6, 7, 8, 9, 0])

In [83]:
dq.appendleft(-1)
dq

deque([-1, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [84]:
dq.extend([11, 22, 33])
dq

deque([3, 4, 5, 6, 7, 8, 9, 11, 22, 33])

In [85]:
dq.extendleft([10, 20, 30, 40])
dq

deque([40, 30, 20, 10, 3, 4, 5, 6, 7, 8])

## Soapbox

### Mixed bag lists

In [86]:
l = [28, 14, '28', 5, '9', '1', 0, 6, '23', 19]

In [87]:
try:
    sorted(l)
except TypeError as e:
    print(repr(e))

TypeError("'<' not supported between instances of 'str' and 'int'")


### Key is Brilliant

In [88]:
l = [28, 14, '28', 5, '9', '1', 0, 6, '23', 19]

sorted(l, key=int)

[0, '1', 5, 6, '9', 14, 19, '23', 28, '28']

In [89]:
sorted(l, key=str)

[0, '1', 14, 19, '23', 28, '28', 5, 6, '9']