## Overview of Built-In Sequences
container sequences: **list, tuple,** and **collections.deque** can hold items of different types, including **nested containers**.

flat sequences: **str, bytes, bytearray, memoryview,** and **array.array** hold items of one simple type.

### List Comprehensions

In [1]:
symbols = '$¢£¥€¤'
codes = []
for symbol in symbols:
    codes.append(ord(symbol))
codes

[36, 162, 163, 165, 8364, 164]

In [2]:
symbols = '$¢£¥€¤'
codes = [ord(symbol) for symbol in symbols]
codes

[36, 162, 163, 165, 8364, 164]

In [3]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L']
tshirts = [(color, size) for color in colors for size in sizes]
tshirts

[('black', 'S'),
 ('black', 'M'),
 ('black', 'L'),
 ('white', 'S'),
 ('white', 'M'),
 ('white', 'L')]

### Listcomps vs. map and filter

In [4]:
beyond_ascii = [ord(s) for s in symbols if ord(s) > 127]
beyond_ascii

[162, 163, 165, 8364, 164]

In [5]:
beyond_ascii = list(filter(lambda c: c > 127, map(ord, symbols)))
beyond_ascii

[162, 163, 165, 8364, 164]

### Generator Expressions
To initialize tuples, arrays, and other types of sequences, you could also start from a listcomp, but a genexp saves memory because it yields items one by one using the iterator protocol instead of building a whole list just to feed another constructor.

Genexps use the same syntax as listcomps, but are enclosed in parentheses rather than brackets.

In [6]:
import array
array.array('I', (ord(symbol) for symbol in symbols))

array('I', [36, 162, 163, 165, 8364, 164])

### Tuples
As records and as immutable list. First codeblock shows tuple unpacking.

In [7]:
lax_coordinates = (33.9425, -118.408056)
city, year, pop, chg, area = ('Tokyo', 2003, 32_450, 0.66, 8014)
traveler_ids = [('USA', '31195855'), ('BRA', 'CE342567'), ('ESP', 'XDA205856')]
for passport in sorted(traveler_ids):
    print('%s/%s' % passport)

BRA/CE342567
ESP/XDA205856
USA/31195855


In [8]:
for country, _ in traveler_ids:
    print(country)

USA
BRA
ESP


In [9]:
a = 1
b = 2
a, b = b, a
a, b

(2, 1)

In [10]:
t = (20, 8)
divmod(*t)

(2, 4)

In [11]:
a, b, *rest = range(5)
a, b, rest

(0, 1, [2, 3, 4])

In [12]:
a, *body, b = range(5)
a, body, b

(0, [1, 2, 3], 4)

In [13]:
*head, a, b = range(5)
head, a, b

([0, 1, 2], 3, 4)

Tuples are immutable references in them might not be!

In [14]:
a = (10, 'alpha', [1, 2])
b = (10, 'alpha', [1, 2])
print(a == b)
b[-1].append(99)
print(a == b)

b

True
False


(10, 'alpha', [1, 2, 99])

To check whether a tuple is mutable one possibility would be to compute its hash. Since a object is only hashable if its value cannot ever change.

In [15]:
def fixed(o):
    try:
        hash(o)
    except TypeError:
        return False
    return True

tf = (10, 'alpha', (1, 2))
tm = (10, 'alpha', [1, 2])

print(fixed(tf))
print(fixed(tm))

True
False


### Slicing
s\[a:b:c\] start a(including a), end b(excluding b), step c 

In [16]:
s = 'bicycle'
s[::3]

'bye'

In [17]:
s[::-1]

'elcycib'

In [18]:
l = list(range(10))
l

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [19]:
l[2:5] = [20, 30]
l

[0, 1, 20, 30, 5, 6, 7, 8, 9]

In [20]:
del l[5:7]
l

[0, 1, 20, 30, 5, 8, 9]

In [21]:
l[3::2] = [11, 22]
l

[0, 1, 20, 11, 5, 22, 9]

### Multidimensional Slicing
The \[\] operator can also take multiple indexes or slices separated by commas. Special methods will receive the input as a tuple so Python calls a.\_\_getitem__((i, j)). This is used for example in numpy.

In [22]:
import numpy as np
x = np.array([[1, 2, 3], [4, 5, 6]], np.int32)
x[1,1]

5

### Building Lists of Lists trickyness

In [23]:
board = [['_'] * 3 for i in range(3)]
board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [24]:
board[1][2] = 'X'
board

[['_', '_', '_'], ['_', '_', 'X'], ['_', '_', '_']]

In [25]:
weird_board = [['_'] * 3] * 3
weird_board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [26]:
weird_board[1][2] = 'X'
weird_board

[['_', '_', 'X'], ['_', '_', 'X'], ['_', '_', 'X']]

### Augmented Assignment wit Sequences
The special method that makes += work is \_\_iadd__ (for “in-place addition”). However, if \_\_iadd__ is not implemented, Python falls back to calling \_\_add__.

In [27]:
l = [1, 2, 3]
print(id(l))
l *= 2
print(id(l))
t = (1, 2, 3)
print(id(t))
t *= 2
print(id(t))

1276734821192
1276734821192
1276738676840
1276733376168


### list.sort and the sorted Built-In Function

The **list.sort** method sorts a list in-place, that is without making a copy. It returns **None** to remind us it changes the receiver.

In contrast, the built-in function **sorted** creates a new list and returns it (it accepts any iterable object as an argument, including immutable sequences and generators).

In [28]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
sorted(fruits)

['apple', 'banana', 'grape', 'raspberry']

In [29]:
sorted(fruits, reverse=True, key=len)

['raspberry', 'banana', 'grape', 'apple']

In [30]:
fruits.sort()

In [31]:
fruits

['apple', 'banana', 'grape', 'raspberry']

### Managing Ordered Sequences with bisect
The bisect module offers two main functions—**bisect and insort**—that use the binary search algorithm to quickly find and insert items in any sorted sequence.

In [32]:
import bisect
import sys

HAYSTACK = [1, 4, 5, 6, 8, 12, 15, 20, 21, 23, 23, 26, 29, 30]
NEEDLES = [0, 1, 2, 5, 8, 10, 22, 23, 29, 30, 31]

ROW_FMT = '{0:2d} @ {1:2d}    {2}{0:<2d}'

def demo(bisect_fn):
    for needle in reversed(NEEDLES):
        position = bisect_fn(HAYSTACK, needle)
        offset = position * '  |'
        print(ROW_FMT.format(needle, position, offset))

bisect_right = bisect.bisect
bisect_left = bisect.bisect_left

print('DEMO:', bisect_right.__name__)
print('haystack ->', ' '.join(f'{n:2}' for n in HAYSTACK))
demo(bisect_right)

print('DEMO:', bisect_left.__name__)
print('haystack ->', ' '.join(f'{n:2}' for n in HAYSTACK))
demo(bisect_left)

DEMO: bisect_right
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |29
23 @ 11      |  |  |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  5      |  |  |  |  |8 
 5 @  3      |  |  |5 
 2 @  1      |2 
 1 @  1      |1 
 0 @  0    0 
DEMO: bisect_left
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 12      |  |  |  |  |  |  |  |  |  |  |  |29
23 @  9      |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  4      |  |  |  |8 
 5 @  2      |  |5 
 2 @  1      |2 
 1 @  0    1 
 0 @  0    0 


In [33]:
import bisect
import random

SIZE = 10

random.seed(1729)

my_list = []
for i in range(SIZE):
    new_item = random.randrange(SIZE * 2)
    bisect.insort(my_list, new_item)
    print(f'{new_item:2d} -> {my_list}')

 1 -> [1]
13 -> [1, 13]
16 -> [1, 13, 16]
14 -> [1, 13, 14, 16]
 5 -> [1, 5, 13, 14, 16]
 1 -> [1, 1, 5, 13, 14, 16]
 5 -> [1, 1, 5, 5, 13, 14, 16]
 9 -> [1, 1, 5, 5, 9, 13, 14, 16]
16 -> [1, 1, 5, 5, 9, 13, 14, 16, 16]
11 -> [1, 1, 5, 5, 9, 11, 13, 14, 16, 16]


# When a list is not the answer
### Arrays
Saves a lot of memory when storing millions of floating-point values.
Even writing **.tofile** is super fast.

In [34]:
from array import array
from random import random

#double-precision floats (typecode 'd')
floats = array('d', (random() for i in range(10**7)))
floats[-1]

0.8652059623922853

In [35]:
# wb: write binary
fp = open('floats.bin', 'wb')
floats.tofile(fp)
floats2 = array('d')
# rb: read binary
fp = open('floats.bin', 'rb')
floats2.fromfile(fp, 10**7)
fp.close()
floats2 == floats

True

### Memory Views
The built-in memoryview class is a shared-memory sequence type that lets you handle slices of arrays without copying bytes.

In [36]:
from array import array
octets = array('B', range(6))
m1 = memoryview(octets)
m1.tolist()

[0, 1, 2, 3, 4, 5]

In [37]:
m2 = m1.cast('B', [2, 3])
m2.tolist()

[[0, 1, 2], [3, 4, 5]]

In [38]:
m3 = m1.cast('B', [3, 2])
m3.tolist()

[[0, 1], [2, 3], [4, 5]]

In [39]:
m2[1, 1] = 22
m3[1, 1] = 33

In [40]:
octets

array('B', [0, 1, 2, 33, 22, 5])

Build memoryview from array of 5 16-bit signed integers (typecode 'h'). Cast it to bytes. change to bytes changes number in integer.

In [41]:
numbers = array('h', [-2, -1, 0, 1, 2])
memv = memoryview(numbers)
memv[0]

-2

In [42]:
memv_oct = memv.cast('B')
memv_oct.tolist()
memv_oct[5] = 4
numbers

array('h', [-2, -1, 1024, 1, 2])

### NumPy
Used for advanced array and matrix operations. Implements multi-dimensional, homogeneous arrays, matrix types that hold not only numbers but also user-defined records, and provides efficient elementwise operations.

SciPy is a library, written on top of NumPy, offering many scientific computing algorithms from linear algebra, numerical calculus, and statistics.

Brief NumPy demo:

In [43]:
import numpy as np
a = np.arange(12)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [44]:
type(a)

numpy.ndarray

In [45]:
a.shape

(12,)

In [46]:
a.shape = 3, 4
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [47]:
a[2]

array([ 8,  9, 10, 11])

In [48]:
a[2, 1]

9

In [49]:
a[:, 1]

array([1, 5, 9])

In [50]:
a.transpose()

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [51]:
a *= 2
a

array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

### Deques and Other Queues
The .append and .pop methods make a list usable as a stack or a queue. But inserting and removing from the head of a list (the 0-index end) is costly because the entire list must be shifted in memory. There Queues have an advantage.

In [52]:
from collections import deque
dq = deque(range(10), maxlen=10)
dq

deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [53]:
dq.rotate(3)
dq

deque([7, 8, 9, 0, 1, 2, 3, 4, 5, 6])

In [54]:
dq.rotate(-4)
dq

deque([1, 2, 3, 4, 5, 6, 7, 8, 9, 0])

In [55]:
dq.appendleft(-1)
dq

deque([-1, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [56]:
dq.extend([11, 22, 33])
dq

deque([3, 4, 5, 6, 7, 8, 9, 11, 22, 33])

In [57]:
dq.extendleft([10, 20, 30, 40])
dq

deque([40, 30, 20, 10, 3, 4, 5, 6, 7, 8])

In [58]:
dq.popleft()


40

- **queue**: **SimpleQueue, Queue, LifoQueue, and PriorityQueue**. They don't discard items to make room as deque does. Instead, when the queue is full the insertion of a new item blocks.
- **multiprocessing**: Implements its own queue designed for interprocess communication. multiprocessing.JoinableQueue.
- **asyncio**: Provides **Queue, LifoQueue, PriorityQueue, and JoinableQueue** for managing tasks in asynchronous programming.
- **heapq**: In contrast to the previous three modules, heapq does not implement a queue class, but provides functions like heappush and heappop that let you use a mutable sequence as a heap queue or priority queue.
