###### References: 
- https://docs.python.org/3/reference/datamodel.html   
- Fluent Python by Luciano Ramalho. Chapter 2: An Array of Sequences
Sequences include: string, lists, tuples, arrays, queues. All which share common operations including iteration, slicing, sorting and concatenation.

# Overview of Sequences
| Container Sequences | Flat Sequences |
| :---   | :---  |
| list | str |
| tuple | bytes |
| collections.deque | bytearray |
| - | memoryview |
| - | array.array |
| - | - |
| Hold items of different types.   | Hold items of one type. |
| Hold refernces to the objects they contain. | Store items in its own memory space. |


| Mutable sequences |  Immutable sequnces |
| :---  | :---  |
| list | tuple |
| bytearray  | string|
| array.array | bytes |
| collections.deque | - |
| memoryview | - |






# List Comprehension
Build lists from sequences or any other iterable types by filtering and transforming items.  
Easier to read.

In [3]:
symbols = '♥♦♣♠'
codes = [ord(symbol) for symbol in symbols]
codes

[9829, 9830, 9827, 9824]

### using map/filter:

In [4]:
beyond_ascii = list(filter(lambda c: c > 9825, map(ord, symbols)))
beyond_ascii

[9829, 9830, 9827]

###  using listcomp:

In [5]:
beyond_ascii = [ord(symbol) for symbol in symbols  if ord(symbol) > 9825]
beyond_ascii

[9829, 9830, 9827]

###  Cartesian product with listcomp:

In [6]:
ranks = [x for x in 'JQKA']
suits = [x for x in '♥♦♣♠']

card_deck = [(rank,suit) for rank in ranks 
                         for suit in suits]
card_deck

[('J', '♥'),
 ('J', '♦'),
 ('J', '♣'),
 ('J', '♠'),
 ('Q', '♥'),
 ('Q', '♦'),
 ('Q', '♣'),
 ('Q', '♠'),
 ('K', '♥'),
 ('K', '♦'),
 ('K', '♣'),
 ('K', '♠'),
 ('A', '♥'),
 ('A', '♦'),
 ('A', '♣'),
 ('A', '♠')]

# Generator Expressions
Genexps saves memory compared to listcomps, because it yields items one by one.

In [7]:
tuple(ord(symbol) for symbol in symbols)

(9829, 9830, 9827, 9824)

In [8]:
import array
array.array('I', (ord(symbol) for symbol in symbols))

array('I', [9829, 9830, 9827, 9824])

In [9]:
for card in ('%s %s' % (r,s) for r in ranks for s in suits):
    print(card)

J ♥
J ♦
J ♣
J ♠
Q ♥
Q ♦
Q ♣
Q ♠
K ♥
K ♦
K ♣
K ♠
A ♥
A ♦
A ♣
A ♠


# Tuples
Tuples are immutable lists.  
They are also used to hold records.

In [10]:
lax_coordinates = (1.359167, 103.989441)
city, year, pop, chg, area = ('Singapore', 2022, 5400000, -4.1, 728.6)
traveler_ids = [('USA', '31195827'), ('BRA', 'CE345828'), ('ESP', 'XDA205918')]
for passport in sorted(traveler_ids):
    print('%s/%s' % passport)

BRA/CE345828
ESP/XDA205918
USA/31195827


## Tuple Unpacking

In [11]:
for country, _  in traveler_ids:
    print(country)

USA
BRA
ESP


In [12]:
divmod(20, 8)

(2, 4)

In [13]:
t = (20, 8)
divmod(*t)

(2, 4)

In [14]:
a, b, *rest = range(5)
a, b, rest

(0, 1, [2, 3, 4])

In [15]:
*head, a, b = range(5)
head, a, b

([0, 1, 2], 3, 4)

## Nested Tuple Unpacking

In [16]:
metro_areas = [
    ('Tokyo', 'JP', 36.933, (35.689722, 139.691667)),   # <1>
    ('Delhi NCR', 'IN', 21.935, (28.613889, 77.208889)),
    ('Mexico City', 'MX', 20.142, (19.433333, -99.133333)),
    ('New York-Newark', 'US', 20.104, (40.808611, -74.020386)),
    ('Sao Paulo', 'BR', 19.649, (-23.547778, -46.635833)),
]

print('{:15} | {:^9} | {:^9}'.format('', 'lat.', 'long.'))
fmt = '{:15} | {:9.4f} | {:9.4f}'
for name, cc, pop, (latitude, longitude) in metro_areas:  # <2>
    if longitude <= 0:  # <3>
        print(fmt.format(name, latitude, longitude))

                |   lat.    |   long.  
Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
Sao Paulo       |  -23.5478 |  -46.6358


## Named Tuple
The  `collections.namedtuple`  function is a factory that produces subclasses  of tuple enhanced with field names and a class name.

In [17]:
from collections import namedtuple
City = namedtuple('City', 'name country population coordinates')
tokyo = City('Tokyo', 'JP', 14.043, (35.652832, 139.839478))
tokyo

City(name='Tokyo', country='JP', population=14.043, coordinates=(35.652832, 139.839478))

In [18]:
tokyo.population

14.043

In [19]:
tokyo.coordinates

(35.652832, 139.839478)

In [20]:
tokyo[1]

'JP'

In [21]:
City._fields

('name', 'country', 'population', 'coordinates')

In [22]:
LatLong = namedtuple('LatLong', 'lat long')
delhi_data = ('Delhi NCR', 'IN', 21.935, LatLong(28.613889, 77.208889))
delhi = City._make(delhi_data)
delhi._asdict()

OrderedDict([('name', 'Delhi NCR'),
             ('country', 'IN'),
             ('population', 21.935),
             ('coordinates', LatLong(lat=28.613889, long=77.208889))])

In [23]:
for  key, value in delhi._asdict().items():
    print(key + ':', value)

name: Delhi NCR
country: IN
population: 21.935
coordinates: LatLong(lat=28.613889, long=77.208889)


# Slicing
    seq[start:stop:step]
Slices and range excludes the last item to keep with the convention of zero-based indexing.  
* makes it easier to see the length of a slice or range
* easy to compute the length, i.e. `stop - start`
* easy to split a sequence in two parts at any index `x` without overlapping, e.g. `my_list[:x]` and `my_list[x:]`

In [24]:
l = [20, 30, 40, 50, 60]
l[:2]

[20, 30]

In [25]:
l[2:]

[40, 50, 60]

We can also assign name to slices to parse flat file data.

In [30]:
invoice = """
0.....6.................................40..........52...55........
1909  Pimoroni PiBrella                       $17.50  3  $52.50
1489  6mm Tactile Switch x20                   $4.95  2  $9.90
1510  Panavise Jr. - PV-201                   $28.00  1  $28.00
1601  PiTFT Mini Kit 320x240                  $34.95  1  $34.95
"""

SKU = slice(0, 6)
DESCRIPTION = slice(6, 40)
UNIT_PRICE = slice(40, 52)
QUANTITY = slice(52, 55)
ITEM_TOTAL = slice(55, None)

line_items = invoice.split('\n')[2:]

for item in line_items:
    print(item[QUANTITY], item[UNIT_PRICE], item[DESCRIPTION], item[ITEM_TOTAL])

  3       $17.50 Pimoroni PiBrella                    $52.50
  2        $4.95 6mm Tactile Switch x20               $9.90
  1       $28.00 Panavise Jr. - PV-201                $28.00
  1       $34.95 PiTFT Mini Kit 320x240               $34.95
   


## Assigning to Slices

In [31]:
l = list(range(10))
l

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [32]:
l[2:5] = [20, 30]
l

[0, 1, 20, 30, 5, 6, 7, 8, 9]

In [33]:
del l[5:7]
l

[0, 1, 20, 30, 5, 8, 9]

In [34]:
l[3::2] = [11, 22]
l

[0, 1, 20, 11, 5, 22, 9]

In [35]:
try:
    l[2:5] = 100
except TypeError as e:
    print(repr(e))

TypeError('can only assign an iterable')


In [36]:
l[2:5] = [100]
l

[0, 1, 100, 22, 9]

## Using + and * with Sequences
Pythonistas expect that sequences support `+` and `*`  
Both create a new object, and never change their operands.

In [37]:
l = [1, 2, 3]
l * 5

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

In [38]:
5 * 'abcd'

'abcdabcdabcdabcdabcd'

### Building Lists of Lists
A list with three lists of length 3 can represent a tic-tac-toe board

In [45]:
def print_b(board):
    for row in board:
        print(row)
        
board = [['_'] * 3 for i in range(3)]

print_b(board)

['_', '_', '_']
['_', '_', '_']
['_', '_', '_']


In [46]:
board[1][2] = 'X'
print_b(board)

['_', '_', '_']
['_', '_', 'X']
['_', '_', '_']


A list with three references to the same list is useless

In [47]:
weird_board = [['_'] * 3] * 3
print_b(weird_board)

['_', '_', '_']
['_', '_', '_']
['_', '_', '_']


In [48]:
weird_board[1][2] = 'O'
print_b(weird_board)

['_', '_', 'O']
['_', '_', 'O']
['_', '_', 'O']


## Augmented Assignment with Sequences
`+=`  is the application of special method `__iadd__` (for "in-place addition").  
If not implemented `a += b` has the same effect as `a = a + b`.  
Which clearly there is no way for that to happen for immutable sequences.

In [58]:
l = [1, 2, 3]
idl = id(l)
idl

4399381952

In [59]:
l *= 2
l

[1, 2, 3, 1, 2, 3]

In [60]:
id(l)

4399381952

In [61]:
t = (1, 2, 3)
idt = id(t)
idt

4399351760

In [62]:
t *= 2
t

(1, 2, 3, 1, 2, 3)

In [63]:
id(t)

4397866352

## A += Assignment Puzzler

    t = (1,  2, [30, 40])
    t[2] +=  [50, 60]

A. `t` becomes `(1, 2, [30, 40, 50, 60])`   
B. `TypeError` is raised with message `'tuple' object does not support item assignment`  
C. Both A and B  

    t

In [67]:
import dis

dis.dis('s[a] += b')

  1           0 LOAD_NAME                0 (s)
              2 LOAD_NAME                1 (a)
              4 DUP_TOP_TWO
              6 BINARY_SUBSCR
              8 LOAD_NAME                2 (b)
             10 INPLACE_ADD
             12 ROT_THREE
             14 STORE_SUBSCR
             16 LOAD_CONST               0 (None)
             18 RETURN_VALUE


##  list.sort and the sorted Built-In Function
The `list.sort` method sorts a list in place. It returns `None`.  
In contrast, the built-in function `sorted` creates a new sequence and returns it.

In [68]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
sorted(fruits)

['apple', 'banana', 'grape', 'raspberry']

In [69]:
fruits

['grape', 'raspberry', 'apple', 'banana']

In [70]:
sorted(fruits, reverse=True)

['raspberry', 'grape', 'banana', 'apple']

In [71]:
sorted(fruits, key=len)

['grape', 'apple', 'banana', 'raspberry']

In [72]:
sorted(fruits, key=len, reverse=True)

['raspberry', 'banana', 'grape', 'apple']

In [73]:
fruits.sort()
fruits

['apple', 'banana', 'grape', 'raspberry']

## Managing Ordered Sequences with bisect
The `bisect` module has two main functions -- `bisect` and `insort` -- that use the binary search algorithm to quickly find and insert items in any sorted sequence.
### `bisect(haystack, needle)`
bisect finds insertion points for items in a sorted sequence



In [74]:
# BEGIN BISECT_DEMO
import bisect
import sys

HAYSTACK = [1, 4, 5, 6, 8, 12, 15, 20, 21, 23, 23, 26, 29, 30]
NEEDLES = [0, 1, 2, 5, 8, 10, 22, 23, 29, 30, 31]

ROW_FMT = '{0:2d} @ {1:2d}    {2}{0:<2d}'

def demo(haystack, needles, bisect_fn):
    print('DEMO:', bisect_fn.__name__)  # <1> use the chosen bisect function to get to insertion point
    print('haystack ->', ' '.join('%2d' % n for n in haystack))
    for needle in reversed(needles):
        position = bisect_fn(haystack, needle)  # <2> build a pattern of vertical bars proportional to the offset
        offset = position * '  |'  # <3>  print formatted for showing needle and insert position
        print(ROW_FMT.format(needle, position, offset))  # <4> choose the bisect function to use according to the last command line argument

demo(HAYSTACK, NEEDLES, bisect.bisect)  # <5> print header with name of function selected
# END BISECT_DEMO

DEMO: bisect_right
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |29
23 @ 11      |  |  |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  5      |  |  |  |  |8 
 5 @  3      |  |  |5 
 2 @  1      |2 
 1 @  1      |1 
 0 @  0    0 


In [75]:
demo(HAYSTACK, NEEDLES, bisect.bisect_left)

DEMO: bisect_left
haystack ->  1  4  5  6  8 12 15 20 21 23 23 26 29 30
31 @ 14      |  |  |  |  |  |  |  |  |  |  |  |  |  |31
30 @ 13      |  |  |  |  |  |  |  |  |  |  |  |  |30
29 @ 12      |  |  |  |  |  |  |  |  |  |  |  |29
23 @  9      |  |  |  |  |  |  |  |  |23
22 @  9      |  |  |  |  |  |  |  |  |22
10 @  5      |  |  |  |  |10
 8 @  4      |  |  |  |8 
 5 @  2      |  |5 
 2 @  1      |2 
 1 @  0    1 
 0 @  0    0 


Given a test score, grade returns the corresponding letter grade

In [76]:
def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
    i = bisect.bisect(breakpoints, score)
    return grades[i]

[grade(score) for score in [55, 60, 65, 70, 75, 80, 85, 90, 95]]

['F', 'D', 'D', 'C', 'C', 'B', 'B', 'A', 'A']

bisect_left maps a score of 60 to grade F, not D

In [77]:
def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
    i = bisect.bisect_left(breakpoints, score)
    return grades[i]

[grade(score) for score in [55, 60, 65, 70, 75, 80, 85, 90, 95]]

['F', 'F', 'D', 'D', 'C', 'C', 'B', 'B', 'A']

### `insort(seq, item)`
Insort keeps a sorted sequence always sorted

In [78]:
import bisect
import random

SIZE = 7

random.seed(1729)

my_list = []

for i in range(SIZE):
    new_item = random.randrange(SIZE*2)
    bisect.insort(my_list, new_item)
    print(f'insert {new_item:2d} -> {my_list}')

insert 10 -> [10]
insert  0 -> [0, 10]
insert  6 -> [0, 6, 10]
insert  8 -> [0, 6, 8, 10]
insert  7 -> [0, 6, 7, 8, 10]
insert  2 -> [0, 2, 6, 7, 8, 10]
insert 10 -> [0, 2, 6, 7, 8, 10, 10]


# When a List Is Not the Answer
*  If  the list will only contain numbers

##  `array.array`
is more efficient.

Creating, saving, and loading a large array of floats

In [79]:
from array import array
from random import random

floats = array('d', (random() for i in range(10**7)))
floats[-1]

0.5963321947530882

In [80]:
with open('floats.bin', 'wb') as fp:
    floats.tofile(fp)

In [81]:
floats2 = array('d')

with open('floats.bin', 'rb') as fp:
    floats2.fromfile(fp, 10**7)

floats2[-1]

0.5963321947530882

In [82]:
floats2 == floats

True

# Memory Views
https://docs.python.org/dev/library/stdtypes.html#memoryview

The `memoryview` class is a shared-memory sequence type that allows handling of slices of array without copying bytes.

Changing the value of an array item by poking one of its bytes:

In [83]:
numbers = array('h', [-2, -1, 0, 1, 2]) # typecode 'h' - short signed integers
memv = memoryview(numbers)
len(memv)

5

In [84]:
memv[0]

-2

In [85]:
memv_oct = memv.cast('B') # typecode 'B' - unsigned char 
memv_oct.tolist()

[254, 255, 255, 255, 0, 0, 1, 0, 2, 0]

In [86]:
memv_oct[5] = 4
numbers

array('h', [-2, -1, 1024, 1, 2])

# NumPy and SciPy
For advanced array and matrix operation.

Basic operations with rows and columns in a `numpy.ndarray`:

In [87]:
import numpy as np
a = np.arange(12)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [88]:
type(a)

numpy.ndarray

In [89]:
a.shape

(12,)

In [90]:
a.shape = 3, 4
a

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [91]:
a[2]

array([ 8,  9, 10, 11])

In [92]:
a[2, 1]

9

In [93]:
a[:, 1]

array([1, 5, 9])

In [94]:
a.transpose()

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

Loading, saving, and vectorized operations:

In [95]:
with open('floats-1M-lines.txt', 'wt') as fp:
    for _ in range(1_000_000):
        fp.write(f'{random()}\n')

In [96]:
floats = np.loadtxt('floats-1M-lines.txt')

In [97]:
floats[-3:]

array([0.29150425, 0.33893554, 0.08112756])

Multiply every element by `.5`:

In [98]:
floats *= .5
floats[-3:]

array([0.14575213, 0.16946777, 0.04056378])

In [99]:
from time import perf_counter as pc

t0 = pc()
floats /= 3
(pc() - t0) < 0.01

True

In [100]:
np.save('floats-1M', floats)
floats2 = np.load('floats-1M.npy', 'r+')
floats2 *= 6

In [101]:
floats2[-3:]

memmap([0.29150425, 0.33893554, 0.08112756])

# Deques and Other Queues
Even though we can emulate FIFO with behaviour with `list` using `.append` and `.pop(0)`, it is costly because the entire list needs to be shifted.

The class `collection.deque` is thread-safe double-ended queue  designed for inserting and removing from boft ends.

It is also useful for keeping a list of 'x items' as a `deque` can be bounded. i.e. discards the items from the opposite end when it is full.

In [102]:
import collections

dq = collections.deque(range(10), maxlen=10)
dq

deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [103]:
dq.rotate(3)
dq

deque([7, 8, 9, 0, 1, 2, 3, 4, 5, 6])

In [104]:
dq.rotate(-4)
dq

deque([1, 2, 3, 4, 5, 6, 7, 8, 9, 0])

In [105]:
dq.appendleft(-1)
dq

deque([-1, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [106]:
dq.extend([11, 22, 33])
dq

deque([3, 4, 5, 6, 7, 8, 9, 11, 22, 33])

In [107]:
dq.extendleft([10, 20, 30, 40])
dq

deque([40, 30, 20, 10, 3, 4, 5, 6, 7, 8])