## Overview of Built-in Sequeneces
- Container sequences (hold references to the objects it contains)

    `list`, `tuple` and `collections.deque` can hold items of different types, including nested containers

- Flat sequences (stores the value of its contents in its own memory space)

    `str`, `bytes`, `bytearray`, `memoryview` and `array.array` hold items of one simple type (only primitive machine values, like bytes, integers and floats)

Another way of grouping sequence types:
- Mutable sequences

    `list`, `bytearray`, `memoryview`, `array.array` and `collections.deque`
- Immutable sequences

    `tuple`, `str` and `bytes`

    Assignment statements are not supported by immutable sequences! But other functions that change the value of mutable sequences may be allowed (e.g. `append` for `list`)

## List Comprehensions (listcomps) and Generator Expressions (genexps)
### List Comprehensions
Listcomps build lists from sequences or any other iterable type by filtering and transforming items.

In [1]:
symbols = "WKW快去写代码"
# `ord` function returns an integer representing the Unicode character.
beyond_ascii = [ord(s) for s in symbols if ord(s) > 127]
beyond_ascii

[24555, 21435, 20889, 20195, 30721]

This same list can also be built by a map/filter composition, but readability suffers


In [2]:
beyond_ascii = list(filter(lambda c : c > 127, map(ord, symbols)))
beyond_ascii

[24555, 21435, 20889, 20195, 30721]

Example: imagine you need to produce a list of T-shirts available in two colors and three sizes

In [3]:
colors = ["black", "white"]
sizes = ["S", "M", "L"]
tshirts = [(color, size) for color in colors for size in sizes]
tshirts

[('black', 'S'),
 ('black', 'M'),
 ('black', 'L'),
 ('white', 'S'),
 ('white', 'M'),
 ('white', 'L')]

### Generator Expressions
Genexp can be used to initialize tuples, [arrays](#Arrays-(flat,-mutable)) and other types of sequences. It yield items **one by one** instead of building a whole list

Genexps use the same syntax as listcomps, but are enclosed in parentheses `()` rather than brackets `[]`

In [4]:
symbols = "WKW快去写代码"

tuple(ord(s) for s in symbols if ord(s) > 127) # genexp is the only argument here, so no need for more parenthesis

(24555, 21435, 20889, 20195, 30721)

In [5]:
import array
array.array("I", (ord(s) for s in symbols if ord(s) > 127)) # array constructor takes two arguments, parentheses around the genexp are mandatory

array('I', [24555, 21435, 20889, 20195, 30721])

In [6]:
colors = ["black", "white"]
sizes = ["S", "M", "L"]

# yield items one by one, a list with all 6 T-shirts variations is never produced
for tshirt in (f"{color} {size}" for color in colors for size in sizes):
    print(tshirt)

black S
black M
black L
white S
white M
white L


## Tuples are NOT just immutable lists
### Tuples as Records (Unpacking)

In [7]:
# parallel assignment
lax_coordinates = (33.9425, -118.4108056)
latitude, longtitude = lax_coordinates
print(latitude)
print(longtitude)

33.9425
-118.4108056


In [8]:
# swapping without a temporary variable
a, b = 5, 6
b, a = a, b
print(a)
print(b)

6
5


In [9]:
# prefixing an argument with a `*` when calling a function
t = (20, 8)

# returns a tuple consisting of their quotient and remainder.
divmod(*t)

(2, 4)

In [10]:
import os

# _ can be used as a placeholder for what we don't need
_, filename = os.path.split("/Desktop/Fluent-Python-2/02.ipynb")
filename

'02.ipynb'

In [11]:
# Using `*` to grab excess(0~) items, and not necessary at the end
a, *rest, b = range(5)
rest

[1, 2, 3]

In [12]:
a, b, *rest = range(2)
rest

[]

### Nested Tuple Unpacking

In [13]:
metro_areas = [
    ('Tokyo', 'JP', 36.933, (35.689722, 139.691667)),
    ('Delhi NCR', 'IN', 21.935, (28.613889, 77.208889)),
    ('Mexico City', 'MX', 20.142, (19.433333, -99.133333)),
    ('New York-Newark', 'US', 20.104, (40.808611, -74.020386)),
    ('Sao Paulo', 'BR', 19.649, (-23.547778, -46.635833)),
]

# 15 is the width of the first column, `^` means centered
print(f'{"":15} | {"lat.":^9} | {"long.":^9}')
for name, cc, pop, (latitude, longtitude) in metro_areas:
    if longtitude <= 0:
        print(f'{name:15} | {latitude:9.4f} | {longtitude:9.4f}')

                |   lat.    |   long.  
Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
Sao Paulo       |  -23.5478 |  -46.6358


### Tuples as Immutable Lists
- Length of a `tuple` will never change
- A `tuple` uses less memory than a `list` of the same length
- The immutability of a `tuple` only applies to the references contained in it. References in a tuple cannot be deleted or replaced, but if one of those references points to a mutable object, and that object is changed, then the value of the `tuple` **changes**.

In [14]:
a = (10, 'alpha', [1, 2])
a[-1].append(99)

a

(10, 'alpha', [1, 2, 99])

How to make sure a `tuple` will stay unchanged?
- An object is only hashable if its value cannot ever change. Therefore here is a way:

In [15]:
def fixed(obj):
    try:
        hash(obj)
    except TypeError:
        return False
    return True

a = (10, 'alpha', [1, 2])
b = (10, 'alpha', (1, 2))
print(fixed(a))
print(fixed(b))

False
True


### Tuple VS. List methods
`tuple` supports all `list` methods that do not involve adding or removing items, with one exception:tuple lacks the `__reversed__` method, but `reversed(my_tuple)` works without it.

## Slicing
### Slice Objects
`s[a:b:c]` specify a stride or step `c`, which can also be negative

In [16]:
s = 'bicycle'
s[::3], s[::-1], s[2::2]

('bye', 'elcycib', 'cce')

`s[a:b:c]` actually produces a slice object `slice(a, b, c)`. Python intepreter will call `s.__getitem__(slice(a, b, c))`.

As a result, we can also name some slices to make them more readable

In [17]:
invoice = """
0.....6.................................40........52...55........
1909  Pimoroni PiBrella                     $17.50    3    $52.50
1489  6mm Tactile Switch x20                 $4.95    2     $9.90
1510  Panavise Jr. - PV-201                 $28.00    1    $28.00
1601  PiTFT Mini Kit 320x240                $34.95    1    $34.95
"""
SKU = slice(0, 6)
DESCRIPTION = slice(6, 40)
UNIT_PRICE = slice(40, 52)
QUANTITY =  slice(52, 55)
ITEM_TOTAL = slice(55, None)

line_items = invoice.split("\n")[2:-1]

for item in line_items:
    print(item[UNIT_PRICE], item[DESCRIPTION])

    $17.50   Pimoroni PiBrella                 
     $4.95   6mm Tactile Switch x20            
    $28.00   Panavise Jr. - PV-201             
    $34.95   PiTFT Mini Kit 320x240            


### Multidimensional Slicing and Ellipsis
To evaluate `a[i, j]`, Python calls `a.__getitem__((i, j))`

The Ellipsis `...` can be passed as an argument to functions and as part of a slice specification, as in `f(a, ..., z)` or `a[i:...]`

Multidimensional Slicing and Ellipsis are mostly used to support user-defined types or extensions such as Numpy, where `x[i,...]` is a shortcut for `x[i,:,:,:]` if `x` is a four-dimensional array

### Assigning to Slices
Mutable sequences can be modified in place using slice notation on the left-hand of an assignment statement or as the target of a `del` statement

In [18]:
l = list(range(10))
# left-hand of an assignment statement, 
l[2:5] = [20] # brackets are necessary
print(l)

# target of a `del` statement
del l[2:5]
print(l)

[0, 1, 20, 5, 6, 7, 8, 9]
[0, 1, 7, 8, 9]


## Using `+` and `*` with Sequences
- Usually both operands of `+` must be of the same sequence type.
- Both `+` and `*` always create a new object, and never change their operands


In [19]:
5 * 'abcd'

'abcdabcdabcdabcdabcd'

### Buliding Lists of Lists 
- `['_'] * 3` is equivalent to `['_' for i in range(3)]`
- But `[['_'] * 3] * 3` is not equivalent to `[['_'] * 3 for i in range(3)]`

In [20]:
board = [['_'] * 3 for i in range(3)]
print(board)
# board[1,2] = 'X' #list indices must be integers or slices, not tuple
board[1][2] = 'X'
board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]


[['_', '_', '_'], ['_', '_', 'X'], ['_', '_', '_']]

In [21]:
# A tempting but wrong shortcut
weird_board = [['_'] * 3] * 3 # `weird_board` is actually made of three references to the SAME inner list
print(weird_board)
weird_board[1][2] = 'O' # Also changes other two rows
weird_board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]


[['_', '_', 'O'], ['_', '_', 'O'], ['_', '_', 'O']]

In [22]:
temp = ['_' for i in range(3)]
temp[1] = 'O'
temp

['_', 'O', '_']

## Augumented Assignment with Sequences
`+=` calls `__iadd__` (in-place addition) if available. If `__iadd__` is not implemented, Python will call `__add__` to calculate the sum, and them bind the sum to the original object. So whether the `id` of object `a` changes, depends on whether we implemented `__iadd__` or not.

`*=` calls `__imul__` respectively

In [23]:
class A:
    def __init__(self, x=[0]):
        self.x = x
    def __add__(self, y):
        return A(self.x + y)
    def __iadd__(self, y):
        self.x += y
        return self

class B:
    def __init__(self, x=0):
        self.x = x
    def __add__(self, y):
        return A(self.x + y)

a = A([5])
old_id = id(a)
a += [5]
new_id = id(a)
print(old_id == new_id)

b = B([5])
old_id = id(b)
b += [5]
new_id = id(b)
print(old_id == new_id)

True
False


Argumented assignment does not change the `id` of mutable sequence, but changes the `id` of immutable sequence, because it creates a new object

In [24]:
l = [1, 2, 3]
old_id = id(l)
l *= 2
new_id = id(l)
print(old_id == new_id)

True


In [25]:
t = (1, 2, 3)
old_id = id(t)
t *= 2
new_id = id(t)
print(old_id == new_id)

False


### A `+=` riddle

In [26]:
t = (1, 2, [30, 40])

try:
    t[2] += [50, 60]
except TypeError:   # TypeError because `tuple` object does not support item assignment
    print(t)        # Argument assignment succeed

(1, 2, [30, 40, 50, 60])


In [27]:
# Bytecode for the expression `t[2] += [50, 60]`
import dis
dis.dis('t[2] += [50, 60]')

  1           0 LOAD_NAME                0 (t)
              2 LOAD_CONST               0 (2)
              4 DUP_TOP_TWO
              6 BINARY_SUBSCR
              8 LOAD_CONST               1 (50)
             10 LOAD_CONST               2 (60)
             12 BUILD_LIST               2
             14 INPLACE_ADD
             16 ROT_THREE
             18 STORE_SUBSCR
             20 LOAD_CONST               3 (None)
             22 RETURN_VALUE



- `BINARY_SUBSCR`: Put the value of `t[2]` on TOS(Top Of Stack)
- `INPLACE_ADD`: Perform `TOS += b`. This succeed if `TOS` refers to a mutable object
- `STORE_SUBSCR`: Assign `t[2] = TOS`. This fails if `s` is immutable

AVOID putting mutable items in tuples!!!

## `list.sort` and the `sorted` built-in function
- `list.sort` method sorts a list in-place without making a new copy and returns `None` to remind us that it changes the receiver and does not create a new list. **Similar behavior can be seen in other in-place functions** (e.g. `random.shuffle()`)

- Built-in function `sorted`creates a new list and returns it.

Both 'list.sort` and `sorted` take two optional, keyword-only arguments:
- reverse: `False` by default. if `True`, return the items in descending order
- key: A one-argument function that will be applied to each item to producing its sorting key (e.g. `len`)

In [28]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
sorted(fruits, key=len)

['grape', 'apple', 'banana', 'raspberry']

In [29]:
print(fruits.sort(key=len))
print(fruits)

None
['grape', 'apple', 'banana', 'raspberry']


## Managing Ordered Sequences with `bisect`
The `bisect` module offers two main functions— `bisect` and `insort`—that use the **binary search** algorithm to quickly find and insert items in any sorted sequence.

### Searching with bisect (Not exactly binary search)

`bisect(haystack, needle)` does a binary search for `needle` in `haystack`—which must be a sorted sequence—to locate the position where `needle` can be inserted while maintaining `haystack` in ascending order.

In [30]:
import bisect

# `bisect.bisect` actually finds the index of the smallest element that is bigger than niddle
breakpoints = [60, 70, 80, 90]
grades = 'FDCBA'
def grade(score):
    i = bisect.bisect(breakpoints, score)
    return grades[i]

[grade(score) for score in [55,60,65,70,75,80,85,90,95]]

['F', 'D', 'D', 'C', 'C', 'B', 'B', 'A', 'A']

In [31]:
# `bisect.bisect_left` actually finds the index of the biggest element that is smaller than niddle
breakpoints = [60, 70, 80, 90]
grades = 'FDCBA'
def grade(score):
    i = bisect.bisect_left(breakpoints, score)
    return grades[i]

[grade(score) for score in [55,60,65,70,75,80,85,90,95]]

['F', 'F', 'D', 'D', 'C', 'C', 'B', 'B', 'A']

### Inserting with `bisect.insort`
`insort` keeps a sorted sequence always sorted. `Actually `insort` use `bisect` to find a proper index.

In [32]:
import random
SIZE = 7
random.seed(42)
my_list = []
for i in range(SIZE):
    new_item = random.randrange(SIZE * 2)
    bisect.insort(my_list, new_item)
    print(f'{new_item:>2d} -> {my_list}')

10 -> [10]
 1 -> [1, 10]
 0 -> [0, 1, 10]
11 -> [0, 1, 10, 11]
 4 -> [0, 1, 4, 10, 11]
 3 -> [0, 1, 3, 4, 10, 11]
 3 -> [0, 1, 3, 3, 4, 10, 11]


### When a List is NOT the Answer
The `list` type is flexible and easy to use, but depending on specific requirements, there are better options. 
- An `array` **saves a lot of memory** when you need to store millions of floating-point values. 
- If you are constantly adding and removing items from opposite ends of a `list`, it’s good to know that a `deque` (double-ended queue) is a more efficient FIFO data structure.

### Arrays (flat, mutable)
When creating an `array`, you provide a [typecode](https://docs.python.org/3/library/array.html#module-array), a letter to determine the underlying C type used to store each item in the array. 

Methods for **fast** loading and saving: `.fromfile`, `.tofile`

In [33]:
from array import array
from random import random

floats = array('d', (random() for i in range(10**7)))
print(floats[-1])
fp = open('floats.bin', 'wb')
floats.tofile(fp)
fp.close()

floats2 = array('d')
fp = open('floats.bin', 'rb')
floats2.fromfile(fp, 10**7) # second argument means how many we want to read
fp.close()

floats2[-1]

0.1703067320049848


0.1703067320049848

### Memory Views
The built-in `memoryview` class is a **shared-memory sequence** type that lets you handle slices of `array`s **without copying bytes**. 

In [34]:
octets = array('B', range(6)) # 'B' : unsigned char
m1 = memoryview(octets)
m2 = m1.cast('B', [2, 3])
id (octets[4]) == id(m2[1,1])

True

In [35]:
m2[1,1] = 40
m1.tolist() # `m1` and `m2` share memory

[0, 1, 2, 3, 40, 5]

`memoryview` may also cause transition between `unsigned` and `signed`

### Deques and Other Queues
The `.append` and `.pop` methods make a list usable as a stack or a queue. But inserting and removing from the head of a list is costly because the entire list **must be shifted in memory**.

In [36]:
from collections import deque

dq = deque(range(10), maxlen=10) # The optional `maxlen` argument sets the maximum number of items allowed in this instance of `deque`
dq.rotate(-4) # Rotating with n > 0 takes items from the right end and prepends them to the left; when n < 0 items are taken from left and appended to the right.
dq

deque([4, 5, 6, 7, 8, 9, 0, 1, 2, 3])

In [37]:
dq.extendleft([1, 2]) # Appending to a deque that is full (len(d) == d.maxlen) discards items from the other end;
dq # 2 and 3 are discarded

deque([2, 1, 4, 5, 6, 7, 8, 9, 0, 1])

In [38]:
dq.extend([3, 4, 5])
dq # 2, 1 and 4 are discarded

deque([5, 6, 7, 8, 9, 0, 1, 3, 4, 5])