# Chapter 2 An array of sequences

## Overview of built-in sequences

- Container vs. flat
  - Container sequences, 
    - list, tuple and collections.deque can hold items of different types
    - hold references to the objects they contain
  - Flat sequences,
    - str, bytes, bytearray, memoryview and array.array hold items of one type.
    - physically store the value of each item within its own memory space
    
- Mutable vs. Immutable
  - Mutable sequences
    - list, bytearray, array.array, collections.deque and memoryview
  - Immutable sequences
    - tuple, str and bytes

## List comprehensions (list-comps) and generator expressions (genexps)

- List-comps
  - Only meant to build a new list
  - No longer leaking the value
  - Listcomps do everything the map and filter functions do, without the contortions of the functionally challenged Python lambda
- Genexps (Generator expressions)
  -  A genexp saves memory because it yields items one by one using the iterator protocol instead of building a whole list just to feed another constructor.
 

In [1]:
dummy = [x for x in 'alksdjf']

In [2]:
x

NameError: name 'x' is not defined

^ See, no longer leaking the value. List comprehensions, generator expressions and their siblings set and dict comprehensions now have their own local scope, like functions

In [3]:
x = 'ABC'

In [7]:
dummy = [ord(x) for x in x]
print(dummy)

[65, 66, 67]


In [8]:
# the local variables do not mask the variables from the surrounding scope
print(x)

ABC


In [1]:
symbols = '$¢£¥€¤'

In [2]:
%%time
i = 0
while i <= 1000000:
    beyond_ascii = [ord(s) for s in symbols if ord(s) > 127]
    i += 1

CPU times: user 2.52 s, sys: 21.1 ms, total: 2.54 s
Wall time: 2.55 s


In [3]:
%%time
i = 0
while i <= 1000000:
    beyond_ascii = list(filter(lambda c: c > 127, map(ord, symbols)))
    i += 1

CPU times: user 3.04 s, sys: 25.3 ms, total: 3.07 s
Wall time: 3.09 s


Hard to say which one is fater

In [6]:
# Cartesian products
colors = ['black', 'white']
sizes = ['s', 'm', 'xxxxxl']
tshirts = [(color, size) for color in colors for size in sizes]
tshirts

[('black', 's'),
 ('black', 'm'),
 ('black', 'xxxxxl'),
 ('white', 's'),
 ('white', 'm'),
 ('white', 'xxxxxl')]

In [7]:
symbols = '$¢£¥€¤'
tuple(ord(symbol) for symbol in symbols)

(36, 162, 163, 165, 8364, 164)

In [8]:
import array
array.array('I', (ord(symbol) for symbol in symbols))

array('I', [36, 162, 163, 165, 8364, 164])

^ genexps can generate different types of constructor rather than just a list

In [10]:
for tshirt in ('%s %s' % (c, s) for c in colors for s in sizes):
    print(tshirt)

black s
black m
black xxxxxl
white s
white m
white xxxxxl


^ genexp yields items one by one; a list with all 6 t-shirt variations is never produced

## Tuples are not just immutable lists
- Tuples do double-duty:
  - can be used as immutable lists
  - also as records with no field names, each item in the tuple holds the data for one field and the position of the item gives its meaning
- Tuple unpacking
  - Swapping the values of variables without using a temporary variable
  - Prefixing an argument with a star when calling a function
  - Parallel assignment, the * prefix can be applied to exactly one variable, but it can appear in any position
- Nested tuple unpacking
  - 

In [15]:
traveler_ids = [('USA', '31195855'), ('BRA', 'CE342567'), ('ESP', 'XDA205856')]
for passport in sorted(traveler_ids):
#     print('%s/%s' % passport)
    print('/'.join(passport))

BRA/CE342567
ESP/XDA205856
USA/31195855


^ Unpacking!!!

In [16]:
# swapping the values of variables without using a temporary variable
a = 1
b = 2
print(('a', a))
print(('b', b))
b, a = a, b
print(('a', a))
print(('b', b))

('a', 1)
('b', 2)
('a', 2)
('b', 1)


In [17]:
# prefixing an argument with a star when calling a function
t = (20, 7)
divmod(*t)

(2, 6)

In [20]:
a, b, *rest = range(5)
print((a, b, rest))

(0, 1, [2, 3, 4])


In [21]:
a, b, *rest = range(2)
print((a, b, rest))

(0, 1, [])


In [23]:
a, *middle, b = range(6)
print((a, middle, b))

(0, [1, 2, 3, 4], 5)


^ parallel assignment, the * prefix can be applied to exactly one variable, but it can appear in any position

In [24]:
metro_areas = [
    ('Tokyo','JP',36.933,(35.689722,139.691667)),
    ('Delhi NCR', 'IN', 21.935, (28.613889, 77.208889)),
    ('Mexico City', 'MX', 20.142, (19.433333, -99.133333)),
    ('New York-Newark', 'US', 20.104, (40.808611, -74.020386)),
    ('Sao Paulo', 'BR', 19.649, (-23.547778, -46.635833)),
]

In [25]:
print('{:15} | {:^9} | {:^9}'.format('', 'lat.', 'long.'))
fmt = '{:15} | {:9.4f} | {:9.4f}'
for name, cc, pop, (latitude, longitude) in metro_areas:
    if longitude <= 0:
        print(fmt.format(name, latitude, longitude))

                |   lat.    |   long.  
Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
Sao Paulo       |  -23.5478 |  -46.6358


^ Nested unpacking