<h1>Chapter 2 — An Array of Sequences </h1>

<h2>Built-in Sequences:</h2>

Python has several built-in ways to store ordered collections of items, called sequences.

Container sequences (like list, tuple, deque): They can hold items of different types (e.g., a list can have numbers, strings, and even other lists). They store references to these objects.<br>
Flat sequences (like str, bytes, bytearray, memoryview, array.array): They are more specialized and can only hold items of a single, primitive type (like characters, bytes, or numbers). They store the actual values directly in memory, making them more memory-efficient for their specific purpose.

Sequences are also categorized by whether you can change them after they're created:
Mutable sequences (like list, bytearray, array.array, deque, memoryview): You can add, remove, or modify elements in these sequences.
Immutable sequences (like tuple, str, bytes): Once created, you cannot change the elements within these sequences.

<h2>List Comprehensions and Generator Expressions</h2>

In Python code, line breaks are ignored inside pairs of [], {} or ().
So you can build multi-line lists, listcomps, genexps, dictionaries etc.
without using the ugly \ line continuation escape.

<h3>Example 2-1. Build a list of Unicode codepoints from a string</h3>

In [1]:
symbols = '$¢£¥€¤'
codes = []

for symbol in symbols:
    codes.append(ord(symbol)) # takes a single character (a string of length 1) as its argument.

codes

[36, 162, 163, 165, 8364, 164]

List comprehensions are a concise and often more readable way to create new lists based on existing sequences or other iterable things.<br>
Instead of writing a traditional for loop to build a list element by element, you can do it in a single, more expressive line.

In [11]:
cubes = []
for x in range(10):
    if x % 2 == 0:
        cubes.append(x**3)
print("Using for loop:", cubes)

Using for loop: [0, 8, 64, 216, 512]


<h3>Example 2-2. Build a list of Unicode codepoints from a string, using a listcomp</h3>

In [2]:
symbols = '$¢£¥€¤'

codes = [ord(symbol) for symbol in symbols]

codes

[36, 162, 163, 165, 8364, 164]

In [15]:
cube = [x ** 3 for x in range(10) if x%2==0]
print("Using for loop:", cube)

Using for loop: [0, 8, 64, 216, 512]


<h3>Listcomps No Longer Leak Their Variables</h3>

In [3]:
x = 'ABC'
codes = [ord(x) for x in x]
x

'ABC'

In [5]:
codes

[65, 66, 67]

In [4]:
codes = [last := ord(c) for c in x]
last

67

last := ...: The walrus operator does two things simultaneously:
It assigns the value of the expression on its right (ord(c)) to the variable on its left (last).
It returns the assigned value. This returned value is then used as the element that gets added to the codes list in the current iteration of the comprehension.

<h3>Example 2-3. The same list built by a listcomp and a map/filter composition</h3>

This section argues that list comprehensions are a more Pythonic and often more readable alternative to using the built-in map() and filter() functions, especially when combined with lambda functions.

In [6]:
symbols = '$¢£¥€¤'
beyond_ascii = [ord(s) for s in symbols if ord(s) > 127]
beyond_ascii

[162, 163, 165, 8364, 164]

In [7]:
beyond_ascii = list(filter(lambda c: c > 127, map(ord, symbols)))
beyond_ascii

[162, 163, 165, 8364, 164]

<h3>Example 2-4. Cartesian product using a list comprehension</h3>

In [8]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L']
tshirts = [(color, size) for color in colors for size in sizes]
tshirts

[('black', 'S'),
 ('black', 'M'),
 ('black', 'L'),
 ('white', 'S'),
 ('white', 'M'),
 ('white', 'L')]

In [16]:
for color in colors:
    for size in sizes:
        print((color, size))

('black', 'S')
('black', 'M')
('black', 'L')
('white', 'S')
('white', 'M')
('white', 'L')


In [19]:
tshirts = [(color, size) for size in sizes
                        for color in colors]
tshirts

[('black', 'S'),
 ('white', 'S'),
 ('black', 'M'),
 ('white', 'M'),
 ('black', 'L'),
 ('white', 'L')]

<h3>Generator Expressions</h3>
Generator expressions have the same syntax as list comprehensions, but instead of being enclosed in square brackets [], they are enclosed in parentheses ().

To initialize tuples, arrays and other types of sequences, you could also start from a
listcomp but a genexp saves memory because it yields items one by one using the iterator
protocol instead of building a whole list just to feed another constructor.

In [20]:
symbols = '$¢£¥€¤'
tuple(ord(symbol) for symbol in symbols)

(36, 162, 163, 165, 8364, 164)

In [21]:
import array

array.array('I', (ord(symbol) for symbol in symbols))

array('I', [36, 162, 163, 165, 8364, 164])

In [22]:
colors = ['black', 'white']
sizes = ['S', 'M', 'L']

for tshirt in ('%s %s' % (c, s) for c in colors for s in sizes):
    print(tshirt)

black S
black M
black L
white S
white M
white L


The generator expression yields items one by one; a list with all 6 t-shirt
variations is never produced in this example.


<h2>Tuples are not just immutable lists
</h2>

Tuples do double-duty: they can be used as immutable lists and also
as records with no field names.

<h3>Tuples as records</h3>

Tuples hold records: each item in the tuple holds the data for one field and the position
of the item gives its meaning.


In [23]:
lax_coordinates = (33.9425, -118.408056)
city, year, pop, chg, area = ('Tokyo', 2003, 32_450, 0.66, 8014)
traveler_ids = [('USA', '31195855'), ('BRA', 'CE342567'), ('ESP', 'XDA205856')]

for passport in sorted(traveler_ids):
    print('%s/%s' % passport)

BRA/CE342567
ESP/XDA205856
USA/31195855


In [24]:
for country, _ in traveler_ids:
    print(country)

USA
BRA
ESP


In [25]:
a = (10, 'alpha', [1, 2])
b = (10, 'alpha', [1, 2])
a == b

True

In [26]:
b[-1].append(99)
a == b

False

In [27]:
b

(10, 'alpha', [1, 2, 99])

<h3>Unpacking sequences and iterables</h3>

<h4>Tuple Unpacking</h4>

In [28]:
lax_coordinates = (33.9425, -118.408056)
latitude, longitude = lax_coordinates  # unpacking
latitude

33.9425

In [29]:
longitude

-118.408056

In [30]:
divmod(20, 8)

(2, 4)

Another example of tuple unpacking is prefixing an argument with a star when calling
a function

In [31]:
t = (20, 8)
divmod(*t)


(2, 4)

In [32]:
quotient, remainder = divmod(*t)
quotient, remainder


(2, 4)

the
os.path.split() function builds a tuple (path, last_part) from a filesystem path.


In [33]:
import os

_, filename = os.path.split('/home/luciano/.ssh/id_rsa.pub')
filename

'id_rsa.pub'

<h3>Using * to grab excess items</h3>

In [34]:
a, b, *rest = range(5)
a, b, rest

(0, 1, [2, 3, 4])

In [35]:
a, b, *rest = range(3)
a, b, rest

(0, 1, [2])

In [36]:
a, b, *rest = range(2)
a, b, rest

(0, 1, [])

In [37]:
*head, b, c, d = range(5)
head, b, c, d

([0, 1], 2, 3, 4)

<h3> Nested unpacking </h3>

<h3>Example 2-8. Unpacking nested tuples to access the longitude</h3>

In [38]:
metro_areas = [
 ('Tokyo', 'JP', 36.933, (35.689722, 139.691667)), #
 ('Delhi NCR', 'IN', 21.935, (28.613889, 77.208889)),
 ('Mexico City', 'MX', 20.142, (19.433333, -99.133333)),
 ('New York-Newark', 'US', 20.104, (40.808611, -74.020386)),
 ('Sao Paulo', 'BR', 19.649, (-23.547778, -46.635833)),
]
print('{:15} | {:^9} | {:^9}'.format('', 'lat.', 'long.'))
fmt = '{:15} | {:9.4f} | {:9.4f}'
for name, cc, pop, (latitude, longitude) in metro_areas: #
    if longitude <= 0: #
        print(fmt.format(name, latitude, longitude))

                |   lat.    |   long.  
Mexico City     |   19.4333 |  -99.1333
New York-Newark |   40.8086 |  -74.0204
Sao Paulo       |  -23.5478 |  -46.6358


Each tuple holds a record with four fields, the last of which is a coordinate pair.
By assigning the last field to a tuple, we unpack the coordinates.
if longitude <= 0: limits the output to metropolitan areas in the Western
hemisphere.


<h2>Slicing</h2>

Feature of list, tuple, str and all sequence types in Python is the support of slicing operations

In [1]:
l = [10, 20, 30, 40, 50, 60]

l[:2]  # split at 2

[10, 20]

In [2]:
l[2:]

[30, 40, 50, 60]

In [3]:
l[:3]

[10, 20, 30]

<h3>Slice Objects</h3>

In [4]:
s = 'bicycle'
s[::3]

'bye'

The stride can also be negative,
 returning items in reverse.

In [5]:
s[::-1]

'elcycib'

In [6]:
s[::-2]

'eccb'

Instead
 of filling your code with hard-coded slices, you can name them. See how readable this
 makes the for loop at the end of the example:

In [7]:
invoice = """
0.....6.................................40........52...55........
1909 Pimoroni PiBrella                      $17.50    3    $52.50
1489 6mm Tactile Switch x20                  $4.95    2    $9.90
1510 Panavise Jr. - PV-201                  $28.00    1    $28.00
1601 PiTFT Mini Kit 320x240                 $34.95    1    $34.95
"""

SKU = slice(0, 6)
DESCRIPTION = slice(6, 40)
UNIT_PRICE = slice(40, 52)
QUANTITY = slice(52, 55)
ITEM_TOTAL = slice(55, None)

line_items = invoice.split('\n')[2:]

for item in line_items:
    print(item[UNIT_PRICE], item[DESCRIPTION])

    $17.50   imoroni PiBrella                  
     $4.95   mm Tactile Switch x20             
    $28.00   anavise Jr. - PV-201              
    $34.95   iTFT Mini Kit 320x240             
 


<h2>Assigning to Slices</h2>

In [8]:
l = list(range(10))
l

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [9]:
l[2:5] = [20, 30]
l

[0, 1, 20, 30, 5, 6, 7, 8, 9]

In [10]:
del l[5:7]
l

[0, 1, 20, 30, 5, 8, 9]

<h2>Using + and * with Sequences</h2>

In [11]:
l = [1, 2, 3]
l * 5

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

In [12]:
5 * 'abcd'

'abcdabcdabcdabcdabcd'

<h3> Building lists of lists</h3>

In [13]:
board = [['_'] * 3 for i in range(3)]
board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [14]:
board[1][2] = 'X'
board

[['_', '_', '_'], ['_', '_', 'X'], ['_', '_', '_']]

Create a list of with 3 lists of 3 items each. Inspect the structure. <br>
 Place a mark in row 1, column 2 and check the result

In [15]:
weird_board = [['_'] * 3] * 3
weird_board

[['_', '_', '_'], ['_', '_', '_'], ['_', '_', '_']]

In [16]:
weird_board[1][2] = 'O'
weird_board

[['_', '_', 'O'], ['_', '_', 'O'], ['_', '_', 'O']]

The outer list is made of three references to the same inner list. While it is
 unchanged, all seems right. <br>
 Placing a mark in row 1, column 2 reveals that all rows are aliases referring to
 the same object.

<h2>A += Assignment Puzzler</h2>

In [17]:
t = (1, 2, [30, 40])
try:
    t[2] += [50, 60]
except TypeError as e:
    print(repr(e))

TypeError("'tuple' object does not support item assignment")


In [18]:
t

(1, 2, [30, 40, 50, 60])

<h2>list.sort and the sorted Built-In Function</h2>

In [19]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
sorted(fruits)

['apple', 'banana', 'grape', 'raspberry']

The list.sort method sorts a list in-place, that is, without making a copy. It returns
 None to remind us that it changes the target object, and does not create a new list. This
 is an important Python API convention: functions or methods that change an object
 in-place should return None to make it clear to the caller that the object itself was
 changed, and no new object was created. 

In [20]:
fruits = ['grape', 'raspberry', 'apple', 'banana']
sorted(fruits)

['apple', 'banana', 'grape', 'raspberry']

In [21]:
fruits

['grape', 'raspberry', 'apple', 'banana']

In [22]:
sorted(fruits, reverse=True)

['raspberry', 'grape', 'banana', 'apple']

In [23]:
sorted(fruits, key=len)

['grape', 'apple', 'banana', 'raspberry']

In [24]:
sorted(fruits, key=len, reverse=True)

['raspberry', 'banana', 'grape', 'apple']

In [25]:
fruits

['grape', 'raspberry', 'apple', 'banana']

In [26]:
fruits.sort()
fruits

['apple', 'banana', 'grape', 'raspberry']

<h2>NumPy</h2>

 Basic operations with rows and columns in a numpy.ndarray

In [27]:
import numpy as np

a = np.arange(12)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [28]:
type(a)

numpy.ndarray

In [29]:
a.shape

(12,)

a.shape = 3, 4
a

In [31]:
a[2]

array([ 8,  9, 10, 11])

In [32]:
a[2, 1]

np.int32(9)

In [34]:
a[:, 1]
#get column at index 1;

array([1, 5, 9])

In [35]:
a.transpose()

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

<h3>Deques and Other Queues <h3>

<h3>Working with a deque</h3>

In [36]:
import collections

dq = collections.deque(range(10), maxlen=10)
dq

deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], maxlen=10)

In [37]:
dq.rotate(3)
dq

deque([7, 8, 9, 0, 1, 2, 3, 4, 5, 6], maxlen=10)

rotating with n > 0 takes items from the right end and prepends them to the
 left; when n < 0 items are taken from left and appended to the right;

In [38]:
dq.rotate(-4)
dq

deque([1, 2, 3, 4, 5, 6, 7, 8, 9, 0], maxlen=10)

In [39]:
dq.appendleft(-1)
dq

deque([-1, 1, 2, 3, 4, 5, 6, 7, 8, 9], maxlen=10)

In [40]:
dq.extend([11, 22, 33])
dq

deque([3, 4, 5, 6, 7, 8, 9, 11, 22, 33], maxlen=10)

In [41]:
dq.extendleft([10, 20, 30, 40])
dq

deque([40, 30, 20, 10, 3, 4, 5, 6, 7, 8], maxlen=10)

note that extendleft(iter) works by appending each successive item of the
 iter argument to the left of the deque, therefore the final position of the items
 is reversed.

<h2>Soapbox</h2>

<h3>Mixed bag lists</h3>

In [43]:
l = [28, 14, '28', 5, '9', '1', 0, 6, '23', 19]

In [44]:
try:
    sorted(l)
except TypeError as e:
    print(repr(e))

TypeError("'<' not supported between instances of 'str' and 'int'")


<h3>Key is Brilliant</h3>

In [45]:
l = [28, 14, '28', 5, '9', '1', 0, 6, '23', 19]

sorted(l, key=int)

[0, '1', 5, 6, '9', 14, 19, '23', 28, '28']

In [47]:
sorted(l, key=str)

[0, '1', 14, 19, '23', 28, '28', 5, 6, '9']