# An array of Sequences

## Introduction

Before working on Python, Guido was a contributor to the ABC language, so it is not a surprise that Python inherited from ABC the handling of seuqences, Strings, lists, byte sequences, arrays and many more.

Understanding the variety of sequences available in Python saves us from reinventing the wheel, let's take a quick overview of the built-in sequences

 - Container sequences: Can hold items of different types, list, tuples, collections.deque
 - Flat sequences: Hold items of one simple type, str, bytes, array.array
 
A container sequence holds references to the objects it contains, in the other hand, a flat sequence stores the value of its contents, therefore, flat sequences are more compact, but they are limited to holding primitive machine values like bytes, integers and floats.

Every Python object in memory has a header with metadata, the simplest Python value float has a value fued and two metadata fields
 - ob_refcnt: the object's reference count
 - ob_type: a pointer to object's type
 - ob_fval: a C double holding the value
 

Another way to group sequences on Python may be by mutable and inmutable even the evident difference, both of them are still subclasses of sequence.

In [1]:
from collections import abc

print(issubclass(tuple, abc.Sequence))
print(issubclass(list, abc.Sequence))

True
True


But list in particular is subclass of `MutableSequence` as well

In [2]:
print(issubclass(tuple, abc.MutableSequence))
print(issubclass(list, abc.MutableSequence))

False
True


This chapter assumes that we are pretty familiar with sequences, specially with lists in order of keep going and dive directly into List comprehensions and decorators.

### List comprehensions

A quick way to build a sequence is using a list comprehension, of course if the target is actually a list, else a generator may be a good idea as well, for brevity, Python developers often refers to list comprehensions as _listcomps_ and generator expresions as _genexps_ and we will stick to that too.

#### List comprehension and readability

The mail goal of a list comprehension is always to build a list, therefore you want to use it when a list needs to be created from another list, but not for actually iterating and presenting data

The for-loop below is an use-case of a Python user who has to create a new list from a previous one

In [3]:
# Create a list of unicode code points from a string

symbols = "ü$%ç"
codes = []

for char in symbols:
    codes.append(ord(char))

print(codes)

[252, 36, 37, 231]


Even the example above works, it can be simplified to one line using a _listcomp_ making it easier to read.

In [4]:
codes = [ord(char) for char in "ü$%ç"]
print(codes)

[252, 36, 37, 231]


Of course, the readability of a comprehension list depends on the developer too, they can create huge incomprehensible chunks of code, and remember, if you are not using the crated list for something, then you should not use this syntax, try to keep them short, if the target is to create a list but there is a lot of logic in it, is better to use a classic `for-loop` in order of being more explicit.

**Remember:** all _listcomps_ have a local scope, that means you can not access any variable outside it, after the addition of walrus operator `:=` in Python 3.8 the variable name will store all the values and finally will take the value of the last iteration, it also applies for generators, set comprehensions or dict comprehensions.

In [5]:
[x := int(i) for i in "123456789"]
print(x)

9


#### _Listcomps_ vs filter and map

Peopele used to believe that the usage of `filter` or `map` is faster than use a _listcomp_ but this, in fact, is not true, the difference is ignorable, and they are still easier to read and write, since you don't have to deal with the understanding of `lambda`

_Listcomps_ can iterate over more than just one iterable, it will always depend on the requirements of your implementation and targeting readability.

In [6]:
colors = ["black", "white"]
sizes = ["s", "m", "l"]

shirts = [(color, size) for color in colors for size in sizes]

for shirt in shirts:
    print(*shirt)

black s
black m
black l
white s
white m
white l


The above _listcomp_ is ordering our shirts by size since it is the "highest order" iterable on the list, if we need to actually ordering them by size we'll need to change the sequence of iterables

In [7]:
shirts = [(color, size) for size in sizes for color in colors]

for shirt in shirts:
    print(*shirt)

black s
white s
black m
white m
black l
white l


### Generator expressions

To initialize tuples, arrays and other sequences you may want to start with a _listcomp_ but a _genexp_ will save memory, since it does not store all the results of the iterable, yet it iterates, `yield` a value saving the iteration state and is able to run again for the next iteration.

The advantage is, the syntax is the same as the _listcomps_ but using `()` rather than `[]`


In [8]:
codes = tuple(ord(char) for char in "ü$%ç")
print(codes)

(252, 36, 37, 231)


Note that if a _genexp_ is the only argument in a function the parentheses are not necessary.

Let's see how the previous example of shirts combinations will work with a _genexp_

In [9]:
for shirt in (f"{color} {size}" for size in sizes for color in colors):
    print(shirt)

black s
white s
black m
white m
black l
white l


#### When to use each one?

You may want to use a _listcomp_ if:
 - The values are finite and not too many
 - You will need to use the list after its creation
 - The transformation of the items is simple
 
You may want to use a _genexp_ if:
 - Values are much or infinite
 - You don´t need to use the result after the iteration
 - The transformation of the items is simple

### Tuples are not just inmutable lists

Tuples is another primitive datatype used as an inmutable sequence, they can be used, yes as an inmutable list, and to represent nameless records, let's start with that

#### Tuples as records

Tuples hold items, each item in the tuple holds the data for one field, and the position gives its meaning.

If you think a tuple as an inmutable list, the quantity or order or the items may not be relevant for you, depending the context, in the other hand, using them as records, the quantity must be fixed and the order is the context.

See how in the following example, a record of coordinates depends on the order of each value, as well as the order of the unpacked values

In [10]:
coordinates = (-39.62871, 92.33189)

city, year, pop, chg, area = ("Tokyo", 2003, 32_450, 0.66, 8014)
traveler_ids = [("USA", "223311"), ("MX", "223300"),]

for id_ in traveler_ids:
    print(*id_)

USA 223311
MX 223300


#### Tuples as inmutable lists

Using a tuple rather than a list will depend on the context, sometimes, the data within a list will remain the same through all the application, therefore, you might want use a tuple since it provides two main benefits:
 - Clarity: When you see a tuple on the code, you know its length will never change
 - Performance: A `tuple` uses less memory than a list of the same size, since it allows Python to do some optimizations

#### "Mutable" tuples

Take in count that, the inmutability of a `tuple`only applies to the references contained in it, references cannot be deleted or replaces, but they can point to a mutable object, which value could change, then the value of the `tuple` actually changes.

In [11]:
a = (9, "tuple", [5, 6])
b = (9, "tuple", [5, 6])

a == b

True

The snippet above, shows how two tuples, containing the same references will be equal

In [12]:
b[-1].append(0)

a == b

False

Tuples with mutable objects can be a source of bugs, inmutability will be important in the following chapters, an object is `hashable` if it is inmutable, otherwise the hash will not be created and that value won't be candidate to be used as a `dict` key or be added to a `set` element.

Let's create a function to determine if an object is hashable.

In [13]:
def is_hashable(o):
    try:
        hash(o)
    except TypeError:
        return False
    return True

inmutable_elements = (1, "strings are inmutable!", (2, 3))
elements = (1, "strings are inmutable!", [2, 3])

print("inmutable_elements is hashable? ", is_hashable(inmutable_elements))
print("elements is hashable? ", is_hashable(elements))

inmutable_elements is hashable?  True
elements is hashable?  False


#### Unpacking

Is a technique to avoid unnecessary use of indexes to extract elements from sequences. Also unpacking works with any iterable object as the data source (including iterators which don't support index notation `[]`)

Most visible application form of unpacking is the parallel assignment; that is assign items from an iterable to a tuple of variables, as you can see in the example below

In [14]:
coordinates = (127.212, -102.12)
lat, lon = coordinates
print(f"{lat=}")
print(f"{lon=}")

lat=127.212
lon=-102.12


But that is not the only functionality of unpacking, it is useful for unpack a tuple as arguments for a function, for example, if we need to print each coordinate in the same row, your first reflect might be 

In [15]:
print(coordinates[0], coordinates[1])

127.212 -102.12


This works, of course, but Python has another trick to unpack all the values, this is handy when the tuple has many elements and retrieveng them by index is quite complicated, for this we'll use the `*` operator

In [16]:
print(*coordinates)

127.212 -102.12


It is also handy when you need to return many values from a function and store them in multiple variables at runtime

In [17]:
quotient, remain = divmod(20, 8)

print(quotient, remain)

2 4


#### Using `*` in variadic functions

Defining a function's parameters with `*args` will grab arbitrary excess arguments in a `tuple` called `args` available in your function's scope.

In Python 3, this idea was extended to apply parallel assignment as well

In [18]:
one, two, *rest = range(1, 5)

print(one)
print(two)
print(rest)

1
2
[3, 4]


With this notation, you can use `*` operator in just one variable, but you can use it in any position.

In [19]:
one, *middle, five, six = range(1, 7)

print(one)
print(middle)
print(five)
print(six)

1
[2, 3, 4]
5
6


`*` operator is useful for create new `list`, `tuple` or `set`.

In [20]:
*range(4), 4

(0, 1, 2, 3, 4)

In [21]:
[1, *range(2, 10), 10]

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [22]:
{1, *range(10)}

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

#### Nested unpacking

Unpacking is smart enough to make us able to pick important data pieces from big and nested iterables. As long as the `tuple` sequence has the same structure.

In [23]:
metro_areas = [
    ("Tokyo", "JP", 36.933, (35.68, 139.6916)),
    ("Delhi", "IN", 21.935, (28.61, 77.20)),
    ("Mexico City", "MX", 20.142, (19.43, -99.13)),
]

print(f'{"":15} | {"latitude":>9} | {"longitude":>9}')
for name, _, _, (lat, lon) in metro_areas:
    if lon <= 0:
        print(f'{name:15} | {lat:9.4f} | {lon:9.4f}')

                |  latitude | longitude
Mexico City     |   19.4300 |  -99.1300


If the target of an unpacking contains only one result (e.g., the SQL code has a LIMIT 1 clause) you can use

In [24]:
from random import randint

def latest_user_id():
    return (randint(999, 10000), )

[last] = latest_user_id()
print(last)

3199


If the record has only one field, you can get it directly with

In [25]:
from random import randint

def latest_user_id():
    return ((randint(999, 10000), ), )

[[last]] = latest_user_id()
print(last)

4267


### Pattern matching with sequences

The most visible new feature in Python 3.10 is pattern matching with `match/case` statement.

Imagine you are receiving messages with the structure of a sequence of words describing the action that you need to perform, writing this code using a sequence of `if/elif/else` statements might be inconvinient, there is the first application of pattern matching.

In [26]:
import time

class Led:
    def __init__(self):
        self.brightness = 0
        self.color = "white"

    def set_brightness(self, val):
        self.brightness = val

    def set_color(self, r, g, b):
        self.color = r, g, b


class Robot:
    
    def __init__(self):
        self.leds = {k: Led() for k in range(1, 6)}
        self.neck_angle = 0
        
    def beep(self, times, freq):
        for _ in range(times):
            print("BEEP!")
            time.sleep(freq)

    def rotate_neck(self, angle):
        self.neck_angle += angle
        print(f"NECK ANGLE IS NOW: {self.neck_angle}")

    def adjust_bright_of_led(self, led, intensity):
        self.leds[led].set_brightness(intensity)
        print(f"LED NUMBER {led} INTENSITY IS NOW {self.leds[led].brightness}")
        
    def adjust_color_of_led(self, led, r, g, b):
        self.leds[led].set_color(r, g, b)
        print(f"LED NUMBER {led} COLOR IS NOW RGB{self.leds[led].color}")

    def handle_command(self, command):
        match message:
            case ["BEEP", freq, times]:
                self.beep(freq, times)
            case ["NECK", angle]:
                self.rotate_neck(angle)
            case ["LED", ident, intensity]:
                self.adjust_bright_of_led(ident, intensity)
            case ["LED", ident, red, green, blue]:
                self.adjust_color_of_led(ident, red, green, blue)
            case _:
                print("ERROR: Invalid command")

The example above, shows a quick class of a simple robot with a few commands, it comes with a `handle_command()` method as well, which will receive the commands previously mentioned and execute them, please note how the pattern matching identifies any sequence and will `match` only if it has the structure defined in the `case` sentence.


In [27]:
rob = Robot()

message = ["BEEP", 3, 1]
rob.handle_command(message)

BEEP!
BEEP!
BEEP!


In [28]:
message = ["NECK", 180]
rob.handle_command(message)

NECK ANGLE IS NOW: 180


In [29]:
message = ["LED", 2, 100]
rob.handle_command(message)

LED NUMBER 2 INTENSITY IS NOW 100


In [30]:
message = ["LED", 4, 255, 255, 255]
rob.handle_command(message)

LED NUMBER 4 COLOR IS NOW RGB(255, 255, 255)


In [31]:
message = ["LASER", "info"]
rob.handle_command(message)

ERROR: Invalid command


This is only the surface of pattern matching true capacities, destructuring is a new concept in Python but commonly used in functional languages like Elixir or Scala.

Let's rewrite our previous example but now using a bit more complex patern matching.

In [32]:
metro_areas = [
    ("Tokyo", "JP", 36.933, (35.68, 139.6916)),
    ("Delhi", "IN", 21.935, (28.61, 77.20)),
    ("Mexico City", "MX", 20.142, (19.43, -99.13)),
]

print(f'{"":15} | {"latitude":>9} | {"longitude":>9}')
for record in metro_areas:
    match record:
        case [name, _, _, (lat, lon)] if lon >= 0:
            print(f'{name:15} | {lat:9.4f} | {lon:9.4f}')

                |  latitude | longitude
Tokyo           |   35.6800 |  139.6916
Delhi           |   28.6100 |   77.2000


This exampel has two parts, the `case` clause and an extra `if` condition, we can add more `case` and make them depend on the conditional.

In [33]:
print(f'{"Hemisphere":12} | {"City":15} | {"latitude":>9} | {"longitude":>9}')
for record in metro_areas:
    match record:
        case [name, _, _, (lat, lon)] if lon <= 0:
            print(f'{"North":12} | {name:15} | {lat:9.4f} | {lon:9.4f}')
        case [name, _, _, (lat, lon)] if lon >= 0:
            print(f'{"South":12} | {name:15} | {lat:9.4f} | {lon:9.4f}')

Hemisphere   | City            |  latitude | longitude
South        | Tokyo           |   35.6800 |  139.6916
South        | Delhi           |   28.6100 |   77.2000
North        | Mexico City     |   19.4300 |  -99.1300


In general terms, a sequence pattern matches if
 - The subject is a sequence _and_;
 - The subject and the pattern have the same length _and_;
 - Each corresponding item matches, including nested items
 
Sequence patterns may be written as tuples or lists, or any combination of nested tuples and lists, but it makes no difference which syntax you use.

There are many more tricks, for exampel if you want a more general pattern, like, matching if the sequence starts with a string and ends with a nested sequence of two floats, we can write

```py
case [str(name), *_, (float(lat), float(lon))]
```

The `*_` matches any number of items, without binding them to a variable. Using `*extra` instead would bind the items to `extra` as a list with `0` or more items.

The optional guard clause starting with `if` is evaluated only if the pattern matches, and can reference variables bound in the pattern.

### Quick recap of lists

Lists are really useful for a bunch of tasks, therefore, I'll leave a quick recap of many things that you can perform using lists.

In [34]:
# Lists can store any type of data
my_list = ["See you space cowboy", 3.142592, ["A list", "inside a list!"]]

# Slicing allows you to access to a slice of the list
print("A)", my_list[1:3]) # start:end:step, actually ther are optional

# As you might now, negative indexes are real in Python, also in slices, helps to revert a list
print("B)", my_list[::-1])

# Python has a built-in function named `slice()` to create verbose slices
BEBOP_MESSAGE = slice(0, 1) # start,end,step
print("C)", my_list[BEBOP_MESSAGE])
# REMEMBER! "end" is exclusive, otherwise it might raise IndexError exceptions

# You can concat lists as well, this will return a new list instead of altere the original
concat_list = my_list + [3+1j, (20, 77)]
print("D)", concat_list)

# Multiply by a constant "n" will copy and concat the list n times
times_list = my_list[1:3] * 2
print("E)", times_list)

# Of course in-line assignment is available to using += and *= operators
# Substracting and division are not allowed

# To sort a list you can use built-in function `sorted()` or use `list.sort()`
# The first one will create a new sorted list from the original
# And the second one will return None and modify the origin list

numbers = [8, 2, 5, 1, 4, 7, 2, 3]
sorted_list = sorted(numbers)
print("F)", sorted_list, "<- New list")
print("F)", numbers, "<- Original list")

result_of_list_dot_sort = numbers.sort()
print("G)", result_of_list_dot_sort, "<- Result of method")
print("G)", numbers, "<- Original list")

# Both receive `key` keyword parameter which should be a callable to
# specify sorting strategy, useful for more complex data structures

A) [3.142592, ['A list', 'inside a list!']]
B) [['A list', 'inside a list!'], 3.142592, 'See you space cowboy']
C) ['See you space cowboy']
D) ['See you space cowboy', 3.142592, ['A list', 'inside a list!'], (3+1j), (20, 77)]
E) [3.142592, ['A list', 'inside a list!'], 3.142592, ['A list', 'inside a list!']]
F) [1, 2, 2, 3, 4, 5, 7, 8] <- New list
F) [8, 2, 5, 1, 4, 7, 2, 3] <- Original list
G) None <- Result of method
G) [1, 2, 2, 3, 4, 5, 7, 8] <- Original list


### Arrays

As we said in the Tuples section, Python objects does not actually store the value of the object, it stores a reference to the value instead, this might be counterproductive when it comes to save a big quantity of items, there is where the `arrays` are better, since they store the value and only the value they are lighter than lists and tuples, they won't accept different values since all of them must be of the same type.

Python arrays are as lean as C arrays, the constructor of an array will take as first parameter a letter to indicate the undelying C type to be stored in our arrays.

In [35]:
from array import array
from random import random

# Note the genexp as second parameter, this mean you can use any sequence there
arr = array("d", (random() for _ in range(10**5)))
print(arr[-1])

0.911721898087911


Arrays are a sequence too, that means they have many of the methods that a list contains and a few more, let's see how to convert an array to a bytearray using the built-in methods for it.

Note the last element of the original array will be equal to the array loaded from a file in the cell below.

In [36]:
fp = open("numbers.bin", "wb")
arr.tofile(fp)
fp.close()

arr2 = array("d") # Note the data type is the same
fp = open("numbers.bin", "rb")
arr2.fromfile(fp, 10**5)
fp.close()
print(arr2[-1])

0.911721898087911


If you still have doubts, let's compare both arrays

In [37]:
arr == arr2

True

### Dequeues

As you might have guessed, lists offer a _FIFO_ behavior when you use `append()` or `pop()`, but inserting and deleting from the head of the list is performance-expensive since the whole list is shifted in memory.

This is why Python has a class `collections.dequeue`, a thread-safe, double-ended queue designed for fast inserting and removing both ends, it is quite useful to store thingfs like a list of recently viewed items since ehtn the max capacity is reached, it will delete a head item to store a the new value.

In [38]:
from collections import deque

q = deque(range(10), maxlen=10) # Note again, first parameter can be any sequence

Rotating with `n > 0` takes n items from the right end and prepends them to the left; when `n < 0` n items are taken from left and appended to the right

In [39]:
q.rotate(3)
print(q)
q.rotate(-3)
print(q)

deque([7, 8, 9, 0, 1, 2, 3, 4, 5, 6], maxlen=10)
deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], maxlen=10)


Note how, since we added the optional parameter `maxlen` if we try to extend n elements to our queue it will forget about the first n items and will append the n items  

In [40]:
q.extend([10, 11, 12])
print(q)

deque([3, 4, 5, 6, 7, 8, 9, 10, 11, 12], maxlen=10)


There is another method that does pretty much the same but at the "beginning" of the sequence, the method is `extendleft()`

In [41]:
q.extendleft([-2, -1, 0])
print(q)

deque([0, -1, -2, 3, 4, 5, 6, 7, 8, 9], maxlen=10)


## Conclusion

Mastering the built-in library sequence types is key for writing concise, effective and idiomatic Python code.

Sequences are often categorized as mutable and inmutable, but it is also useful to consider them as flat and container sequences, first ones are faster, compact ans easier to use, but they are limited to store atomic data, in the other hand, container sequences are more flexible; remember, be careful storing mutable types inside inmutable containers.

List comprehensions and generator expressions are powerful one-liners to build and initialize sequences.

Tuples play two roles, as records and as inmutable lists, both of them have their uses and adventages, using them as records allows you to extract data unpacking them, beyond tuples `*` works with any sequence.

Using one or other sequence will always depend on the requirements of your implementation, there is not sequence which is a silver bullet for any problem in Python.