# CHAPTER 1: Data Structures and Algorithms

Python provides a variety of useful $\text{built-in}$ data structures, such as $\text{lists}$, $\text{sets}$, and $\text{dictionaries}$.For the most part, the use of these structures is straightforward. However,common questions concerning $searching$, $sorting$, $ordering$, and $filtering$ often arise.Thus, the goal of this chapter is to discuss common data structures and algorithms involving data. In addition, treatment is given to the various data structures contained in the collections module.

## 1.1. Unpacking a Sequence into Separate Variables

### Problem
You have an $\text{N}$-element tuple or sequence that you would like to unpack into a collection of $\text{N}$ variables.

### Solution
Any $\text{sequence}$ (or $\text{iterable}$) can be $unpacked$ into variables using a simple $\text{assignment}$ operation.   
The only requirement is that the number of variables and structure match the sequence.

For example:

In [1]:
p = (4, 5)

In [2]:
x, y = p

In [3]:
x

4

In [4]:
y

5

For example:

In [5]:
data = ['LEE', 50, 90.1, (2016, 12, 12)]

In [6]:
name, shares, price, date = data

In [7]:
name

'LEE'

In [8]:
date

(2016, 12, 12)

In [9]:
name, shares, price, (year, mon, day) = data

In [10]:
name

'LEE'

In [11]:
year

2016

In [12]:
mon

12

In [13]:
day

12

If there is a mismatch in the number of elements, you’ll get an error. 

For example:

In [14]:
p = (4, 5)

In [15]:
x, y, z = p

ValueError: need more than 2 values to unpack

### Discussion

$Unpacking$ actually works with any object that happens to be iterable, not just tuples or lists.   
This includes $\text{strings}$, $\text{files}$, $\text{iterators}$, and $\text{generators}$.

For example:

In [16]:
s = 'Hello'

In [17]:
a, b, c, d, e = s

In [18]:
a

'H'

In [19]:
b

'e'

In [20]:
e

'o'

When unpacking, you may sometimes want to discard certain values.   
Python has no special syntax for this, but you can often just pick a throwaway variable name for it. 

For example:

In [21]:
data = ['LEE', 50, 90.1, (2016, 12, 12)]

In [22]:
_, shares, price, _ = data

In [23]:
shares

50

In [24]:
price

90.1

However, make sure that the variable name you pick isn’t being used for something else already.

## 1.2. Unpacking Elements from Iterables of Arbitrary Length

### Problem
You need to unpack $\text{N}$ elements from an iterable, but the iterable may be longer than $\text{N}$ elements, causing a “too many values to unpack” exception.

### Solution
Python $\text{star expressions}$ can be used to address this problem. 

For example, suppose you run a course and decide at the end of the semester that you’re going to drop the first and last homework grades, and only average the rest of them. If there are only four assignments, maybe you simply unpack all four, but what if there are 24? A star expression makes it easy:

In [1]:
def drop_first_last(grades):
    first, *middle, last = grades
    return avg(middle)

As another use case, suppose you have user records that consist of a name and email
address, followed by an arbitrary number of phone numbers. You could unpack the
records like this:

In [4]:
user_record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')

In [5]:
name, email, *phone_numbers = user_record

In [6]:
name

'Dave'

In [7]:
email

'dave@example.com'

In [8]:
phone_numbers

['773-555-1212', '847-555-1212']

It’s worth noting that the $\text{phone_numbers}$ variable will always be a list, regardless of how many phone numbers are unpacked (including none). Thus, any code that uses $\text{phone_numbers}$ won’t have to account for the possibility that it might not be a list or
perform any kind of additional type checking.

The starred variable can also be the first one in the list.

For example:

In [10]:
*trailing, current = [10, 8, 7, 1, 9, 5, 10, 3]

In [11]:
trailing

[10, 8, 7, 1, 9, 5, 10]

In [12]:
current

3

### Discussion
Extended iterable unpacking is tailor-made for unpacking iterables of unknown or arbitrary length. Oftentimes, these iterables have some known component or pattern in their construction (e.g. “everything after element 1 is a phone number”), and star unpacking lets the developer leverage those patterns easily instead of performing acrobatics to get at the relevant elements in the iterable.

It is worth noting that the star syntax can be especially useful when iterating over a sequence of tuples of varying length.For example, perhaps a sequence of tagged tuples:

In [13]:
records = [
    ('foo', 1, 2),
    ('bar', 'hello'),
    ('foo', 3, 4),
]

In [14]:
def do_foo(x, y):
    print('foo', x, y)

In [15]:
def do_bar(s):
    print('bar', s)

In [16]:
for tag, *args in records:
    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)

foo 1 2
bar hello
foo 3 4


Star unpacking can also be useful when combined with certain kinds of string processing operations, such as splitting. For example:

In [17]:
passwd = 'liheyi:x:1000:1000:liheyi,,,:/home/liheyi:/bin/bash'

In [20]:
uname, *fields, homedir, sh = passwd.split(':')

In [21]:
uname

'liheyi'

In [22]:
homedir

'/home/liheyi'

In [23]:
sh

'/bin/bash'

Sometimes you might want to unpack values and throw them away. You can’t just specify a bare * when unpacking, but you could use a common throwaway variable name, such as _ or $ign$ (ignored). For example:

In [24]:
record = ('ACME', 50, 123.45, (12, 18, 2012))

In [25]:
name, *_, (*_, year) = record

In [26]:
name

'ACME'

In [27]:
year

2012

There is a certain similarity between star unpacking and list-processing features of various functional languages. For example, if you have a list, you can easily split it into head and tail components like this:

In [28]:
items = [1, 10, 7, 4, 5, 9]

In [29]:
head, *tail = items

In [30]:
head

1

In [31]:
tail

[10, 7, 4, 5, 9]

One could imagine writing functions that perform such splitting in order to carry out some kind of clever recursive algorithm. For example:

In [32]:
def sum(items):
    head, *tail = items
    return head + sum(tail) if tail else head

In [33]:
sum(items)

36

However, be aware that recursion really isn’t a strong Python feature due to the inherent recursion limit. Thus, this last example might be nothing more than an academic curiosity in practice.

## 1.3. Keeping the Last N Items

### Problem
You want to keep a limited history of the last few items seen during iteration or during some other kind of processing.

### Solution
Keeping a limited history is a perfect use for a $\text{collections.deque}$. For example, the following code performs a simple text match on a sequence of lines and yields the matching line along with the previous $\text{N}$ lines of context when found: