# [Chapter 1. Data Structures and Algorithms](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html)

Python provides a variety of useful built-in data structures, such as lists, sets, and dictionaries.  
For the most part, the use of these structures is straightforward.  
However, common questions concerning searching, sorting, ordering, and filtering often arise.  
Thus, the goal of this chapter is to discuss common data structures and algorithms involving data.  
In addition, treatment is given to the various data structures contained in the `collections` module.

## [Unpacking a Sequence into Separate Variables](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#iterableunpack)

### Problem

You have an $n$-element tuple or sequence that you would like to unpack into a collection of $n$ variables.

### Solution

Any sequence (or iterable) can be unpacked into variables using a simple assignment operation.  
The only requirement is that the number of variables and structure match the sequence.  
For example:

In [1]:
p = (4, 5)
x, y = p
print("x: {}".format(x))
print("y: {}".format(y))
print() 

data = [ 'ACME', 50, 91.1, (2012, 12, 21) ]
name, shares, price, date = data
print("name: {}".format(name))
print("date: {}".format(date))
print()

name, shares, price, (year, month, day) = data
print("name: {}".format(name))
print("year: {}".format(year))
print("month: {}".format(month))
print("day: {}".format(day))

x: 4
y: 5

name: ACME
date: (2012, 12, 21)

name: ACME
year: 2012
month: 12
day: 21


If there is a mismatch in the number of elements, you'll get an error.  
For example:

### Discussion

Unpacking actually works with any object that happens to be iterable, not just tuples or lists.  
This includes strings, files, iterators, and generators.  
For example:

In [2]:
s = 'Hello'
a, b, c, d, e = s
print("a: {}".format(a))
print("e: {}".format(e))

a: H
e: o


When unpacking, you may sometimes want to discard certain values.  
Python has no special syntax for this, but you can often just pick a throwaway variable name for it.  
For example:

In [3]:
data = [ 'ACME', 50, 91.1, (2012, 12, 21) ]
_, shares, price, _ = data
print("shares: {}".format(shares))
print("price: {}".format(price))

shares: 50
price: 91.1


Just make sure that the variable name you pick isn't being used for something else already.

## [Unpacking Elements from Iterables of Arbitrary Length](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_unpacking_elements_from_iterables_of_arbitrary_length)

### Problem

You need to unpack $n$ elements from an iterable, but the iterable may be longer than $n$ elements, causing a "too many values to unpack" exception.

### Solution

Python "star expressions" can be used to address this problem.  
For example, suppose you run a course and decide at the end of the semester that you're going to drop the first and last homework grades, and only average the rest of them.  
If there are only four assignments, maybe you simply unpack all four, but what if there are 24?  
A star expression makes it easy:

In [4]:
def drop_first_last(grades):
    first, *middle, last = grades
    return avg(middle)

As another use case, suppose you have user records that consist of a name and email address, followed by an arbitrary number of phone numbers.  
You could unpack the records like this:

In [5]:
user_record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')
name, email, *phone_numbers = user_record
print("name: {}".format(name))
print("email: {}".format(email))
print("phone_numbers: {}".format(phone_numbers))

name: Dave
email: dave@example.com
phone_numbers: ['773-555-1212', '847-555-1212']


It’s worth noting that the phone_numbers variable will always be a list, regardless of how many phone numbers are unpacked (including none).  
Thus, any code that uses phone_numbers won’t have to account for the possibility that it might not be a list or perform any kind of additional type checking.  
The starred variable can also be the first one in the list.  
For example, say you have a sequence of values representing your company’s sales figures for the last eight quarters.  
If you want to see how the most recent quarter stacks up to the average of the first seven, you could do something like this:

Let's give this operation a try:

In [6]:
sales_record = [10, 8, 7, 1, 9, 5, 10, 3]
*trailing_qtrs, current_qtr = sales_record
trailing_avg = sum(trailing_qtrs) / len(trailing_qtrs)

print("trailing_qtrs: {}".format(trailing_qtrs))
print("current_qtr: {}".format(current_qtr))
print("trailing_avg: {}".format(trailing_avg))

trailing_qtrs: [10, 8, 7, 1, 9, 5, 10]
current_qtr: 3
trailing_avg: 7.142857142857143


### Discussion

Extended iterable unpacking is tailor-made for unpacking iterables of unknown or arbitrary length.  
Oftentimes, these iterables have some known component or pattern in their construction (e.g. "everything after element 1 is a phone number"), and star unpacking lets the developer leverage those patterns easily instead of performing acrobatics to get at the relevant elements in the iterable.  
It is worth noting that the star syntax can be especially useful when iterating over a sequence of tuples of varying length.  
For example, perhaps a sequence of tagged tuples:

In [7]:
records = [
    ('foo', 1, 2),
    ('bar', 'hello'),
    ('foo', 3, 4)
]

def do_foo(x, y):
    print('foo', x, y)
    
def do_bar(s):
    print('bar', s)
    
for tag, *args in records:
    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)
    else:
        print("Something went wrong. Try again.")

foo 1 2
bar hello
foo 3 4


Star unpacking an also be useful when combined with certain kinds of string processing operations, such as splitting.  
For example:

In [8]:
line = 'nobody:*:-2:-2:Unprivileged User:/var/empty:/usr/bin/false'
uname, *fields, homedir, sh = line.split(':')
print("uname: {}".format(uname))
print("fields: {}".format(fields))
print("homedir: {}".format(homedir))
print("sh: {}".format(sh))

uname: nobody
fields: ['*', '-2', '-2', 'Unprivileged User']
homedir: /var/empty
sh: /usr/bin/false


Sometimes you might want to unpack values and throw them away.  
You can't just specify a bare * when unpacking, but you could use a common throwaway variable name, such as _ or `ign` (ignore).  
For example:

In [9]:
record = ('ACME', 50, 123.45, (12, 18, 2017))
name, *_, (*_, year) = record
print("name: {}".format(name))
print("year: {}".format(year))

name: ACME
year: 2017


There is a certain similarity between star unpacking and list-processing features of various functional languages.  
For example, if you have a list, you can easily split it into head and tail components like this:

In [10]:
items = [1, 10, 7, 4, 5, 9]
head, *tail = items
print("head: {}".format(head))
print("tail: {}".format(tail))

head: 1
tail: [10, 7, 4, 5, 9]


One could imagine writing functions that perform such splitting in order to carry out some kind of clever recursive algorithm, like this:

In [11]:
def add_up(items):
    head, *tail = items
    return head + add_up(tail) if tail else head

add_up(items)

36

However, be aware that recursion really isn't a strong Python feature due to the [inherent recursion limit](https://docs.python.org/3/library/sys.html#sys.getrecursionlimit).  
Thus, this last example might be nothing more than an academic curiosity in practice.

## [Keeping the Last N Items](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_keeping_the_last_n_items)

### Problem

You want to keep a limited history of the last few items seen during iteration or during some other kind of processing.

### Solution

Keeping a limited history is a perfect use for a [collections.deque](https://docs.python.org/3/library/collections.html#collections.deque).  
For example, the following code performs a simple text match on a sequence of lines and yields the matching line along with the previous $n$ lines of context when found:

In [12]:
from collections import deque

def search(lines, pattern, history=5):
    previous_lines = deque(maxlen=history)
    for line in lines:
        if pattern in line:
            yield line, previous_lines
        previous_lines.append(line)
        
# Example use on a file:
if __name__=='__main__':
    with open('python_ipsum.txt') as f:
        for line, prevlines in search(f, 'python', 5):
            for pline in prevlines:
                print(pline, end='')
            print(line, end='')
            print('-'*20)

Python Ipsum: Your source for Python-flavored placeholder text.
http://pythonipsum.com/
--------------------
Python Ipsum: Your source for Python-flavored placeholder text.
http://pythonipsum.com/

Lambda raspberrypi beautiful test script. Kwargs integration itertools dict reduce egg import cython.

Django integration functools unit object kwargs functools dictionary cython. Cython integration exception. Lambda integration diversity bdfl. Return integration exception self dunder. Python integration mercurial bdfl python lambda generator. Kwargs raspberrypi decorator unit cython import. Cython raspberrypi exception unit future klass exception. Python integration community. Object raspberrypi community bdfl cython import method.
--------------------

Lambda raspberrypi beautiful test script. Kwargs integration itertools dict reduce egg import cython.

Django integration functools unit object kwargs functools dictionary cython. Cython integration exception. Lambda integration diversity bd

### Discussion

When writing code to search for items, it is common to use a generator function involving `yield`, as shown in this recipe's solution.  
This decouples the process of searching from the code that uses the results.  
If you're new to generators, see ["Creating New Iteration Patterns with Generators"](http://chimera.labs.oreilly.com/books/1230000000393/ch04.html#_solution_59).  
Using `deque(maxlen=n)` creates a fixed-sized queue.  
When new items are added and the queue is full, the oldest item is automatically removed.  
For example:

In [13]:
q = deque(maxlen=3)
q.append(1)
q.append(2)
q.append(3)
print(q)
q.append(4)
print(q)
q.append(5)
print(q)

deque([1, 2, 3], maxlen=3)
deque([2, 3, 4], maxlen=3)
deque([3, 4, 5], maxlen=3)


Although you could manually perform such operations on a list (e.g., appending, deleting, and so on), the queue solution is far more elegant and runs a lot faster.  
More generally, a `deque` can be used whenever you need a simple queue structure.  
If you don't give it a maximum size, you get an unbounded queue that lets you append and pop items on either end, like this:

In [14]:
q = deque()
q.append(1)
q.append(2)
q.append(3)
print(q)
q.appendleft(4)
print(q)
q.pop()
print(q)
q.popleft()
print(q)

deque([1, 2, 3])
deque([4, 1, 2, 3])
deque([4, 1, 2])
deque([1, 2])


Adding or popping items from either end of a queue has $O(1)$ complexity.  
This is unlike a [list](https://docs.python.org/3/library/stdtypes.html#list) where inserting or removing items from the front of a list is $O(n)$.

## [Finding the Largest or Smallest N Items](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#findingthelargestorsmallest)

### Problem

You want to make a list of the largest or smallest $n$ items in a collection.

### Solution

The [heapq](https://docs.python.org/3.6/library/heapq.html) module has two functions -- `nlargest()` and `nsmallest()` -- that do exactly what you want.

In [15]:
import heapq

nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
print("heapq.nlargest(3, nums): {}".format(heapq.nlargest(3, nums)))
print("heapq.nsmallest(3, nums): {}".format(heapq.nsmallest(3, nums)))

heapq.nlargest(3, nums): [42, 37, 23]
heapq.nsmallest(3, nums): [-4, 1, 2]


Both functions also accept a key parameter that allows them to be used with more complicated data structures.

In [16]:
portfolio = [
   {'name': 'IBM', 'shares': 100, 'price': 91.1},
   {'name': 'AAPL', 'shares': 50, 'price': 543.22},
   {'name': 'FB', 'shares': 200, 'price': 21.09},
   {'name': 'HPQ', 'shares': 35, 'price': 31.75},
   {'name': 'YHOO', 'shares': 45, 'price': 16.35},
   {'name': 'ACME', 'shares': 75, 'price': 115.65}
]

cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
print("cheap: {}".format(cheap))
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])
print("expensive: {}".format(expensive))

cheap: [{'name': 'YHOO', 'shares': 45, 'price': 16.35}, {'name': 'FB', 'shares': 200, 'price': 21.09}, {'name': 'HPQ', 'shares': 35, 'price': 31.75}]
expensive: [{'name': 'AAPL', 'shares': 50, 'price': 543.22}, {'name': 'ACME', 'shares': 75, 'price': 115.65}, {'name': 'IBM', 'shares': 100, 'price': 91.1}]


### Discussion

If you are looking for the $n$ smallest or largest items and $n$ is small compared to the overall size of the collection, these functions provide superior performance.  
Undeneath the covers, they work by first converting the data into a list where items are ordered as a heap.

In [17]:
import heapq

nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
myheap = list(nums)
print("myheap: {}".format(myheap))

myheap: [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]


In [18]:
heapq.heapify(myheap)
myheap

[-4, 2, 1, 23, 7, 2, 18, 23, 42, 37, 8]

The most important feature of a heap is that `heap[0]` is always the smallest item.  
Moreover, subsequent items can be easily found using the `heapq.heappop()` method, which pops off the first item and replaces it with the next smallest item (an operation that requires O(log $n$) operations where $n$ is the size of the heap).  
For example, to find the three smallest items, you would do this:

In [19]:
heapq.heappop(myheap)

-4

In [20]:
heapq.heappop(myheap)

1

In [21]:
heapq.heappop(myheap)

2

The `nlargest()` and `nsmallest()` functions are most appropriate if you are trying to find a relatively small number of items.  
If you are simply trying to find the single smallest or largest item ($n$=1), it is faster to use `min()` and `max()`. Similarly, if $n$ is about the same size as the collection itself, it is usually faster to sort it first and take a slice (i.e., use sorted(items)[:$n$] or sorted(items)[-$n$:]).  
It should be noted that the actual implementation of `nlargest()` and `nsmallest()` is adaptive in how it operates and will carry out some of these optimizations on your behalf (e.g., using sorting if $n$ is close to the same size as the input).  
Although it’s not necessary to use this recipe, the implementation of a heap is an interesting and worthwhile subject of study.  
This can usually be found in any decent book on algorithms and data structures.  
The documentation for the heapq module also discusses the underlying implementation details.

## [Implementing a Priority Queue](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#priorityqueue)

### Problem

You want to implement a queue that sorts items by a given priority and always returns the item with the highest priority on each pop operation.

### Solution

The following class uses the `heapq` module to implement a simple priority queue:

In [22]:
import heapq

class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
        
    def pq_push(self, item, priority):
        heapq.heappush(self._queue, (-priority, self._index, item))
        self._index += 1
        
    def pq_pop(self):
        return heapq.heappop(self._queue)[-1]

Here is how it might be used:

In [23]:
class Item:
    def __init__(self, name):
        self.name = name
        
    def __repr__(self):
        return "Item({!r})".format(self.name)
    
q = PriorityQueue()
q.pq_push(Item('foo'), 1)
q.pq_push(Item('bar'), 5)
q.pq_push(Item('spam'), 4)
q.pq_push(Item('grok'), 1)
q.pq_pop()

Item('bar')

In [24]:
q.pq_pop()

Item('spam')

In [25]:
q.pq_pop()

Item('foo')

In [26]:
q.pq_pop()

Item('grok')

Observe how the first `pop()` operation returned the item with the highest priority.  
Also observe how the two items with the same priority (`foo` and `grok`) were returned in the same order in which they were inserted into the queue.

### Discussion

The core of this recipe concerns the use of the `heapq` module.  
The functions `heapq.heappush()` and `heapq.heappop()` insert and remove items from a list `_queue` in a way such that the first item in the list has the smallest priority (as discussed in “Finding the Largest or Smallest N Items”).  
The `heappop()` method always returns the "smallest" item, so that is the key to making the queue pop the correct items.  
Moreover, since the push and pop operations have `O(log N)` complexity where `N` is the number of items in the heap, they are fairly efficient even for fairly large values of `N`.

In this recipe, the queue consists of tuples of the form `(-priority, index, item)`.  
The `priority` value is negated to get the queue to sort items from highest priority to lowest priority.  
This is the opposite of the normal heap ordering which sorts from lowest to highest value.  
The role of the `index` variable is to properly order items with the same priority level.  
By keeping a constantly increasing index, the items will be sorted according to the order in which they were inserted.  
However, the index also serves an important role in making the comparison operations work for items that have the same priority level.

To elaborate on that, instances if Item in the example can't be ordered.  
For example:

If you make `(priority, item)` tuples, they can be compared as long as the priorities are different.  
However, if two tuples with equal priorities are compared, the comparison fails as before.  
For example:

By introducing the extra index and making `(priority, index, item)` tuples, you avoid this problem entirely since no two tuples will ever have the same value for `index` (and Python never bother to compare the remaining tuple values once the result of comparison can be determined):

In [27]:
a = (1, 0, Item('foo'))
b = (5, 1, Item('bar'))
c = (1, 2, Item('grok'))
print(a < b)
print(a < c)

True
True


If you want to use this queue for communication between threads, you need to add appropriate locking and signaling. See “Communicating Between Threads” for an example of how to do this.  
The documentation for the heapq module has further examples and discussion concerning the theory and implementation of heaps.

## [Mapping Keys to Multiple Values in a Dictionary](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#multidict)

### Problem

You want to make a dictionary that maps keys to more than one value (a so-called "multidict").

### Solution

A dictionary is a mapping where each key is mapped to a single value.  
If you want to map keys to multiple values, you need to store the multiple values in another container such as a list or set.  
For example, you might make dictionaries like this:

In [28]:
d = {
    'a' : [1, 2, 3],
    'b' : [4, 5]
}

e = {
    'a' : {1, 2, 3},
    'b' : {4, 5}
}

The choice of whether or not to use lists or sets depends on intended use.  
Use a list if you want to preserve the insertion order of the items.  
Use a set if you want to eliminate duplicates (and don’t care about the order).  
To easily construct such dictionaries, you can use `defaultdict` in the `collections` module.  
A feature of `defaultdict` is that it automatically initializes the first value so you can simply focus on adding items.  
For example:

One caution with `defaultdict` is that it will automatically create dictionary entries for keys accessed later on (even if they aren’t currently found in the dictionary).  
If you don’t want this behavior, you might use `setdefault()` on an ordinary dictionary instead.  
For example:

However, many programmers find `setdefault()` to be a little unnatural -- not to mention the fact that it always creates a new instance of the initial value on each invocation (the empty list [] in the example).

### Discussion

In principle, constructing a multivalued dictionary is simple.  
However, initialization of the first value can be messy if you try to do it yourself.  
For example, you might have code that looks like this:

Using a `defaultdict` simply leads to much cleaner code:

This recipe is strongly related to the problem of grouping records together in data processing problems.  
See “Grouping Records Together Based on a Field” for an example.

## [Keeping Dictionaries in Order](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#ordereddict)

### Problem

You want to create a dictionary, and you also want to control the order of the items when iterating or serializing.

### Solution

To control the order of items in a dictionary, you can use an `OrderedDict` from the `collections` module.  
It exactly preserves the original insertion order of the data when iterating.  
Here's how that works:

In [29]:
from collections import OrderedDict

d = OrderedDict()
d['foo'] = 1
d['bar'] = 2
d['spam'] = 3
d['grok'] = 4

# Outputs "foo 1", "bar 2", "spam 3", "grok 4"
for key in d:
    print(key, d[key])

foo 1
bar 2
spam 3
grok 4


An `OrderedDict` can be particularly useful when you want to build a mapping that you may want to later serialize or encode into a different format.  
For example, if you want to precisely control the order of fields appearing in a [JSON encoding](https://docs.python.org/3/library/json.html), first building the data in an `OrderedDict` will do the trick:

In [30]:
import json
json.dumps(d)

'{"foo": 1, "bar": 2, "spam": 3, "grok": 4}'

### Discussion

An `OrderedDict` internally maintains a doubly linked list that orders the keys according to insertion order.  
When a new item is first inserted, it is placed at the end of this list.  
Subsequent reassignment of an existing key doesn't change the order.

Be aware that the size of an `OrderedDict` is more than twice as large as a normal dictionary due to the extra linked list that's created.  
Thus, if you are going to build a data structure involving a large number of `OrderedDict` instances (like reading 100,000 lines of a CSV file into a list of `OrderedDict` instances), you would need to study the requirements of your application to determine if the benefits of using an `OrderedDict` outweighed the extra memory overhead.

## [Calculating with Dictionaries](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#dictcalc)

### Problem

You want to perform various calculations (like minimum value, maximum value, sorting, and so forth) on a dictionary of data.

### Solution

Consider a dictionary that maps stock names to prices:

In [31]:
prices = {
    'ACME': 45.23,
    'AAPL': 612.78,
    'IBM': 205.55,
    'HPQ': 37.20,
    'FB': 10.75
}

In order to perform useful calculations on the dictionary contents, it is often useful to invert the keys and values of the dictionary using `zip()`.  
In the following example, we find the minimum and maximum price and stock name:

In [32]:
min_price = min(zip(prices.values(), prices.keys()))
print("min_price: {}".format(min_price))

max_price = max(zip(prices.values(), prices.keys()))
print("max_price: {}".format(max_price))

min_price: (10.75, 'FB')
max_price: (612.78, 'AAPL')


Similarly, to rank the data, use `zip()` with `sorted()`, as follows:

In [33]:
prices_sorted = sorted(zip(prices.values(), prices.keys()))
print("prices_sorted: \n{}".format(prices_sorted))

prices_sorted: 
[(10.75, 'FB'), (37.2, 'HPQ'), (45.23, 'ACME'), (205.55, 'IBM'), (612.78, 'AAPL')]


When doing these calculations, be aware that `zip()` creates an iterator that can only be consumed once.  
For example, the following code will throw an error:

### Discussion

If you try to perform common data reductions on a dictionary, you will find that they only process the keys, not the values.

In [34]:
print("min(prices): {}".format(min(prices)))
print("max(prices): {}".format(max(prices)))

min(prices): AAPL
max(prices): IBM


This is probably not what you want because you're actually trying to perform a calculation involving the dictionary values.  
You might try to fix this using the `values()` method of a dictionary:

In [35]:
print("min(prices.values()): {}".format(min(prices.values())))
print("max(prices.values()): {}".format(max(prices.values())))

min(prices.values()): 10.75
max(prices.values()): 612.78


Unfortunately, this is often not exactly what you want either.  
For example, you may want to know information about the corresponding keys(for instance, which stock has the lowest price?).

You can get the key corresponding to the min or max value if you supply a key function to `min()` and `max()`.

In [36]:
min(prices, key=lambda k: prices[k])

'FB'

In [37]:
max(prices, key=lambda k: prices[k])

'AAPL'

However, to get the minimum *value*, you'll need to perform an extra lookup step:

In [38]:
min_value = prices[min(prices, key=lambda k: prices[k])]
min_value

10.75

The solution involving `zip()` solves the problem by "inverting" the dictionary into a sequence of `(value, key)` pairs.  
When performing comparisons on such tuples, the value element is compared first, followed by the key.  
This gives you exactly the behavior that you want and allows reductions and sorting to be easily performed on the dictionary contents using a single statement.  
It should be noted that in calculations involving `(value, key)` pairs, the key will be used to determine the result in instances where multiple entries happen to have the same value.  
For instance, in calculations such as `min()` and `max()`, the entry with the smallest or largest key will be returned if there happen to be duplicate values.

In [39]:
prices = { 'AAA' : 45.23, 'ZZZ' : 45.23}
print("min_price: {}".format(min(zip(prices.values(), prices.keys()))))
print("max_price: {}".format(max(zip(prices.values(), prices.keys()))))

min_price: (45.23, 'AAA')
max_price: (45.23, 'ZZZ')


## [Finding Commonalities in Two Dictionaries](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_finding_commonalities_in_two_dictionaries)

### Problem

You have two dictionaries and you want to find out what they might have in common (same keys, same values, etc.).

### Solution

Consider two dictionaries:

In [40]:
a = {
    'x': 1,
    'y': 2,
    'z': 3
}

b = {
    'w': 10,
    'x': 11,
    'y': 2
}

To find out what the two dictionaries have in common, simply perform common [set operations](https://docs.python.org/3/library/stdtypes.html#set) using the `keys()` or `items()` methods.

In [41]:
# Find keys in common:
a.keys() & b.keys()

{'x', 'y'}

In [42]:
# Find keys in a that are not in b:
a.keys() - b.keys()

{'z'}

In [43]:
# Find (key, value) pairs in common:
a.items() & b.items()

{('y', 2)}

These kinds of operations can also be used to alter or filter dictionary contents.  
For example, suppose you want to make a new dictionary with selected keys removed.  
Here is some sample code using a [dictionary comprehension](http://www.diveintopython3.net/comprehensions.html#dictionarycomprehension):

In [44]:
# Make a new dictionary with certain keys removed:
c = {key: a[key] for key in a.keys() - {'z', 'w'}}
c

{'x': 1, 'y': 2}

### Discussion

A dictionary is a mapping between a set of keys and values.  
The `keys()` method of a dictionary returns a keys-view object that exposes the keys.  
A little-known feature of keys views is that they also support common set operations such as unions, intersections, and differences.  
Thus, if you need to perform common set operations with dictionary keys, you can often just use the keys-view objects directly without first converting them into a set.

The `items()` method of a dictionary returns an items-view object consisting of `(key, value)` pairs.  
This object supports similar set operations and can be used to perform operations such as finding out which key-value pairs two dictionaries have in common.

Although similar, the `values()` method of a dictionary does not support the set operations described in this recipe.  
In part, this is due to the fact that unlike keys, the items contained in a values view aren’t guaranteed to be unique.  
This alone makes certain set operations of questionable utility.  
However, if you must perform such calculations, they can be accomplished by simply converting the values to a set first.

## [Removing Duplicates from a Sequence while Maintaining Order](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_removing_duplicates_from_a_sequence_while_maintaining_order)

### Problem

You want to eliminate the duplicate values in a sequence, but preserve the order of the remaining items.

### Solution

If the values in the sequence are hashable, the problem can be easily solved using a set and a generator.

In [45]:
def deduplicate(items):
    seen = set()
    for item in items:
        if item not in seen:
            yield item
            seen.add(item)

Now let's make this function useful:

In [46]:
a = [1, 5, 2, 1, 9, 1, 5, 10]
list(deduplicate(a))

[1, 5, 2, 9, 10]

This only works if the items in the sequence are hashable.  
If you are trying to eliminate duplicates in a sequence of unhashable types (such as dicts), you can make a change to this recipe, as follows:

In [47]:
def dedupe2(items, key=None):
    seen = set()
    for item in items:
        val = item if key is None else key(item)
        if val not in seen:
            yield item
            seen.add(val)

In [48]:
list(dedupe2(a))

[1, 5, 2, 9, 10]

Here, the purpose of the `key` argument is to specify a function that converts sequence items into a hashable type for the purposes of duplicate detection.

In [49]:
a = [ {'x':1, 'y':2}, {'x':1, 'y':3}, {'x':1, 'y':2}, {'x':2, 'y':4} ]
list(dedupe2(a, key=lambda d: (d['x'], d['y'])))

[{'x': 1, 'y': 2}, {'x': 1, 'y': 3}, {'x': 2, 'y': 4}]

In [50]:
list(dedupe2(a, key=lambda d: d['x']))

[{'x': 1, 'y': 2}, {'x': 2, 'y': 4}]

This latter solution also works nicely if you want to eliminate duplicates based on the value of a single field or attribute, or a larger data structure.

### Discussion

If all you want to do is eliminate duplicates, it is often easy enough to make a set.

In [51]:
a = [1, 5, 2, 1, 9, 1, 5, 10]
a

[1, 5, 2, 1, 9, 1, 5, 10]

In [52]:
set(a)

{1, 2, 5, 9, 10}

However, this approach doesn't preserve any kind of ordering.  
So, the resulting data will be scrambled afterward.  
The solution shown avoids this.

The use of a generator function in this recipe reflects the fact that you might want the function to be extremely general purpose -- not necessarily tied directly to list processing.  
For example, if you want to read a file, eliminating duplicate lines, you could simply do this:

The specification of a `key` function mimics similar functionality in built-in functions such as `sorted()`, `min()`, and `max()`.

## [Naming a Slice](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_naming_a_slice)

### Problem

Your program has become an unreadable mess of hardcoded slice indices and you want to clean it up.

### Solution

Suppose you have some code that is pulling specific data fields out of a record string with fixed fields, like a flat file or similar format.

In [53]:
######    0123456789012345678901234567890123456789012345678901234567890'
record = '....................100          .......513.25     ..........'
cost = int(record[20:32]) * float(record[40:48])
cost

51325.0

Instead of doing that, why not name the slices like this?

In [54]:
SHARES = slice(20, 32)
PRICE = slice(40, 48)

cost = int(record[SHARES]) * float(record[PRICE])
cost

51325.0

In the latter version, you avoid having a lot of mysterious hardcoded indices, and what you're doing becomes much clearer.

### Discussion

As a general rule, writing code with a lot of hardcoded index values leads to a readability and maintenance mess.  
For example, if you come back to the code a year later, you'll look at it and wonder what you were thinking when you wrote it.  
The solution shown is simply a way of more clearly stating what your code is actually doing.  
In general, the built-in `slice()` creates a slice object that can be used anywhere a slice is allowed.

In [55]:
items = [0, 1, 2, 3, 4, 5, 6]
a = slice(2, 4)
print(items[2:4])
print(items[a])
items[a] = [10, 11]
print(items)
del items[a]
print(items)

[2, 3]
[2, 3]
[0, 1, 10, 11, 4, 5, 6]
[0, 1, 4, 5, 6]


If you have a `slice` instance `s`, you can get more information about it by looking at its `s.start`, `s.stop`, and `s.step` attributes, respectively.

In [56]:
a = slice(10, 50, 2)
print(a.start)
print(a.stop)
print(a.step)

10
50
2


In addition, you can map a slice onto a sequence of a specific size by using its `indices(size)` method.  
This returns a tuple `(start, stop, step)` where all values have been suitably limited to fit within bounds (so you can avoid `IndexError` exceptions when indexing).

In [57]:
s = 'HelloWorld'
print(a.indices(len(s)))
for i in range(*a.indices(len(s))):
    print(s[i])

(10, 10, 2)


### [Determining the Most Frequently Occuring Items in a Sequence](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_determining_the_most_frequently_occurring_items_in_a_sequence)

### Problem

You have a sequence of items, and you would like to determine the mos frequently occurring items in the sequence.

### Solution

The [collections.Counter](https://docs.python.org/3/library/collections.html#collections.Counter) class is designed for just such a problem.  
It even comes with a handy [most_common()](https://docs.python.org/3.6/library/collections.html#collections.Counter.most_common) method that will give you the answer.  
To illustrate, let's say you have a list of words and you want to find out which words occur most often.  
Here's how you would do it:

In [58]:
words = [
   'look', 'into', 'my', 'eyes', 'look', 'into', 'my', 'eyes',
   'the', 'eyes', 'the', 'eyes', 'the', 'eyes', 'not', 'around', 'the',
   'eyes', "don't", 'look', 'around', 'the', 'eyes', 'look', 'into',
   'my', 'eyes', "you're", 'under'
]
words

['look',
 'into',
 'my',
 'eyes',
 'look',
 'into',
 'my',
 'eyes',
 'the',
 'eyes',
 'the',
 'eyes',
 'the',
 'eyes',
 'not',
 'around',
 'the',
 'eyes',
 "don't",
 'look',
 'around',
 'the',
 'eyes',
 'look',
 'into',
 'my',
 'eyes',
 "you're",
 'under']

In [59]:
from collections import Counter

word_counts = Counter(words)
top_three = word_counts.most_common(3)
print(top_three)

[('eyes', 8), ('the', 5), ('look', 4)]


### Discussion

As input, `Counter` objects can be fed any sequence of hashable input items.  
Under the covers, a `Counter` is a dictionary that maps the items to the number of occurrences.

In [60]:
word_counts['not']

1

In [61]:
word_counts['eyes']

8

If you want to increment the count manually, simply use addition:

In [62]:
morewords = ['why','are','you','not','looking','in','my','eyes']

for word in morewords:
    word_counts[word] += 1
    
word_counts['eyes']

9

Or you could use the `update()` method:

In [63]:
word_counts.update(morewords)

A little-known feature of `Counter` instances is that they can easily be combined using various mathematical operations.

In [64]:
a = Counter(words)
b = Counter(morewords)
print("a: {}".format(a))
print("b: {}".format(b))

a: Counter({'eyes': 8, 'the': 5, 'look': 4, 'into': 3, 'my': 3, 'around': 2, 'not': 1, "don't": 1, "you're": 1, 'under': 1})
b: Counter({'why': 1, 'are': 1, 'you': 1, 'not': 1, 'looking': 1, 'in': 1, 'my': 1, 'eyes': 1})


In [65]:
# Combine counts:
c = a + b
print("c: {}".format(c))

c: Counter({'eyes': 9, 'the': 5, 'look': 4, 'my': 4, 'into': 3, 'not': 2, 'around': 2, "don't": 1, "you're": 1, 'under': 1, 'why': 1, 'are': 1, 'you': 1, 'looking': 1, 'in': 1})


In [66]:
# Subtract counts:
d = a - b
print("d: {}".format(d))

d: Counter({'eyes': 7, 'the': 5, 'look': 4, 'into': 3, 'my': 2, 'around': 2, "don't": 1, "you're": 1, 'under': 1})


Needless to say, `Counter` objects are a tremendously useful tool for almost any kind of problem where you need to tabulate and count data.  
You should prefer this over manually written solutions involving dictionaries.

## [Sorting a List of Dictionaries by a Common Key](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#itemgetter)

### Problem

You have a list of dictionaries and you would like to sort the entries according to one or more of the dictionary values.

### Solution

Sorting this type of structure is easy using the [operator module's itemgetter() function](https://docs.python.org/3.6/library/operator.html#operator.itemgetter).  
Let's say that you have queried a database table to get a listing of the members on your website, and you recieve the following data structure in return:

In [67]:
rows = [
    {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
    {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
    {'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
    {'fname': 'Big', 'lname': 'Jones', 'uid': 1004}
]

It's fairly easy to output these rows ordered by any of the fields common to all of the dictionaries.

In [68]:
from operator import itemgetter
import pprint

rows_by_fname = sorted(rows, key=itemgetter('fname'))
pprint.pprint(rows_by_fname)

[{'fname': 'Big', 'lname': 'Jones', 'uid': 1004},
 {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
 {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
 {'fname': 'John', 'lname': 'Cleese', 'uid': 1001}]


In [69]:
rows_by_uid = sorted(rows, key=itemgetter('uid'))
pprint.pprint(rows_by_uid)

[{'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
 {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
 {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
 {'fname': 'Big', 'lname': 'Jones', 'uid': 1004}]


The `itemgetter()` function can also accept multiple keys.

In [70]:
rows_by_lfname = sorted(rows, key=itemgetter('lname','fname'))
pprint.pprint(rows_by_lfname)

[{'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
 {'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
 {'fname': 'Big', 'lname': 'Jones', 'uid': 1004},
 {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003}]


### Discussion

In this example, `rows` is passed to the built-in `sorted()` function, which accepts a keyword argument `key`.  
This argument is expected to be a callable that accepts a single item from rows as input and returns a value that will be used as the basis for sorting.  
The `itemgetter()` function creates just such a callable.  
The `operator.itemgetter()` function takes as arguments the lookup indices used to extract the desired values from the records in rows.  
It can be a dictionary key name, a numeric list element, or any value that can be fed to an object’s `__getitem__()` method.  
If you give multiple indices to `itemgetter()`, the callable it produces will return a tuple with all of the elements in it, and `sorted()` will order the output according to the sorted order of the tuples.  
This can be useful if you want to simultaneously sort on multiple fields (such as last and first name, as shown in the example).

The functionality of `itemgetter(`) is sometimes replaced by lambda expressions.

In [71]:
rows_by_fname = sorted(rows, key=lambda r: r['fname'])
pprint.pprint(rows_by_fname)

[{'fname': 'Big', 'lname': 'Jones', 'uid': 1004},
 {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003},
 {'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
 {'fname': 'John', 'lname': 'Cleese', 'uid': 1001}]


In [72]:
rows_by_lfname = sorted(rows, key=lambda r: (r['lname'],r['fname']))
pprint.pprint(rows_by_lfname)

[{'fname': 'David', 'lname': 'Beazley', 'uid': 1002},
 {'fname': 'John', 'lname': 'Cleese', 'uid': 1001},
 {'fname': 'Big', 'lname': 'Jones', 'uid': 1004},
 {'fname': 'Brian', 'lname': 'Jones', 'uid': 1003}]


This solution often works just fine.  
However, the solution involving `itemgetter()` typically runs a bit faster.  
Thus, you might not prefer it if performance is a concern.

Last, but not least, don't forget that the technique shown in this recipe can be applied to functions such as `min()` and `max()`.

In [73]:
min(rows, key=itemgetter('uid'))

{'fname': 'John', 'lname': 'Cleese', 'uid': 1001}

In [74]:
max(rows, key=itemgetter('uid'))

{'fname': 'Big', 'lname': 'Jones', 'uid': 1004}

## [Sorting Objects Without Native Comparison Support](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_sorting_objects_without_native_comparison_support)

### Problem

You want to sort objects of the same class, but they don't natively support comparison operations.

### Soultion

The built-in `sorted()` function takes a `key` argument that can be passed a callable that will return some value in the object that `sorted()` will use to compare the objects.  
For example, if you have a sequence of `User` instances in your application, and you want to sort them by their `user_id` attribute, you would supply a callable that takes a `User` instance as input and returns the `user_id`.

In [75]:
class User:
    def __init__(self, user_id):
        self.user_id = user_id
    def __repr__(self):
        return "User({})".format(self.user_id)

In [76]:
users = [User(23), User(3), User(99)]
users

[User(23), User(3), User(99)]

In [77]:
sorted(users, key=lambda u: u.user_id)

[User(3), User(23), User(99)]

Instead of using `lambda`, an alternative approach is to use `operator.attrgetter()`:

In [78]:
from operator import attrgetter

sorted(users, key=attrgetter('user_id'))

[User(3), User(23), User(99)]

### Discussion

The choice of whether or not to use `lambda` or `attrgetter()` may be one of personal preference.  
However, `attrgetter()` is often a tad bit faster and also has the added feature of allowing multiple fields to be extracted simultaneously.  
This is analogous to the use of `operator.itemgetter()` for dictionaries (see “Sorting a List of Dictionaries by a Common Key”).  
For example, if `User` instances also had a `first_name` and `last_name` attribute, you could perform a sort like this:

It is also worth noting that the technique used in this recipe can be applied to functions such as `min()` and `max()`.

In [79]:
min(users, key=attrgetter('user_id'))

User(3)

In [80]:
max(users, key=attrgetter('user_id'))

User(99)

## [Grouping Records Together Based on a Field](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#groupby)

### Problem

You have a sequence of dictionaries or instances and you want to iterate over the data in groups based on the value of a particular field, such as date.

### Solution

The `itertools.groupby()` function is particularly useful for grouping data together like this.  
To illustrate, suppose you have the following list of dictionaries:

In [81]:
rows = [
    {'address': '5412 N CLARK', 'date': '07/01/2012'},
    {'address': '5148 N CLARK', 'date': '07/04/2012'},
    {'address': '5800 E 58TH', 'date': '07/02/2012'},
    {'address': '2122 N CLARK', 'date': '07/03/2012'},
    {'address': '5645 N RAVENSWOOD', 'date': '07/02/2012'},
    {'address': '1060 W ADDISON', 'date': '07/02/2012'},
    {'address': '4801 N BROADWAY', 'date': '07/01/2012'},
    {'address': '1039 W GRANVILLE', 'date': '07/04/2012'},
]

Now suppose that you want to iterate over the data in chunks grouped by date.  
To do that, first sort by the desired field (`date` in this case) and then use `itertools.groupby()`:

In [82]:
from operator import itemgetter
from itertools import groupby

# Sort by the desired field first:
rows.sort(key=itemgetter('date'))

# Iterate in groups:
for date, items in groupby(rows, key=itemgetter('date')):
    print(date)
    for i in items:
        print('   ', i)

07/01/2012
    {'address': '5412 N CLARK', 'date': '07/01/2012'}
    {'address': '4801 N BROADWAY', 'date': '07/01/2012'}
07/02/2012
    {'address': '5800 E 58TH', 'date': '07/02/2012'}
    {'address': '5645 N RAVENSWOOD', 'date': '07/02/2012'}
    {'address': '1060 W ADDISON', 'date': '07/02/2012'}
07/03/2012
    {'address': '2122 N CLARK', 'date': '07/03/2012'}
07/04/2012
    {'address': '5148 N CLARK', 'date': '07/04/2012'}
    {'address': '1039 W GRANVILLE', 'date': '07/04/2012'}


### Discussion

The `groupby()` function works by scanning a sequence and finding sequential "runs" of identical values (or values returned by the given key function).  
On each iteration, it returns the value along with an iterator that produces all of the items in a group with the same value.  
An important preliminary step is sorting the data according to the field of interest.  
Since `groupby()` only examines consecutive items, failing to sort first won’t group the records as you want.  
If your goal is to simply group the data together by dates into a large data structure that allows random access, you may have better luck using `defaultdict()` to build a multidict, as described in “Mapping Keys to Multiple Values in a Dictionary”.

In [83]:
from collections import defaultdict
rows_by_date = defaultdict(list)
for row in rows:
    rows_by_date[row['date']].append(row)
    
rows_by_date

defaultdict(list,
            {'07/01/2012': [{'address': '5412 N CLARK', 'date': '07/01/2012'},
              {'address': '4801 N BROADWAY', 'date': '07/01/2012'}],
             '07/02/2012': [{'address': '5800 E 58TH', 'date': '07/02/2012'},
              {'address': '5645 N RAVENSWOOD', 'date': '07/02/2012'},
              {'address': '1060 W ADDISON', 'date': '07/02/2012'}],
             '07/03/2012': [{'address': '2122 N CLARK', 'date': '07/03/2012'}],
             '07/04/2012': [{'address': '5148 N CLARK', 'date': '07/04/2012'},
              {'address': '1039 W GRANVILLE', 'date': '07/04/2012'}]})

This allows the records for each date to be accessed easily like this:

In [84]:
for r in rows_by_date['07/01/2012']:
    print(r)

{'address': '5412 N CLARK', 'date': '07/01/2012'}
{'address': '4801 N BROADWAY', 'date': '07/01/2012'}


For this latter example, it's not necessary to sort the records first.  
Thus, if memory is of no concern, it may be faster to do this than to first sort the records and iterate using `groupby()`.

## [Filtering Sequence Elements](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_filtering_sequence_elements)

### Problem

You have data inside of a sequence, and you need to extract values or reduce the sequence using some criteria.

### Solution

The easiest way to filter sequence data is to use a list comprehension.

In [85]:
mylist = [1, 4, -5, 10, -7, 2, 3, -1]
[n for n in mylist if n > 0]

[1, 4, 10, 2, 3]

In [86]:
[n for n in mylist if n < 0]

[-5, -7, -1]

One potential downside of using a list comprehension is that it might produce a large result if the original input is large.  
If this is a concern, you can use generator expressions to produce filtered values iteratively.

In [87]:
pos = (n for n in mylist if n > 0)
pos

<generator object <genexpr> at 0x1101a26d0>

In [88]:
for x in pos:
    print(x)

1
4
10
2
3


Sometimes, the filtering criteria cannot be easily expressed in a list comprehension or generator expression.  
Suppose that the filtering process involves exception handling or some other complicated detail.  
For this, put the filtering code into its own function and use the built-in `filter()` function.

In [89]:
values = ['1', '2', '-3', '-', '4', 'N/A', '5']

def is_int(val):
    try:
        x = int(val)
        return True
    except ValueError:
        return False

In [90]:
ivals = list(filter(is_int, values))
ivals

['1', '2', '-3', '4', '5']

The `filter()` function creates an iterator, so if you want to create a list of results, make sure you also use `list()` as shown.

### Discussion

[List comprehensions](https://docs.python.org/3.6/tutorial/datastructures.html#list-comprehensions) and [generator expressions](https://docs.python.org/3/reference/expressions.html#generator-expressions) are often the easiest and most straightforward ways to filter simple data.  
They also have the added power to transform the data at the same time.

In [91]:
mylist = [1, 4, -5, 10, -7, 2, 3, -1]
import math
[math.sqrt(n) for n in mylist if n > 0]

[1.0, 2.0, 3.1622776601683795, 1.4142135623730951, 1.7320508075688772]

One variation on filtering involves replacing the values that don't meet the criteria with a new value instead of discarding them.  
For example, perhaps instead of just finding positive values, you also want to discard the negative values and replace them with zeros.  
This is easily accomplished by moving the filter criterion into a conditional expression like this:

In [92]:
clip_neg = [n if n > 0 else 0 for n in mylist]
clip_neg

[1, 4, 0, 10, 0, 2, 3, 0]

Now let's perform the same operation, only now we'll swap negative and positive numbers:

In [93]:
clip_pos = [n if n < 0 else 0 for n in mylist]
clip_pos

[0, 0, -5, 0, -7, 0, 0, -1]

Another notable filtering tool is [itertools.compress()](https://docs.python.org/3/library/itertools.html#itertools.compress), which takes an iterable and an accompanying Boolean selector sequence as input.  
As output, it gives you all of the items in the iterable where the corresponding element in the selector is `True`.  
This can be useful if you're trying to apply the results of filtering one sequence to another related sequence.

Suppose you have the following two columns of data:

In [94]:
addresses = [
    '5412 N CLARK',
    '5148 N CLARK',
    '5800 E 58TH',
    '2122 N CLARK'
    '5645 N RAVENSWOOD',
    '1060 W ADDISON',
    '4801 N BROADWAY',
    '1039 W GRANVILLE',
]

counts = [ 0, 3, 10, 4, 1, 7, 6, 1]

Here's one way you can make a list of all addresses where the corresponding count value is greater than 5:

In [95]:
from itertools import compress

more5 = [n > 5 for n in counts]
more5

[False, False, True, False, False, True, True, False]

In [96]:
list(compress(addresses, more5))

['5800 E 58TH', '4801 N BROADWAY', '1039 W GRANVILLE']

The key here is to first create a sequence of Booleans that indicates which elements satisfy the desired condition.  
The `compress()` function then picks out the items corresponding to `True` values.  
Like `filter()`, `compress()` normally returns an iterator.  
Thus, you need to use `list()` to turn the results into a list if desired.

## [Extracting a Subset of a Dictionary](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_extracting_a_subset_of_a_dictionary)

### Problem

You want to make a dictionary that is a subset of another dictionary.

### Solution

This is easily accomplished using a dictionary comprehension.

In [97]:
prices = {
   'ACME': 45.23,
   'AAPL': 612.78,
   'IBM': 205.55,
   'HPQ': 37.20,
   'FB': 10.75
}

In [98]:
# Make a dictionary of all prices over 200:
p1 = { key:value for key, value in prices.items() if value > 200 }
p1

{'AAPL': 612.78, 'IBM': 205.55}

In [99]:
# Make a dictionary of tech stocks:
tech_names = { 'AAPL', 'IBM', 'HPQ', 'MSFT' }
p2 = { key:value for key, value in prices.items() if key in tech_names }
p2

{'AAPL': 612.78, 'HPQ': 37.2, 'IBM': 205.55}

### Discussion

Much of what can be accomplished with a dictionary comprehension might also be done by creating a sequence of tuples and passing them to the `dict()` function.

In [100]:
p1 = dict((key, value) for key, value in prices.items() if value > 200)
p1

{'AAPL': 612.78, 'IBM': 205.55}

However, the dictionary comprehension solution is a bit clearer and actually runs quite a bit faster (over twice as fast when tested on the `prices` dictionary used in the example).

Sometimes there are multiple ways of accomplishing the same thing.  
For instance, the second example could be rewritten as:

In [101]:
# Make a dictionary of tech stocks:
tech_names = { 'AAPL', 'IBM', 'HPQ', 'MSFT' }
p2 = { key:prices[key] for key in prices.keys() & tech_names }
p2

{'AAPL': 612.78, 'HPQ': 37.2, 'IBM': 205.55}

However, a timing study reveals that this solution is almost 1.6 times slower than the first solution.  
If performance matters, it usually pays to spend a bit of time studying it.  
See ["Profiling and Timing Your Program"](http://chimera.labs.oreilly.com/books/1230000000393/ch14.html#profiling) for specific information about timing and profiling.

## [Mapping Names to Sequence Elements](http://chimera.labs.oreilly.com/books/1230000000393/ch01.html#_mapping_names_to_sequence_elements)