# Chapter 10: Maps, Hash Tables, and Skip Lists

## Chapter 10.1: Maps and Dictionaries

Python's `dict` class is arguably the most significant data structure in the language. It represents an abstraction known as a dictionary in which unique keys are mapped to associated values. We note that the keys are assumed to be unique, but the values are not necessarily unique.

**Common applications of maps including the following.**

* A university's information system relies on some form of a student ID as a ke that is mapped to that student's associated record serving as the value.
* The domain-name system (DNS) maps a host name, such as `www.wiley.com`, to an Internet-Protocol (IP) address, such as `208.215.179.146`.
* A social media site typically relies on a (nonnumeric) username as a key that can be efficiently mapped to a particular user's associated information.
* A computer graphics system may map a color name, such as `turquoise`, to the tripple of numberss that describes the color's RGB representation, such as `(64, 22,4 208)`.
* Python uses a dictionary to reprsent each namespace, mapping an identifying string, such as `pi`, to an associated object, such as `3.14159`.

In this chapter and the next we demonstrate that a map may be implemented so that a search for a key, and its associated value, can be performed very efficiently, thereby supporting fast lookup in such applications.

### 10.1.1 The Map ADT

We begin by listing what we consider the most significant five behaviors of a map `M` as follows:

* `M[k]`: Return the value `v` associated with key `k` in map `M`, if one exists; otherwise raise a `KeyError`. In python, this is implemented with the special method `__getitem__`.
* `M[k] = v`: Associate value `v` with key `k` in map `M`, replacing the existing value if the map already contains an item with key equal to `k`. In python, this is implemented with the special method `__setitem__`.
* `del M[k]`: Remove from map `M` the item with key equal to `k`; if `M` has no such item, then raise a `KeyError`. In Python, this is implemented with the special method `__delitem__`.
* `len(M)`: Return the number of items in map `M`. In python, this is impelmented with the special method `__len__`.
* `iter(M)`: The default iteration for a map generates a sequence of `keys` in the map. In Python, this is implemented with the special method `__iter__`, and it allows lopps of the form, `for k in M`.

We have highlighted the above five behaviors because they demonstrate the core functionality of a map-namely, the ability to query, add, modify, or delete a key-value pair, and the ability to report all such pairs. For additional convenience, map `M` should also support the following behaviors:

* `k in M`: Return `True` if the map contains an item with key `k`. In Python, this is implemented with the special `__contains__` method.
* `M.get(k, d=None)`: Return `M[k]` if key `k` exists in the map; otherwise return default value `d`. This provides a form to query `M[k]` without risk of a `KeyError`.
* `M.setdefault(k, d)`: If key `k` exists in the map, simply return `M[k]`; if key `k` does not exist, set `M[k] = d` and return that value.
* `M.pop(k, d=NOne)`: Remove the item associated with key `k` from the map and return its associated value `v`. If key `k` is not in the map, return default value `d` (or raise `KeyError` if parameter `d` is `None`).
* `M.popitem()`: Remove an arbitrary key-value pair from the map, and return a `(k,v)` tuple representing the removed pair. If map is empty, raise a `KeyError`.
* `M.clear()`: Remove all key-value pairs from the map.
* `M.keys()`: Return a set-like view of all keys of `M`.
* `M.values()`: Return a set-like view of all values of `M`.
* `M.items()`: Return a set-like view of `(k, v)` tuples for all entries of `M`.
* `M.update(M2)`: Assign `M[k] = v` for every `(k, v)` pair in map `M2`.
* `M == M2`: Return `True` if maps `M` and `M2` have identical key-value associations.
* `M != M2`: Return `True` if maps `M` and `M2` do not have identical key-value associations.

### 10.1.2 Application: Counting Word Frequencies

In [13]:
text_sample = "ALEXEY Fyodorovitch Karamazov was the third son of Fyodor Pavlovitch Karamazov, a landowner well known in our district in his own day, and still remembered among us owing to his gloomy and tragic death, which happened thirteen years ago, and which I shall describe in its proper place. For the present I will only say that this 'landowner'- for so we used to call him, although he hardly spent a day of his life on his own estate- was a strange type, yet one pretty frequently to be met with, a type abject and vicious and at the same time senseless. But he was one of those senseless persons who are very well capable of looking after their worldly affairs, and, apparently, after nothing else. Fyodor Pavlovitch, for instance, began with next to nothing; his estate was of the smallest; he ran to dine at other men's tables, and fastened on them as a toady, yet at his death it appeared that he had a hundred thousand roubles in hard cash. At the same time, he was all his life one of the most senseless, fantastical fellows in the whole district. I repeat, it was not stupidity- the majority of these fantastical fellows are shrewd and intelligent enough- but just senselessness, and a peculiar national form of it."

In [18]:
freq = {}
words = [c for c in text_sample.lower().split() if c.isalpha()]

In [19]:
for word in words:
    freq[word] = 1 + freq.get(word, 0)

In [20]:
freq

{'alexey': 1,
 'fyodorovitch': 1,
 'karamazov': 1,
 'was': 6,
 'the': 8,
 'third': 1,
 'son': 1,
 'of': 8,
 'fyodor': 2,
 'pavlovitch': 1,
 'a': 7,
 'landowner': 1,
 'well': 2,
 'known': 1,
 'in': 5,
 'our': 1,
 'district': 1,
 'his': 7,
 'own': 2,
 'and': 8,
 'still': 1,
 'remembered': 1,
 'among': 1,
 'us': 1,
 'owing': 1,
 'to': 5,
 'gloomy': 1,
 'tragic': 1,
 'which': 2,
 'happened': 1,
 'thirteen': 1,
 'years': 1,
 'i': 3,
 'shall': 1,
 'describe': 1,
 'its': 1,
 'proper': 1,
 'for': 3,
 'present': 1,
 'will': 1,
 'only': 1,
 'say': 1,
 'that': 2,
 'this': 1,
 'so': 1,
 'we': 1,
 'used': 1,
 'call': 1,
 'although': 1,
 'he': 5,
 'hardly': 1,
 'spent': 1,
 'day': 1,
 'life': 2,
 'on': 2,
 'strange': 1,
 'yet': 2,
 'one': 3,
 'pretty': 1,
 'frequently': 1,
 'be': 1,
 'met': 1,
 'type': 1,
 'abject': 1,
 'vicious': 1,
 'at': 4,
 'same': 2,
 'time': 1,
 'but': 2,
 'those': 1,
 'senseless': 1,
 'persons': 1,
 'who': 1,
 'are': 2,
 'very': 1,
 'capable': 1,
 'looking': 1,
 'after': 2,
 

In [22]:
max_word = ''
max_count = 0
for (w,c) in freq.items():
    if c > max_count:
        max_word = w
        max_count = c

print('The most frequent word is', max_word)
print('Its number of occurencies is', max_count)


The most frequent word is the
Its number of occurencies is 8
