# Python review: Basic collections of values

This notebook continues the review of Python basics based on [Chris Simpkins's](https://www.cc.gatech.edu/~simpkins/) [Python Bootcamp](http://datamastery.gitlab.io/msabc/august2018.html). The focus here is on basic collections: tuples, dictionaries, and sets.

**Exercise 0** (`minmax_test`: 1 point). Complete the function `minmax(L)`, which takes a list `L` and returns a pair---that is, 2-element Python tuple, or "2-tuple"---whose first element is the minimum value in the list and whose second element is the maximum. For instance:

```python
  minmax([8, 7, 2, 5, 1]) == (1, 8)
```

In [1]:
def minmax(L):
    assert hasattr(L, "__iter__")
    
    return (min(L), max(L))

In [2]:
# `minmax_test`: Test cell

L = [8, 7, 2, 5, 1]
mmL = minmax(L)
mmL_true = (1, 8)
print("minmax({}) -> {} [True: {}]".format(L, mmL, mmL_true))
assert type(mmL) is tuple and mmL == (1, 8)

from random import sample
L = sample(range(1000), 10)
mmL = minmax(L)
L_s = sorted(L)
mmL_true = (L_s[0], L_s[-1])
print("minmax({}) -> {} [True: {}]".format(L, mmL, mmL_true))
assert mmL == mmL_true

print("\n(Passed!)")

minmax([8, 7, 2, 5, 1]) -> (1, 8) [True: (1, 8)]
minmax([923, 309, 977, 868, 839, 316, 57, 970, 648, 749]) -> (57, 977) [True: (57, 977)]

(Passed!)


**Exercise 1** (`remove_all_test`: 2 points). Complete the function `remove_all(L, x)` so that, given a list `L` and a target value `x`, it returns a *copy* of the list that excludes *all* occurrences of `x` but preserves the order of the remaining elements. For instance:

```python
    remove_all([1, 2, 3, 2, 4, 8, 2], 2) == [1, 3, 4, 8]
```

> **Note.** Your implementation should *not* modify the list being passed into `remove_all`.

In [9]:
def remove_all(L, x):
    assert type(L) is list and x is not None
    #
    S = [elem for elem in L if elem != x]
    return S
    #


In [10]:
# `remove_all_test`: Test cell
def test_it(L, x, L_ans):
    print("Testing `remove_all({}, {})`...".format(L, x))
    print("\tTrue solution: {}".format(L_ans))
    L_copy = L.copy()
    L_rem = remove_all(L_copy, x)
    print("\tYour computed solution: {}".format(L_rem))
    assert L_copy == L, "Your code appears to modify the input list."
    assert L_rem == L_ans, "The returned list is incorrect."

# Test 1: Example
test_it([1, 2, 3, 2, 4, 8, 2], 2, [1, 3, 4, 8])

# Test 2: Random list
from random import randint
target = randint(0, 9)
L_input = []
L_ans = []
for _ in range(20):
    v = randint(0, 9)
    L_input.append(v)
    if v != target:
        L_ans.append(v)
test_it(L_input, target, L_ans)

print("\n(Passed!)")

Testing `remove_all([1, 2, 3, 2, 4, 8, 2], 2)`...
	True solution: [1, 3, 4, 8]
	Your computed solution: [1, 3, 4, 8]
Testing `remove_all([5, 3, 9, 4, 7, 7, 5, 4, 9, 9, 8, 1, 4, 5, 8, 4, 0, 1, 2, 7], 3)`...
	True solution: [5, 9, 4, 7, 7, 5, 4, 9, 9, 8, 1, 4, 5, 8, 4, 0, 1, 2, 7]
	Your computed solution: [5, 9, 4, 7, 7, 5, 4, 9, 9, 8, 1, 4, 5, 8, 4, 0, 1, 2, 7]

(Passed!)


**Exercise 2** (`compress_vector_test`: 2 points). Suppose you are given a vector, `x`, containing real values that are mostly zero. For instance:

```python
    x = [0.0, 0.87, 0.0, 0.0, 0.0, 0.32, 0.46, 0.0, 0.0, 0.10, 0.0, 0.0]
```

Complete the function, `compress_vector(x)`, so that returns a dictionary `d` with two keys, `d['inds']` and `d['vals']`, which are lists that indicate the position and value of all the *non-zero* entries of `x`. For the previous example,

```python
    d['inds'] = [1, 5, 6, 9]
    d['vals'] = [0.87, 0.32, 0.46, 0.10]
```

> **Note 1.** Your implementation must _not_ modify the input vector `x`.

> **Note 2.** If `x` contains only zero entries, `d['inds']` and `d['vals']` should be empty lists.

In [11]:
from collections import defaultdict

In [12]:
def compress_vector(x):
    assert type(x) is list
    d = {
        'inds': [], 
        'vals': [],
    }
    #
    for i, ex in enumerate(x):
        if ex != 0.0:
            d['inds'].append(i)
            d['vals'].append(ex)
        else:
            continue
    #
    return d

In [13]:
# `compress_vector_test`: Test cell
def check_compress_vector(x_orig):
    print("Testing `compress_vector(x={})`:".format(x_orig))
    x = x_orig.copy()
    nz = x.count(0.0)
    print("\t`x` has {} zero entries.".format(nz))
    d = compress_vector(x)
    print("\tx (after call): {}".format(x))
    print("\td: {}".format(d))
    assert x == x_orig, "Your implementation appears to modify the input."
    assert type(d) is dict, "Output type is not `dict` (a dictionary)."
    assert 'inds' in d and type(d['inds']) is list, "Output key, 'inds', does not have a value of type `list`."
    assert 'vals' in d and type(d['vals']) is list, "Output key, 'vals', does not have a value of type `list`."
    assert len(d['inds']) == len(d['vals']), "`d['inds']` and `d['vals']` are lists of unequal length."
    for i, v in zip(d['inds'], d['vals']):
        assert x[i] == v, "x[{}] == {} instead of {}".format(i, x[i], v)
    assert nz + len(d['vals']) == len(x), "Output may be missing values."
    assert len(d.keys()) == 2, "Output may have keys other than 'inds' and 'vals'."
    
# Test 1: Example
x = [0.0, 0.87, 0.0, 0.0, 0.0, 0.32, 0.46, 0.0, 0.0, 0.10, 0.0, 0.0]
check_compress_vector(x)

# Test 2: Random sparse vectors
from random import random
for _ in range(3):
    print("")
    x = []
    for _ in range(20):
        if random() <= 0.8: # Make about 10% of entries zero
            v = 0.0
        else:
            v = float("{:.2f}".format(random()))
        x.append(v)
    check_compress_vector(x)
    
# Test 3: Empty vector
x = [0.0] * 10
check_compress_vector(x)

print("\n(Passed!)")

Testing `compress_vector(x=[0.0, 0.87, 0.0, 0.0, 0.0, 0.32, 0.46, 0.0, 0.0, 0.1, 0.0, 0.0])`:
	`x` has 8 zero entries.
	x (after call): [0.0, 0.87, 0.0, 0.0, 0.0, 0.32, 0.46, 0.0, 0.0, 0.1, 0.0, 0.0]
	d: {'inds': [1, 5, 6, 9], 'vals': [0.87, 0.32, 0.46, 0.1]}

Testing `compress_vector(x=[0.0, 0.0, 0.0, 0.81, 0.78, 0.59, 0.0, 0.0, 0.0, 0.62, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])`:
	`x` has 16 zero entries.
	x (after call): [0.0, 0.0, 0.0, 0.81, 0.78, 0.59, 0.0, 0.0, 0.0, 0.62, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
	d: {'inds': [3, 4, 5, 9], 'vals': [0.81, 0.78, 0.59, 0.62]}

Testing `compress_vector(x=[0.01, 0.0, 0.75, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.4, 0.99, 0.85, 0.0, 0.0, 0.0, 0.0, 0.78, 0.37, 0.0])`:
	`x` has 13 zero entries.
	x (after call): [0.01, 0.0, 0.75, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.4, 0.99, 0.85, 0.0, 0.0, 0.0, 0.0, 0.78, 0.37, 0.0]
	d: {'inds': [0, 2, 10, 11, 12, 17, 18], 'vals': [0.01, 0.75, 0.4, 0.99, 0.85, 0.78, 0.37]}

Testing `compr

**Repeated indices.** Consider the compressed vector data structure, `d`, in the preceding exercise, which stores a list of indices (`d['inds']`) and a list of values (`d['vals']`).

Suppose we allow duplicate indices, possibly with different values. For example:

```python
    d['inds'] == [0,   3,   7,   3,   3,   5, 1]
    d['vals'] == [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0]
```

In this case, the index 3 appears three times. (Also note that the indices `d['ind']` need not appear in sorted order.)

Let's adopt the convention that when there are repeated indices, the "true" value there is the _sum_ of the individual values. In other words, the true vector corresponding to this example of `d` would be:

```python
    # ind:  0    1    2    3*    4    5    6    7
    x == [1.0, 7.0, 0.0, 11.0, 0.0, 6.0, 0.0, 3.0]
```

In [14]:
vector_x = [0.0, 0.87, 0.0, 0.0, 0.0, 0.32, 0.46, 0.0, 0.0, 0.10, 0.0, 0.0]

**Exercise 3** (`decompress_vector_test`: 2 points). Complete the function `decompress_vector(d)` that takes a compressed vector `d`, which is a dictionary with keys for the indices (`inds`) and values (`vals`), and returns the corresponding full vector. For any repeated index, the values should be summed.

The function should accept an _optional_ parameter, `n`, that specifies the length of the full vector. You may assume this length is at least `max(d['inds'])+1`.

In [None]:
def decompress_vector(d, n=None):
    # Checks the input
    assert type(d) is dict and 'inds' in d and 'vals' in d, "Not a dictionary or missing keys"
    assert type(d['inds']) is list and type(d['vals']) is list, "Not a list"
    assert len(d['inds']) == len(d['vals']), "Length mismatch"
    
    # Determine length of the full vector
    i_max = max(d['inds']) if d['inds'] else -1
    if n is None:
        n = i_max+1
    else:
        assert n > i_max, "Bad value for full vector length"
        
    #
    # YOUR CODE HERE
    #


In [None]:
def decompress_vector(d, n=None):
    # Checks the input
    assert type(d) is dict and 'inds' in d and 'vals' in d, "Not a dictionary or missing keys"
    assert type(d['inds']) is list and type(d['vals']) is list, "Not a list"
    assert len(d['inds']) == len(d['vals']), "Length mismatch"
    
    # Determine length of the full vector
    i_max = max(d['inds']) if d['inds'] else -1
    if n is None:
        n = i_max+1
    else:
        assert n > i_max, "Bad value for full vector length"
        
    #
    # YOUR CODE HERE
    #
