In [0]:
#@title
# Use matplotlib's inline backend, which creates and inserts
# plots directly below our Jupyter cells.
%matplotlib inline
import matplotlib.pyplot as plt

### Logistics

* This lab is due on **Monday night, 2019-09-09 at 11:59:59 pm**.
* To submit, print the completed Jupyter notebook (with all cells expanded and showing all required results) as a PDF, name it `JHED_lab1.pdf`, and upload it to Gradescope. The course code is 9D6XXR. 

* Make sure you run all code cells in order (including the one above) to set up the environment correctly

* Make sure to ask any questions on piazza.

### Read This First

#### You are allowed to import modules, but only from the Python 3 standard library. For example, do *not* import numpy, torch, etc. (The only exception is `matplotlib`, which we have already imported above.)

#### Remember that `tab` is is useful for autocompletion.

#### Remember that `shift + tab` is useful for rapidly obtaining usage + documentation.

### Moving Averages, Padding, and Edge Effects

#### Complete the `moving_average` function below, which computes the moving average of a 1-D input signal. (See the examples in the documentation if you're unfamiliar with moving averages.)

In [0]:
import statistics
def moving_average(x, window_size=3):
    """ Compute a moving average.
    
    Example 1: moving_average([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], window_size=3) returns
        [(1 + 2 + 3) / 3, (2 + 3 + 4) / 3, (3 + 4 + 5) / 3, (4 + 5 + 6) / 3]
        which is [2.0, 3.0, 4.0, 5.0].
    
    Example 2: moving_average([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], window_size=5) returns
        [(1 + 2 + 3 + 4 + 5) / 5, (2 + 3 + 4 + 5 + 6) / 5]
        which is [3.0, 4.0].
    
    Args:
        x: A list of floats.
        window_size: A positive, odd integer.
        
    Returns:
        A list of floats.
    """
    if window_size % 2 != 1:
        raise ValueError('window_size must be odd.')
    if window_size > len(x):
        raise ValueError('window_size should be smaller than len(x).')
        
    # TODO: Replace with valid code.
    y = []
        
    return y

#### Print the outputs from your `moving_average` function when run on the inputs given in the documentation.

Verify that they match the expected outputs that are given in the documentation.

In [0]:
moving_average([1,2,3,4,5])

#### Create `x = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1]` and plot both `x` and `y = moving_average(x, window_size=3)` using `matplotlib`'s `plot` function.

Notice that `y` differs in length from `x`, and that `y` is *shifted* in that the peaks above are not horizontally aligned.

Later in the course, we will see this same effect when using convolutions, and it will sometimes be convenient to enforce both equal length and centering (so that the peaks above are aligned). One common way of achieving this is to pad `x` with 0s on both sides before applying the moving average.

Before we get to the `padded_moving_average` function, let's go through a simple warm-up exercise:

#### Evaluate the expression `[0] * 5`

If this result surprises you, remember that Python lists can contain *arbitrary items*, not just numerical values.

#### Evaluate the expressions `[0] + ['test']`

#### Evaluate the expression `[0] + ['test'] * 5`

#### Evaluate the expression [0] + ['a', 'b', 'c'] + [print] * 2

Later, when we get to NumPy and PyTorch, we will see behavior that might be a bit more intuitive. Unlike Python lists, arrays and tensors will almost always store numerical values, and so we will see that (for example) `5 * np.array([1, 2, 3])` evaluates to `np.array([5, 10, 15])`.

#### Complete the `padded_moving_average` function below by zero padding both sides of the input `x` before passing it to your previous `moving_average` function.

In [0]:
def padded_moving_average(x, window_size=3):
    """ Compute a moving average.
    
    This differs from moving_average in that the input is first
    padded on both sides with an appropriate number of 0s, so that
    the output has the same length as x and so that x and y are
    aligned.
    
    Example: padded_moving_average([1.0, 1.0, 1.0], window_size=3) returns
        [(0 + 1 + 1) / 3, (1 + 1 + 1) / 3, (1 + 1 + 0) / 3]
        which has approximate values of [0.66, 1.0, 0.66].
    
    Args:
        x: A list of floats.
        window_size: A positive, odd integer that's less than the length of .
        
    Returns:
        A list of floats.
    """
    
    # TODO: Replace with valid code
    y = None
    
    return y

#### Create `x = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1]` and plot both `x` and `y = padded_moving_average(x, window_size=3)` using `matplotlib`'s `plot` function. Be sure to verify that `x` is not modified after calling `padded_moving_average`.

#### Does this padding procedure introduce any artifacts in our moving average?

### 1-D Data Normalization

#### Write a function `normalize` which takes in a list of 1-D data and returns a list of *normalized* data which is centered and scaled to have a mean of 0 and a standard deviation of 1. Be sure to include documentation using Google style docstrings, as done above.

#### Create `data = [10.3, 15.5, 12.7, 13.3, 8.9, 12.3, 14.6, 11.2, 12.8, 9.5]` and form `normalized_data` using your function. Compute the mean and standard deviation of `normalized_data`, and make sure that the mean is very close to 0.0 and that the standard deviation is very close to 1.0.

(Here, let's agree that `x` and `y` are 'very close' if the distance between them is less than $10^{-10}$.)

### Matrices as Lists of Rows

One way to represent matrices is as *lists* of *rows*, with each row having the same length. Here is an example:

In [0]:
A = [[1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]]
print(A)

#### Write a function `transpose` which accepts a matrix in this form and *transposes* it. Be sure to include documentation using Google style docstrings, as done above.

For example, if `A` is defined as above, then `transpose(A)` should return `[[1, 4, 7], [2, 5, 8], [3, 6, 9]]`.

Define `B = [[1, 2, 3, 4], [5, 6, 7, 8]]` and `B_T = transpose(B)`, and verify that `B_T` is correct.

### Building a Vocabulary

Later in the course, when we deal with RNNs, we will likely build a character-level language model. For example, if our model is trained with English sentences, then we might expect that the following three probabilities go from high to low to lowest:

- $P$(next character is 'e' | previous characters were 'appl')
- $P$(next character is 'y' | previous characters were 'appl')
- $P$(next character is 'h' | previous characters were 'appl')

For now, suppose that our training data is one string (e.g. 100 MB of Wikipedia), and that our objective is to form a character vocabulary over these characters. For example, we would expect `form_char_vocab('this is cs382')` to return `[' ', '2', '3', '8', 'c', 'h', 'i', 's', 't']`, since these are the unique characters that are present.

#### Write a function `form_char_vocab` that takes in a string and returns a *sorted list* of the unique characters in that string. Be sure to include documentation using Google style docstrings, as done above.

#### Print the output of your `form_char_vocab` function when run on the string `'this is a short string'`.

#### See how long your function takes to run on the following ~10 MB string by running the following code:

In [0]:
long_ish_string = 'this is a test' * 1000000
%timeit form_char_vocab(long_ish_string)

(If this doesn't terminate in, say, 10s of seconds, then something is wrong.)

### In Place Operations

#### Complete the `replace_element` function below, which returns a new list with all occurrences of a particular element replaced. `replace_element` should not modify any of its arguments.

In [0]:
def replace_element(a_list, element, replacement):
    """ Replace all occurrences of an element in a list.
    
    a_list is not modified in place.
    
    Example:
        old_list = [1, 'a', 2, 'b', 'a', 1, 'b', 1]
        new_list = replace_element(old_list, 'b', 10)
        # new_list is now [1, 'a', 2, 10, 'a', 1, 10, 1]
    
    Args:
        a_list: A list.
        element: The element to be replaced.
        replacement: The replacement.
    
    Returns:
        A copy of a_list, with all occurrences of element replaced
        by replacement.
    """
    
    # TODO: Replace with valid code.
    new_list = None
    
    return new_list

#### Verify that `replace_element` produces the expected output for the example in the documentation, and also verify that `replace_element` does not modify `old_list`.

#### Write a `replace_element_` function that replaces all occurrences of a particular element *in place*. That is, the function should not return anything; instead, its first argument should be *modified*. Be sure to include documentation using Google style docstrings, as done above.

#### Create an example list and verify that your `replace_element_` function modifies the list as intended.

### Indexing with Arbitrary Indices

It is often useful to pluck out the elements from a list that are at specified indices.

In [0]:
some_indices = [2**i - i for i in range(16)]
print(some_indices)

#### Complete the following code to form `some_elements` by plucking out elements of `long_ish_string` according to `some_indices`.

In [0]:
# TODO: Replace with valid code.
some_elements = []
print(some_elements)

### Zip and String Formatting

#### Loop through the following lists and print sentences such as 'Geoffrey Hinton is affiliated with U of T.' *without ever accessing the lists by index* (i.e. no `first_names[0]`).

In [0]:
first_names = ['Geoffrey', 'Yoshua', 'Juergen']
last_names = ['Hinton', 'Bengio', 'Schmidhuber']
affiliations = ['U of T', 'U of M', 'IDSIA']

# TODO: Replace with valid code
pass