Make sure you fill in any place that says `YOUR CODE HERE`. 

---

# Homework 2: Efficiency

# Instructions

## The Format of a Python Notebook

*This* is a Python Notebook homework.  It consists of various types of cells: 

* Text: you can read them :-) 
* Code: you should run them, as they may set up the problems that you are asked to solve.
* **Solution:** These are cells where you should enter a solution.  You will see a marker in these cells that indicates where your work should be inserted.  

```
    # YOUR CODE HERE
```    

* Test: These cells contains some tests, and are worth some points.  You should run the cells as a way to debug your code, and to see if you understood the question, and whether the output of your code is produced in the correct format.  The notebook contains both the tests you see, and some secret ones that you cannot see.  This prevents you from using the simple trick of hard-coding the desired output. 

## Working on Your Notebook

To work on your notebook, you can just work on `colab.research.google.com`.  Please don't download it and work directly on your laptop.  Working on Colab has two key features: 

* The notebook is shared with the TAs, tutors, and with the instructor.  So when you report that you have difficulties, they can open your notebook and help you. 
* The notebook preserves the revision history, which is useful for many reasons, among which that we can see how you reached the solution.

## Submitting Your Notebook

To give you control of which version of notebook you submit for consideration, you need to submit your work as follows: 

* Download the notebook from Colab, clicking on "File > Download .ipynb".
* Upload the resulting file to [this Google form](https://docs.google.com/forms/d/e/1FAIpQLSf_MSdPKykxSbRuWNF2ZzRpXqv9JdAUClSHyjkxXX42DwI6sg/viewform?usp=sf_link).
* **Deadline: Friday October 4, 7pm.**

You can submit multiple times, and the last submittion before the deadline will be used to assign you a grade. 

## What Happens Next? 

After you submit, your instructor at some point will retreat to a secret hideout, put on some good music, and run some mysterious scripts.  These will generate two things: 

* Your grade, that goes into a spreadsheet. 
* Feedback, shared to you as a pdf file on Google Drive (you will receive an email notification).  The pdf shows your work, your grade, the tests that passed and those that failed, and any comments left by the instructor and the TAs. 

In [1]:
try:
    from nose.tools import assert_equal, assert_almost_equal
    from nose.tools import assert_true, assert_false
    from nose.tools import assert_not_equal, assert_greater_equal
except:
    !pip install nose
    from nose.tools import assert_equal, assert_almost_equal
    from nose.tools import assert_true, assert_false
    from nose.tools import assert_not_equal, assert_greater_equal

Collecting nose
[?25l  Downloading https://files.pythonhosted.org/packages/15/d8/dd071918c040f50fa1cf80da16423af51ff8ce4a0f2399b7bf8de45ac3d9/nose-1.3.7-py3-none-any.whl (154kB)
[K     |████████████████████████████████| 163kB 2.8MB/s 
[?25hInstalling collected packages: nose
Successfully installed nose-1.3.7


## The task: inserting elements into a list

The following function `do_insertions_simple` takes as input a list `l`, and a list of insertions `insertions`.  The latter is a list of elements of the form

    (i, x)

specifying that `x` should be inserted at position `i` in the list.  The list `insertions` is _sorted according to `i`_ (the problem would be markedly more difficult if this were not true, but you can consider on your own how to solve it also in that case).  The function `do_insertions_simple` performs all insertions, and returns the result, without modifying the original list `l`.

In [0]:
def do_insertions_simple(l, insertions):
    """Performs the insertions specified into l.
    @param l: list in which to do the insertions.  Is is not modified.
    @param insertions: list of pairs (i, x), indicating that x should
        be inserted at position i.
    """
    r = list(l)
    for i, x in insertions:
        r.insert(i, x)
    return r

We can time how long it takes to execute:

In [0]:
import time

def timeit(f, l, insertions):
    t0 = time.time()
    f(l, insertions)
    return time.time() - t0

In [0]:
import numpy as np
import random

def generate_testing_case(list_len=1000000, num_insertions=10000):
    l = list(np.random.random(list_len))
    insertions = []
    for j in range(num_insertions):
        i = random.randint(0, list_len + j)
        x = random.random()
        insertions.append((i, x))
    insertions.sort()
    return l, insertions

In [5]:
l, insertions = generate_testing_case()
simple_time = timeit(do_insertions_simple, l, insertions)
print(simple_time)

2.7500853538513184


## Your task: writing a faster implementation

Your task consists in writing a faster implementation of `do_insertions_simple`.  In doing so, you can use the content of the _Data structures and their access characteristics_ chapter of the [class book](https://sites.google.com/ucsc.edu/programmingabstractions).  You do not need anything extra (no special modules, no advanced algorithms) in order to obtain a considerable speedup.  Simply think at what makes `do_insertions_simple` slow, and think at how you can rewrite the whole thing in a faster way.


In [0]:
def do_insertions_fast(l, insertions):
    """Implement here a faster version of do_insertions_simple """
    # YOUR CODE HERE
    new_list = []
    count = 0
    for i, x in insertions:
      new_list += l[len(new_list) - count : i - count]
      new_list.insert(i, x)
      count += 1
    return new_list + l[len(new_list) - count:]

### Correctness

First, let's check that you compute the right thing.

In [7]:
import string
l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
insertions = [(0, 'a'), (2, 'b'), (2, 'b'), (7, 'c')]
r1 = do_insertions_simple(l, insertions)
r2 = do_insertions_fast(l, insertions)
print("r1:", r1)
print("r2:", r2)
assert_equal(r1, r2)

is_correct = False
for _ in range(20):
    l, insertions = generate_testing_case(list_len=100, num_insertions=20)
    r1 = do_insertions_simple(l, insertions)
    r2 = do_insertions_fast(l, insertions)
    assert_equal(r1, r2)
    is_correct = True

r1: ['a', 0, 'b', 'b', 1, 2, 3, 'c', 4, 5, 6, 7, 8, 9]
r2: ['a', 0, 'b', 'b', 1, 2, 3, 'c', 4, 5, 6, 7, 8, 9]


### Performance

For every doubling in speed of `do_insertions_fast` with respect to our `do_insertions_simple`, you get a point.  Let's see how many points you can get! 

What is the maximum?  Nobody knows.  How many points did the instructor get with his implementation?  He won't tell.  The race is on! 

This problem can be a time sink; please do not devote to it more than 2-3 hours. 

In [8]:
def do_insertions_simple(l, insertions):
    r = list(l)
    for i, x in insertions:
        r.insert(i, x)
    return r
    
def generate_testing_case(list_len=1000000, num_insertions=10000):
    l = list(np.random.random(list_len))
    insertions = []
    for j in range(num_insertions):
        i = random.randint(0, list_len + j)
        x = random.random()
        insertions.append((i, x))
    insertions.sort()
    return l, insertions

l, insertions = generate_testing_case()
simple_time = timeit(do_insertions_simple, l, insertions)
fast_time = timeit(do_insertions_fast, l, insertions)
speedup = simple_time / fast_time
print("Speedup:", speedup)
points = int(np.log2(speedup))
print("You got", points, "points")

Speedup: 65.37191191369502
You got 6 points


In [0]:
assert_greater_equal(points, 1)
assert is_correct

In [0]:
assert_greater_equal(points, 2)
assert is_correct

In [0]:
assert_greater_equal(points, 3)
assert is_correct

In [0]:
assert_greater_equal(points, 4)
assert is_correct

In [0]:
assert_greater_equal(points, 5)
assert is_correct

In [0]:
assert_greater_equal(points, 6)
assert is_correct

In [15]:
assert_greater_equal(points, 7)
assert is_correct

AssertionError: ignored

In [0]:
assert_greater_equal(points, 8)
assert is_correct

In [0]:
assert_greater_equal(points, 9)
assert is_correct

In [0]:
assert_greater_equal(points, 10)
assert is_correct