# Exercise Sheet 2
## Instructions
The instructions for this exercise sheet are the same as the previous, so this section is just a summary to serve as a reminder. Refer to the material on Engage and at the start of Exercise Sheet 1 for more detail, and if you are unsure ask on the Q&A forum.

This exercise sheet counts towards your overall grade for the unit.

Complete each question in the code cell provided, which will usually contain some skeleton code to help you start the question. Unless specified otherwise you may change the skeleton code as you wish, but your code must pass the formatting tests in the following code cell – if the cell runs without errors then your code is eligible for submission. However these will only test the basics of the question. Your code will be subject to additional hidden tests which will determine your grade. Make sure you thoroughly test your own code to check you think it works correctly before submission.

**Note**: For any question which accepts a mutable input (e.g. a list), your code must not modify that input unless specified otherwise in the question.

## Questions
### Question 1
Write a function that takes a list of items and returns a new list with the order of items reversed, without using any inbuilt Python features that will do this for you (e.g. `.reverse()` or `reversed(…)`). As specified in the instructions, your function must not modify the contents of the original list.

In [948]:
def reversed_list(in_list):
    new_list = []
    for i in in_list:
        new_list = [i] + new_list
    return new_list

In [949]:
assert(reversed_list([1, 2]) == [2, 1])

### Question 2
Now write a procedure that takes a list as an input and reverses its contents. So, the procedure should not return anything, and the inputted list *should* be modified. Of course, you still must not use inbuilt Python features which reverse lists.

In [950]:
def reverse_list(in_list):
    length = len(in_list)
    middle = round(length/2)
    if length <= 3:
        middle -= 1
    for i in range(length):
        index = length - (1 + i)
        current = in_list[i]
        swap = in_list[index]
        in_list[i] = swap
        in_list[index] = current
        if i == middle:
            break

In [951]:
test_in = [1, 2]
reverse_list(test_in)
assert(test_in == [2, 1])

### Question 3
Given two lists `[a1, a2, a3, …]` and `[b1, b2, b3, …]` of the same length, return a list containing pairs of items as tuples by position: a tuple containing the first item from each list, a tuple containing the second item, and so on. In other words, return `[(a1, b1), (a2, b2), …]`.

This is quite a useful feature when you want to iterate over two collections at once, and it also has a inbuilt function called `zip`! Of course, you cannot use `zip` (or other similar functions) in your submission.

In [952]:
def merge(list1, list2):
    merged_list = []
    for i in range(len(list1)):
        merged_list.append((list1[i], list2[i]))
    if len(list1) == 0 and len(list2) == 0:
        merged_list = [('', '')]
    return merged_list

In [953]:
assert(merge([1, 2], [3, 4]) == [(1, 3), (2, 4)])

### Question 4
Write another function like the one from the previous question. It should take two *non-empty* lists and create a list of tuples. The two input lists do not need to be the same length. If they are not the same length, *wrap around* the shorter list, i.e. start from the beginning again. (Hint: don't forget the *modulo* operator.) As usual, do not use `zip` or other similar inbuilt functions which implement this functionality already.

So for example, given the inputs `[1, 2, 3, 4, 5]`, and `['a', 'b']`, the result should be: <br>
`[(1, 'a'), (2, 'b'), (3, 'a'), (4, 'b'), (5, 'a')]`.

In [954]:
def merge_wrap(list1, list2):
    merged_list = []
    for i in range(max(len(list1),len(list2))):
        item_1 = list1[i % len(list1)]
        item_2 = list2[i % len(list2)]
        merged_list.append((item_1, item_2))
    return merged_list

In [955]:
assert(merge_wrap([1, 2], [3, 4, 5]) == [(1, 3), (2, 4), (1, 5)])

### Question 5
Now one more variation: this time you must write a version of the merge and wrap function which takes *any number* of nonempty lists as input, using the *varargs* syntax you saw in the material about tuples. The return value should be a single list containing a number of tuples equal to the length of the longest list from all of the arguments, where each tuple has a number of elements equal to the number of arguments. Hopefully this is obvious, but you must not use `zip` or any other similar inbuilt functions which do the work for you.

See an example in the test cell below.

In [956]:
def merge_wrap_n(*args):
    max_count = 0 
    final_list = []
    arg_list = []
    for i in args:
        if len(i) > max_count:
            max_count = len(i)
    for i in range(max_count):
        for arg in args:  
            idx = arg[i % len(arg)]
            arg_list.append(idx)  
        final_list.append(tuple(arg_list))
        arg_list = []
    return final_list

In [957]:
assert(merge_wrap_n([1, 2], [3], [4, 5, 6]) == [(1, 3, 4), (2, 3, 5), (1, 3, 6)])

### Question 6
Write a function which takes a list of numbers of length $\ge 2$, and returns a tuple containing the *arithmetic mean* and the *unbiased sample variance* of those numbers.

i.e. given a list of items $[x_1, x_2, \dots, x_N]$, return a tuple $(m, s^2)$ where:

\begin{align} 
m &= \frac{1}{N} \sum_{i=1}^{N} x_i \\
s^2 &= \frac{1}{N-1} \sum_{i=1}^{N} (x_i - m)^2 .
\end{align}

The symbol $\sum$ means to add up all the values in this range for the variable specified. In other words, take the value where $i=1$, add the value where $i=2$, and so on until $i=N$. So the arithmetic mean can be explained in words as: add up all the values in the list, then divide by the total number of values ($N$). For the variance you need to subtract the mean from each value and square the result, add this up, then divide by $N-1$. You are welcome to use mathematically equivalent reformulations if you prefer (and it may be more efficient to do so), just ensure you match the outputs of these formulas exactly.

You must do the calculations manually, do not use inbuilt functions like `sum`, any `math` functions, etc.

In [958]:
def mean_variance(numbers):
    n = len(numbers)
    total = 0
    varince_total = 0
    for i in numbers:
        total += i
    mean = total / n
    for i in numbers:
        varince_total += ((i - mean) ** 2)
    variance = varince_total / (n - 1)
    return (mean, variance)

In [959]:
assert(mean_variance([1, 2, 3]) == (2.0, 1.0))

### Question 7
Write a function which takes a string as an input and returns a dictionary which maps from each letter a-z (the keys), *ignoring case*, to the number of times this letter appears in the string (the values), *if the letter appears in the string* (all counts should be $\ge 1$). The keys to the dictionary must all be *lower case*. Ignore all punctuation or whitespace.

In [960]:
def frequency_analysis(string):
    string_clean = ''.join(c for c in string if c.isalnum()).lower()
    string_clean_numbers = ''.join(c for c in string_clean if not c.isdigit())
    unique_keys = []
    for i in string_clean_numbers:
        if i not in unique_keys:
            unique_keys.append(i)
    frequency_dict = {key: 0 for key in unique_keys}
    for i in string_clean_numbers:
        frequency_dict[i] = frequency_dict[i] + 1
    return frequency_dict

In [961]:
assert(frequency_analysis("Hello world!") == {'h': 1, 'e': 1, 'l': 3, 'o': 2, 'w': 1, 'r': 1, 'd': 1})

### Question 8
Write a function which takes a string as an input and returns a dictionary where the keys are the *words* from the string and the values are the *counts*, how often each word appears in the string. A word is a sequence of consecutive letters which is broken by a space, the end of the string, or any of the following punctuation symbols: `.,?!` (no other punctuation will be included in the input string). Ignore case when counting words: the dictionary's keys must be entirely in lower case. Ignore consecutive spaces or punctuation.

You are *encouraged* to make use of the various [string methods](https://docs.python.org/3/library/stdtypes.html#string-methods) available in Python.

In [962]:
def count_words(string):
    word_list = []
    word = ''
    idx = 0
    for i in string:
        idx += 1
        if i in [' ', '.', ',', '?', '!']:
            if word != '':
                if word.isalpha() == True:
                    word_list.append(word.lower())
            word = ''
        else:
            word += i
        if idx == len(string):
            if word != '':
                if word.isalpha() == True:
                    word_list.append(word.lower())
    dictionary = {key: 0 for key in word_list}
    for i in word_list:
        dictionary[i] = dictionary[i] + 1
    if dictionary == None:
        dictionary = {}
    return dictionary    

In [963]:
assert(count_words("Hello world!") == {"hello": 1, "world": 1})
assert(count_words("The Bart, the") == {"the": 2, "bart": 1})

### Question 9
In this question you will implement the [sieve of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes), a method for finding prime numbers. A prime number is any integer which is divisible by exactly two numbers: $1$ and itself. The basic idea is that you start with a list (or equivalent) of all the numbers from $2$ to $N$, then you repeat the following two steps:
* remove the first element from the list, this is your next prime number, call it $p$
* remove every element from the rest of the which is a multiple of $p$

By repeating these steps you fill find every prime number up to $N$. There are various optimisations you can perform, but this basic idea is sufficient for this exercise.

For this question, write a function which takes a single integer `n` and returns a *set* containing all of the prime numbers up to *and including* `n`.

In [964]:
def sieve(n):
    numbers = []
    primes = []
    prime_calc = []
    idx = 0
    for i in range(n):
        numbers.append(i + 1)
    numbers = numbers[1:]
    for i in numbers:
        if idx == 0:
            p = numbers[0]
            primes.append(p)
            prime_calc = numbers[1:]
        elif len(prime_calc) == 0:
            pass
        else:
            p = prime_calc[0]
            primes.append(p)
            prime_calc = prime_calc[1:]
        for x in prime_calc:
            if x % p == 0:
                prime_calc.remove(x)
        idx += 1
    return set(primes)

In [965]:
assert(sieve(10) == {2, 3, 5, 7})

### Question 10
In this question you need to write three functions, all of which return 2D lists with a given pattern made up of single-character strings containing either a hash symbol `'#'` or a space `' '`.

The three patterns are:
* `rectangle(n, m)`, where $n\ge2$ and $m\ge2$, must produce an `n` by `m` rectangle outline, e.g:
  ```
  rectangle(3, 3) returns
      [['#', '#', '#'], 
       ['#', ' ', '#'], 
       ['#', '#', '#']]
       
  rectangle(4, 5) returns
      [['#', '#', '#', '#', '#'],
       ['#', ' ', ' ', ' ', '#'],
       ['#', ' ', ' ', ' ', '#'],
       ['#', '#', '#', '#', '#']]
  ```
* `diamond(n)`, where $n\ge3$, must produce an `n` by `n` filled diamond, where odd sized diamonds are one hash wide at their points, and even shaped diamonds are two hashes wide, e.g.:
  ```
  diamond(5) returns
      [[' ', ' ', '#', ' ', ' '],
       [' ', '#', '#', '#', ' '],
       ['#', '#', '#', '#', '#'],
       [' ', '#', '#', '#', ' '],
       [' ', ' ', '#', ' ', ' ']]
       
  diamond(6) returns
      [[' ', ' ', '#', '#', ' ', ' '],
       [' ', '#', '#', '#', '#', ' '],
       ['#', '#', '#', '#', '#', '#'],
       ['#', '#', '#', '#', '#', '#'],
       [' ', '#', '#', '#', '#', ' '],
       [' ', ' ', '#', '#', ' ', ' ']]
  ```
* `chequerboard(n, m, c_h, c_w)`, all arguments $\ge1$, returns an `n` by `m` chequerboard where each chequer (region) is of size `c_h` by `c_w`, the top left cell is *always* filled, and *only* the right hand and bottom edge may contain incomplete chequers, e.g.:
  ```
  chequerboard(6, 8, 2, 2) returns
      [['#', '#', ' ', ' ', '#', '#', ' ', ' '],
       ['#', '#', ' ', ' ', '#', '#', ' ', ' '],
       [' ', ' ', '#', '#', ' ', ' ', '#', '#'],
       [' ', ' ', '#', '#', ' ', ' ', '#', '#'],
       ['#', '#', ' ', ' ', '#', '#', ' ', ' '],
       ['#', '#', ' ', ' ', '#', '#', ' ', ' ']]
      
  chequerboard(5, 7, 2, 3) returns
      [['#', '#', '#', ' ', ' ', ' ', '#'],
       ['#', '#', '#', ' ', ' ', ' ', '#'],
       [' ', ' ', ' ', '#', '#', '#', ' '],
       [' ', ' ', ' ', '#', '#', '#', ' '],
       ['#', '#', '#', ' ', ' ', ' ', '#']]
  ```
  
  
For this question, `rectangle(n, m)` is worth a maximum of 2 marks, `diamond(n)` is worth 3 marks, and `chequerboard(…)` is worth 5 marks.

In [966]:
def rectangle(n, m):
    grid = []
    row = []
    for i in range(n):
        if i == 0:
            for x in range(m):
                row.append('#')
            grid.append(row)
            row = []
        if i != 0 and i != n-1:
            for x in range(m):
                if x == 0 or x == m-1:
                    row.append('#')
                else:
                    row.append(' ')
            grid.append(row)
            row = []
        if i == n-1:
            for x in range(m):
                row.append('#')
            grid.append(row)
            row = []
    return grid
    
    
def diamond(n):
    grid = []
    sequence = []
    row = []
    even = False
    if n % 2 == 0:
        even = True
    if even == True:
        idx = 2
        for i in range(n):
            if i < int(n / 2) - 1:
                sequence.append(idx)
                idx += 2
            elif i == int(n / 2) - 1:
                sequence.append(idx)
            else:
                sequence.append(idx)
                idx -= 2         
        idx = 1
        count = 0
        switch = False
        for i in sequence: 
            mid_idx = int(n / 2)
            low_idx = mid_idx - idx
            high_idx = mid_idx + idx - 1
            if i == n and count == int(n / 2):
                switch = True
                idx -= 1
            if switch == False:
                idx += 1
            else:
                idx -= 1
            for x in range(n):
                if x < low_idx or x > high_idx:
                    row.append(' ')
                else:
                    row.append('#')
            count += 1
            grid.append(row)
            row = []
    else:
        idx = 1
        for i in range(n):
            if i < int(n / 2):
                sequence.append(idx)
                idx += 2
            else:
                sequence.append(idx)
                idx -= 2
        idx = 0
        switch = False
        for i in sequence:
            mid_idx = int(n / 2)
            low_idx = mid_idx - idx
            high_idx = mid_idx + idx
            if i == n and idx == int(n / 2):
                switch = True
            if switch == False:
                idx += 1
            else:
                idx -= 1
            for x in range(n):
                if x < low_idx or x > high_idx:
                    row.append(' ')
                else:
                    row.append('#')
            grid.append(row)
            row = []
    return grid


def chequerboard(n, m, c_h, c_w):
    grid = []
    row = []
    c_h_switch = True
    c_w_switch = True
    c_w_idx = 0
    c_h_idx = 0
    for col in range(n):
        for rw in range(m):
            if c_w_switch == True:
                row.append('#')
                c_w_idx += 1
                if c_w_idx == c_w:
                    c_w_idx = 0
                    c_w_switch = False
            else:
                row.append(' ')
                c_w_idx += 1
                if c_w_idx == c_w:
                    c_w_switch = True
                    c_w_idx = 0
        grid.append(row)
        row = []
        c_w_idx = 0
        c_h_idx += 1
        if c_h_idx == c_h:
            c_h_idx = 0
            if c_h_switch == True:
                c_h_switch = False
            else:
                c_h_switch = True
        if c_h_switch == True:
            c_w_switch = True
        else:
            c_w_switch = False     
    return grid

In [947]:
rect = [['#', '#', '#'], 
        ['#', ' ', '#'], 
        ['#', '#', '#']]

diam = [[' ', '#', ' '], 
        ['#', '#', '#'], 
        [' ', '#', ' ']]

cheq = [['#', ' ', '#'], 
        [' ', '#', ' '], 
        ['#', ' ', '#']]

assert(rectangle(3, 3) == rect)
assert(diamond(3) == diam)
assert(chequerboard(3, 3, 1, 1) == cheq)