# Everything in one notebook - Advanced Techniques
## Lecture 4a - Advanced Techniques

In this lecture we will introduce a number of advanced techniques in Python, which can be useful in a variety of situations. Some of these are particular to Python, and its syntax, others are more general techniques. In particular we will cover:

* List, Set and Dictionary comprehensions
* Generators
* Recursion
* Functions as objects, and the Lambda operator

Please do ask if you find any of this confusing. We're more than happy to go over this (whether today, or in a future lecture). This topics are largely separate from each other, so you could go over this in whatever order makes sense to you.

## List comprehensions

"List comprehensions" is a fancy way of describing a syntax feature in Python, which lets you write simple for loops in a condensed form. Let's consider a for loop which collects the first 100 cubic numbers.

In [None]:
cube_numbers = []
for i in range(1,101):
    cube_numbers.append(i**3)

This is 3 lines of code. This can be written in one line as a list comprehension with the following:

In [None]:
cube_numbers = [i**3 for i in range(1,101)]
print(cube_numbers)

You can see that exactly the same variables (`i` and `cube_numbers`) are used, and both use the same `range` function. In fact these are identical, python will internally do exactly the same thing in these two instances.

One more example, and then it's over to you. Here we have as input a list of names, as strings in the format "FIRST_NAME LAST_NAME" (for e.g. `["Ian Harry", "Laura Nutall", "Gareth Cabourn-Davies"]`). We want to create two lists, one list of the first names, and one of the last names. This can be done using:

In [None]:
names = ["Boris Johnson", "Theresa May", "David Cameron", "Gordon Brown", "Tony Blair", "John Major", "Margaret Thatcher",
         "James Callaghan", "Harold Wilson", "Edward Heath", "Harold Wilson", "Alec Douglas-Home", "Harold Macmillan"] # Turns out this list dated a lot in a year!

first_names = []
last_names = []
for name in names:
    first_name, last_name = name.split(' ')
    first_names.append(first_name)
    last_names.append(last_name)
print (first_names)
print (last_names)

This can be written more simply as *two* list comprehensions.

In [None]:
first_names = [name.split(' ')[0] for name in names]
last_names = [name.split(' ')[1] for name in names]

print (first_names)
print (last_names)

In this case, the list comprehensions, while less code, are actually slower, as we need to loop and do the split twice. However, as the "Zen of Python" states:

https://www.python.org/dev/peps/pep-0020/

"Readability counts", so if you can make code more readable (even if it's slower) it might be worth doing if you don't need that speed difference. I note that it is possible to reduce this into a single line command, but then it is still only as fast as the two list comprehensions, and almost unreadable:

In [None]:
# AVOID CODE LIKE THIS, IF YOU DON'T KNOW WHAT IT DOES, HOW WILL ANYONE ELSE?
# .... But this does demonstrate that you can nest list comprehensions in list comprehensions
first_names, last_names = [[name.split(' ')[i] for name in names] for i in [0,1]]

print (first_names)
print (last_names)

**EXERCISE**

1. Compute $x^{0.5}$ for all integer values of $x$ between 15 and 105 using a list comprehension.
1. Compute $\sin(x)^2 - \cos(x)^2$ for 1000 values of x, uniformly distributed between 0 and $2 \pi$ using a list comprehension.

There's one more level of complexity that we can unlock with list comprehensions. That is that they can also contain a single conditional statement. So for example if we wanted to store all values of $x^3$ for x between 1 and 100 (inclusive) where $x^3$ does not end with the digit 8, we can do:

In [None]:
cubic_numbers = []
for x in range(1,101):
    x3 = x**3
    if not (x3 % 10) == 8:
        cubic_numbers.append(x3)
print (cubic_numbers)

This can be written as a single list comprehension with

In [None]:
cubic_numbers = [x**3 for x in range(1,101) if (not (x**3 % 10) == 8)]

Again though, list comprehensions should be used to help with code readability by making things more concise. Things that become so concise that they are unreadable are not good!

**EXERCISE**

1. Compute $x^{0.5}$ for all integer values of $x$ between 15 and 105 *excluding* all values of $x$ where $x$ is a multiple of 7, using a list comprehension.
1. Compute $\sin(x)^2 - \cos(x)^2$ for 1000 values of $x$, uniformly distributed between 0 and $2 \pi$. Include only values of $x$ where $\cos(x)$ is greater than 0 *or* $\sin(x)$ is less than -0.5. Use a single-line list comprehension.

As well as list comprehensions, we can also consider dictionary and even set comprehensions. These are simple enough to write:

In [None]:
cubic_numbers_dict = {x:x**3 for x in range(1,101) if (not (x**3 % 10) == 8)}
print (cubic_numbers_dict)
print (type(cubic_numbers_dict))

# I avoid this because the syntax is too similar (in my mind) to dictionaries.
#I would just do a list comprehension and wrap set() around it.
cubic_numbers_set = {x**3 for x in range(1,101) if (not (x**3 % 10) == 8)}
print (cubic_numbers_set)
print (type(cubic_numbers_set))

We can also do this (using circular brackets), but we do *not* get a tuple, instead we get ...

In [None]:
cubic_numbers_gen = (x**3 for x in range(1,101) if (not (x**3 % 10) == 8))
print (cubic_numbers_gen)
print (type(cubic_numbers_gen))

A "generator" object?! What's that? This leads nicely into our next section!

## Generators

What is a generator? A generator is a specific type of function in Python, which has some special properties allowing it to be used as an iterator (ie. in a for loop). Let me quote from the nice reference https://realpython.com/introduction-to-python-generators/ here:

"
Introduced with PEP 255, generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a list. However, unlike lists, lazy iterators do not store their contents in memory.
"

Why would you need this?

"
Have you ever had to work with a dataset so large that it overwhelmed your machine’s memory? Or maybe you have a complex function that needs to maintain an internal state every time it’s called, but the function is too small to justify creating its own class. In these cases and more, generators and the Python yield statement are here to help.
"

Still not making sense? Let's try to explain with some examples.

As an example of this (following closely https://realpython.com/introduction-to-python-generators/), let's look at reading in our `stm.txt` file, which we've used in previous lectures (you can download it from Moodle). Let's first start by writing a function to read this in line-by-line using stuff we've already seen:

In [None]:
def txt_reader(file_name):
    file = open(file_name)
    lines = file.read().split("\n")
    result = [line.split(' ') for line in lines]
    # Result will be a list of lists. result[i] will refer to the ith row, result[i][j] will refer to the jth column
    # in the ith row.
    return result

txt_gen = txt_reader("stm.txt")
row_count = 0

for row in txt_gen:
    row_count += 1

# This uses a cool feature called fstrings. It lets us refer to specific variables, and print them, directly!
print(f"Row count is {row_count}")

# Note that the entire file is stored in memory, so we can do:
print (txt_gen[2][10])
# To quickly get a specific value


However, if this file was extremely long, then it might not be possible to hold the whole thing in memory (or you just might not want to be so inefficient with memory usage). So if we just wanted to do something that involves reading the file in a linear order (for example counting the number of entries, or lines, in the file, or counting how often 123 occurs in the file), we can instead use a generator. This looks something like

In [None]:
def txt_reader(file_name):
    for row in open(file_name, "r"):
        yield row

txt_gen = txt_reader("stm.txt")
row_count = 0

for row in txt_gen:
    row_count += 1

print(f"Row count is {row_count}")

# Note that the entire file is NOT stored in memory, so we CANNOT do:
# print (txt_gen[2][10])
# To quickly get a specific value


So let's try and break down how this works. The magic here is the `yield` statement. If you run a function that evaluates a yield statement, it will be interpreted as a *generator*, and a generator object will be returned. Then if used as an iterator (ie. in a for loop) the first value of the iterator will be the value after `yield`, then it will continue until `yield` is reached again and this will be the second value of the iterator. When the function stops reaching `yield` (ie. when it stops) the iteration will stop. So as a simple example of generating integers between 1 and 1000 you can do:

In [None]:
def integer_generator():
    i = 1
    while i <= 1000:
        yield i
        i += 1

gen = integer_generator()

for curr_int in gen:
    print (curr_int)

I emphasize that this is just a simple example, for such a case the `range` function (which is actually a generator!) should be used. But you can make this an infinite integer generator. The following will just keep generating integers until you terminate the process (kernel->interrupt will stop this):

In [None]:
def integer_generator():
    i = 1
    while 1:
        yield i
        i += 1

gen = integer_generator()

for curr_int in gen:
    print (curr_int)

We can also use the `next` command, which will simply obtain the next value from the generator. So in the previous case we can do:

In [None]:
def integer_generator():
    i = 1
    while 1:
        yield i
        i += 1

gen = integer_generator()

print (next(gen))
print (next(gen))
print (next(gen))
print (next(gen))
print (next(gen))

**EXERCISES** Let's try a couple of exercises at this stage.

* Write a generator that returns $x !$ (x-factorial) for x between 1 and 40.
* Write a generator that loops over values of $x^2$ while $x^2$ is smaller than 10000.


Let's add a bit more complexity to the problem now. Here's an example which will generate $N$ numbers between 0 and $\pi$. $N$ is supplied as an argument when creating the generator

In [None]:
import numpy as np

def gen_N_numbers(N):
    diff = np.pi / (N-1)
    for i in range(N):
        yield i * diff

for num in gen_N_numbers(10):
    print (num)

print()

for num in gen_N_numbers(20):
    print (num)


And here's a more complex example, which is really one of the main examples for why a generator would be used. In this case we consider the motion of some particle in the x direction. With each iteration the particle moves by 1 unit in the x direction *either* forwards or backwards, with a 50% chance of doing either. The iteration should stop when the particle has reached an x position of either +50 or -50.

This is a special example because it's an *evolving* system. The next point depends on the previous point and this information can be stored inside the generator. This is great if doing some quite complex random evolution problems (we'll explore one example at the end!)

In [None]:
# The random module is great for random problems. We explored random processes already
import random

def moving_particle():
    particle_position = 0
    
    while 1:
        forward_or_backwards = random.choice([-1,1])
        particle_position = particle_position + forward_or_backwards
        yield particle_position
        if abs(particle_position) == 50:
            break

particle_moves = []
for pos in moving_particle():
    particle_moves.append(pos)

print (len(particle_moves), particle_moves)

**EXERCISES**

* Write a generator to return N random numbers (uniformly distributed between 0 and 1). N should be supplied to the generator.
* Write a generator to return the first N numbers in the Fibonacci sequence.

For a bit more detail in generators, and some more advanced functionality, I refer you again to this article:

https://realpython.com/introduction-to-python-generators/

## Recursion

Recursion is an important idea in many computing languages. This basically boils down to "some function that calls itself". A standard use case would be where the problem can be repeatedly reduced to a simpler problem of the same form. Some of the examples below might seem more natural using iteration (a for loop or generator), but they're useful practice for building to problems where recursion is more natural. As always, the idea here is to provide you with a toolkit and which tool to use for any given problem is something that should be considered when attempting the problem!

As an example of a recursive function (following https://www.programiz.com/python-programming/recursion) let's consider the case of generating X factorial. This can be defined, using a function that calls itself, as follows:

In [None]:
def calc_factorial(x):
    """This is a recursive function
    to find the factorial of an integer"""

    if x == 1:
        return 1
    else:
        return (x * calc_factorial(x-1))

num = 4
print("The factorial of", num, "is", calc_factorial(num))

According to the page above, this form has a number of advantages, and disadvantages:

ADVANTAGES:
* Recursive functions make the code look clean and elegant.
* A complex task can be broken down into simpler sub-problems using recursion.
* Sequence generation is easier with recursion than using some nested iteration.

DISADVANTAGES
* Sometimes the logic behind recursion is hard to follow through.
* Recursive calls can be expensive (inefficient) as they take up a lot of memory and time.
* Recursive functions are hard to debug.


Here's another example for computing the sum of numbers between 1 and N (thanks to https://realpython.com/python-thinking-recursively/):

In [None]:
def sum_recursive(current_number, highest_number, accumulated_sum=0):
    # Base case
    # Return the final state
    if current_number == highest_number:
        return accumulated_sum

    # Recursive case
    # Thread the state through the recursive call
    else:
        #return current_number + sum_recursive(current_number + 1, highest_number)
        return sum_recursive(current_number + 1, highest_number, accumulated_sum=(accumulated_sum + current_number))

# Sum of numbers between 1 and 10
print (sum_recursive(1,11))

# Sum of numbers between 1 and 10
print (sum_recursive(5,60))

The `sum_recursive` function tracks the current sum in the `accumulated_sum` variable. In this example this wasn't needed (it could have been done as with the previous example, see the commented out code), but in many cases this is the kind of approach that is needed to deal with a recursion problem. Brute-force path-finding is one example where you might want to do something like this.

**EXERCISES**

* Write a program to compute f(n)=f(n-1)+100 when n>0 and f(0)=1 with a given n input by console (n>0).
* Write a function to return the Nth Fibonacci number using recursion.

## Function as a value and the lambda operator

In our last topic of this lecture, let's explore the idea of functions as a value themselves, and the lambda operator in python.

Let's take a closer look at the `math.sin` function:

In [None]:
import math

a = math.sin
b = math.sin(3.0)

print(a, b)

The meaning of `b` is (I hope!) clear. It's a float storing the sin of 3.0 .... Or it's the result of *operating* the function `math.sin` on the float `3.0`. However, `a` is the function itself. So we can do:

In [None]:
a = math.sin
b = math.sin(3.0)
c = a(b)

print(a, b, c)

This can be useful if want to write a function that will take as input a value (`x`), and a function (`F`), and returns `F(x)`:

In [None]:
def F_of_x(F, x):
    return F(x)

print(F_of_x(math.sin, 1.0), math.sin(1.0))
print(F_of_x(math.cos, 1.0), math.cos(1.0))
print(F_of_x(math.exp, 2.0), math.exp(2.0))


One example of a function taking another (which we briefly touched on earlier in the course) is the sorted function. This takes a function which can determine a sort order. For example if I wanted to sort values between 0 and 10 according to the sin of that value I could do:

In [None]:
import numpy as np
inputs = np.linspace(0,10,100)
sorted_inps = sorted(inputs, key=math.sin)
print(sorted_inps)
# Checking that this worked as intended.
print ([math.sin(val) for val in sorted_inps])

The `lambda` function can be used to quickly define simple functions. For example if we wanted to define a function that returns $x^2$, we could do:

In [None]:
def x_squared(x):
    return x * x

Or this could be written as a single line function according to:

In [None]:
x_squared_alt = lambda x : x*x

print(x_squared(10), x_squared_alt(10))

A standard usecase for this is being able to write `sorted` commands which do not extend over multiple lines, for example we could do:

In [None]:
import numpy as np
inputs = np.linspace(-9.5,10,51)
sorted_inps = sorted(inputs, key=lambda x : (x-2)*(x-2))
print(sorted_inps)


**EXERCISE**

* Generate 100 numbers uniformly distributed between 5 and 50. Then sort the numbers by using the function $F(x) = x^3 - 50.5 x^2 - 34$. This means that after sorting the numbers into `array` if I did `array**3 - 50.5 * array**2 - 34` the output would be in ascending order. Do *not* store the values of F(x).

## Summary

We've introduced a number of different concepts at this point. Let me try and set some summary exercises to close things out, these will probably be quite challenging. 

### Exercise 1

Write a generator with the following properties:

* The generator should model the motion of a particle in two directions (call them x and y).
* The initial position of the particle should be (0,0) (corresponding to the positions in x and y). This should be the first `yield` of this generator.
* Every subsequent `yield` should move the particle by a random amount in x and y. The magnitude moved should be drawn from a uniform distribution between 0 and 1 and the direction of this moving should be drawn from a uniform angle between 0 and 2pi.
* The generator should stop when the distance between the particle and the origin is at least 10.

Run this generator (use a list comprehension to run the generator and store the x,y positions of the particle over time).

Plot these positions in a 2D plot.

Run the generator some additional times and show these evolutions on the same plot.


### Exercise 2

(Courtesy of codesignal.com, https://app.codesignal.com/arcade/python-arcade/lambda-illusions/eP7hJDmLdZym2Kdo3)

"""
You've been preparing all night for the upcoming test and entered the class certain that you will ace it. Now that you received the test questions, you died inside a little: looks like you prepared for the test on a completely different topic.

You're not even sure if you should bother to answer the questions. You still have some hope though: it is known that there's a glitch in the test preparing system, so that if the sum of digits of question ids is divisible by k, the answer to each question has a 90% probability to be an A.

Given the list of question ids, determine if the sum of their digits is divisible by k to see if it's worth trying to pass the test.
"""

To solve this problem you must write a lambda function in the space indicated. You *must not* add any additional lines to solve this!


In [None]:
def isTestSolvable(ids, k):
    digitSum = # Add lambda function here. MAKE NO OTHER CHANGES!!

    sm = 0
    for questionId in ids:
        sm += digitSum(questionId)
    return sm % k == 0


In [None]:
ids = [529665, 909767, 644200]
k = 3
print (isTestSolvable(ids,k), "Should be True")

ids = [882144, 993441, 460418, 325830, 404529, 912233, 255818, 68407, 94032, 6801, 38227, 997782, 747063, 754688, 725338, 802267, 673468, 271162, 478014, 21599]
k = 6
print (isTestSolvable(ids,k), "Should be False")


### Exercise 3

(From https://www.w3resource.com/)

Write a Python program to calculate the value of 'a' to the power 'b'. Do not use the `**` operator, the `pow` function `math.pow`, equivalent numpy functions or anything else that already does this for you.

Assume that b is an integer, and implement this using only the multiply `*` operator. Can you write a solution:

1. Using recursion
1. Not using recursion

Which do you prefer?

### EXERCISE 4

Given a string, write a function to return all unique palindromic (https://en.wikipedia.org/wiki/Palindrome) subtrings (https://en.wikipedia.org/wiki/Substring) sorted according to the scrabble value of each of the substrings (if the scrabble score is equal, it should be alphabetically sorted). By this we mean if the string is "arsfgfsgh", the output should be `['a', 'r', 's', 'g', 'f', 'fgf', 'sfgfs']`. These are all the possible substrings of the input which are palindromes and they ordered in increasing values of the scrabble score (a = 1, r = 1, s = 1, g = 2, f = 4, fgf = 10, sfgfs = 12).

Some other examples and outputs:

* Input: 'cabca'; Output: `['a', 'b', 'c']`
* Input: 'ccccccccccc'; Output: `['c', 'cc', 'ccc', 'cccc', 'ccccc', 'cccccc', 'ccccccc', 'cccccccc', 'ccccccccc', 'cccccccccc']`
* Input: 'abacabaabacab'; Output: `['a', 'aa', 'b', 'c', 'aba', 'aca', 'baab', 'abaaba', 'bacab', 'abacaba', 'cabaabac', 'acabaabaca']`
* Input: 'zazazaza'; Output: `['a', 'z', 'aza', 'zaz', 'azaza', 'zazaz', 'zazazaz']`
* Input: 'shjzzovuzvabrcrfxemkhbiguanipxaxrnybexth'; Output: `['a', 'e', 'i', 'n', 'o', 'r', 's', 't', 'u', 'g', 'b', 'c', 'm', 'p', 'f', 'h', 'v', 'y', 'k', 'rcr', 'j', 'x', 'z', 'xax', 'zz']`

In [None]:
scrabble_score = {"a": 1 , "b": 3 , "c": 3 , "d": 2 ,
         "e": 1 , "f": 4 , "g": 2 , "h": 4 ,
         "i": 1 , "j": 8 , "k": 5 , "l": 1 ,
         "m": 3 , "n": 1 , "o": 1 , "p": 3 ,
         "q": 10, "r": 1 , "s": 1 , "t": 1 ,
         "u": 1 , "v": 4 , "w": 4 , "x": 8 ,
         "y": 4 , "z": 10}
