# Optimising Loops
Loops in Python are slow. They take time to implement themselves and, more importantly, cause sections of code to be repeated a potentially very large number of times. Thus, when optimising, it is very often loops where we will look to optimise first (assuming this is what the profiling leads us to believe is important).

When optimising loops, a general principle is to optimise the innermost loop first as its contents will always be carried out more times than any other.

If it's possible to eliminate a loop, this is almost always advantageous. This may be done by noticing the loop is expressible as a multiplication, arithmetic sum, geometric sum, etc.

## List comprehensions

One way to remove a loop is to replace it with a [list comprehension](https://www.datacamp.com/community/tutorials/python-list-comprehension). Due to time constraints, we're not going to offer a full discussion of list comprehensions here, but the example below should give you a rough idea of how to use one. This syntax is able to populate a list without an explicit ```for``` loop and the lack of the loop speeds up the code considerably. For example, take the code: 

In [None]:
%pip install line_profiler
%load_ext line_profiler

def make_list():
  #This function makes a list of a million elements with each being equal to the square of its index
  my_list = []

  for i in range(1000000):
    my_list.append(i**2)

  return my_list

%lprun -f make_list print(make_list()[-10:])

We can replace the ```for``` loop with a list comprehension:

In [None]:
%pip install line_profiler
%load_ext line_profiler

def make_list():
  #This function makes a list of a million elements with each being equal to the square of its index
  my_list = [i**2 for i in range(1000000)]

  return my_list

%lprun -f make_list print(make_list()[-10:])

This runs much faster and is also more compact and arguably easier to read once you're familiar with the syntax.

### Converting a range to a list

If you want a list of ascending numbers, operating on a ```range``` with the ```list()``` command can be even faster than a list comprehension.

In [None]:
%pip install line_profiler
%load_ext line_profiler

def make_list():
  # A list comprehension
  my_list = [i for i in range(100000)]

  # list(range()) syntax
  my_list = list(range(100000))

  return my_list

%lprun -f make_list print(make_list()[-10:])

## The ```Map``` Function

The map operates on every entry of an iterable (such as a ```list```) with a specified function and returns an iterable with the results. This can then be converted back to another iterable class. For instance:

In [None]:
import math

my_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print("my_list: ", my_list)

my_map = map(math.cos, my_list)

print("my_map: ", list(my_map))

This is interesting from the perspective of performance as the map function is not explicitly represented in our code. Instead it is contained in the definition of the map function (it is written in C). Because of this, the map function has a more efficient implementation of a the loop as the map function fulfills a more specific role than a generic ```for``` loop and so is very often faster to execute.

For example, we can compare the following two pieces of code:

In [None]:
%pip install line_profiler
%load_ext line_profiler
import math

def log_list():

  my_list = list(range(1,1000000))

  result_list=[]
  for value in my_list:
    result_list.append(math.log(value))

  return result_list

%lprun -f log_list print(log_list()[-10:])

In [None]:
%pip install line_profiler
%load_ext line_profiler
import math

def log_list():

  my_list = list(range(1,1000000))

  return list(map(math.log, my_list))

%lprun -f log_list print(log_list()[-10:])

The second code runs in about half the time. Once you're familiar with the ```map``` function it's also about as readable as the ```for``` loop version.

The ```map``` function can also be applied to ```lambda``` functions (if you don't know what this means, don't worry) and, with a little [extra work](https://stackoverflow.com/questions/10834960/how-to-do-multiple-arguments-to-map-function-where-one-remains-the-same-in-pytho), can be applied to functions which take multiple arguments.

It can also be applied to a range construct directly (which is acyually faster than applying it to an equivalent list). Using the ```sum()``` function on a ```map``` will evaluate all the values in the map and return the sum of them:

In [None]:
%pip install line_profiler
%load_ext line_profiler
import math

def square_root_sum(n):
  # Two methods to calculate the sum of the square roots of all numbers up to n

  # Convert to a list
  result_1 = sum(map(math.sqrt, list(range(n))))

  # Map operates directly on a range
  result_2 = sum(map(math.sqrt, range(n)))

  # Both methods produce the same answer
  print(result_1, result_2)

%lprun -f square_root_sum square_root_sum(100000)

## Loop Control

It's possible to cause a loop to finish early using the ```break``` command. This command causes the innermost loop currently running to exit. This can be useful when your code is to find if a statement is true. By breaking a loop early when the final result has been determined, it's possible to prevent unnecessary executions of the contents of the loop.

For example, consider the two versions of the code below determines if a number is included in a string:

In [None]:
%pip install line_profiler
%load_ext line_profiler

def contains_number(string):
  # Initially assume the string doesn't contain a number
  contains_number = False

  for letter in string:
    # Check if each character is a string
    if letter in "1234567890":
      # If it is, reflect this by changing contains_number to True
      contains_number = True

  return contains_number

def get_random_string(length):
  # This function generates a random string
  # Don't worry about how it works
  import random
  import string
  random_string = ''.join([random.choice(string.ascii_letters + string.digits) for n in range(length)])
  return random_string

def check_strings(n_strings, length):
  # This function generates n_strings strings, each of length length and checks if each of them contains a number

  n_with_numbers = 0
  
  for i in range(n_strings):
    string = get_random_string(length)
    if contains_number(string):
      n_with_numbers = n_with_numbers + 1

  print(n_with_numbers)

%lprun -f contains_number check_strings(100000, 10)

In [None]:
%pip install line_profiler
%load_ext line_profiler

def contains_number(string):
  # Initially assume the string doesn't contain a number
  contains_number = False

  for letter in string:
    # Check if each character is a string
    if letter in "1234567890":
      # If it is, reflect this by changing contains_number to True
      contains_number = True
      # The only change is add "break" here so we stop checking characters after finding a number
      break

  return contains_number

def get_random_string(length):
  # This function generates a random string with a specified length
  # Don't worry about how it works
  import random
  import string
  random_string = ''.join([random.choice(string.ascii_letters + string.digits) for n in range(length)])
  return random_string

def check_strings(n_strings, length):
  # This function generates n_strings strings, each of length length and checks if each of them contains a number

  n_with_numbers = 0
  
  for i in range(n_strings):
    string = get_random_string(length)
    if contains_number(string):
      n_with_numbers = n_with_numbers + 1

  print(n_with_numbers)

%lprun -f contains_number check_strings(100000, 10)

Note that the number of times lines 10 (the contents of the loop) is carried out is decreased significantly.

## Exercise
Below is a code which uses three nested loops. Using the techniques described above, optimise the second copy of the code. Ensure the result remains the same to within 5 significant figures. Note that there are three sample solutions with progressively greater optimisation.

In [None]:
# The original version
%pip install line_profiler
%load_ext line_profiler
import math

def loopy_function():

  my_list=[]

  for i in range(100):
    my_list.append(i**2)

  result = 0

  for i in range(100):
    for j in range(100):
      temp_var = math.sqrt(j)
      for k in range(100):
        result = result + math.tan(my_list[i]) + k + temp_var

  return result

%lprun -f loopy_function print(loopy_function())

In [None]:
# Edit this version
%pip install line_profiler
%load_ext line_profiler
import math

def loopy_function():

  my_list=[]

  for i in range(100):
    my_list.append(i**2)

  result = 0

  for i in range(100):
    for j in range(100):
      temp_var = math.sqrt(j)
      for k in range(100):
        result = result + math.tan(my_list[i]) + k + temp_var

  return result

%lprun -f loopy_function print(loopy_function())

In [None]:
#@title
# The first optimisation is to note that the inner loop always multiplies tan(my_list[j]) by 100 whilst adding (99+0)*100/2=4950
# This optimisation should occur first as it's in an inner loop and, as shown by the profiling, takes up most of the time
# We see this immediately reduces the time taken for the function to run by a factor of ~100 as we've eliminated the innermost loop
%pip install line_profiler
%load_ext line_profiler
import math

def loopy_function():

  my_list=[]

  for i in range(100):
    my_list.append(i**2)

  result = 0

  for i in range(100):
    for j in range(100):
      temp_var = math.sqrt(j)
      result = result + 100 * (math.tan(my_list[i]) + temp_var) + 4950

  return result

%lprun -f loopy_function print(loopy_function())

In [None]:
#@title
# The second optimisation is to note that we can replace the inner loop with a map function which we then take the sum of
# In addition, we can move this map outside of the outer loop entirely as we are simply adding temp_var to result each time
# We can alo remove the loop over j and replace it with a list comprehension
%pip install line_profiler
%load_ext line_profiler
import math

def loopy_function():

  my_list=[]

  for i in range(100):
    my_list.append(i**2)

  result = 0

  for i in range(100):
    result = result + 10000 * (math.tan(my_list[i])) + 495000

  result = result + 10000 * sum(map(math.sqrt, range(100)))

  return result

%lprun -f loopy_function print(loopy_function())

In [None]:
#@title
# The third optimisation is to combine the two remaining loops and replace them with a map function that we take the sum of
# We use a list comprehension to form the list passed to the map
# The resultant function is approximately 10,000 times faster than the function we began with
%pip install line_profiler
%load_ext line_profiler
import math

def loopy_function():
  result = 10000 * sum(map(math.tan, [i ** 2 for i in range(100)])) + 49500000

  result = result + 10000 * sum(map(math.sqrt, range(100)))

  return result

%lprun -f loopy_function print(loopy_function())