# Session 2: Algorithms and functions

*Data Structures and Algorithms*

*Achyuthuni Sri Harsha*

------------------------------------------------------------------------

In this session, we will learn to build our first algorithms and
organize our code into building blocks using *functions*.

------------------------------------------------------------------------

## Preparation

***Readings:***

Guttag: Chapters 3.1, 3.3, 4.0-4.2.

***OR***

Sweigart, Al. Automate the Boring Stuff with Python.

-   Chapter 3 – Functions <https://automatetheboringstuff.com/chapter3/>

***Optional Readings:***

Ford, Paul. What is code?

-   <https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/>

**Questions:**

Please read the readings above and think about how you would explain to
your classmates:

-   What is a function?
-   Why do we use functions?
-   What does it mean to "call a function"?

------------------------------------------------------------------------

## Recap

In Session 1, we learned how to execute statements such as arithmetic operations, store the values of our calculations, and use conditional statements to make decisions in our programs.

## Functions

We've now introduced many key structures of programming: calculations, variable assignment, conditions, and loop structures. These are enough for us to start solving computational problems, but before we do, let us introduce another useful concept: functions.

We use algorithms all the time in our everyday lives, and often trust someone else or a service to do so for us. For example, we give a destination to a mapping service, and get the shortest route. Or we call up an order to a restaurant, and get a fresh pizza. There is a "map algorithm" or "pizza algorithm" behind the scenes, but we don't know exactly how it works - we simply trust others to have designed it such that it works.

Similarly, in Python, we have started using code that others have written and we trust to work. In the first session, we used the commands like:


In [35]:
abs(-5)
max(6, 9)

9

to calculate the absolute value of a number and the maximum of two
numbers. `abs` and `max` are called *functions*, and they come built in
with the Python language. We've used them without thinking about how the
program for calculating the maximum has been implemented. This
*abstraction* is both convenient and powerful: we can simply use the
functions to build more interesting calculations without worrying about
the details behind the calculations.

We'll soon start defining our own functions to similarly package our
algorithms for convenient use. But first, let's try to understand how
the above statements work. Each of the two statements is a *function
call*, which is always of the form

    function_name(arguments)

An *argument* is an expression within the parentheses of the function
call. There can be several arguments, separated by a comma. When a
function gets called, the argument expressions are evaluated one at a
time, from left to right. The resulting values are passed to the
function. The function is then executed and produces an output value.
For example, the following code calculates the absolute value of a
temperature difference.


In [36]:
temp_morning = 5
temp_afternoon = 10
abs(temp_morning - temp_afternoon)

5

The argument expression `temp_morning - temp_afternoon` is evaluated
first, producing the value `-5`, which gets passed to `abs`, resulting
in the absolute value `5`.

Function calls can also be nested, for example:

In [37]:
min(abs(-1), max(3, 0))

1

Here Python will start evaluating the arguments from left to right.
First, `abs(-1)` evaluates to `1`, and then `max(3, 0)` evaluates to
`3`. These then get passed to `min`. The above function call is
therefore equivalent to evaluating `min(1, 3)`.

If you don't know exactly what a function does, you can type
`help(function_name)` into the console.

In [38]:
help(abs)

Help on built-in function abs in module builtins:

abs(x, /)
    Return the absolute value of the argument.



This will provide documentation for the function.

But how do these calculations work, and how can we define functions for
our own algorithms?

### Defining a function

Python's built-in functions, such as `print`, `type` and `abs`, are
useful but limited: often there aren't built-in functions for what we
want to do. Now we'll learn how to add to these by defining our own
functions.

We'd like to be able to do something like this.

    In[1]: miles_to_km(30)

However, what we get is an error. Python tells us that there is no
function with the name `miles_to_km`.

    NameError: name 'miles_to_km' is not defined

We'll need to *define* a new function that tells Python what to do when
we call `miles_to_km`. Here's how we can do it.

In [39]:
def miles_to_km(miles):
    return miles * 1.609

The first line is the function header. It consists of the keyword `def`,
which tells Python that we're defining a function. It is followed by the
function name `miles_to_km`, and a comma-separated list of parameters
within parentheses, and finally a colon. Here the only parameter is
`miles`. Defining a function is similar to defining a variable: with the
definition, Python knows that the name `miles_to_km` exists and points
to this piece of code.

After the colon, we have a sequence of statements that form the function
body. These statements are executed whenever we call the function. By
convention, this is exactly four spaces, which you'll get in Spyder with
the Tab key.

Now, if we run this code and then call the function, we get a result.

In [40]:
def miles_to_km(miles):
    return miles * 1.609

x = miles_to_km(30)

What happens when we run the code? First, Python executes the function
definition, which creates a new function object. Next, Python executes
the function call `miles_to_km(30)`. The argument `30` is assigned to
the parameter variable `miles` within this function call. Python then
executes the `return` statement. It evaluates the expression
`miles * 1.609`. The keyword `return` then tells Python that this is the
value we want to produce as output of the function. With the `return`,
the function call finishes, and Python literally returns to the place in
our program where the function was called. Here the resulting value gets
assigned to the variable `x`.

Now it is convenient to call the function many times over with different
arguments.

In [41]:
miles_to_km(10)
miles_to_km(20)
miles_to_km(45)

72.405

In general, a function is defined and called as follows.

    def function_name(parameters):
        statement
        statement

    function_name(arguments)

The `def` keyword starts the function definition, followed by the name
we give the function, parameters within parentheses, and a colon. The
parameters are variables that the function uses that we may wish to vary
from the outside, such as the variable miles above. The block of
statements that follows is indented, and gets executed when we call the
function. The call is always by the name, followed by arguments. These
arguments are then assigned to the parameter variables of the function.

In brief, a function allows us to give a name to a block of code. We can
then run that block from anywhere in our program, perhaps many times
over.

### Printing and returning

Sometimes we don't need our function to return a value. For example, we
might just want a function to display some text in the console. We can
write the function as follows, with nothing following the return
statement. When we call the function and the code execution hits the
return statement, Python simply returns to the original line of code.


In [42]:
def print_compliment(name):
    print('Great job, ' + name + '!')
    return

print_compliment('John')

Great job, John!


If we now assign the result of the function call to a variable, the
value is the special value `None`, which indicates an "empty" or "null"
value.

In [43]:
x = print_compliment('John')
print(x)

Great job, John!
None


Since the absence of a return value is fairly common, Python allows us
to omit the return statement altogether. If a function finishes without
a return, the code execution still returns to the point where the
function was called, with the value `None`. So, the below code works
exactly like the code above.

In [44]:
def print_compliment(name):
    print('Great job, ' + name + '!')

print_compliment('John')

Great job, John!


Note that the text now only exists within the function call. If we want
to use it outside the function, we need to return the text. Then we can
use it multiple times without always calling the function.

In [45]:
def print_compliment(name):
    return 'Great job, ' + name + '!'

text = print_compliment('John')
print(text)
print(text)

Great job, John!
Great job, John!


Note that a function does not need to have parameters. For example, we
could define a function to print compliments to a specific person, as
follows.

In [46]:
def print_compliment_donald():
    print('Well done, Donald!')
    print('Great job!')

print_compliment_donald()

Well done, Donald!
Great job!


### A common error

Just like a variable, a function must be defined before we can call it.
The following code will result in an error if the function
`print_personal_compliment` has not been previously defined.

In [47]:
print_personal_compliment('Donald')

def print_personal_compliment(name):
    print('Well done, ' + name + '!') # insert the name into the phrase

NameError: name 'print_personal_compliment' is not defined

### Local variables

Let's look at another function example. We'll replicate what the
built-in function max might do. Here's how it might work.

In [3]:
def max_value(a, b):
    if a > b:
        return a
    else: 
        return b

The function takes two arguments, and compares their values, returning
the larger one.

Now let's call this function. Keep track of the values of the different
variables when we run the following code.

In [4]:
c = max_value(3, 5)
print(c)
print(a)

5


NameError: name 'a' is not defined

What would you expect to be printed out?

Here's a
[visualization](http://pythontutor.com/visualize.html#code=def%20max_value%28a,%20b%29%3A%0A%20%20%20%20if%20a%20%3E%20b%3A%0A%20%20%20%20%20%20%20%20return%20a%0A%20%20%20%20else%3A%20%0A%20%20%20%20%20%20%20%20return%20b%0A%0Ac%20%3D%20max_value%283,%205%29%0Aprint%28c%29%0Aprint%28a%29&cumulative=false&curInstr=0&heapPrimitives=false&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)
of the above code executing.

Let's walk through the code execution, which goes roughly as follows:

1.  We define the function, then call it with the arguments 3 and 5.
2.  Python finds the function and starts executing it.
3.  First, it assigns the two arguments to the parameters `a` and `b`.
4.  It then executes the statements and returns a value. But any
    variables that are created within the function are *local*
    variables, including the parameter variables `a` and `b`. These
    variables only exist when the function is being executed, and
    disappear when we return from the function call.
5.  When we try to use a local variable from outside the function call,
    we therefore get a `NameError`.

The part of a program where we can use a variable is called its *scope*.
The scope of a variable defined within a function is the function call.

Here's another example:

In [5]:
b = 4
c = max_value(b, 1)
print(c)
print(b)
print(a)

4
4


NameError: name 'a' is not defined

What do you think Python would display here?

Let's again step through the code.

1.  We assign `b` to 4.
2.  We then call the function `max_value` with the arguments `b` and 1.
3.  Python starts executing the function by creating the local variables
    `a` and `b`. These are in the local scope, or what is sometimes
    called the namespace. Since it is local, it does not overwrite
    variable values outside its scope. In this local frame, `a` is 4 and
    `b` is 1, and we return the maximum of those.
4.  When we return from the function call, the local namespace or frame
    disappears, and we're left with the variables we had outside the
    function.
5.  So, after we return, `b` is still equal to 4 and `a` does not exist.

Generally, we can think about local variables and scoping as follows.

1.  At the global level, in the Python interpreter, there is a "frame"
    (or "table") of names keeping track of all names defined at that
    level and their assigned values.
2.  When a function is called, a new frame is created to keep track of
    all names within that function call. This could result in many such
    tables, as a function could call another function that could call
    another function, etc.
3.  When a function call is completed, its frame disappears. Some values
    may be returned to the frame from which the function was called.

Why does Python work like this? One good reason is **abstraction**. If
the variables we declare inside the function did affect affect outside
variables, we would have to be careful not to use the same variable
names inside functions. But as it is, we don't have to worry about how
the code has been written inside the function. This allows us to simply
take a function as a building block for more complex programs,
abstracting from how it's been implemented.

Let's practice these ideas with some exercises.

**The folder `ses02` contains a Python script file `ses02.py`. Open this
file in Spyder.** Select `File -> Open` in Spyder's menu and navigate to
the folder. The file contains "skeleton" implementations of the
functions you'll define. Your task is to complete the functions.

**Please do not change the file name from `ses02.py`**. When you later
test your code, it will always test the contents of `ses02.py`
in the folder `ses02`.

When you open the file, you should see something like this.

The function `sum_of_squares` contains first a string explaining what
the function does (or should do!), and giving some examples of its
output for different inputs. This is called a *documentation string*, or
*docstring* for short. When writing functions, including a docstring is
good practice as it makes the purpose and workings of the function clear
to whoever is using your code.


When working on a function, you can test your implementation in Spyder's
console. For example, when you edit the code of `sum_of_squares`, first
run the file `ses02.py` using the green play button. This makes sure
your changes to the function are available to be called.

Then copy or type into the console a function call of the first example:

In [6]:
from ses02 import sum_of_squares
sum_of_squares(1, 2)

5

In this example, we haven't edited the function yet, so the result is
not correct: the output should be 5. After you edit the code, the output
for each example should be the one indicated in the documentation
string.

Now edit the code in the function `sum_of_squares` to work as per the
function documentation. Note that **whenever you edit the function code,
you must run it again for Python to take into account your changes.**

When you have completed the code and tested it in Spyder, use `ok` to
test your function more comprehensively. **First save your work in
Spyder.** Otherwise `ok` won't see the changes you've made.

Next, complete the function `print_grade` in `ses02.py` as per the
instructions in the code below.

In [7]:
def print_grade(mark, grade_high, grade_low):
    """
    Prints out distinction, pass, or fail depending on mark

    If mark is at least grade_high, prints 'distinction'
    Else if mark is at least grade_low, prints 'pass'
    Else prints 'fail'

    Example use:
    >>> grade_high = 70 
    >>> grade_low = 50
    >>> print_grade(20, grade_high, grade_low)
    fail
    >>> print_grade(61, 70, 50)
    pass
    >>> print_grade(90, 80, 60)
    distinction
    """
    # DON'T CHANGE ANYTHING ABOVE
    # YOUR CODE HERE
help(print_grade)

Help on function print_grade in module __main__:

print_grade(mark, grade_high, grade_low)
    Prints out distinction, pass, or fail depending on mark
    
    If mark is at least grade_high, prints 'distinction'
    Else if mark is at least grade_low, prints 'pass'
    Else prints 'fail'
    
    Example use:
    >>> grade_high = 70 
    >>> grade_low = 50
    >>> print_grade(20, grade_high, grade_low)
    fail
    >>> print_grade(61, 70, 50)
    pass
    >>> print_grade(90, 80, 60)
    distinction



### Function recap

Functions are very useful because they

-   Allow us to write a set of commands once and execute them whenever
    needed. This means that we can save a lot of time and effort when
    writing larger programs.
-   Create useful abstractions. Once the function works, we don't care
    about the code inside the function anymore - we just use it by
    calling it. This means that we can write small parts of programs and
    use them to create larger programs We have been using this idea
    every time we use Python's built-in functions like `print`: we've
    never actually looked at the code behind the function. We've been
    able to assume it works correctly, and use it to build more
    complicated programs.
-   Make it easier to change our code, by being modular. Whenever we
    want to change our code, we only need to do it in one place only,
    and it will work similarly every time the function is called.
    Without functions, we would have to make the same changes in
    multiple places in our programs, making us more vulnerable to bugs
    and mistakes.

### Designing functions

Functions are key components of our programs. They typically solve a
specific problem, and can then be used by other parts of our program.
When starting to write a function, we find it useful to start by stating
out what we want the function to do. This could be for example, "convert
temperature from Fahrenheit to Celsius to two decimals". We'd like to
give the function a name that gives a clear indication of this purpose,
and a comment which states this more clearly. So we start out writing:

In [8]:
def fahrenheit_to_celsius(fahrenheit):
    """
    Converts temperature from Fahrenheit to Celsius to two decimals.
    """
    pass

help(fahrenheit_to_celsius)

Help on function fahrenheit_to_celsius in module __main__:

fahrenheit_to_celsius(fahrenheit)
    Converts temperature from Fahrenheit to Celsius to two decimals.



Here, `pass` is a keyword used as a placeholder for code we haven't
written yet. As we noted above, the comment within triple quotes
directly following the function definition is called a *docstring*.
Docstrings are commonly used to describe functions to make them
convenient for others to use. Using the command

    help(fahrenheit_to_celsius)

will print out this docstring, telling a prospective user what the
function does.

The next step is determining what the function should do, i.e. what its
output should be for different inputs. If we look up the conversion
formula, we can use it to calculate what the result should be, for
example

    >>> fahrenheit_to_celsius(10)
    -12.22
    >>> fahrenheit_to_celsius(-20)
    -28.89

It's useful to add these code examples to the docstring to make the
output clear to users. We also often want to clarify the types of input
and output of the function, as follows:

In [9]:
def fahrenheit_to_celsius(fahrenheit):
    """
    Converts temperature from Fahrenheit to Celsius to two decimals.

    Parameters:
    fahrenheit: temperature in fahrenheit

    Returns:
    temperature in celsius

    Example use:
    >>> fahrenheit_to_celsius(10)
    -12.22
    >>> fahrenheit_to_celsius(-20)
    -28.89
    """
    pass

Now, after specifying what the function *should* do, we are ready to
write the program to give us the desired result.

In [10]:
def fahrenheit_to_celsius(fahrenheit):
    """
    Converts temperature from Fahrenheit to Celsius to two decimals.

    Parameters:
    fahrenheit: temperature in fahrenheit

    Returns:
    temperature in celsius

    Example use:
    >>> fahrenheit_to_celsius(10)
    -12.22
    >>> fahrenheit_to_celsius(-20)
    -28.89
    """
    celsius = (fahrenheit - 32)*5/9
    return round(celsius, 2)

fahrenheit_to_celsius(-20)

-28.89

In practice, the process of solving computational and data problems is
exploratory, messy, and highly nonlinear. It therefore often involves
playing around with code to figure things out before the solution gets
organized into functions. But having spent some extra time in organizing
your code into functions and documenting it can be very valuable when
you share it with others, or return to it later.

The larger the programming project, the more important this organization
and documentation becomes.

## Homework exercises

### Middle of three

Complete the function `middle_of_three` in the file `ses02_extra.py`.

### Sum up to

Complete the function `sum_up_to` in the file `ses02_extra.py`. 

### Faster than Heron?

During the lecture, we used Heron's square root algorithm to find square
roots. It looked like in that case it was quite quick to converge to the
solution. But how quick is it actually?

In `ses02_extra.py`, you will find two algorithms for calculating the
square root of a number: Heron's algorithm and bisection search. Your
task is to compare how quickly they find the solution.

First, update Heron's algorithm so that it counts the number of
iterations it does, and returns the count along with the square root
value. You'll need to add a variable that increases in value at each
iteration, and return it together with the result.

In [11]:
def square_root_heron(x, epsilon=0.01):     
        """
        Find square root using Heron's algorithm
        
        Parameters:
        x: integer or float
        epsilon: desired precision, 
            default value epsilon = 0.01 if not specified
        
        Returns:
            the square root value, rounded to two decimals,
            the number of iterations the algorithm ran
        
        Example use:
        >>> y, c = square_root_heron(20)
        >>> print(y, c)
        4.47 4
        """
        # DON'T CHANGE ANYTHING ABOVE
        # UPDATE CODE BELOW THIS
        
        guess = x/2 # Make initial guess
        # Loop until squared value of guess is close to x
        while abs(guess**2 - x) >= epsilon:
            guess = (guess + x/guess)/2 # Update guess using Heron's formula
        return round(guess, 2), ...

Next, you'll compare Heron's approach with another classic algorithm called bisection search.

The idea is simple: we know that the square root of a number \$x\$ is somewhere between \$0\$ and \$x\$. We make a guess: the square root is
the midpoint of this range. Like with Heron, if the guess is close enough, we stop. If the guess is too low (guess squared is lower than
\$x\$), we know that the square root is not in the bottom half so we can only search the top half. Otherwise, we similarly discard the top half.
We then pick the midpoint of the updated range as a new guess. We repeat this until we're close enough. The procedure is called {\it bisection
search} because it halves the search area at each iteration.

For example, to find the square root of 10, our first guess in the search range \$\[0,10\]\$ would be \$(10+0)/2=5\$. Since this guess is
too high (\$5^2 \> 10\$), we would discard everything above \$5\$ and repeat the search in the range \$\[0,5\]\$. Our new guess would then be
\$2.5\$...

The algorithm is thus as follows:

-   Start with a guess \$g\$ as average of search range \$low=0\$ and
    \$high=\max\\1.0,x\\\$
-   If \$g\cdot g\$ is close to \$x\$, stop and return \$g\$ as the
    answer
-   Otherwise, if \$g \cdot g \< x\$, update search range: \$low = g\$,
-   Otherwise, if \$g \cdot g \>= x\$, update search range: \$high = g\$
-   Make new guess as average of updated search range
-   Repeat process using new guess until close enough

Let's think about these steps in terms of how we could translate them
into a while loop. After the initial guess, we have actions that are
repeated as long as a condition is met. We can write this as a while
loop as follows.

-   Start with a guess \$g\$ as average of search range \$low=0\$ and
    \$high=\max\\1.0,x\\\$
-   While \$g \cdot g\$ is not "close" to \$x\$:
    -   If \$g \cdot g \< x\$, update search range: \$low = g\$,
    -   If \$g \cdot g \>= x\$, update search range: \$high = g\$
    -   Make new guess as average of updated search range
        \$\\low,high\\\$
-   When the while loop finishes, return the final guess

Your task now is to complete the Python function in the file to match
these steps.

In [12]:
def square_root_bisection(x, epsilon=0.01):
    """
    Find square root using bisection search

    Parameters:
    x: integer or float
    epsilon: desired precision, 
        default value epsilon = 0.01 if not specified

    Returns:
    the square root value, rounded to two decimals,
    the number of iterations of the algorithm run

    Example use:
    >>> y, c = square_root_bisection(20)
    >>> print(y, c)
    4.47 9
    """
    # DON'T CHANGE ANYTHING ABOVE
    # UPDATE CODE BELOW THIS

    low = 0.0
    high = max(1.0, x) # Why are we doing this? What would happen for x=0.5?
    guess = (low + high)/2 # First guess at midpoint of low and high
    while abs(guess**2 - x) >= epsilon:
        if guess**2 < x:
            low = ... # update low
        else:
            high = ... # update high
        guess = ... # new guess at midpoint of low and high
    return ..., ...  

How do the algorithms compare in general for different values? Why do
you think that is the case?

## Question: Guess the number game

This exercise brings together many of the programming concepts from the first two sessions (with a couple of new tricks too!)

Starting from the code below, write a *Guess the number* game. Here's how it works. The computer draws a random (integer) number (say, between
1 and 100), and prompts the user to guess the number. If the guess is correct, the program ends. Otherwise, the program prints out "Too low"
or "Too high", and prompts the user again.

For example, it might work like this (for a particularly lucky guesser!):

    >>> guess_the_number(1, 100)

    Guess a number between 1 and 100: 30

    Too high!

    Guess a number between 1 and 100: 10

    Too low!

    Guess a number between 1 and 100: 13

    Correct!

Create a new file in Spyder and save it as `guess_the_number.py`. In the file, complete this function to implement the game:

In [13]:
def guess_the_number(min_value, max_value):
    """
    User must guess the value of a number between min_value and max_value

    Parameters:
        min_value, max_value (integers)
    """
    # YOUR CODE HERE

To complete the function, you'll need to draw a random number and ask
for user input. Here's how you can do it:

In [14]:
# We import a library to draw pseudorandom numbers. 
# We'll learn more about libraries later in the module. 
import random  

# draw random integer between 1 and 100
x = random.randint(1, 100)

In [15]:
# we can get user input using the input function
name = input('What is your name?')

What is your name?Harsha


Extra part: what is a good strategy for guessing? Make the program count
and return how many attempts you needed, and try a few times for numbers
between 1 and 100. How would this change for 1000? What about 1 000 000?