# Introduction: Algorithms

Module Algorithms and Data Structures | Chapter 1 | Notebook 1

***
Welcome to your first exercise in the Algorithms and Data Structures module! In this exercise you will get to know recursive implementation of algorithms. 
By the end of this exercise you will be able to: 
* Tell the difference between recursive and iterative algorithms 
* Describe when recursive algorithms are advantageous, 
* Implement the binary search algorithm recursively.

***


## Linear search algorithm


Let's go back to our book search example from the text lesson. 
Our algorithm was described as follows: 

1)	Start searching on the far left of the shelf.  
2)	If there is no book at the position, end the search and report that the title you are looking for was not found.  
3)	If a book is available, look at the title of the book. If the title matches what you are looking for, enter the book and its position and end the search.
4)	Move one position to the right and repeat all steps from step 2.


We saw that this algorithm has a time complexity of $O(n)$ and that this is a *linear search algorithm*. We also discussed that there cannot be a faster algorithm - at least not if the library is not sorted. The situation is different if our library is already sorted. The faster algorithm you learned about in this context is the *binary search algorithm*. We want to implement this later in this exercise.


We can translate our linear search algorithm into code. When implementing functions or algorithms, we often have two conceptual approaches at our disposal: iterative and recursive.

* With the **iterative** method, we look at the problem "from the beginning". We use loop structures such as `for` or `while` to search through the data. These structures are generally intuitive and things we are familiar with.

* With the **recursive** method, a function calls itself with modified arguments until a base condition is met that ends the recursive call. Recursion can often be more elegant and shorter to write. However, the approach takes some getting used to and requires a little practice. 

For our linear search algorithm, we will look at the iterative version first. Conceptually, we will go through our data list and check each element until we either find the element we are looking for or realize that it is not in the list.


##### <font color="#17415F">Code example I</font>
> The `linear_search_iter()` function is an example of an iterative implementation of our linear search algorithm.


In [None]:
def linear_search_iter(title, library): 
    """
    Return book with given title as well as position in library and None as well as -1 if nothing is found. 
    
    Args: 
        title (str): book title to be searched for. 
        library (list): list of book titles in the library. 
        
    Returns: 
        (title, index position in library) and (None, -1) if title was not found.
        
    """
    
    if len(library) > 0: 
      pos = 0
      while pos < len(library): 
          if title == library[pos]: 
              return (title, pos)
          pos += 1
    return (None, -1)

In this implementation, we use a `while` loop to search all entries of `library` in a linear way. 
We had already established that the linear search algorithm, as described above, has a time complexity of $O(n)$. Our implementation should have the same time complexity. Is that really the case?

In `linear_search_iter()` we used a built-in method for lists: `len()`. If we want to evaluate the time complexity of our algorithm, we must also consider the runtime of `len()`. A naive implementation, which iterates through all list elements and counts them, would have a linear runtime. However, the time complexity of the `len()` function is actually $O(1)$. This is related to how `list` is implemented as a data structure. We will come across this again in the second part of the module. But until then here's a little bit of information: In Python, the length of a list is stored as an attribute, so that when the `len()` method is called, this value is retrieved directly without having to iterate through the list. 
This constant runtime is therefore negligible compared to the linear runtime, which we have in any case, since our `while` loop runs through all entries in case of doubt if no hits were found beforehand.


Now let's look at the recursive solution. We design our implementation in such a way that the function calls itself again and again. The search area is reduced with each call until the element being searched for is found or the search area is exhausted.


##### <font color="#17415F">Code example II</font>
> The `linear_search_rec()` function is an example of a recursive implementation of our linear search.


In [None]:
def linear_search_rec(title, library, index=0): 
    """
    Recursively return book with given title as well as position in library and None as well as -1 if nothing is found. 
    
    Args: 
        title (str): book title to be searched for. 
        library (list): list of book titles in the library.
        index (int): index to be checked in current recursive step.
        
    Returns: 
        (title, index position in library) and (None, -1) if title was not found.
        
    """
    
    if index == len(library): #base case 1: no title found
        return (None, -1)
    if library[index] == title: #base case 2: title found 
        return (title, index)
    return linear_search_rec(title, library, index+1) #recursion 

Let's take a closer look at how this recursive implementation works. In addition to the parameters `title` and `library`, there is an additional argument `index`. This serves as a pointer that guides us through the book list and is adjusted during the recursion.

In every recursive function, there are one or more base cases that end the recursion. We have two of these cases in our implementation:

1. When the end of the shelf (or list) is reached and the book we are looking for has not been found.
2. If the book at the current position (`index`) matches the title you are looking for.

If none of these base cases occur, the function goes into the recursive step and calls itself with an incremented `index`. This allows us to systematically go through the book list without using a loop. Adjusting the `index` value with each recursive call ensures that the search progresses and eventually reaches a base case.

In summary, recursion is characterized by the following features:

* There is **at least one base case** that ends the recursion.
* Often **additional arguments** are used that are especially important for the recursive process. It makes sense to assign default values to these arguments in order to make it more user friendly.
* The function **calls itself**, adjusting the arguments so that the recursive process progresses and eventually reaches a base case.


Now it's your turn. Now try to implement an algorithm recursively yourself. The algorithm is similar to our previous linear search algorithm, with the difference that the search should now start on the right-hand side of the shelf. Write your code directly into the cell, we don't need a script for this exercise.


##### <font color="#3399DB">Task 1</font>
> Write a recursive function `linear_search_rec_inv()`. Proceed in the same way to `linear_search_rec()`, but start your search on the right. The index positions should be output as before. Use the prepared tests in the following cell to test your code.


In [None]:
#Test your code 
lib = ['great_gatsby', 'tonio_krueger', 'infinite_jest', 'garp', 'python_for_dummies', 'linear_algebra_1', 'crimson_labyrinth', 'garp', 'ulysses', 'don_quixote' ]
assert linear_search_rec_inv('ulysses', lib) == ('ulysses', 8)
assert linear_search_rec_inv('garp', lib) == ('garp', 7)
assert linear_search_rec_inv('great_gatsby', lib) == ('great_gatsby', 0)
assert linear_search_rec_inv('lord_of_the_rings', lib) == (None, -1)
print('tests passed')

Recursive approaches are not always the first choice, especially when the iterative approach is obvious and intuitive, as in our previous example. However, there are scenarios where recursion simplifies problem solving considerably. Especially if a problem can be broken down into several smaller but essentially identical sub-problems, the recursive structure can represent this break down naturally and clearly. Recursive code can often be shorter and more concise, as it can dispense with a lot of case distinctions. This leads to increased readability and elegance, reduces the likelihood of errors and makes the debugging process clearer. A classic example of recursion's strength is the binary search algorithm, which we will look at next.


## The binary search algorithm


If our library is sorted alphabetically by title, we can use a faster search algorithm: the binary search algorithm. We already looked at this in the last text lesson. This algorithm utilizes the sorted structure of the library and achieves a very good time complexity of $O(\log n)$. Here is the procedure:

1. If the shelf is empty, end the search. Otherwise, start with the book in the middle of the shelf. If the number of books is even, take the book directly to the left of the center.
2. Check the title of the book. If it matches the title you are looking for, enter the book title and its position on the shelf and end the search.
3. If not, decide based on the first letter of the title you are looking for: If this is before the book title of the current book in the alphabet, continue with the left half of the shelf. If it is after it, concentrate on the right half. Restart the search process in the selected area from step 1.


By repeatedly halving the search space, the binary search algorithm effectively reduces the number of steps required to find the desired book or to determine that it is not on the shelf. The algorithm bears its name because it successively divides the search space into two (not necessarily equally sized) parts.
We will implement this algorithm recursively in a moment. 

The illustration shows an example of the process for searching for a book that is not on the shelf: 
    
<img src="pyp_ads_nb1_binaersuchalgo.png" alt="Binary search algorithm" style="width: 600px;"/>

In the shelf with 10 books, we first look at the fifth book (index position 4) (step 1). The title does not match the one you are looking for (step 2). The title comes forward in the alphabet, so we turn to the half of the shelf to the left of index position 4 (step 3). To do this, we repeat the process. After the third iteration, we would again look at the half of the shelf to the left of the book with the title `garp`. That half of the shelf is empty, so step 1 ends the process.


We face a number of challenges when implementing the binary search algorithm. In order to get a clear overview of the upcoming steps, it makes sense to first write some **pseudocode**.

Pseudocode is a kind of intermediate step between the verbal description of an algorithm and the actual code. Here, we formulate the sequence of our algorithm in a structured way that follows the logic of a programming language, but without getting caught up in specific syntactic details. It serves as a bridge that allows us to focus on the core logic of the algorithm before we start with the actual programming.


For example, the pseudocode of `linear_search_rec()`, the recursive version of our linear algorithm above, could look like this: 

> **Input:** `library` (`list` of books), `title` (`str` containing title), `index` (helper variable for recursion)
>
> **Process:**
> * **Base case 1**: If index out of range (library fully searched) -> Return `(None, -1)`
> * **Base case 2**: If title found at current index: -> Return `(title, index)`
>
> **Recursive step:**
> * Else if library not fully searched and title not found:
>  * `index += 1`
>  * Recursively call function with new `index` (and return it)


The pseudocode already looks very similar to our final code. This is because Python is a high-level programming language. The syntax is similar to our language and way of thinking. This means that the path from pseudocode to code is not far. Nevertheless, pseudocode can help you to organize your thoughts.


Before we start with the recursive implementation of our binary search algorithm, we will outline the structure and flow of the algorithm using pseudocode. It is important to us that the function not only returns the book title searched for, but also its position on the shelf.

Our aim is for the implementation to achieve a time complexity of $O(\log n)$. To ensure this and to avoid being influenced by external factors, we refrain from using existing methods - with one exception: The list method `len()`, which has a constant time complexity of $O(1)$, may be used because it has no influence on our overall target complexity.


##### <font color="#3399DB">Task 2</font>
> First, write pseudocode for the binary search algorithm in our library example. What additional variables do you need for the recursion? What are the base cases? You may assume that the titles are in alphabetical order. You can also use the list method `len()` in your later code. Do not use any methods that we have already implemented. Remember that the algorithm should output both the title and the index position if the book is found.


In the following box you can see the docstring for a possible solution. It gives you an idea of which additional variables you can use for the recursive process.


**Note:** Open this box to see the docstring for a possible solution:

<div class="details">

```python 
def binary_search_rec(title, library, low=0, high=None): 
    """
    Recursively return book title and position in library and None and -1 if nothing is found. 
    
    Args: 
        title (str): book title to be searched for. 
        library (list): list of book titles in the library.
        low (int): lower index bound for recursion. Defaults to 0.
        high (int): higher index bound for recursion. Defaults to None. 
        
    Returns: 
        (title, index position in library) and (None, -1) if no title was found.
        
    """
```
</div>


Does your pseudocode provide for other recursion variables or just one of these variables? Then feel free to use your own version for the implementation. If you like, please share your solution in the forum.


##### <font color="#3399DB">Task 3</font>
> Now implement `binary_search_rec()` in the following code cell. Avoid using any methods that we have already implemented. An exception to this is the list method `len()`.
> Use the prepared tests in the following cell to test your code.


In [None]:
#Test your code 
lib_sorted = ['crimson_labyrinth',
 'don_quixote',
 'garp',
 'garp',
 'great_gatsby',
 'infinite_jest',
 'linear_algebra_1',
 'python_for_dummies',
 'tonio_krueger',
 'ulysses']

assert binary_search_rec('ulysses', lib_sorted) == ('ulysses', 9)
assert binary_search_rec('garp', lib_sorted) == ('garp', 2)
assert binary_search_rec('crimson_labyrinth', lib_sorted) == ('crimson_labyrinth', 0)
assert binary_search_rec('infinite_jest', lib_sorted) == ('infinite_jest', 5)
assert binary_search_rec('lord_of_the_rings', lib_sorted) == (None, -1)
print('tests passed')

**Congratulations:** You have implemented a binary search recursively. It is an example of an algorithm based on the *divide and conquer* principle: The actual problem is repeatedly divided into basically identical sub-problems. This recursion therefore represents the natural structure of the search.


**Deep dive:** the Quicksort algorithm. 
<div class="details">
In our exercise, we have assumed that our library is sorted alphabetically. We can therefore implement our search in $O(\log n)$ instead of $O(n)$ as before. However, sorting itself involves a lot of work. In the previous text lesson, we assumed in question 2 that sorting in $O(n \log n)$ is possible. In fact, $O(n \log n)$ is the best time complexity we can achieve for sorting. We can achieve this using the <i>Quicksort</i> algorithm, among other things. In the algorithm, what we call a <i>pivot</i> element is compared with all the other elements in each recursion step. Larger elements are arranged to the right of the pivot, smaller ones to the left. 
If you are interested in this, you can read more 
<a href="https://realpython.com/sorting-algorithms-python/#the-quicksort-algorithm-in-python">here</a>, for example.
</div>


Later in the module we will get to know the binary search tree data structure. If we want to implement a search method in this data structure, we also use a binary search algorithm. 
However, in the following text lesson, we will first look at what data structures are and why we should be concerned about them.


**Remember:** 
* Many problems can be solved both iteratively and recursively.
* Recursive implementations are advantageous for problems where their natural structure can be broken down into identical subproblems.
* The binary search is an example of one of these problems.
* Pseudocode can help you to pre-structure your thoughts.


***
Do you have any questions about this exercise? Look in the forum to see if they have already been discussed.
***
Found a mistake? Contact Support at support@stackfuel.com.
***
