<a href="https://colab.research.google.com/github/brendanpshea/computing_concepts_python/blob/main/IntroCS_07_Algorithms_and_Loops.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Algorithms and Loops
## Brendan Shea, PhD (Brendan.Shea@rctc.edu)


An **algorithm** is a step-by-step procedure or a set of rules for solving a particular problem or accomplishing a specific task. Algorithms can be found in various aspects of everyday life and computing. They are the basis for how computer programs and applications function, as well as how people make decisions or solve problems in their daily lives.

Examples of algorithms in computing:

1.  Sorting algorithms: Sorting is a common computational task where items are arranged in a specific order, such as alphabetical, numerical, or chronological. Some popular sorting algorithms include Bubble Sort, Quick Sort, and Merge Sort. These algorithms have different ways of organizing data, but they all achieve the same goal of sorting items.

2.  Search algorithms: Search algorithms are used to locate specific items or data within a larger data set. Examples include Linear Search, which involves iterating through each item in the data set until the desired item is found, and Binary Search, which involves repeatedly dividing a sorted data set in half and checking if the desired item is in the left or right subset.

Examples of algorithms in everyday life:

1.  Cooking a recipe: When you cook a meal following a recipe, you are essentially executing an algorithm. The recipe provides a list of ingredients and a step-by-step procedure to follow in order to prepare the dish. By following the steps in the correct order, you will successfully cook the meal.

2.  Driving to a destination: When you drive from one location to another, you follow a series of steps and make decisions based on traffic conditions, road signs, and your knowledge of the route. This process can be considered an algorithm, as you are following a procedure to reach your destination.

3.  Getting dressed: The process of getting dressed involves a sequence of actions, such as selecting clothes, putting them on in a specific order, and adjusting as needed. This routine can be seen as an algorithm for preparing yourself for the day.

So: an algorithm is a systematic set of rules or instructions used to solve problems or accomplish tasks. They can be found in both computing and everyday life, guiding actions and decisions to achieve specific goals.

## Brendan's Lecture
Run the following cell to launch my lecture.

In [None]:
from IPython.display import YouTubeVideo
YouTubeVideo('D7gYz215ioc', width=800, height=500)

# Properties of Algorithms
The technical definition of an algorithm is *a finite, well-defined sequence of steps or instructions that, when followed, accomplishes a specific task or solves a particular problem.* An algorithm must have the following properties:

1.  **Unambiguity:** Each step of the algorithm should be clearly defined and not open to multiple interpretations. The instructions must be precise and unambiguous to ensure consistent execution.

2. **Finiteness:** An algorithm must have a finite number of steps. It should eventually terminate after a limited number of iterations or actions.

3.  **Input:** An algorithm typically takes zero or more input values to process and generate the desired output. These inputs can be in various forms, such as numbers, text, or other data types.

4.  **Output:** An algorithm must produce at least one output, which is the result of processing the input data. The output is the solution to the problem or the accomplishment of the task.

5.  **Effectiveness:** The steps in an algorithm should be simple, basic, and executable within a finite amount of time. The algorithm should be efficient and practical for the problem it is designed to solve.

We can use these defintion to explain WHY the following processes are algorithms.

1.  *Binary search algorithm.* A binary search algorithm is used to find a specific value in a sorted array. It repeatedly divides the array in half, comparing the middle element with the target value, and then narrowing the search to the left or right half, depending on the comparison result. This algorithm is unambiguous, finite, takes input, generates output, and is effective.

2.  *Euclidean algorithm.* The Euclidean algorithm is used to find the greatest common divisor (GCD) of two integers. It repeatedly applies the division algorithm, replacing the larger integer with the remainder of the division until the remainder is zero. The last non-zero remainder is the GCD. This algorithm satisfies all the properties of an algorithm.

We can also use this definition to explain why these processes are NOT algorithms:

1.  *A vague or incomplete recipe.* A recipe that lacks specific measurements or cooking times, or has ambiguous instructions, does not qualify as an algorithm. It fails the unambiguity and effectiveness criteria.

2.  *An infinite loop.* A sequence of steps that never terminates (e.g., "repeat the action forever") is not an algorithm, as it does not meet the finiteness requirement.

3.  *A philosophical argument.* A philosophical argument or debate may involve logical reasoning but does not constitute an algorithm, as it lacks specific input and output, and may not be unambiguous or effective.

The nature of algorithms can help us appreciate why computers are so good at certain tasks, and so bad at others.

algs.svg

# What Computers are (Not) Good At
Computers excel at tasks that involve repetitive calculations, processing large volumes of data, or following explicit instructions with precision and speed. Their ability to perform loops and execute repetitive tasks quickly and accurately is a significant advantage over humans. Some tasks that computers do well, better or faster than humans, include:

1.  Computers can perform *complex mathematical operations* and calculations much faster and more accurately than humans. They can handle large numbers, fractions, and various arithmetic or algebraic operations with ease.

2.  Computers can *process and analyze data* efficiently. Tasks such as sorting, searching, filtering, and statistical analysis can be performed rapidly by computers, often outpacing human capabilities.

3.  Computers excel at performing *repetitive tasks* through loops, which allow them to execute the same set of instructions multiple times without errors or fatigue. Examples include running simulations, generating reports, or updating databases.

4.  Computers can be trained to *recognize patterns* in data using machine learning and artificial intelligence algorithms. They can process large datasets to identify trends, anomalies, or correlations that might be challenging for humans to detect.

5.  Computers are capable of *controlling and monitoring intricate systems*, such as power grids, traffic control, or telecommunications networks, by processing real-time data and making decisions based on predefined rules or algorithms.

However, there are tasks that computers struggle with, especially those that involve human-like understanding, intuition, or creativity. Some of these tasks include:

1. While computers have made significant progress in *natural language processing*, they still struggle with understanding context, idiomatic expressions, and implicit meanings in human language.

2.  Computers lack the ability to perceive, understand, and respond to *human emotions* effectively. They struggle to interpret subtle cues, such as tone of voice, body language, or facial expressions, that humans use to communicate emotions. (This is unsurprsing, as a huge portion of human brains are devoted to this sort of thing--it is a difficult task!).

3. Computers can produce original content based on algorithms or templates, but they generally lack the human-like creativity that drives innovation, artistic expression, or problem-solving in novel situations. Again, we have made progress in recent years on this (with AI art and text generating programs).

4.  Computers struggle with tasks that require common sense or general knowledge about the world. They often lack the ability to make inferences based on everyday experiences or situations, which humans can do with ease. Programming computers to do these tasks can be difficult, since humans bring so much "background knowledge and skills" to everyday tasks.

# Loops
Loops are control structures in programming languages that allow a sequence of instructions to be executed repeatedly, based on a specific condition or a predetermined number of iterations. In Python, there are two types of loops: "for" loops and "while" loops. Both types can be used to perform repetitive tasks using numbers without involving lists or dictionaries.

## For loops:
A "for" loop in Python iterates over a range of numbers, executing the instructions within the loop body for each number in the range. The syntax for a "for" loop is as follows:

```
for variable in range(start, end, step):
    # Code to be executed
```
Here's an example of a "for" loop that prints the first 5 even numbers:



In [None]:
for i in range(2, 11, 2):
    print(i)

In this example, the loop variable i takes on the values 2, 4, 6, 8, and 10, and the print(i) statement is executed for each value.

## While loops:
A "while" loop in Python repeatedly executes a block of code as long as a given condition is true. The syntax for a "while" loop is as follows:

```
while condition:
    # Code to be executed
```
Here's an example of a "while" loop that prints the first 5 even numbers:

In [None]:
i = 2
count = 0

while count < 5:
    print(i)
    i += 2
    count += 1


In this example, the loop executes as long as count is less than 5. The print(i) statement is executed, i is incremented by 2, and count is incremented by 1 on each iteration.

Both "for" loops and "while" loops can be used to perform repetitive tasks with numbers in Python. The choice between the two depends on the specific requirements of the task and the desired loop control structure.

# More on While Loops
A "while" loop is a control structure in programming languages, like Python, that allows you to repeatedly execute a block of code as long as a specific condition is met. It can be particularly useful for tasks that involve an indefinite number of iterations, such as validating user input or performing an action until a certain requirement is satisfied.

To use a "while" loop, you can follow these steps:

1. Start with the "while" keyword, followed by the condition that needs to be true for the loop to continue executing. The condition should be placed inside parentheses.

2. After the condition, write a colon (":") to indicate the beginning of the loop body.

3. Indent the code block that you want to execute repeatedly. This block of code is executed on each iteration of the loop.

Here's a simple example of how to use a "while" loop:

In this example, the loop will execute as long as the value of count is less than 5. The loop body prints the value of count and increments it by 1 on each iteration.

## Using while True/break for Validating Input
 The "while True" construct creates an infinite loop that runs until a "break" statement is encountered. It can be particularly useful when you don't know how many iterations the loop will need before a specific condition is met.  
Here's an example of using a "while True" loop for validating user input:



In [None]:
user_input = None

while True:
    user_input_str = input("Enter an integer between 1 and 10: ")

    if user_input_str.isnumeric():
        user_input = int(user_input_str)
        if 1 <= user_input <= 10:
            break
        else:
            print("Invalid input. Please enter an integer between 1 and 10.")
    else:
        print("Invalid input. Please enter an integer.")

print(f"Your input is {user_input}.")

In this example, the "while True" loop keeps running indefinitely until a valid user input is provided. We first check if the input string is numeric using the isnumeric() method. If it is, we convert the string to an integer and then check if it is within the desired range (1 to 10). If the input meets the criteria, the "break" statement is executed, which terminates the loop. If the input does not meet the criteria, the loop continues, and the user is prompted to enter a new input.

## None
The "None" keyword in Python represents the absence of a value or a null value. In this example, we initialize user_input with the value None to indicate that the user input has not yet been provided or assigned a value.

In summary, you can use "while" loops to repeatedly execute a block of code as long as a specific condition is met. The "while True" construct and the "break" statement can be used to create an infinite loop that terminates when a certain requirement is satisfied, such as validating user input. Using "None" can help you represent the absence of a value or a null value for a variable, like user input, before it is assigned an appropriate value.

## Table: Some Simple While-Loops

| Code | Description |
| --- | --- |
| `i = 1; while i <= 5: print(i); i += 1` | Python while loop that prints numbers from 1 to 5 |
| `n = 5; result = 1; while n > 0: result *= n; n -= 1; print(result)` | Python while loop that calculates the factorial of 5 |
| `count = 0; while count < 3: print('Hello, world!'); count += 1` | Python while loop that prints 'Hello, world!' three times |
| `i = 1; while i <= 5: print(i ** 2); i += 1` | Python while loop that prints the first 5 square numbers |
| `word = 'Python'; reversed_word = ''; index = len(word) - 1; while index >= 0: reversed_word += word[index]; index -= 1; print(reversed_word)` | Python while loop that reverses a string 'Python' |
| `lst = [1, 2, 3]; index = 0; while index < len(lst): print(lst[index]); index += 1` | Python while loop that prints all elements in a list `[1, 2, 3]` |
| `i = 5; while i > 0: print(i); i -= 1` | Python while loop that counts down from 5 to 1 |
| `i = 2; while i <= 10: print(i); i += 2` | Python while loop that prints all even numbers from 2 to 10 |
| `i = 1; squares = []; while i <= 5: squares.append(i ** 2); i += 1; print(squares)` | Python while loop that appends squares of 1 to 5 to a list and prints the list |

# More on For Loops
A "for" loop is another type of control structure in programming languages like Python, allowing you to iterate over a sequence of values and execute a block of code for each value in the sequence. The "for" loop is particularly useful when you know the number of iterations you want the loop to perform, or when you want to iterate over a specific range of values.

To use a "for" loop, you can follow these steps:

1. Start with the `for` keyword, followed by a loop variable that represents the current value in the sequence.

2. Use the `in` keyword, followed by a sequence of values to iterate over. In Python, you can use the built-in range() function to generate a sequence of numbers.

3. Write a colon (":") to indicate the beginning of the loop body.

4. Indent the code block that you want to execute repeatedly. This block of code is executed on each iteration of the loop, with the loop variable taking on the next value in the sequence.

Here's a simple example of using a "for" loop to print the numbers from 0 to 4:



In [None]:
for i in [0,1,2,3,4]:
  print(i)

0
1
2
3
4



## range()
The range() function generates a sequence of numbers and has three possible arguments:

* `range(stop)`: Generates a sequence of numbers from 0 (inclusive) to stop (exclusive), with a step of 1.

* `range(start, stop)`: Generates a sequence of numbers from start (inclusive) to stop (exclusive), with a step of 1.

* `range(start, stop, step)`: Generates a sequence of numbers from start (inclusive) to stop (exclusive), with a custom step.

## Use Cases
Here are some potential uses of "for" loops:

### Performing calculations:
A "for" loop can be used to perform a calculation over a specific range of values. For example, you can use a "for" loop to calculate the sum of the first 10 positive integers:



In [None]:
sum_of_numbers = 0

for i in range(1, 11):
    sum_of_numbers += i

print(f"The sum of the first 10 positive integers is {sum_of_numbers}.")

### Repeating an action a specific number of times:
A "for" loop can be used to execute an action for a predetermined number of iterations. For example, you can use a "for" loop to print a message five times:


In [None]:
for i in range(5):
    print("Hello, world!")

### Iterating over a sequence with a specific step:
A "for" loop can be used to iterate over a sequence of values with a custom step. For example, you can use a "for" loop to print all multiples of three between 7 and 21 inclusive.

In [None]:
for i in range(6, 22, 3):
    print(i)


In the end, "for" loops are a versatile control structure that can be used to iterate over a specific range of values and execute a block of code for each value in the sequence. They are particularly useful for tasks that require a known number of iterations, performing calculations over a range of values, or repeating an action a specific number of times.

# Looping Through Strings
In programming languages like Python, strings are sequences of characters. A string can be considered as a collection of individual characters arranged in a specific order. Python provides built-in support for iterating over the characters of a string, which allows you to perform various operations on them using for loops.

As strings are sequences, you can use a for loop to iterate over each character in the string. On each iteration, the loop variable takes on the value of the next character in the string. This feature enables you to perform operations on each character or derive information from the string.

Here's an example that demonstrates how to use a for loop to iterate over the characters of a string and capitalize every other character:

In [None]:
text = "Hello, world!"
capitalized_alternate_text = ""
length = len(text)

for i in range(length):
    char = text[i]
    if i % 2 == 0:
        capitalized_char = char.upper()
    else:
        capitalized_char = char
    capitalized_alternate_text += capitalized_char

print(f"Capitalized alternate text: {capitalized_alternate_text}")


In this example, we first use the len() function to find the length of the string text. The len() function returns the number of characters in the string, which allows us to use a for loop with a range from 0 to the length of the string (exclusive).

On each iteration, we access the character at the index i using the expression `text[i]`. Then, we use a conditional statement to check if the index i is even. If it is, we capitalize the character using the `upper()` method; otherwise, we keep the character unchanged. We then append the capitalized or original character to the capitalized_alternate_text string.

The resulting capitalized_alternate_text string has every other character capitalized: "HeLlO, wOrLd!"

This example illustrates how you can use for loops to iterate over the characters of a string, derive information about the string using the len() function, and perform operations on the characters, such as capitalizing every other character.

# Nested Loops
Nested loops are loops that are present within another loop. They are often used when you need to perform a set of operations that involve multiple levels of iteration or when you need to repeat a block of code for each combination of items from two or more sequences. The outer loop iterates over the first sequence, and the inner loop(s) iterate over the second (and subsequent) sequences. The inner loop completes all its iterations for each single iteration of the outer loop.

Let's break down the concept of nested loops step by step:

1.**Outer loop:** This is the loop that encloses the inner loop(s). It iterates over a sequence of values, just like any other loop.

2. **Inner loop:** This loop is placed within the body of the outer loop. It also iterates over a sequence of values, executing its code block for each value in the sequence.

3. **Combined iterations:** For each iteration of the outer loop, the inner loop goes through all its iterations. This means that the code block of the inner loop will be executed for every combination of values from the outer and inner loop sequences.

Here's a simple example to illustrate the concept of nested loops:



In [None]:
for i in range(1, 4):
    print(f"Outer loop iteration {i}:")
    for j in range(1, 3):
        print(f"  Inner loop iteration {j}")

In this example, we have an outer loop that iterates over the range of numbers from 1 to 3 (inclusive), and an inner loop that iterates over the range of numbers from 1 to 2 (inclusive). The inner loop is placed inside the body of the outer loop, which means that for each iteration of the outer loop, the inner loop will go through all its iterations.

If you run the program (in the cell below), you'll discover that for each iteration of the outer loop (i = 1, 2, 3), the inner loop completes all its iterations (j = 1, 2). This results in a total of 6 combined iterations, as the outer loop iterates 3 times, and the inner loop iterates 2 times for each outer loop iteration (3 * 2 = 6).

Nested loops are a powerful programming concept that allows you to handle problems requiring multiple levels of iteration or when you need to explore all possible combinations of elements from multiple sequences.


## Lists and Loops
A Python list is a built-in data structure that can hold an ordered collection of items. These items can be of any type---integers, floats, strings, and even other lists or complex objects. Lists in Python are mutable, meaning you can modify them after they are created.

### Syntax

The syntax for creating a list is simple: square brackets `[]` enclose the list items, which are separated by commas.


In [None]:
# An empty list
empty_list = []

# A list of integers
integer_list = [1, 2, 3, 4, 5]

# A list of strings
string_list = ['Mario', 'Luigi', 'Peach', "Yoshi", "Donkey Kong"]

# A mixed list
mixed_list = [1, 'Mario Kart', 3.14]


### Accessing List Elements
Just like Strings, List elements are accessed by their index, which starts from 0. For example, in the list `[1, 2, 3]`, the element at index 0 is `1`. Negative indexing can be used to access elements from the end of the list. For example, an index of `-1` refers to the last element.


In [None]:
print(string_list[0])
print(string_list[-1])

Mario
Donkey Kong


Lists and for-loops are often used together in Python for tasks like iteration, data manipulation, and data analysis. Below are ways you can use lists with for-loops.

### Using Lists With For Loops
You can iterate through a list using a for-loop, performing some action with each item.

In [None]:
# Iterating through each item in the list
for item in string_list:
    print(item)

Mario
Luigi
Peach
Yoshi
Donkey Kong


### Using the `range()` Function

You can also use the `range()` function with the `len()` function to iterate through the list by index. This is particularly useful when you need to know the position of an item within the list.

In [None]:
for dex in range(len(string_list)):
  print(f"Position {dex} is {string_list[dex]}")

Position 0 is Mario
Position 1 is Luigi
Position 2 is Peach
Position 3 is Yoshi
Position 4 is Donkey Kong


### Nested Lists and Loops

Lists can contain other lists, creating a nested structure. You can use nested for-loops to iterate through the elements of a nested list.

In [None]:
# A nested list
nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Iterating through a nested list
for sublist in nested_list:
    for item in sublist:
        print(item, end=' ')
    print()  # Newline at the end of each sublist


1 2 3 
4 5 6 
7 8 9 


## Table: For-Loops to Remember

| Code | Description |
| --- | --- |
| `for i in range(1, 8): print(i)` | Print out the numbers from 1 to 7 |
| `for i in range(1, 11): print(i*i)` | Print the square of numbers from 1 to 10 |
| `fruits = ['apple', 'banana', 'cherry'] for fruit in fruits: print(fruit)` | Print each fruit in a list containing ['apple', 'banana', 'cherry'] |
| `even_numbers = [i*2 for i in range(1, 11)]` | Create a list with the first 10 even numbers |
| `for char in "Hello": print(char)` | Print each character in the string "Hello" |
| `numbers = [1, 2, 3, 4, 5]; sum = 0; for num in numbers: sum += num` | Calculate the sum of the numbers in a list containing [1, 2, 3, 4, 5] |
| `d = {'a': 1, 'b': 2, 'c': 3}; for key, value in d.items(): print(key, value)` | Print each key-value pair in a dictionary containing {'a': 1, 'b': 2, 'c': 3} |
| `for i in range(1, 11): print("5 x", i, "=", 5*i)` | Print a multiplication table for the number 5 |
| `list_of_tuples = [(1, 2), (3, 4), (5, 6)]; for a, b in list_of_tuples: print(a, b)` | Print the elements of each tuple in a list containing [(1, 2), (3, 4), (5, 6)] |
| `my_list = ['a', 'b', 'c', 'd', 'e']; for element in reversed(my_list): print(element)` | Print elements of a list containing ['a', 'b', 'c', 'd', 'e'] in reverse order |

## Exercises

### Exercise 1: Lap Counter

- Objective: Write a function `lap_counter()` to practice using a for-loop.
- Description: Ask the user to input the total number of laps in a Mario Kart race (a positive integer). Use a for-loop to count and print each lap until the total number of laps is reached.\
- Hint: Use a for-loop with a range that goes up to the total number of laps.\
- Sample Function Call: `lap_counter()`
- Sample Output:

```
Enter the total number of laps: 3
Lap 1
Lap 2
Lap 3
```


### Exercise 2: Boost Pad Activation

- Objective: Write a function `boost_pad()` to practice using a while-loop.
- Description: Mario hits a boost pad that activates every 2 seconds. Use a while-loop to print "Boost activated!" every 2 seconds until 10 seconds have passed.
- Hint: Use a while-loop and a variable to keep track of time.
- Sample Function Call: `boost_pad()`
- Sample Output:

```
Boost activated!
Boost activated!
Boost activated!
Boost activated!
Boost activated!
```


### Exercise 3: Collecting Coins

- Objective: Write a function `collect_coins()` to practice using `range()` with steps.
- Description: In a Mario Kart race, coins are placed at every 10 meters on a 100-meter track. Use a for-loop with `range()` to print the distance at which each coin is collected
- Hint: Use the `range()` function with a step of 10 to go from 10 to 100.
- Sample Function Call: `collect_coins()`
- Sample Output:

```
Coin collected at 10 meters
Coin collected at 20 meters
Coin collected at 30 meters
Coin collected at 40 meters
Coin collected at 50 meters
Coin collected at 60 meters
Coin collected at 70 meters
Coin collected at 80 meters
Coin collected at 90 meters
Coin collected at 100 meters
```


### Exercise 4: Time Trials

- Objective: Combine loops, conditionals, and user input.\
- Description: Write a function `time_trials()` that asks the user for the number of laps they want to race (a positive integer). Then, for each lap, ask the user for the time it took to complete the lap in seconds (a positive float). Calculate and print the average lap time using an f-string.\
- Hint: Use a for-loop to iterate through each lap and collect lap times. Use conditionals to ensure that input is positive.\
- Sample Function Call: `time_trials()`
- Sample Output:

```Enter the total number of laps: 3
Enter the time for lap 1: 30.5
Enter the time for lap 2: 29.7
Enter the time for lap 3: 31.2
Average lap time: 30.466666666666665 seconds
```


### Exercise 5: Blue Shell Dodge (Bonus)

- Objective: Combine while-loops, conditionals, and f-strings.
- Description: Write a function `blue_shell_dodge()` that simulates dodging Blue Shells. The user starts at position 4 and can move between position 1 and 4. The Blue Shell targets the user every 15 seconds. Ask the user if they want to move down a position to dodge the Blue Shell. If they say 'yes', move them down a position. If they say "No.", print "Hit by Blue Shell!". Continue this for 60 seconds.
- Hint: Use a while-loop to keep track of time and a conditional to check the user's decision.
- Sample Function Call: `blue_shell_dodge()`
- Sample Output:

```
Time: 15 seconds. You are in position 4. Do you want to move down a position to dodge the Blue Shell? (yes/no): yes
Time: 30 seconds. You are in position 3. Do you want to move down a position to dodge the Blue Shell? (yes/no): no
You've been hit by a blue shell!
```

### Exercise 6: Item Box Roulette (Bonus)

- Objective: Use loops, conditionals, and random choices.
- Description: Write a function `item_box_roulette()` that simulates hitting an item box in Mario Kart. For each item box hit, randomly choose an item from a list of items (e.g., "Green Shell", "Banana", "Star"). Print the item received and keep track of the number of each item collected (you'll have to figure out the best way to do thi). At the end, print a summary using an f-string.
- Hint: Use a for-loop to iterate a fixed number of item box hits (e.g., 10). Use a list to store the items and the `random.choice()` function to pick an item.
- Sample Function Call: `item_box_roulette()`
Sample Output:

```
Item received: Green Shell
Item received: Banana
Item received: Star
Item received: Green Shell
(continue for 10 total)
```
Summary: Green Shell x 5, Banana x 4, Star x 1`


In [None]:
# Problem 1

In [None]:
# Problem 2

In [None]:
# Problem 3

In [None]:
# Problem 4

In [None]:
# Problem 5

In [None]:
# Problem 6

# Case Study: The Search-Sort Tradeoff

In the world of computer science, search and sort algorithms are some of the most fundamental building blocks. In this case study, we will explore the tradeoffs between searching and sorting, focusing on linear and binary search methods. By understanding these concepts, you will be better equipped to solve real-world problems and make efficient decisions in computer programming.

##  The Problem of Search

Imagine you are a librarian, and you need to find a specific book in a vast library. How can you locate the book quickly and efficiently? This is similar to the problem of searching in computer science, where we need to find a particular element in a collection (like an array or a list) of items.

There are several search algorithms, but we will focus on two common ones: linear search and binary search.

##  Linear Search

Linear search is a simple approach where you start at the beginning of the collection and examine each element sequentially until you find the desired item or reach the end of the collection. In the library example, this would be like checking every book on every shelf one by one until you find the book you're looking for.

Pros:

-   Easy to understand and implement.
-   Works on both sorted and unsorted collections.

Cons:

-   Inefficient for large collections, as it may require checking every element.

## Binary Search

Binary search is a more efficient method that takes advantage of a sorted collection. It works by repeatedly dividing the collection in half and focusing on the section where the desired item is likely to be found. In the library example, this would be like knowing the book's location based on the alphabetical order of titles and narrowing down your search by looking at the middle book of each section.

Pros:

-   Much faster than linear search for large collections.
-   Efficient, as it reduces the search space with each step.

Cons:

-   Requires a sorted collection.
-   More complex to understand and implement than linear search.

## The Search-Sort Tradeoff

Now that we understand the basic concepts of linear and binary search, we can discuss the search-sort tradeoff. The tradeoff arises when deciding whether to sort a collection before searching or to search the collection as-is.

If you use binary search, you must first sort the collection, which can be time-consuming. On the other hand, linear search works on unsorted collections but can be slower for large datasets. The optimal choice depends on the specific problem and the number of searches you expect to perform. If you plan to search the collection frequently, it may be more efficient to sort it once and use binary search for subsequent searches. However, if you only need to search the collection occasionally, linear search might be a more practical choice.

## Python Comparison
So, that's the theory. Now, let's implement both algorithms and see how they perform:

In [None]:
import random

# Linear search implementation
def linear_search(arr, x):
    for i in range(len(arr)):
        if arr[i] == x:
            return i
    return -1

# Binary search implementation
def binary_search(arr, x):
    arr.sort() # Sort the array in ascending order
    left = 0 # Set the left index to 0
    right = len(arr) - 1 # Set the right index to the length of the array minus 1
    # Loop until the left index is greater than the right index
    while left <= right:
        # Find the middle index
        mid = (left + right) // 2
        # Check if the middle index is equal to the value we are searching for
        if arr[mid] == x:
            return mid
        # Check if the middle index is less than the value we are searching for
        elif arr[mid] < x:
            left = mid + 1
        # Otherwise, the middle index is greater than the value we are searching for
        else:
            right = mid - 1
    return -1


In [None]:
import random

# %timeit allows us to time the execution of a single line of code
# It works by running the code multiple times and returning the average time
# The -o flag returns the time as an object instead of printing it

# Generate a list of 1000 random integers
arr = [random.randint(1, 1000) for i in range(1000)]

# Test linear search
linear_search_time = %timeit -o linear_search(arr, 500)
print("Linear search time:", linear_search_time)

# Test binary search
binary_search_time = %timeit -o binary_search(arr, 500)
print("Binary search time:", binary_search_time)

62.3 µs ± 4.77 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Linear search time: 62.3 µs ± 4.77 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
8.83 µs ± 2.49 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Binary search time: 8.83 µs ± 2.49 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


## How Does Sorting Work?
In the binary search code above, we used Python's built-in `sort` method to sort our data. It's worth remembering that sorting takes time as well! Some common algorithms for searching include the following  We'll use the example of C-3PO (from *Star Wars*) to help us understand how some of these algorithms work.(Note: These are generally covered in detail in more advanced CS classes, and so the goal here is to focus on the "big ideas," as opposed to the nitty gritty implementation).

### Bubble Sort: The Galactic Shuffle

In a galaxy far, far away, C-3PO finds himself in a peculiar situation. He is tasked with sorting a collection of Ewokian trinkets, each assigned a numeric value for its rarity. The droid opts for the Bubble Sort algorithm, a simple sorting technique that repeatedly steps through the list, compares adjacent elements, and swaps them if they are in the wrong order. The steps are:

1. Start at the beginning of the list of Ewokian trinkets.
2. For each trinket in the list, do steps 3-5.
3. For each unsorted pair of adjacent trinkets, do steps 4-5.
4. If the first trinket is greater than the second, swap them.
5. Go to the next pair of adjacent trinkets.
6. Once the end of the list is reached, go back to step 2.
7. Exclude the last sorted trinket from the next iteration.
8. Repeat until the list is sorted.
9. The list of Ewokian trinkets is now sorted.

This algorithm is easy for humans (and droids!) to understand, but it's considerably slower than some other search algorithms.



In [None]:
def bubble_sort(ewokian_trinkets):
  """Sorts a list of Ewokian trinkets in ascending order.

  Args:
    ewokian_trinkets: A list of Ewokian trinkets.

  Returns:
    A sorted list of Ewokian trinkets.
  """

  # Get the length of the list.
  n = len(ewokian_trinkets)

  # Iterate over the list, comparing adjacent elements and swapping them if
  # they are in the wrong order.
  for i in range(n):
    for j in range(0, n - i - 1):
      if ewokian_trinkets[j] > ewokian_trinkets[j + 1]:
        ewokian_trinkets[j], ewokian_trinkets[j + 1] = ewokian_trinkets[j + 1], ewokian_trinkets[j]

  # Return the sorted list.
  return ewokian_trinkets

### Insertion Sort: The Jedi Way

C-3PO then encounters a collection of lightsaber crystals, each with a unique power level. He decides to use Insertion Sort, a sorting algorithm that builds the sorted array one element at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. Steps:

1.  Start at the second lightsaber crystal in the list.
2.  For each unsorted crystal, do steps 3-7.
3.  Take the current crystal as the key.
4.  Compare the key with each sorted crystal before it.
5.  If the key is smaller, shift the sorted crystal to the right.
6.  Place the key in its correct position among sorted crystals.
7.  Go to the next unsorted crystal.
8.  Repeat until the list is sorted.
9.  The list of lightsaber crystals is now sorted.

In [None]:
def insertion_sort(lightsaber_crystals):
  """Sorts a list of lightsaber crystals in ascending order.

  Args:
    lightsaber_crystals: A list of lightsaber crystals.

  Returns:
    A sorted list of lightsaber crystals.
  """

  # Iterate over the list, starting at the second element.
  for i in range(1, len(lightsaber_crystals)):
    # Store the current element in a variable called `key`.
    key = lightsaber_crystals[i]
    # Initialize a variable called `j` to the previous index.
    j = i - 1
    # While `j` is greater than or equal to 0 and `key` is less than the element at
    # index `j`, shift the element at index `j` to the right by one position.
    while j >= 0 and key < lightsaber_crystals[j]:
      lightsaber_crystals[j + 1] = lightsaber_crystals[j]
      j -= 1
    # Insert the `key` element at the index `j + 1`.
    lightsaber_crystals[j + 1] = key

  # Return the sorted list.
  return lightsaber_crystals

This is, like Bubble Sort, a "simple" algorithm that ends up being pretty slow.

### Merge Sort: The Clone Trooper Strategy

Finally, C-3PO is presented with Clone Trooper IDs that need sorting. He employs Merge Sort, a divide-and-conquer algorithm. The list is divided into n sub-lists, each containing one element, and then repeatedly merges sub-lists to produce new sorted sub-lists until there is only one sublist remaining. This is a much faster way of sorting (at least for most lists). However, it's a bit more complex for us humans to understand! Steps:

1.  Split the list of Clone Trooper IDs into two halves.
2.  Sort each half **recursively** using Merge Sort.
3.  Create pointers for the two halves and the merged list.
4.  While both halves have elements, do steps 5-6.
5.  Take the smaller element from the two halves.
6.  Add the smaller element to the merged list.
7.  Add any remaining elements from both halves to the merged list.
8.  Replace the original list with the merged list.
9.  If the list has one or zero elements, it is sorted.
10. The list of Clone Trooper IDs is now sorted.

In [None]:
def merge_sort(clone_trooper_ids):
  """Sorts a list of clone trooper IDs in ascending order.

  Args:
    clone_trooper_ids: A list of clone trooper IDs.

  Returns:
    A sorted list of clone trooper IDs.
  """

  if len(clone_trooper_ids) > 1:
    # Divide the list into two halves.
    mid = len(clone_trooper_ids) // 2
    L = clone_trooper_ids[:mid]
    R = clone_trooper_ids[mid:]

    # Recursively sort the two halves.
    merge_sort(L)
    merge_sort(R)

    # Merge the two sorted halves into the original list.
    i = j = k = 0
    while i < len(L) and j < len(R):
      if L[i] < R[j]:
        clone_trooper_ids[k] = L[i]
        i += 1
      else:
        clone_trooper_ids[k] = R[j]
        j += 1
      k += 1

    # Copy the remaining elements from the left half, if any.
    while i < len(L):
      clone_trooper_ids[k] = L[i]
      i += 1
      k += 1

    # Copy the remaining elements from the right half, if any.
    while j < len(R):
      clone_trooper_ids[k] = R[j]
      j += 1
      k += 1

  return clone_trooper_ids


### Testing Our Sorting Algorithms
OK. So, we've defined three different ways of sorting. Let's reassure ourselves that these do, in fact, all sort lists of numbers in the same way.

In [None]:
import random

# Generate a list of unsorted numbers.
unsorted_numbers = random.sample(range(100), 10)

# Sort the list using bubble sort.
bubble_sorted_numbers = bubble_sort(unsorted_numbers)

# Sort the list using insertion sort.
insertion_sorted_numbers = insertion_sort(unsorted_numbers)

# Sort the list using merge sort.
merge_sorted_numbers = merge_sort(unsorted_numbers)

# Print the results.
print("Unsorted numbers:", unsorted_numbers)
print("Bubble sorted numbers:", bubble_sorted_numbers)
print("Insertion sorted numbers:", insertion_sorted_numbers)
print("Merge sorted numbers:", merge_sorted_numbers)

Unsorted numbers: [10, 16, 28, 46, 57, 59, 62, 67, 76, 97]
Bubble sorted numbers: [10, 16, 28, 46, 57, 59, 62, 67, 76, 97]
Insertion sorted numbers: [10, 16, 28, 46, 57, 59, 62, 67, 76, 97]
Merge sorted numbers: [10, 16, 28, 46, 57, 59, 62, 67, 76, 97]


## Big O Notation
Big O notation is a way of describing the time complexity of algorithms, which is a measure of how long an algorithm takes to run as the input size grows larger. As we saw in our search-sort tradeoff case study, different algorithms can have very different performance characteristics, even when they are solving the same problem. Big O notation allows us to abstract away the specific details of an algorithm and focus on its overall efficiency as the input size grows.

In Big O notation, we use a mathematical formula to describe the worst-case time complexity of an algorithm, as a function of the input size n. We typically ignore lower-order terms and constant factors, since these become less significant as n grows large. For example, the Big O notation for bubble sort is O(n^2), which means that the worst-case running time of bubble sort grows quadratically with the input size. This is because bubble sort requires iterating over the entire array multiple times, which leads to a nested loop structure.


Knowing the Big O notation of an algorithm allows us to make informed decisions about which algorithm to use for a particular task, based on the size and characteristics of the input data. For example, if we are working with very large data sets, we may want to choose an algorithm with a lower Big O complexity, such as merge sort or quicksort, even if it requires more memory to run. Conversely, if we are working with very small data sets, a simpler algorithm like bubble sort or insertion sort may be sufficient, since their running time will still be relatively fast.

| Big O | Name | Example Algorithms |
| --- | --- | --- |
| O(1) | Constant | Hash table lookup |
| O(log n) | Logarithmic | Binary search |
| O(n) | Linear | Linear search |
| O(n log n) | Linearithmic | Merge sort, quicksort |
| O(n^2) | Quadratic | Bubble sort, insertion sort |
| O(n^3) | Cubic | Matrix multiplication |
| O(2^n) | Exponential | Brute force search |
| O(n!) | Factorial | Traveling Salesman |

As you can see, different algorithms have different Big O complexities, which can have a significant impact on their performance as the input size grows. For example, binary search has a complexity of O(log n), which means that its running time grows very slowly as the input size grows. In contrast, bubble sort has a complexity of O(n^2), which means that its running time grows very quickly as the input size grows.

By understanding the Big O complexity of different algorithms, we can make informed decisions about which algorithm to use for a particular task, based on the size and characteristics of the input data. This can help us optimize our code and avoid performance issues as our programs scale up to handle larger and more complex data sets.

## Big O in Action
To give an example of the difference this can make in running time, let's consider a simple example where it takes 1 unit of time (for example 1 sec) to run on an input of length 1 (for example, 1 integer). Here is the difference in running times:


In [None]:
# Pandas table showing commong Big O runtimes for numbers 1 - 20

import pandas as pd
import numpy as np

# Create a list of numbers 1 - 20 using numpy
numbers = np.arange(1, 20)

# Create a list of common Big O runtimes
runtimes = ['Constant', 'Logarithmic', 'Linear', 'Log Linear', 'Quadratic', 'Cubic', 'Exponential']

# Create dataframe by calculating the runtime for each number
df = pd.DataFrame({'Num Items to Sort': numbers,
                   'Constant Time': [1] * len(numbers),
                   'Logarithmic Time': np.log(numbers),
                   'Linear Time': numbers,
                   'Log Linear Time': numbers * np.log(numbers),
                   'Quadratic Time': numbers ** 2,
                   'Cubic Time': numbers ** 3,
                   'Exponential Time': 2 ** numbers})

round(df,2)

Unnamed: 0,Num Items to Sort,Constant Time,Logarithmic Time,Linear Time,Log Linear Time,Quadratic Time,Cubic Time,Exponential Time
0,1,1,0.0,1,0.0,1,1,2
1,2,1,0.69,2,1.39,4,8,4
2,3,1,1.1,3,3.3,9,27,8
3,4,1,1.39,4,5.55,16,64,16
4,5,1,1.61,5,8.05,25,125,32
5,6,1,1.79,6,10.75,36,216,64
6,7,1,1.95,7,13.62,49,343,128
7,8,1,2.08,8,16.64,64,512,256
8,9,1,2.2,9,19.78,81,729,512
9,10,1,2.3,10,23.03,100,1000,1024


## Dicussion Questions: The Search-Sort Tradeoff
1, What are the key differences between linear search and binary search in terms of their approach to searching through data? How do these differences impact their efficiency?

2. Can you explain the concept of Big O notation and how it is used to analyze the performance of algorithms? How do linear search and binary search compare in terms of their Big O notations?

3. What factors should be considered when choosing between linear search and binary search for a particular problem? How do these factors relate to the idea of "tradeoffs"?

4. In what situations would a linear search be more appropriate than a binary search, and vice versa? Provide real-world examples of when each algorithm might be the better choice. (These don't need to involve computers! Just think of any real world activity that involves "seaching" for something).

5. Above, we introduced the idea of Big-O notation for "time complexity." However, we can also use it as measure of "space complexity" (i.e., how much memory an algorithm uses). Which of the algorithms we studied above do you think have the highest space complexity? Why?


## Discussion Questions: Your Answers
1.

2.

3.

4.

5.

## Review with Quizlet
Run the following cell to launch the chapter review on Quizlet.

In [None]:
%%html
<iframe src="https://quizlet.com/821232086/learn/embed?i=psvlh&x=1jj1" height="600" width="100%" style="border:0"></iframe>

## Glossary

| Term | Definition |
| --- | --- |
| Algorithm | A finite sequence of unambigious instructions or rules designed to effectively solve a particular problem. |
| Unambiguity | The clarity in design such that a step or process can only be interpreted in one way. |
| Finiteness | The attribute of having a definite and limited number of steps. |
| Effectiveness | The ability of a process or algorithm to achieve its intended result. |
| for-loop | A control flow statement in programming which executes a piece of code a specified number of times. |
| while-loop | A control flow statement that repeatedly executes a block of code as long as a certain condition is true. |
| Iterate | The process of repeating steps in a systematic way, often used in programming loops. |
| Input Validation | The process of checking if the data entered by a user meets specific criteria before it's processed. |
| Nested Loop | A loop within another loop, where the inner loop executes multiple times for each iteration of the outer loop. |
| Inner Loop | The loop that is inside another loop (the outer loop). This loop performs its entire cycle for every single iteration of the outer loop. |
| Outer Loop | The loop that encloses another loop (the inner loop). This loop goes through one iteration for each full set of iterations of the inner loop. |
| Search | The process of locating a specific item in a set of items. This is a common operation in computing and algorithms. |
| Linear Search | A search algorithm that traverses through a list sequentially until the desired element is found. |
| Binary Search | A search algorithm that efficiently finds a target value within a sorted list or array by halving the search space at each step. |
| Sort | The process of arranging data in a particular order, typically numerical or lexicographical. |
| Bubble Sort | A simple sorting algorithm that repeatedly steps through the list, compares adjacent elements and swaps them if they are in the wrong order. |
| Insertion Sort | A simple sorting algorithm that builds the final sorted list one item at a time. It is generally less efficient on larger lists compared to more advanced algorithms such as quicksort or mergesort. |
| Merge Sort | A divide and conquer algorithm that splits a list in half, recursively sorts the halves, and then merges the sorted halves. |
| Quick Sort | A divide and conquer algorithm which picks an element as a pivot, partitions the array around the pivot, and then recursively applies the steps to the sub-arrays. |
| Big O | Notation used in computer science to describe the performance or complexity of an algorithm. It specifically describes the worst-case scenario. |
| O(1) | Denotes an algorithm that always performs a fixed number of operations, regardless of the size of the input data set. Example: Accessing an array element by its index. |
| O(log n) | Denotes an algorithm whose performance grows logarithmically with the input size. Example: Binary search in a sorted array. |
| O(n) | Denotes an algorithm whose performance grows linearly and in direct proportion to the size of the input data set. Example: Linear search in an unsorted array. |
| O(n log n) | Denotes an algorithm whose performance is proportional to the number of items times the logarithm of the number of items. Example: Merge sort or quick sort. |
| O(n^2) | Denotes an algorithm whose performance is proportional to the square of the size of the input data set. Example: Bubble sort or insertion sort. |
| Search vs Sort tradeoff | The concept that highlights the trade-off between time taken to search for items in a list and the time taken to sort the list. For example, sorting a list can make subsequent searches faster, but the initial sort may take considerable time. |