# Complexity Analysis

## Part 1: What Are Recurrence Relations?

### Intuitive Definition

A **recurrence relation** is a way to define a sequence where each term is expressed in terms of previous terms. It's like a recipe that tells you how to build the next step based on what you've already done.

**Real-world analogy**: Think of climbing stairs
- To get to step $n$, you can either:
  - Come from step $n-1$ and take 1 step, OR
  - Come from step $n-2$ and take 2 steps
- This gives us: $T(n) = T(n-1) + T(n-2)$ (Fibonacci-like)

### Mathematical Definition

A recurrence relation expresses $T(n)$ in terms of $T(k)$ where $k < n$, plus some function of $n$.

**General form**: $T(n) = f(T(n-1), T(n-2), ..., T(n-k)) + g(n)$

**Base case(s)**: Values for small $n$ (like $T(0)$, $T(1)$) that stop the recursion.

## Part 2: Simple Examples to Build Intuition

### Example 1: Factorial

**Problem**: Calculate $n! = n \times (n-1) \times (n-2) \times ... \times 1$

**Recursive thinking**: $n! = n \times (n-1)!$

**Recurrence relation**:
$$T(n) = T(n-1) + c$$
where $c$ is the constant time to do one multiplication.

**Base case**: $T(1) = c$

**Let's trace it**:
```
T(4) = T(3) + c
T(3) = T(2) + c  
T(2) = T(1) + c
T(1) = c
```

**Substituting back**:
$$\begin{align}
T(4) &= T(3) + c \\
&= (T(2) + c) + c \\
&= ((T(1) + c) + c) + c \\
&= (c + c + c) + c \\
&= 4c
\end{align}$$

**Pattern**: $T(n) = nc = O(n)$

### Example 2: Sum of First n Numbers

**Problem**: Calculate $1 + 2 + 3 + ... + n$

**Recursive thinking**: $\text{Sum}(n) = n + \text{Sum}(n-1)$

**Recurrence relation**:
$$T(n) = T(n-1) + c$$

**Base case**: $T(1) = c$

**This is identical to factorial!** So $T(n) = O(n)$.

### Example 3: Array Traversal

**Problem**: Visit every element in an array of size $n$

**Recursive thinking**: Process first element, then process remaining $n-1$ elements

**Recurrence relation**:
$$T(n) = T(n-1) + c$$

**Base case**: $T(1) = c$

**Again, same pattern**: $T(n) = O(n)$

**Key insight**: Any algorithm that reduces the problem size by 1 each time will be $O(n)$.

## Part 3: The Divide-and-Conquer Pattern

### Example 4: Binary Search (Revisited)

**Problem**: Search in a sorted array by cutting it in half repeatedly

**Recursive thinking**: Compare with middle, then search half the remaining array

**Recurrence relation**:
$$T(n) = T\left(\frac{n}{2}\right) + c$$

**Base case**: $T(1) = c$

**Let's trace it for $n = 8$**:
```
T(8) = T(4) + c
T(4) = T(2) + c
T(2) = T(1) + c  
T(1) = c
```

**Substituting back**:
$$\begin{align}
T(8) &= T(4) + c \\
&= (T(2) + c) + c \\
&= ((T(1) + c) + c) + c \\
&= (c + c + c) + c \\
&= 4c
\end{align}$$

**How many steps**: $8 \to 4 \to 2 \to 1$ is 3 cuts, plus base case = 4 total

**General pattern**: $T(n) = c \log_2(n) = O(\log n)$

### Example 5: Merge Sort

**Problem**: Sort an array by dividing it in half, sorting each half, then merging

**Recursive thinking**: 
1. Sort left half: $T(n/2)$
2. Sort right half: $T(n/2)$  
3. Merge the halves: $O(n)$

**Recurrence relation**:
$$T(n) = 2T\left(\frac{n}{2}\right) + cn$$

**Base case**: $T(1) = c$

**Let's trace it for $n = 4$**:
```
T(4) = 2T(2) + 4c
T(2) = 2T(1) + 2c = 2c + 2c = 4c
```

**Substituting back**:
$$\begin{align}
T(4) &= 2T(2) + 4c \\
&= 2(4c) + 4c \\
&= 8c + 4c \\
&= 12c
\end{align}$$

**For general $n$, this gives**: $T(n) = O(n \log n)$

## Part 4: Systematic Solution Methods

### Method 1: Substitution (Expansion)

**Steps**:
1. Keep substituting the recurrence into itself
2. Look for a pattern
3. Express in terms of the base case
4. Simplify to get the final answer

**Example**: $T(n) = T(n-1) + c$, $T(1) = c$

$$\begin{align}
T(n) &= T(n-1) + c \\
&= (T(n-2) + c) + c \\
&= ((T(n-3) + c) + c) + c \\
&= T(n-k) + kc
\end{align}$$

**When do we hit base case?** When $n - k = 1$, so $k = n - 1$.

$$T(n) = T(1) + (n-1)c = c + (n-1)c = nc = O(n)$$

### Method 2: Tree Method (For Divide-and-Conquer)

**Example**: $T(n) = 2T(n/2) + cn$

**Draw the recursion tree**:
```
Level 0:           T(n)                    Cost: cn
                  /    \
Level 1:      T(n/2)  T(n/2)              Cost: cn/2 + cn/2 = cn
             /   \    /   \
Level 2: T(n/4) T(n/4) T(n/4) T(n/4)     Cost: cn/4 × 4 = cn
         ...
```

**Pattern**:
- Each level has cost $cn$
- Number of levels = $\log_2(n)$ (until we reach base case)
- **Total cost** = $cn \times \log_2(n) = O(n \log n)$

### Method 3: Master Theorem (Complete Explanation)

The Master Theorem is a powerful tool for solving divide-and-conquer recurrences, but it requires understanding the intuition behind it.

#### What Problems Does Master Theorem Solve?

**Form**: $T(n) = aT(n/b) + f(n)$ where:
- $a \geq 1$ = number of subproblems
- $b > 1$ = factor by which problem size shrinks
- $f(n)$ = cost of dividing and combining

#### Building Intuition: The Tree Perspective

When we have $T(n) = aT(n/b) + f(n)$, we can draw the recursion tree:

```
Level 0:           f(n)                    [1 node, each costs f(n)]
                  /  |  \
Level 1:      f(n/b) f(n/b) f(n/b)        [a nodes, each costs f(n/b)]
             /||\   /||\    /||\
Level 2:   f(n/b²) nodes...               [a² nodes, each costs f(n/b²)]
           ...
Level k:                                  [aᵏ nodes, each costs f(n/bᵏ)]
```

**Key questions**:
1. How many levels are there? $\log_b(n)$ levels
2. How many nodes at level $i$? $a^i$ nodes
3. What's the cost at level $i$? $a^i \cdot f(n/b^i)$
4. What's the total cost? Sum over all levels

#### The Critical Value: $n^{\log_b a}$

**Why is $n^{\log_b a}$ important?**

At the deepest level (level $\log_b n$), we have:
- Number of nodes = $a^{\log_b n}$
- But $a^{\log_b n} = (b^{\log_b a})^{\log_b n} = b^{\log_b a \cdot \log_b n} = (b^{\log_b n})^{\log_b a} = n^{\log_b a}$

So we have $n^{\log_b a}$ leaves, each with cost $T(1) = O(1)$.

**Leaf level cost** = $n^{\log_b a} \cdot O(1) = O(n^{\log_b a})$

#### The Three Cases Explained

The Master Theorem compares $f(n)$ with $n^{\log_b a}$ to see which dominates:

##### Case 1: Tree is Bottom-Heavy

**Condition**: $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$

**Translation**: $f(n)$ grows slower than $n^{\log_b a}$

**Intuition**: The leaves dominate the total cost

**Result**: $T(n) = O(n^{\log_b a})$

**Example**: $T(n) = 4T(n/2) + n$
- $a = 4$, $b = 2$, so $n^{\log_b a} = n^{\log_2 4} = n^2$
- $f(n) = n = O(n^{2-1}) = O(n^{2-\epsilon})$ with $\epsilon = 1$
- Case 1 applies: $T(n) = O(n^2)$

**Tree visualization**:
```
Level 0:      n          [Cost: n]
            /   \
Level 1:   n/2  n/2      [Cost: n/2 + n/2 = n]
          / |   | \
Level 2: n/4 n/4 n/4 n/4 [Cost: 4 × n/4 = n]
         ...
Leaves:  (n² leaves)     [Cost: n² × O(1) = n²]  ← DOMINATES
```

##### Case 2: Tree is Balanced

**Condition**: $f(n) = \Theta(n^{\log_b a})$

**Translation**: $f(n)$ grows at the same rate as $n^{\log_b a}$

**Intuition**: All levels contribute equally

**Result**: $T(n) = O(n^{\log_b a} \log n)$

**Example**: $T(n) = 2T(n/2) + n$ (merge sort)
- $a = 2$, $b = 2$, so $n^{\log_b a} = n^{\log_2 2} = n^1 = n$
- $f(n) = n = \Theta(n^1) = \Theta(n^{\log_b a})$
- Case 2 applies: $T(n) = O(n \log n)$

**Tree visualization**:
```
Level 0:      n          [Cost: n]
            /   \
Level 1:   n/2  n/2      [Cost: n]
          / |   | \
Level 2: n/4 n/4 n/4 n/4 [Cost: n]
         ...
All levels have cost n, and there are log n levels
Total: n × log n
```

##### Case 3: Tree is Top-Heavy  
**Condition**: $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$, AND the regularity condition holds

**Translation**: $f(n)$ grows faster than $n^{\log_b a}$

**Intuition**: The root dominates the total cost

**Result**: $T(n) = O(f(n))$

**Regularity condition**: $af(n/b) \leq cf(n)$ for some $c < 1$ and sufficiently large $n$

**Example**: $T(n) = 2T(n/2) + n^2$
- $a = 2$, $b = 2$, so $n^{\log_b a} = n^1 = n$
- $f(n) = n^2 = \Omega(n^{1+1}) = \Omega(n^{\log_b a + \epsilon})$ with $\epsilon = 1$
- Check regularity: $af(n/b) = 2(n/2)^2 = 2n^2/4 = n^2/2 \leq cn^2$ with $c = 1/2 < 1$ ✓
- Case 3 applies: $T(n) = O(n^2)$

**Tree visualization**:
```
Level 0:      n²         [Cost: n²]  ← DOMINATES
            /   \
Level 1:  (n/2)² (n/2)²  [Cost: 2 × n²/4 = n²/2]
          / |     | \
Level 2: 4 × (n/4)²      [Cost: 4 × n²/16 = n²/4]
         ...
Each level is half the cost of the previous level
```

#### Step-by-Step Application Process

**Given**: $T(n) = aT(n/b) + f(n)$

**Step 1**: Identify $a$, $b$, and $f(n)$
**Step 2**: Calculate $n^{\log_b a}$
**Step 3**: Compare $f(n)$ with $n^{\log_b a}$:
- Is $f(n)$ polynomially smaller? → Case 1
- Is $f(n)$ the same order? → Case 2  
- Is $f(n)$ polynomially larger? → Check regularity → Case 3
**Step 4**: Apply the appropriate case formula

#### Common Examples Worked Out

**Example 1**: $T(n) = 3T(n/4) + n \log n$
- $a = 3$, $b = 4$, $f(n) = n \log n$
- $n^{\log_b a} = n^{\log_4 3} = n^{0.79...}$
- Since $n \log n$ grows faster than $n^{0.79}$, this looks like Case 3
- Check regularity: $3f(n/4) = 3 \cdot \frac{n}{4} \log(\frac{n}{4}) = \frac{3n}{4}(\log n - \log 4)$
- For large $n$: $\frac{3n}{4}(\log n - \log 4) \leq \frac{3n \log n}{4} = \frac{3}{4} \cdot f(n)$ with $c = 3/4 < 1$ ✓
- **Result**: $T(n) = O(n \log n)$

**Example 2**: $T(n) = 9T(n/3) + n^2$
- $a = 9$, $b = 3$, $f(n) = n^2$
- $n^{\log_b a} = n^{\log_3 9} = n^2$
- Since $f(n) = n^2 = \Theta(n^2)$, this is Case 2
- **Result**: $T(n) = O(n^2 \log n)$

#### When Master Theorem Doesn't Apply

**Gap cases**: When $f(n)$ is between Case 1 and Case 2, or between Case 2 and Case 3.

**Example**: $T(n) = 2T(n/2) + \frac{n}{\log n}$
- $a = 2$, $b = 2$, $n^{\log_b a} = n$
- $f(n) = \frac{n}{\log n}$ is slower than $n$ but not polynomially slower
- Master Theorem doesn't apply - need other methods

**Wrong form**: Recurrences that don't fit $T(n) = aT(n/b) + f(n)$

**Examples where it doesn't work**:
- $T(n) = T(n-1) + n$ (not divide-and-conquer)
- $T(n) = T(n/2) + T(n/3) + n$ (different subproblem sizes)
- $T(n) = 2T(n/2 + 1) + n$ (subproblem size isn't exactly $n/b$)

## Part 5: Worked Examples Step-by-Step

### Example A: Linear Reduction

**Recurrence**: $T(n) = T(n-1) + 5$, $T(1) = 3$

**Step 1 - Substitution**:
$$\begin{align}
T(n) &= T(n-1) + 5 \\
&= (T(n-2) + 5) + 5 \\
&= ((T(n-3) + 5) + 5) + 5 \\
&= T(n-k) + 5k
\end{align}$$

**Step 2 - Find base case**: $n - k = 1 \Rightarrow k = n - 1$

**Step 3 - Substitute**: $T(n) = T(1) + 5(n-1) = 3 + 5n - 5 = 5n - 2$

**Step 4 - Big-O**: $T(n) = O(n)$

### Example B: Binary Division

**Recurrence**: $T(n) = T(n/2) + 7$, $T(1) = 2$

**Step 1 - Substitution**:
$$\begin{align}
T(n) &= T(n/2) + 7 \\
&= (T(n/4) + 7) + 7 \\
&= ((T(n/8) + 7) + 7) + 7 \\
&= T(n/2^k) + 7k
\end{align}$$

**Step 2 - Find base case**: $n/2^k = 1 \Rightarrow 2^k = n \Rightarrow k = \log_2(n)$

**Step 3 - Substitute**: $T(n) = T(1) + 7\log_2(n) = 2 + 7\log_2(n)$

**Step 4 - Big-O**: $T(n) = O(\log n)$

### Example C: Double Recursion (Fibonacci-style)

**Recurrence**: $T(n) = T(n-1) + T(n-2) + c$, $T(1) = T(2) = c$

**This is tricky!** Let's use substitution carefully:

**Step 1 - Lower bound**: 
$T(n) \geq T(n-2) + T(n-2) = 2T(n-2)$

This gives us $T(n) \geq 2^{n/2} \cdot c$, so $T(n) = \Omega(2^{n/2})$

**Step 2 - Upper bound**:
$T(n) \leq T(n-1) + T(n-1) = 2T(n-1)$

This gives us $T(n) \leq 2^n \cdot c$, so $T(n) = O(2^n)$

**Result**: $T(n) = O(2^n)$ (exponential - very expensive!)

## Part 6: Common Recurrence Patterns

### Pattern 1: Linear Reduction
**Form**: $T(n) = T(n-1) + f(n)$
**Solution**: $T(n) = \sum_{i=1}^{n} f(i)$
**Common case**: If $f(n) = c$, then $T(n) = O(n)$

**Examples**:
- Linear search: $T(n) = T(n-1) + c \Rightarrow O(n)$
- Insertion sort (worst case): $T(n) = T(n-1) + cn \Rightarrow O(n^2)$

### Pattern 2: Binary Division  
**Form**: $T(n) = T(n/2) + f(n)$
**Common case**: If $f(n) = c$, then $T(n) = O(\log n)$

**Examples**:
- Binary search: $T(n) = T(n/2) + c \Rightarrow O(\log n)$
- Finding max in tournament: $T(n) = T(n/2) + c \Rightarrow O(\log n)$

### Pattern 3: Divide and Conquer
**Form**: $T(n) = aT(n/b) + f(n)$
**Solution**: Use Master Theorem

**Examples**:
- Merge sort: $T(n) = 2T(n/2) + cn \Rightarrow O(n \log n)$
- Quick sort (average): $T(n) = 2T(n/2) + cn \Rightarrow O(n \log n)$
- Karatsuba multiplication: $T(n) = 3T(n/2) + cn \Rightarrow O(n^{\log_2 3}) = O(n^{1.58})$

### Pattern 4: Multiple Branches
**Form**: $T(n) = T(n-1) + T(n-2) + ...$
**Solution**: Usually exponential

**Examples**:
- Naive Fibonacci: $T(n) = T(n-1) + T(n-2) + c \Rightarrow O(2^n)$
- Tower of Hanoi: $T(n) = 2T(n-1) + c \Rightarrow O(2^n)$

## Part 7: Practice Problems with Solutions

### Problem 1: Mystery Algorithm A
**Recurrence**: $T(n) = T(n-2) + 3$, $T(1) = T(2) = 5$

**Solution**:
- For even $n$: $T(n) = T(n-2) + 3 = T(n-4) + 6 = ... = T(2) + 3(n/2 - 1) = 5 + 3n/2 - 3 = 3n/2 + 2$
- For odd $n$: $T(n) = T(n-2) + 3 = T(n-4) + 6 = ... = T(1) + 3((n-1)/2) = 5 + 3(n-1)/2$

**Result**: $T(n) = O(n)$

### Problem 2: Mystery Algorithm B  
**Recurrence**: $T(n) = 3T(n/3) + 2n$, $T(1) = 1$

**Using Master Theorem**:
- $a = 3$, $b = 3$, $f(n) = 2n$
- $n^{\log_b a} = n^{\log_3 3} = n^1 = n$
- Since $f(n) = 2n = O(n)$, we're in Case 2
- **Result**: $T(n) = O(n \log n)$

### Problem 3: Mystery Algorithm C
**Recurrence**: $T(n) = T(n/4) + T(3n/4) + n$, $T(1) = 1$

**Using tree method**:
```
Level 0:              n                     Cost: n
                   /     \
Level 1:         n/4     3n/4               Cost: n/4 + 3n/4 = n  
               /   \     /    \
Level 2:    n/16  3n/16 3n/16 9n/16        Cost: n/16 + 3n/16 + 3n/16 + 9n/16 = n
```

**Pattern**: Each level costs $n$, and we have $O(\log n)$ levels.
**Result**: $T(n) = O(n \log n)$

## Part 8: Common Mistakes and How to Avoid Them

### Mistake 1: Forgetting Base Cases
**Wrong**: $T(n) = T(n-1) + c$ (what happens when $n = 0$?)
**Right**: $T(n) = T(n-1) + c$, $T(1) = c$

### Mistake 2: Incorrect Substitution
**Wrong**: $T(n) = T(n/2) + n$, then claiming $T(n) = nT(1) = O(n)$
**Right**: Must account for how many times we divide

### Mistake 3: Misapplying Master Theorem
**Wrong**: Using $T(n) = T(n-1) + n$ with Master Theorem
**Right**: Master Theorem only applies to $T(n) = aT(n/b) + f(n)$ form

### Mistake 4: Ignoring Non-Dominant Terms
**Example**: $T(n) = T(n/2) + \log n + 100$
**Common error**: Treating the $+100$ as significant
**Truth**: The $\log n$ term dominates for large $n$, constant is negligible

## Part 9: Why Recurrence Relations Matter

### Algorithm Analysis
Recurrence relations are the mathematical foundation for analyzing:
- **Recursive algorithms**: Direct translation of code to math
- **Divide-and-conquer**: Breaking problems into smaller pieces  
- **Dynamic programming**: Understanding overlapping subproblems
- **Tree algorithms**: Height and traversal analysis

### Practical Applications
Understanding recurrences helps you:
1. **Predict performance**: Will your algorithm scale?
2. **Compare approaches**: Which recursive strategy is better?
3. **Optimize code**: Identify bottlenecks in recursive solutions
4. **Design algorithms**: Choose the right recursive structure

### Real-World Examples
- **Database indexing**: B-tree operations follow $T(n) = T(n/m) + O(\log m)$
- **Graphics rendering**: Quad-tree operations are $T(n) = 4T(n/4) + O(1)$
- **Network routing**: Shortest path algorithms often recursive
- **Machine learning**: Many ML algorithms have recursive structure

## Part 10: Summary and Key Takeaways

### The Big Picture
1. **Recurrence relations** express how algorithms break down problems
2. **Solution methods** include substitution, tree method, and Master Theorem
3. **Common patterns** emerge based on how problem size reduces
4. **Big-O analysis** follows naturally from recurrence solutions

### Pattern Recognition Guide
| Problem Reduction | Typical Recurrence | Time Complexity |
|-------------------|-------------------|-----------------|
| Subtract constant | $T(n) = T(n-1) + f(n)$ | $O(\sum f(i))$ |
| Divide by constant | $T(n) = T(n/c) + f(n)$ | $O(f(n) \log n)$ |
| Split into parts | $T(n) = aT(n/b) + f(n)$ | Use Master Theorem |
| Multiple recursion | $T(n) = T(n-1) + T(n-2)$ | Often exponential |

### Master the Fundamentals
- **Start simple**: Linear and logarithmic patterns first
- **Practice substitution**: Most reliable method for beginners  
- **Draw trees**: Visual approach helps with divide-and-conquer
- **Check your work**: Verify with small examples

**Remember**: Recurrence relations are the bridge between recursive thinking and mathematical analysis - master them to understand algorithm complexity at a deep level!

---

# Linear Search Complexity

## Part 1: Understanding the Algorithm First

### What is Linear Search?

Linear search is the most intuitive search algorithm - it's exactly how you'd naturally look for something:

1. Start at the beginning
2. Check each item one by one
3. If you find what you're looking for, stop
4. If you reach the end without finding it, it's not there

Think of it like looking for your keys by checking every pocket, drawer, and surface in order until you find them (or run out of places to look).

### Visual Example: Finding 7 in [3,1,4,1,5,9,2,6,7,3,5,8]

```
Step 1: Check position 0: 3 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
         ↑

Step 2: Check position 1: 1 ≠ 7, continue  
        [3,1,4,1,5,9,2,6,7,3,5,8]
           ↑

Step 3: Check position 2: 4 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
             ↑

Step 4: Check position 3: 1 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
               ↑

Step 5: Check position 4: 5 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
                 ↑

Step 6: Check position 5: 9 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
                   ↑

Step 7: Check position 6: 2 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
                     ↑

Step 8: Check position 7: 6 ≠ 7, continue
        [3,1,4,1,5,9,2,6,7,3,5,8]
                       ↑

Step 9: Check position 8: 7 = 7, Found it! ✓
        [3,1,4,1,5,9,2,6,7,3,5,8]
                         ↑
```

**Result**: Found 7 at position 8 after 9 comparisons.

## Part 2: Analyzing Different Scenarios

### Best Case: Element is at the Beginning

Looking for 3 in [3,1,4,1,5,9,2,6,7,3,5,8]:
```
Step 1: Check position 0: 3 = 3, Found it! ✓
        [3,1,4,1,5,9,2,6,7,3,5,8]
         ↑
```
**Comparisons needed**: 1

### Worst Case: Element is at the End (or Not There)

#### Element at the end - Looking for 8:
```
Step 1-11: Check positions 0-10: all ≠ 8
Step 12: Check position 11: 8 = 8, Found it! ✓
         [3,1,4,1,5,9,2,6,7,3,5,8]
                                 ↑
```
**Comparisons needed**: 12 (the full array size)

#### Element not in array - Looking for 42:
```
Step 1-12: Check all positions 0-11: all ≠ 42
Result: Not found
```
**Comparisons needed**: 12 (still the full array size)

### Average Case: Element Could Be Anywhere

If the element we're looking for is equally likely to be at any position:

**Positions**: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11  
**Comparisons needed**: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

**Average comparisons** = $\frac{1 + 2 + 3 + ... + n}{n} = \frac{n(n+1)/2}{n} = \frac{n+1}{2}$

For our 12-element array: $\frac{12+1}{2} = 6.5$ comparisons on average.

## Part 3: Building the Pattern - Different Array Sizes

### Let's Count Steps for Various Sizes

#### Size 1: [5]
- **Best case**: 1 comparison (element is there)
- **Worst case**: 1 comparison (element not there)  
- **Average case**: 1 comparison

#### Size 3: [2,7,1]
- **Best case**: 1 comparison (first element)
- **Worst case**: 3 comparisons (last element or not found)
- **Average case**: $\frac{1+2+3}{3} = 2$ comparisons

#### Size 5: [9,3,1,8,4]
- **Best case**: 1 comparison
- **Worst case**: 5 comparisons  
- **Average case**: $\frac{1+2+3+4+5}{5} = 3$ comparisons

### The Pattern Emerges

| Array Size (n) | Best Case | Worst Case | Average Case |
|----------------|-----------|------------|--------------|
| 1              | 1         | 1          | 1.0          |
| 2              | 1         | 2          | 1.5          |
| 3              | 1         | 3          | 2.0          |
| 5              | 1         | 5          | 3.0          |
| 10             | 1         | 10         | 5.5          |
| 100            | 1         | 100        | 50.5         |
| 1000           | 1         | 1000       | 500.5        |

**Key Observation**: The worst case grows **linearly** with array size!

## Part 4: Mathematical Analysis

### Setting Up the Analysis

Let $T(n)$ = time to search in an array of size $n$

**What happens in linear search?**
- In the **best case**: Check first element, done → $T(n) = O(1)$
- In the **worst case**: Check all $n$ elements → $T(n) = O(n)$  
- In the **average case**: Check about $n/2$ elements → $T(n) = O(n)$

### Detailed Worst-Case Analysis

**Step-by-step for worst case**:
1. Check element at position 0: 1 comparison
2. Check element at position 1: 1 comparison  
3. Check element at position 2: 1 comparison
4. ...
5. Check element at position $n-1$: 1 comparison

**Total comparisons** = $1 + 1 + 1 + ... + 1$ ($n$ times) = $n$

Therefore: $T(n) = n = O(n)$

### Average-Case Analysis (More Detailed)

**Assumption**: The target element is equally likely to be at any position, or not in the array at all.

**Scenario 1**: Element is in the array (probability = $p$)
- Could be at position 1, 2, 3, ..., $n$ with equal probability $\frac{p}{n}$
- Expected comparisons = $\frac{p}{n}(1 + 2 + 3 + ... + n) = \frac{p}{n} \cdot \frac{n(n+1)}{2} = \frac{p(n+1)}{2}$

**Scenario 2**: Element is not in array (probability = $1-p$)  
- Must check all $n$ elements
- Expected comparisons = $(1-p) \cdot n$

**Total expected comparisons**:
$$E[T(n)] = \frac{p(n+1)}{2} + (1-p)n = \frac{p(n+1) + 2(1-p)n}{2} = \frac{pn + p + 2n - 2pn}{2} = \frac{2n - pn + p}{2}$$

**Special case** (element is definitely in array, $p = 1$):
$$E[T(n)] = \frac{n+1}{2} = O(n)$$

**Common case** (element may or may not be there, $p = 0.5$):
$$E[T(n)] = \frac{2n - 0.5n + 0.5}{2} = \frac{1.5n + 0.5}{2} = 0.75n + 0.25 = O(n)$$

## Part 5: Concrete Walkthrough Examples

### Example 1: Searching for "Emma" in a Class List

Names: `["Alice", "Bob", "Charlie", "David", "Emma", "Frank", "Grace"]`

```
Step 1: Compare "Emma" with "Alice" → Not equal, continue
Step 2: Compare "Emma" with "Bob" → Not equal, continue  
Step 3: Compare "Emma" with "Charlie" → Not equal, continue
Step 4: Compare "Emma" with "David" → Not equal, continue
Step 5: Compare "Emma" with "Emma" → Equal! Found at position 4
```

**Result**: 5 comparisons for array of size 7.

### Example 2: Searching for "Zoe" (Not in List)

```
Step 1: Compare "Zoe" with "Alice" → Not equal, continue
Step 2: Compare "Zoe" with "Bob" → Not equal, continue
Step 3: Compare "Zoe" with "Charlie" → Not equal, continue  
Step 4: Compare "Zoe" with "David" → Not equal, continue
Step 5: Compare "Zoe" with "Emma" → Not equal, continue
Step 6: Compare "Zoe" with "Frank" → Not equal, continue
Step 7: Compare "Zoe" with "Grace" → Not equal, continue
End of array reached → Not found
```

**Result**: 7 comparisons for array of size 7 (checked every element).

### Example 3: Finding Multiple Occurrences

Looking for all occurrences of 3 in `[1,3,7,3,9,3,2]`:

```
Step 1: 1 ≠ 3, continue
Step 2: 3 = 3, found at position 1, continue searching
Step 3: 7 ≠ 3, continue  
Step 4: 3 = 3, found at position 3, continue searching
Step 5: 9 ≠ 3, continue
Step 6: 3 = 3, found at position 5, continue searching
Step 7: 2 ≠ 3, continue
End reached
```

**Result**: Still needed 7 comparisons to find all occurrences, but found 3 matches.

## Part 6: Why Linear Growth Matters

### The Impact of Size

Here's how linear search performance degrades:

| Array Size | Best Case | Average Case | Worst Case |
|------------|-----------|--------------|------------|
| 10         | 1         | 5.5          | 10         |
| 100        | 1         | 50.5         | 100        |
| 1,000      | 1         | 500.5        | 1,000      |
| 10,000     | 1         | 5,000.5      | 10,000     |
| 100,000    | 1         | 50,000.5     | 100,000    |
| 1,000,000  | 1         | 500,000.5    | 1,000,000  |

**Key insight**: When you double the array size, you double the worst-case time!

### Real-World Implications

**Example**: Searching through a phone book
- 1,000 names: Up to 1,000 comparisons  
- 10,000 names: Up to 10,000 comparisons
- 100,000 names: Up to 100,000 comparisons

This is why we need more efficient algorithms for large datasets!

### The "Doubling" Problem

Unlike binary search's logarithmic growth, linear search has a **linear growth problem**:

**Mathematical proof**:
- Array of size $n$: $O(n)$ time
- Array of size $2n$: $O(2n) = 2 \cdot O(n)$ time

**Doubling the input doubles the time** - this doesn't scale well!

## Part 7: Space Complexity Analysis

### Iterative Version (Most Common)

```python
def linear_search(arr, target):
    for i in range(len(arr)):      # O(1) space for loop variable
        if arr[i] == target:       # O(1) space for comparison
            return i               # O(1) space for return
    return -1                      # O(1) space
```

**Space complexity**: $O(1)$ - only uses a constant amount of extra space.

### With Index Tracking

```python
def linear_search_with_tracking(arr, target):
    found_indices = []             # O(k) space where k = number of matches
    for i in range(len(arr)):      # O(1) space for loop variable  
        if arr[i] == target:       # O(1) space for comparison
            found_indices.append(i) # Space grows with matches
    return found_indices
```

**Space complexity**: $O(k)$ where $k$ is the number of matches (worst case: $O(n)$ if all elements match).

### Recursive Version (Less Common)

```python
def linear_search_recursive(arr, target, index=0):
    if index >= len(arr):          # Base case: not found
        return -1
    if arr[index] == target:       # Base case: found
        return index  
    return linear_search_recursive(arr, target, index + 1)  # Recursive call
```

**Space complexity**: $O(n)$ - each recursive call uses stack space, and we might make $n$ calls in the worst case.

**Note**: The recursive version is inefficient for linear search since it doesn't benefit from the recursive structure.

## Part 8: When to Use Linear Search

### Advantages of Linear Search

1. **Simplicity**: Easiest algorithm to understand and implement
2. **No preprocessing**: Works on unsorted arrays  
3. **Memory efficient**: $O(1)$ space complexity
4. **Stable**: Finds first occurrence naturally
5. **Works everywhere**: No special requirements for data structure

### When Linear Search is Optimal

#### Small Arrays (n < 20)
For very small arrays, linear search can be faster than more complex algorithms due to:
- Lower constant factors
- No overhead from complex operations
- Better cache locality

#### Unsorted Data
When data isn't sorted and you need to search infrequently:
- Sorting cost: $O(n \log n)$  
- Linear search cost: $O(n)$
- If you search less often than you add elements, linear search wins

#### Finding All Occurrences
When you need all instances of a value:
```
Linear search: O(n) - must check every element anyway
Binary search: O(log n + k) where k = number of occurrences
```
For finding all occurrences, linear search is often simpler.

### When Linear Search is Poor

#### Large Datasets
- 1 million elements: up to 1 million comparisons
- Binary search on same data: at most 20 comparisons
- **Performance difference**: 50,000× slower!

#### Repeated Searches
If you search the same dataset many times:
- Sort once: $O(n \log n)$
- Each binary search: $O(\log n)$  
- Each linear search: $O(n)$
- **Break-even point**: After $\log n$ searches, binary search wins

## Part 9: Comparison with Other Search Algorithms

### Linear Search vs Binary Search

| Aspect           | Linear Search | Binary Search |
|------------------|---------------|---------------|
| **Prerequisite** | None          | Sorted array  |
| **Time (Best)**  | $O(1)$        | $O(1)$        |
| **Time (Average)**| $O(n)$       | $O(\log n)$   |
| **Time (Worst)** | $O(n)$        | $O(\log n)$   |
| **Space**        | $O(1)$        | $O(1)$ iterative |
| **Implementation** | Very simple  | Moderate      |
| **Data structure** | Any array    | Sorted array  |

### Performance Comparison Table

| Array Size | Linear (Worst) | Binary (Worst) | Speedup |
|------------|----------------|----------------|---------|
| 10         | 10             | 4              | 2.5×    |
| 100        | 100            | 7              | 14×     |
| 1,000      | 1,000          | 10             | 100×    |
| 10,000     | 10,000         | 14             | 714×    |
| 100,000    | 100,000        | 17             | 5,882×  |
| 1,000,000  | 1,000,000      | 20             | 50,000× |

## Part 10: Summary and Key Takeaways

### The Big Picture

1. **Time Complexity**: 
   - Best case: $O(1)$ - element is first
   - Average case: $O(n)$ - element is in middle  
   - Worst case: $O(n)$ - element is last or missing

2. **Space Complexity**: $O(1)$ - constant extra space

3. **Key Characteristic**: **Linear growth** - doubling input size doubles time

### Why This Matters

#### Understanding Algorithm Trade-offs
Linear search teaches us about:
- **Simplicity vs Efficiency**: Sometimes the simplest solution isn't the most efficient
- **Preprocessing costs**: Sorting enables faster searches but has upfront cost
- **Problem constraints**: Unsorted data limits our algorithmic choices

#### When Simple is Better
Linear search reminds us that:
- Complex isn't always better for small problems
- Implementation simplicity has value
- Understanding worst-case behavior is crucial

### The Mathematical Beauty

The recurrence relation for linear search is trivial: $T(n) = T(n-1) + O(1)$ with solution $T(n) = O(n)$, but this simplicity teaches us:

- How algorithm analysis works on basic examples
- The importance of input characteristics (sorted vs unsorted)  
- Why we need more sophisticated approaches for large-scale problems

**Remember**: Linear search is the foundation - understanding its limitations motivates learning more efficient algorithms like binary search, hash tables, and tree-based searches!

---

# Binary Search Complexity

## Part 1: Understanding the Algorithm First

### What is Binary Search?

Binary search is like playing a number guessing game where someone thinks of a number between 1 and 100, and you have to guess it. The smart strategy is:

1. Guess 50 (the middle)
2. If they say "higher," guess 75 (middle of 51-100)  
3. If they say "lower," guess 25 (middle of 1-49)
4. Keep splitting the remaining range in half

**Key insight**: Each guess eliminates half of the remaining possibilities!

### Visual Example: Finding 7 in [1,2,3,4,5,6,7,8,9,10,11,12]

```
Step 0: [1,2,3,4,5,6,7,8,9,10,11,12]  (size = 12)
         ↑           ↑              ↑
        low        mid=6           high
        Compare 7 vs 6: 7 > 6, so search right half

Step 1: [7,8,9,10,11,12]  (size = 6)
         ↑   ↑        ↑
        low mid=9    high  
        Compare 7 vs 9: 7 < 9, so search left half

Step 2: [7,8]  (size = 2)
         ↑ ↑
        low,mid & high
        Compare 7 vs 7: Found it! ✓
```

**Pattern**: 12 → 6 → 2 → 1 (found)

## Part 2: Counting Steps - Building the Pattern

### Let's Try Different Array Sizes

#### Size 4: [1,2,3,4]
```
Step 0: [1,2,3,4] → compare with middle element (2 or 3)
Step 1: [3,4] or [1,2] → compare with remaining middle  
Step 2: [4] or [1] → found (or not found)
```
**Maximum steps**: 3

#### Size 8: [1,2,3,4,5,6,7,8]
```
Step 0: [1,2,3,4,5,6,7,8] (size 8)
Step 1: [5,6,7,8] or [1,2,3,4] (size 4)  
Step 2: [7,8] or [1,2] or [5,6] or [3,4] (size 2)
Step 3: [8] or [1] or [6] or [3] or similar (size 1)
```
**Maximum steps**: 4

#### Size 16: Maximum 5 steps
#### Size 32: Maximum 6 steps

### The Pattern Emerges

| Array Size | Max Steps | What's the Pattern? |
|------------|-----------|---------------------|
| 1          | 1         | $2^0 = 1$          |
| 2          | 2         | $2^1 = 2$          |  
| 4          | 3         | $2^2 = 4$          |
| 8          | 4         | $2^3 = 8$          |
| 16         | 5         | $2^4 = 16$         |
| 32         | 6         | $2^5 = 32$         |

**Aha moment**: If the array size is $2^k$, then we need at most $k+1$ steps!

## Part 3: The Mathematical Connection

### Why Does This Pattern Work?

At each step, we cut the problem size in half:

$$\text{Original size} = n$$

$$\text{After 1 step} = \frac{n}{2}$$

$$\text{After 2 steps} = \frac{n}{4} = \frac{n}{2^2}$$
$$\text{After 3 steps} = \frac{n}{8} = \frac{n}{2^3}$$
$$\text{After k steps} = \frac{n}{2^k}$$

### When Do We Stop?

We stop when we have 1 element left (or 0 if not found):

$$\frac{n}{2^k} = 1$$

Solving for $k$:
$$n = 2^k$$
$$k = \log_2(n)$$

### What About Non-Powers of 2?

For arrays that aren't perfect powers of 2, we need the ceiling function:

**Maximum steps** = $\lceil \log_2(n) \rceil$

#### Examples:
- $n = 10$: $\log_2(10) = 3.32...$, so $\lceil 3.32 \rceil = 4$ steps
- $n = 100$: $\log_2(100) = 6.64...$, so $\lceil 6.64 \rceil = 7$ steps  
- $n = 1000$: $\log_2(1000) = 9.97...$, so $\lceil 9.97 \rceil = 10$ steps

## Part 4: Step-by-Step Complexity Analysis

### Setting Up the Recurrence

Let $T(n)$ = time to search in an array of size $n$

**What happens in one step of binary search?**
1. Compare target with middle element: $O(1)$ time
2. Choose left or right half: $O(1)$ time  
3. Search the chosen half of size $\frac{n}{2}$: $T(\frac{n}{2})$ time

**Recurrence relation:**
$$T(n) = T\left(\frac{n}{2}\right) + O(1)$$

**Base case:** 
$$T(1) = O(1) \text{ (found it or determined it's not there)}$$

### Solving the Recurrence Step-by-Step

Let's expand $T(n)$ by substitution:

$$\begin{align}
T(n) &= T\left(\frac{n}{2}\right) + c \\
&= T\left(\frac{n}{4}\right) + c + c \\  
&= T\left(\frac{n}{8}\right) + c + c + c \\
&= T\left(\frac{n}{2^k}\right) + k \cdot c
\end{align}$$

When do we reach the base case? When $\frac{n}{2^k} = 1$, so $k = \log_2(n)$.

Substituting back:
$$T(n) = T(1) + \log_2(n) \cdot c = O(1) + O(\log n) = O(\log n)$$

## Part 5: Concrete Walkthrough Examples

### Example 1: Searching for 23 in array of size 15

Array: `[1,3,5,7,9,11,13,15,17,19,21,23,25,27,29]`

```
Step 1: Array[0..14], middle = 7, value = 15
        23 > 15, search right: Array[8..14]
        
Step 2: Array[8..14], middle = 11, value = 21  
        23 > 21, search right: Array[12..14]
        
Step 3: Array[12..14], middle = 13, value = 25
        23 < 25, search left: Array[12..12]
        
Step 4: Array[12..12], middle = 12, value = 23
        Found it! ✓
```

**Steps taken**: 4  
**Theory predicts**: $\lceil \log_2(15) \rceil = \lceil 3.91 \rceil = 4$ ✓

### Example 2: Why Linear Search is Much Worse

Same array, but with linear search for 23:
```
Check position 0: 1 ≠ 23
Check position 1: 3 ≠ 23  
Check position 2: 5 ≠ 23
...
Check position 11: 23 = 23 ✓
```

**Linear search**: 12 steps  
**Binary search**: 4 steps  
**Improvement**: 3× better

## Part 6: The Power of Logarithmic Growth

### Why Logarithms Grow So Slowly

Here's the stunning reality of logarithmic growth:

| Array Size | Linear Search (worst) | Binary Search (worst) | Ratio |
|------------|----------------------|-----------------------|--------|
| 10         | 10                   | 4                     | 2.5×   |
| 100        | 100                  | 7                     | 14×    |
| 1,000      | 1,000                | 10                    | 100×   |
| 10,000     | 10,000               | 14                    | 714×   |
| 100,000    | 100,000              | 17                    | 5,882× |
| 1,000,000  | 1,000,000            | 20                    | 50,000×|

### The "Doubling" Property

**Key insight**: When you double the array size, binary search needs only **one more step**.

**Proof**:
- Array of size $n$: $\log_2(n)$ steps
- Array of size $2n$: $\log_2(2n) = \log_2(2) + \log_2(n) = 1 + \log_2(n)$ steps

This is why binary search scales incredibly well!

## Part 7: Space Complexity Analysis

### Iterative Version (Most Common)
```python
def binary_search_iterative(arr, target):
    left, right = 0, len(arr) - 1  # O(1) space
    
    while left <= right:           # O(1) space per iteration
        mid = (left + right) // 2  # O(1) space
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1
```

**Space complexity**: $O(1)$ - only uses a constant amount of extra variables.

### Recursive Version
```python  
def binary_search_recursive(arr, target, left=0, right=None):
    if right is None:
        right = len(arr) - 1
        
    if left > right:
        return -1
    
    mid = (left + right) // 2
    if arr[mid] == target:
        return mid
    elif arr[mid] < target:
        return binary_search_recursive(arr, target, mid+1, right)
    else:
        return binary_search_recursive(arr, target, left, mid-1)
```

**Space complexity**: $O(\log n)$ - each recursive call uses stack space, and we make at most $\log n$ calls.

## Part 8: Summary and Key Takeaways

### The Big Picture
1. **Time Complexity**: $O(\log n)$ - logarithmic time
2. **Space Complexity**: $O(1)$ iterative, $O(\log n)$ recursive
3. **Requirement**: Array must be sorted
4. **Strategy**: Eliminate half the possibilities with each comparison

### Why This Matters
- **Efficiency**: Searching 1 billion elements takes only ~30 comparisons
- **Scalability**: Performance barely degrades as data grows
- **Fundamental**: Forms the basis for many advanced algorithms

### The Mathematical Beauty
The recurrence relation $T(n) = T(n/2) + O(1)$ with solution $T(n) = O(\log n)$ appears throughout computer science - in merge sort analysis, tree height calculations, and many divide-and-conquer algorithms.

**Remember**: Binary search turns an exponentially large search problem into a logarithmically small number of operations - that's the power of smart algorithm design!