# Longest Increasing Subsequence (LIS) - Complete Guide

## 🎯 Problem Statement

**Longest Increasing Subsequence (LIS)**: Given an array/string, find the longest subsequence where elements are in **strictly increasing order**.

**Key Points**:
- **Subsequence**: Not necessarily consecutive (can skip elements)
- **Strictly increasing**: Each element must be greater than the previous one
- **Goal**: Find the maximum possible length

---

## 🌟 Examples to Build Intuition

### 📝 **Example 1: Simple Array**
**Input**: `[3, 1, 4, 1, 5, 9, 2, 6]`

**Possible increasing subsequences**:
- `[3, 4, 5, 9]` → length 4
- `[1, 4, 5, 9]` → length 4  
- `[1, 4, 5, 6]` → length 4
- `[3, 4, 5, 6]` → length 4
- `[1, 2, 6]` → length 3

**LIS**: `[1, 4, 5, 9]` or `[1, 4, 5, 6]` (length **4**)

### 📝 **Example 2: String (Lexicographic)**
**Input**: `"carbohydrate"`

Characters: `c, a, r, b, o, h, y, d, r, a, t, e`

**LIS**: `"abort"` (a < b < o < r < t) → length **5**

Let's trace: c-**a**-r-**b**-**o**-h-y-d-r-a-**t**-e → Wait, this doesn't work!

Actually: **a**-r-**b**-**o**-h-y-d-r-a-**t**-e → `"abot"` (length 4)

Let me recalculate: c-**a**-**r**-**b**-**o**-h-y-d-r-a-t-e → `"arbo"` (length 4)

The lecture says `"abort"` length 5, so let me trace it properly:
c-**a**-r-**b**-**o**-h-y-d-**r**-a-**t**-e → `"abort"` ✓

---

## 🤔 Why LIS is Tricky (Different from LCS)

### 🔄 **First Attempt: Simple Suffix DP**

**Naive idea**: Let `dp[i]` = LIS length starting from position i

**Problem**: How do we ensure the "increasing" property?

```python
# This DOESN'T work!
def bad_lis(A):
    n = len(A)
    dp = [1] * n
    
    for i in range(n-2, -1, -1):  # Go backwards
        # What goes here? We need to know what comes next!
        # But we don't know which element to choose next
        pass
```

**The issue**: When computing `dp[i]`, we don't know which future element we'll pick next, so we can't enforce the increasing constraint.

---

## 💡 The Key Insight: Constraint the Subproblem

### 🎯 **Better Subproblem Definition**

Instead of asking *"What's the LIS starting from position i?"*

Ask: *"What's the LIS starting from position i that **includes A[i] as the first element**?"*

**New Definition**: `dp[i]` = length of LIS starting from position i where A[i] **must be included**

### 🧠 **Why This Works**

Now when we compute `dp[i]`, we know:
1. A[i] is definitely in our subsequence
2. The next element must be > A[i]
3. We can check all positions j > i where A[j] > A[i]

---

## 🌳 From DFS with Cache to Dynamic Programming

### 🔍 **Visualizing the Recursion Tree**

Let's trace through array `[1, 2, 4, 3]` to see how DFS explores the solution space:

![DFS with Cache](./03-1-dfs.png)

The image shows:
- **Green arrows**: Valid transitions (where next element > current element)
- **Red X's**: Invalid transitions (where next element ≤ current element)
- **Cache results**: LIS[0]=2, LIS[2]=1, LIS[3]=1

### 🧠 **From Recursion to DP Table**

![](./03-2-DP.png)

The calculation shows:
- `LIS[3] = 1` (base case)
- `LIS[2] = max(1, 1+LIS[3]) = 1` (since 4 > 3 is false)
- `LIS[1] = max(1, 1+LIS[2], 1+LIS[3]) = 2`
- `LIS[0] = max(1, 1+1, 1+1, 1+2) = 3`

### 📊 **Complete Recursion Tree Analysis**

![](./03-3-DP.png)

This diagram illustrates:
- **Pink boxes**: Overlapping subproblems that get cached
- **Decision tree**: Whether to include or skip each element
- **Final answer**: Maximum LIS length found

### 🕸️ **LIS Subproblem DAG**

![](./03-4-DP.png)


The DAG shows how subproblems depend on each other, with arrows pointing toward the base cases (empty suffix).

## 🛠️ Dynamic Programming Solution

### 📝 **SRT BOT Framework**

**S - Subproblems**: `dp[i]` = LIS length starting from position i, including A[i]

**R - Relation**: 
```
dp[i] = max{1 + dp[j] | j > i and A[j] > A[i]} ∪ {1}
```

**T - Topological Order**: Decreasing i (from right to left)

**B - Base Cases**: No explicit base case needed (we consider A[i] being the last element)

**O - Original Problem**: `max{dp[i] | 0 ≤ i < n}`

**T - Time**: O(n²) - n subproblems, O(n) work each

### 💻 **Code Implementation**

```python
def lis(A):
    n = len(A)
    if n == 0:
        return 0
    
    # dp[i] = LIS length starting from i, including A[i]
    dp = [1] * n  # Initialize all to 1 (single element)
    
    # Fill backwards: from n-1 to 0
    for i in range(n-2, -1, -1):  # i = n-2, n-3, ..., 0
        # Try all positions j after i
        for j in range(i+1, n):
            if A[j] > A[i]:  # Can extend the sequence
                dp[i] = max(dp[i], 1 + dp[j])
    
    # Answer is the maximum among all starting positions
    return max(dp)

# Test with examples
print(lis([3, 1, 4, 1, 5, 9, 2, 6]))  # Expected: 4
print(lis([10, 20, 10, 30, 20, 50]))   # Expected: 4
```

---

## 📊 Step-by-Step Example

**Array**: `[3, 1, 4, 1, 5]`

### 🔄 **Filling Process**

**Initial**: `dp = [1, 1, 1, 1, 1]` (each element forms LIS of length 1)

**Step 1**: `i = 3` (A[3] = 1)
- Check j = 4: A[4] = 5 > A[3] = 1 ✓
- `dp[3] = max(1, 1 + dp[4]) = max(1, 1 + 1) = 2`
- **dp = [1, 1, 1, 2, 1]**

**Step 2**: `i = 2` (A[2] = 4)
- Check j = 3: A[3] = 1 < A[2] = 4 ✗
- Check j = 4: A[4] = 5 > A[2] = 4 ✓
- `dp[2] = max(1, 1 + dp[4]) = max(1, 1 + 1) = 2`
- **dp = [1, 1, 2, 2, 1]**

**Step 3**: `i = 1` (A[1] = 1)
- Check j = 2: A[2] = 4 > A[1] = 1 ✓ → `1 + dp[2] = 1 + 2 = 3`
- Check j = 3: A[3] = 1 = A[1] = 1 ✗ (not strictly greater)
- Check j = 4: A[4] = 5 > A[1] = 1 ✓ → `1 + dp[4] = 1 + 1 = 2`
- `dp[1] = max(1, 3, 2) = 3`
- **dp = [1, 3, 2, 2, 1]**

**Step 4**: `i = 0` (A[0] = 3)
- Check j = 1: A[1] = 1 < A[0] = 3 ✗
- Check j = 2: A[2] = 4 > A[0] = 3 ✓ → `1 + dp[2] = 1 + 2 = 3`
- Check j = 3: A[3] = 1 < A[0] = 3 ✗
- Check j = 4: A[4] = 5 > A[0] = 3 ✓ → `1 + dp[4] = 1 + 1 = 2`
- `dp[0] = max(1, 3, 2) = 3`
- **dp = [3, 3, 2, 2, 1]**

### 🎯 **Final Answer**
`max(dp) = max([3, 3, 2, 2, 1]) = 3`

**LIS**: Either starting from position 0 or 1, both give length 3
- From pos 0: `[3, 4, 5]`
- From pos 1: `[1, 4, 5]`

---

## 🎨 Visual Understanding

```
Array:  [3, 1, 4, 1, 5]
Index:   0  1  2  3  4

dp[i] represents: "Best LIS starting from i, including A[i]"

dp[4] = 1  (just [5])
dp[3] = 2  ([1, 5])
dp[2] = 2  ([4, 5])  
dp[1] = 3  ([1, 4, 5])
dp[0] = 3  ([3, 4, 5])

Answer = max(3, 3, 2, 2, 1) = 3
```

---

## 🔍 More Complex Example

**String**: `"carbohydrate"`

```python
def lis_string(s):
    A = list(s)  # Convert to list of characters
    n = len(A)
    dp = [1] * n
    
    for i in range(n-2, -1, -1):
        for j in range(i+1, n):
            if A[j] > A[i]:  # Lexicographic comparison
                dp[i] = max(dp[i], 1 + dp[j])
    
    return max(dp), dp

# Test
s = "carbohydrate"
length, dp_table = lis_string(s)
print(f"LIS length: {length}")
print(f"DP table: {dp_table}")
```

### 📊 **Trace for "carbohydrate"**

Characters: `c(0) a(1) r(2) b(3) o(4) h(5) y(6) d(7) r(8) a(9) t(10) e(11)`

Working backwards:
- `dp[11] = 1` (just 'e')
- `dp[10] = 1` (just 't', since 't' > 'e')
- `dp[9] = 2` ('a' can go to 't': a→t)
- `dp[8] = 2` ('r' can go to 't': r→t)
- And so on...

**Final LIS**: Characters forming the longest increasing subsequence.

---

## 🚀 LIS Reconstruction

```python
def lis_with_sequence(A):
    n = len(A)
    dp = [1] * n
    parent = [-1] * n  # To track the actual sequence
    
    # Fill DP table
    for i in range(n-2, -1, -1):
        for j in range(i+1, n):
            if A[j] > A[i] and 1 + dp[j] > dp[i]:
                dp[i] = 1 + dp[j]
                parent[i] = j  # Remember where we came from
    
    # Find the starting position of LIS
    max_length = max(dp)
    start_pos = dp.index(max_length)
    
    # Reconstruct the sequence
    lis_sequence = []
    pos = start_pos
    while pos != -1:
        lis_sequence.append(A[pos])
        pos = parent[pos]
    
    return max_length, lis_sequence

# Test
A = [3, 1, 4, 1, 5, 9, 2, 6]
length, sequence = lis_with_sequence(A)
print(f"LIS length: {length}")
print(f"LIS sequence: {sequence}")
```

---

## ⚡ Optimization: O(n log n) Solution

The O(n²) solution can be optimized to O(n log n) using binary search and auxiliary data structures.

```python
import bisect

def lis_optimized(A):
    if not A:
        return 0
    
    # tails[i] = smallest ending element of all increasing subsequences of length i+1
    tails = []
    
    for num in A:
        # Find position where num should be inserted
        pos = bisect.bisect_left(tails, num)
        
        if pos == len(tails):
            tails.append(num)  # Extend the longest subsequence
        else:
            tails[pos] = num   # Replace to keep smallest possible ending
    
    return len(tails)

# Test
print(lis_optimized([3, 1, 4, 1, 5, 9, 2, 6]))  # Output: 4
```

---

## 📊 Complexity Comparison

| **Method** | **Time** | **Space** | **Description** |
|------------|----------|-----------|-----------------|
| DP Solution | O(n²) | O(n) | Standard DP approach |
| Optimized | O(n log n) | O(n) | Using binary search |

---

## 🎯 Key Takeaways

### 🧠 **Why Constraint Was Needed**
- Original subproblem lacked information to maintain "increasing" property
- Adding "must include A[i]" constraint gives us the anchor we need
- This is a common DP pattern: **add constraints to make subproblems well-defined**

### 🔄 **Pattern Recognition**
- **When subsequence problems seem hard**: Try constraining what must be included
- **When order matters**: Consider starting position as a parameter
- **When multiple choices**: Try all valid options and take the best

### 🌟 **Applications**
- Stock price analysis (longest period of increasing prices)
- Sorting analysis (measuring how "sorted" an array is)
- Bioinformatics (gene sequence analysis)
- Patience sorting algorithm

The key insight is that sometimes we need to **add more structure to our subproblems** to make them solvable, even if it seems like we're making the problem harder!