# DP for Coding Interviews: Practice Problems

---

## Problem 1: **Edit Distance**

- Problem:
  - The words `COMPUTER` and `COMMUTER` are very similar and an **update** of just one letter, `P`->`M` will change the first word into the second. Similarly, word `SPORT` can be changed into `SORT` by **deleting** one character, `P`, or equalty `SORT` can be changed into `SPORT` by **inserting** `P`.
  - Edit distance between two strings is defined as the minimum number of character operations (update, delete, insert) required to convert one string into another.
- Input:
  - `CAT` & `CAR`. Edit distance is 1. `T`-> replace -> `R`
  - `SATURDAY` & `SUNDAY`. Edit distance is 3. `A`-> insert. `T`-> insert. `R` -> replace -> `N`.
  - From SUNDAY to S(insert `A`)(insert`T`)U(replace`N` with `R`)DAY
- Observations:
  1. There's 3 choices: insert, delete, replace.
     - Means there's 3 possible options per lazy manager.
     - Means the recurrence relation should define all 3 scenarios.
     - Means 3^n Time Complexity.
     - Means Brute Force recursive solution.
  2. _"How can the first lazy manager reduce the problem?"_
     - Compare's 2 strings one character at a time.
     - If characters are SAME, continue looking at next character in both strings.
     - If characters are different, perform all the operations on strings & **reduce** the string sizes so next manager looks at next string.


In [32]:
def edit_distance_recursive(a, b):
    if not a:
        return len(b)
    if not b:
        return len(a)
    if a[0] == b[0]:
        return edit_distance_recursive(a[1:], b[1:])
    edit_a = edit_distance_recursive(a[1:], b)
    edit_both = edit_distance_recursive(a[1:], b[1:])
    edit_b = edit_distance_recursive(a, b[1:])
    _min = min(edit_b, edit_a, edit_both) + 1
    return _min


edit_distance_recursive("SUNDAY", "SATURDAY")

3

In [20]:
def edit_distance_DP(a, b):
    nRows = len(a) + 1
    nCols = len(b) + 1
    table = [[0] * nCols for _ in range(nRows)]
    for r in range(nRows):
        for c in range(nCols):
            if r == 0:
                table[r][c] = c
            elif c == 0:
                table[r][c] = r
            elif a[r - 1] == b[c - 1]:
                table[r][c] = table[r - 1][c - 1]
            else:
                _min = (
                    min(
                        table[r - 1][c],  # upper
                        table[r][c - 1],  # left
                        table[r - 1][c - 1],  # upper-left
                    )
                    + 1
                )
                table[r][c] = _min
    [print(row) for row in table]
    return table[-1][-1]


edit_distance_DP("SUNDAY", "SATURDAY")

[0, 1, 2, 3, 4, 5, 6, 7, 8]
[1, 0, 1, 2, 3, 4, 5, 6, 7]
[2, 1, 1, 2, 2, 3, 4, 5, 6]
[3, 2, 2, 2, 3, 3, 4, 5, 6]
[4, 3, 3, 3, 3, 4, 3, 4, 5]
[5, 4, 3, 4, 4, 4, 4, 3, 4]
[6, 5, 4, 4, 5, 5, 5, 4, 3]


3

---

## Problem 2: **Total Path Count**

- Problem:
  - Given a 2D array, find total number of paths possible from top-left cell to the bottom-right cell if we are allowed to move only rigthward and downward. For example, if the matrix is of order 2\*2 then only two paths are possible.
- Input:
  - Destination coordinates: `x, y`
- Observations:
  1. There's 2 choices: move-down, move-right.
     - Means there's 2 possible choices per lazy manager in recursive solution.
     - Means the recurrence relation should define all 2 scenarios.
       - Since that the question is asking for **Total**, rather than **Max** or **Min**, then we should add the return values of both calls.
     - Means 2^n Time Complexity
     - Means Brute Force recursive solution
  2. _"How can the first lazy manager reduce the problem"_
     - Each decision represents 1-unit of distance travelled.
  3. Recurrence Relation:
     - `T(x, y) = T(x-1, y) + T(x, y-1)` = 2 changing vars = 2D table.
     - Base Cases are defined as
       - `x = 0` or `y = 0` then `return 1` : Intuition is that if one of our choices is taken away (equals `0`) then there's only 1 choice remaining and thus we can assert concretely a count of what is the only choice - IOW the only "truthy".


In [2]:
def num_paths_recursive(x, y):
    if x == 0 and y == 0:
        return 0
    if x == 0 or y == 0:
        return 1
    return num_paths_recursive(x - 1, y) + num_paths_recursive(x, y - 1)


num_paths_recursive(4, 3)

35

In [16]:
def number_of_paths(matrix, x=0, y=0):
    if x == len(matrix[0]) - 1 and y == len(matrix) - 1:
        return 0
    if matrix[y][x] == 0:
        return 0
    if x == len(matrix[0]) - 1 or y == len(matrix) - 1:
        return 1
    return number_of_paths(matrix, x + 1, y) + number_of_paths(matrix, x, y + 1)


number_of_paths([[0, 1, 1, 1], [1, 0, 0, 0], [1, 0, 0, 0], [1, 0, 0, 0]])

0

In [23]:
def num_of_paths_DP(x, y):
    nRows = x + 1
    nCols = y + 1
    table = [[0] * nCols for _ in range(nRows)]
    for r in range(nRows):
        for c in range(nCols):
            if r == 0:
                table[r][c] = 1
            elif c == 0:
                table[r][c] = 1
            elif r == 0 and c == 0:
                table[r][c] = 0
            else:
                table[r][c] = table[r - 1][c] + table[r][c - 1]
    [print(r) for r in table]
    return table[-1][-1]


num_of_paths_DP(4, 3)

[1, 1, 1, 1]
[1, 2, 3, 4]
[1, 3, 6, 10]
[1, 4, 10, 20]
[1, 5, 15, 35]


35

In [36]:
def number_of_paths(matrix):
    if not matrix or not matrix[0]:
        return 0
    nRows = len(matrix)
    nCols = len(matrix[0])
    dp = [[0] * nCols for _ in range(nRows)]

    # Base case: starting point
    if matrix[0][0] == 0:
        return 0
    dp[0][0] = 1

    for r in range(0, nRows):
        for c in range(0, nCols):
            if r == 0 and matrix[r][c] == 1:
                dp[r][c] = dp[r][c - 1]
            elif c == 0 and matrix[r][c] == 1:
                dp[r][c] = dp[r - 1][c]
            elif matrix[r][c] == 1:
                dp[r][c] = dp[r - 1][c] + dp[r][c - 1]

    # # Fill first row
    # for c in range(1, nCols):
    #     if matrix[0][c] == 1:
    #         dp[0][c] = dp[0][c - 1]

    # # Fill first column
    # for r in range(1, nRows):
    #     if matrix[r][0] == 1:
    #         dp[r][0] = dp[r - 1][0]

    # # Fill rest of the table
    # for r in range(1, nRows):
    #     for c in range(1, nCols):
    #         if matrix[r][c] == 1:
    #             dp[r][c] = dp[r - 1][c] + dp[r][c - 1]

    [print(r) for r in dp]
    return dp[-1][-1]


number_of_paths([[1, 1, 1, 0, 1, 1, 1]])

[0, 0, 0, 0, 0, 0, 0]


0

---

## Problem 3: **String Interleaving**

- Problem:
  - String C is said to be interleaving of string A and B if it contains all the characters of A and B and the relative order of characters of both the strings is preserved in C. For example, if values of A, B and C are as given below:
- Input:
  ```python
  A = 'bbca'
  B = 'bcc'
  C = 'bbcbcac'
  ```
  - String C is the interleaving of strings A and B.
  - Given 3 strings A, B, C, write a function to check if third string is the interleaving of first and second strings.
- Observations:
  1. There's 2 choices: Take first character from string A, or string B
     - Means there's 2 possible choices per lazy manager in recursive solution.
     - Means the recurrence relation should define all 2 scenarios.
       - Since that the question is asking for a **Determination** rather than **Max** or **Min**, then we should return `True` **or** `False`
     - Means 2^n Time Complexity
     - Means Brute Force recursive solution
  2. _"How can the first lazy manager reduce the problem"_
     - Each choice determine which string's first character matches the C string, and reduce C string and the matching string.
  3. Recurrence Relation:
     - `T(A, B, C) = T(A-1, B, C-1) or T(A, B-1, C-1)`
     - **Base Cases**:
       - `A.len == 0 && B.len == 0 && C.len == 0`: Means we have a solution
       - `else C.len == 0`: Means we have characters not accounted for in C, so return `False`
       - `else A.len == 0 and B.len == 0`: Means A & B don't have all the required characters, so return `False`
  4. DP Tabulation:
     - Table Dimensions: 3 changing variables: But C strings can be implied via the other 2 strings since if either A or B changes, then it means C also changed. This means we only need **2 dimensions**.
     - Initialization Row/Col: Yes. Because it's possible one string could be empty while we continue checking the other string.
     - _"What question is each cell answering?"_:
       ```markdown
       |     |     | b   | b   | c   | a   |
       | --- | --- | --- | --- | --- | --- |
       |     | T   |     |     |     |     |
       | b   |     |     |     |     |     |
       | c   |     |     |     |     |     |
       | c   |     |     |     |     |     |
       ```
     - A cell `table[i][j]` is `True` if the first `i`-characters of A and the first `j`-characters of B interleve for the first `i+j`-characters of C.


In [61]:
def string_interleaving_recursive(a, b, c):
    if not a and not b and not c:
        return True
    if not c:
        return False
    if not a and not b:
        return False
    r1, r2 = False, False
    if len(a) and a[0] == c[0]:
        r1 = string_interleaving_recursive(a[1:], b, c[1:])
    if len(b) and b[0] == c[0]:
        r2 = string_interleaving_recursive(a, b[1:], c[1:])
    return r1 or r2


string_interleaving_recursive("bbca", "bcc", "bbcbcac")

True

In [74]:
def string_interleaving_DP(a, b, c):
    len_a = len(a) + 1
    len_b = len(b) + 1
    if len(c) + 2 != len_a + len_b:
        return False
    table = [[None] * len_b for _ in range(len_a)]
    for row in range(1, len_a):
        table[row][0] = a[row - 1] == c[row - 1]
    for col in range(1, len_b):
        table[0][col] = b[col - 1] == c[col - 1]
    table[0][0] = True
    [print(row) for row in table]
    print("\n")

    for i in range(1, len_a):
        for j in range(1, len_b):
            if a[i - 1] == c[i - 1] and b[i - 1] != c[i - 1]:
                table[i][j] = table[i - 1][j]
            elif a[i - 1] != c[i - 1] and b[i - 1] == c[i - 1]:
                table[i][j] = table[i][j - 1]
            elif a[i - 1] == c[i - 1] and b[i - 1] == c[i - 1]:
                table[i][j] = table[i - 1][j] or table[i][j - 1]
            else:
                table[i][j] = False
    [print(row) for row in table]
    return table[-1][-1]


string_interleaving_DP("bcc", "bbca", "bbcbcac")

[True, True, True, True, False]
[True, None, None, None, None]
[False, None, None, None, None]
[True, None, None, None, None]


[True, True, True, True, False]
[True, True, True, True, True]
[False, False, False, False, False]
[True, True, True, True, True]


True

---

## Problem 4: **Subset Sum**

- Problem:
  - Given an array of non-negative integers and a positive number X, determine if there exists a subset of the elements of array with sum equal to X.
- Input:
  ```python
  Array = [3, 2, 7, 1]
  X = 6
  ```
- Output: `True` because `[3, 2, 1]` can be summed up to 6.
- Observations:
  1. There's 2 choices: Include element i'th element in the sum, or exclude the i'th element.
     - Means there's 2 possible choices per lazy manager in recursive solution.
     - Means the recurrence relation should define all 2 scenarios.
       - Since that the question is asking for a **Determination** rather than some _Optimum_ (_Max_ or _Min_), then we should return `True` **or** `False`
     - Means 2^n Time Complexity
     - Means Brute Force recursive solution
  2. _"How can the first lazy manager reduce the problem"_
     - If the manager chooses to include a number, then X is reduced by 1 and # of available elements to choose next have also reduced by 1.
     - If the manager chooses to exclude a number, then X is the same, but # of available elements to choose next should be reduce by 1 as well so that the next manager doesn't choose the same element that has already been chosen.
     - _NOTE_ This problem is quite unique because there's no overlapping subproblems. The intuition should be that we're making choices in such
       a way that it's impossible to make a previous choice in a different manner. Example is: When the total is 6, we can only make one choice
       in the moment on how to reduce 6 to a smaller number given the current element. After we decide, we change the elements such that it's
       impossible to make the same decision ever again. Does this mean DP is not useable? **NO**. _Overlapping_ subproblems is simply one of the
       characteristics that tell us DP is a good option. Does this problem have sub-problems? **YES** and that's the most important feature of this question. The second & last most important feature is the fact that this is a **counting problem** in disguise. We need to produce every possible subset which is implicitly counting the total number of subsets, thus **counting problem**. DP is a the best way to **count**. So i would summarize all the above as follow.
       > DP is the best solution structure when the problem can be reduced to _sub-problems_ AND we must _count_ ALL solutions.
       > What we do with the _count_ is variable. Sometimes we count to optimize the **minimum**. Sometimes we count to optimize the **maximum**. Sometimes we count to simply **determine** if a solution exists at the expected count total. All scenarios either have an explicit or implicit count involved.
  3. Recurrence Relation:
     - `T(E, X) = T(E-1, X-e) or T(E-1, X)`
     - **Base Cases**:
       - `X == 0`: Means we have a solution; `return True`
       - `E == [] and X > 0`: Means solution is impossible `return False`
  4. Time Complexity:
     - Generally speaking the recursion solution is upper-bounded by Big-Oh(2^n), where `n` is length of elements `E`.
     - Additionally, it can be observed that `T(E-1, X-e)` is a decreasing function by factor greater than 1...`X-e` means that the decrease into the next recursive call is larger than 1. This actually makes the upper bound of this specific function call Big-Omega(2^n/m) where `m` is avg(E). So in a best-case scenario, E contains large numbers, that can decrease X to 0 such that a solution is found rather quickly; down the left-size of the recursion tree.


In [101]:
def subset_sum_recursive(s, x):
    if x == 0:
        return True
    if not s:
        return False
    print("subPr: ", s, " | ", x)
    return any([subset_sum_recursive(s[1:], x - s[0]), subset_sum_recursive(s[1:], x)])


subset_sum_recursive([1, 5, 11, 5], 11)

subPr:  [1, 5, 11, 5]  |  11
subPr:  [5, 11, 5]  |  10
subPr:  [11, 5]  |  5
subPr:  [5]  |  -6
subPr:  [5]  |  5
subPr:  [11, 5]  |  10
subPr:  [5]  |  -1
subPr:  [5]  |  10
subPr:  [5, 11, 5]  |  11
subPr:  [11, 5]  |  6
subPr:  [5]  |  -5
subPr:  [5]  |  6
subPr:  [11, 5]  |  11
subPr:  [5]  |  11


True

4. DP Tabulation:

   1. Table Dimensions: 2 changing variables = **2 dimensions**
      - _"Do we need an initialization Row/Col"_: The answer depends on if there are answers we need to count whenever one of the inputs is nullified.
        - If `s == []` it means, we need to return `True` since any number can be the sum of no numbers.
        - If `x == 0` it means we need to return `True` since it indicates there exists a set containing a subset that sum to `x`.
        - Both of these conditions can be captured at `table[0][0] = True`, the answer is **NO** we do not need an initialization row & column.
   2. Table Cell Meaning: _"What question is each cell answering?"_

      ```markdown
           |    | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | j
           |----|---|---|---|---|---|---|---|---|---|---|----|----|

      0 | 1 | T | T | F | F | F | F | F | F | F | F | F | F |
      1 | 5 | T | F | F | F | F | T | T | | | | | |
      2 | 11 | T | | | | | | | | | | | |
      3 | 5 | T | | | | | | | | | | | |
      i
      ```

      **Observe**: It's important to realize that `j` is every possible sub-value of `x`. So when we talk about `j` we should immediately map `j` to the `x` value for a sub-problem.

      - A cell `table[i][j]` is `True` if the first `i`-values contain some subset that sum to the `j`-th value.
      - Once a cell becomes `True` it will propagate `True` thru the rest of the column, indicating there exists a subset within the column regardless of any future `i` values considered.
      - If a cell at `table[i][j]` is `False` it means one of several possibilities.
        1. There exists no subset whose sum equals `j` from `table[i][j]` to `table[0][j]` (think vertically).
        2. If the `i`-th character is greater than `j` then all values for `j` between 0 and `j` are less than `i` (think horizontally). So it's impossible to achieve a subset including the `i`'th value. However, if there was a subset before the `i`-th value that did sum to `j`, then `i` will be `True` as mentioned previously.

   3. Table Cell Assignment: _"How do we build from previous sub-problem answers?"_
      **Observe**: For cell `table[i][j]`
      1. If the current `s[i]` value is less than `j` then we pull the previous answer into the current cell since this `i`-th value cannot contribute any help.
      2. If the current `s[i]` is greater or equal to `j` then we pull the previous answer if `True` **OR** (trickiest part) we check if any previous `s[i]` values can be combined with the current `s[i]` value to equal the sum `j`. The intuition for this is hard to wrap the head around. But the following observations may help.
         - The previous row contains all the results we have for **all** subproblems so far due to the perpetually updating subsequent cells if any cell is `True`, and if the previous row has a `False` value, it's because no rows so far contained a subset whose sum equal `j`. The point is, the previous row has any answer we need for the current row.
         - How to find the right column in the previous row? Since we constrain our logic to only answering this question whenever _`s[i]` is greater than or equal to `j`_ we're ensuring that `s[i]-j` is a positive number. This number is the difference, and this difference is the number of columns we need to look behind the current cell to find the answer to the question: _"Can `s[i]` plus some previous `s[i]` value `(s[i-x])` be summed to find `j`?"_ We can express this as `table[i-1][j - s[i-1]]`


In [99]:
def subset_sum_DP(s, x):
    len_cols = x + 1
    len_rows = len(s)
    table = [[False] * len_cols for row in range(len_rows)]
    for i in range(len_rows):
        for j in range(len_cols):
            if i == 0 and j == 0:
                table[i][j] = True
            elif i == 0:
                table[i][j] = s[i] == j
            elif j == 0:
                table[i][j] = True
            elif s[i] > j:
                table[i][j] = table[i - 1][
                    j
                ]  # No subproblems exist at current i value. Take prev best
            elif s[i] <= j:
                table[i][j] = any(
                    [
                        table[i - 1][j],  # Exclude ith answer
                        table[i - 1][j - s[i]],  # Include ith answer and Compare to
                    ]
                )
    [print(row) for row in table]
    return table[-1][-1]


subset_sum_DP([1, 5, 11, 5], 11)

[True, True, False, False, False, False, False, False, False, False, False, False]
[True, True, False, False, False, True, True, False, False, False, False, False]
[True, True, False, False, False, True, True, False, False, False, False, True]
[True, True, False, False, False, True, True, False, False, False, True, True]


True

---

## Problem 5: **Longest Common Subsequence LCS**

- Problem:
  - Given two strings, determine if String 2 is a subsequence of String 1 maintaining index order.
- Input:
  ```python
  s1 = 'abcdefghij
  s2 = 'cdgi'
  # True: cdgi = is an LCS of s1
  ```
  ```python
  s1 = 'abcdefghij
  s2 = 'ecdgi'
  # False: egi, cdgi are both LCS of s1, but cdgi is the LCS. ecdgi is NOT a valid answer since c comes before e. Letters must be in alphabetical order.
  ```
- Observations:
  1. There's 2 choices: If the ith char in is same in s1 & s2, then exclude it from both strings and call again. Else, exclude the ith character in s1, and exclude the ith character in s2 as 2 separate recurrent calls.
     - Means there's 2 possible choices per lazy manager in recursive solution.
     - Means the recurrence relation should define all 2 scenarios.
       - Since the question is asking for a **Determination** rather than some _Optimum_ (_Max_ or _Min_), then we should return `True` **or** `False`
     - Means 2^n Time Complexity
     - Means Brute Force recursive solution
  2. _"How can the first lazy manager reduce the problem"_
     - If the manager chooses to include a number, then X is reduced by 1 and # of available elements to choose next have also reduced by 1.
     - If the manager chooses to exclude a number, then X is the same, but # of available elements to choose next should be reduce by 1 as well so that the next manager doesn't choose the same element that has already been chosen.
     - _NOTE_ This problem is quite unique because there's no overlapping subproblems. The intuition should be that we're making choices in such
       a way that it's impossible to make a previous choice in a different manner. Example is: When the total is 6, we can only make one choice
       in the moment on how to reduce 6 to a smaller number given the current element. After we decide, we change the elements such that it's
       impossible to make the same decision ever again. Does this mean DP is not useable? **NO**. _Overlapping_ subproblems is simply one of the
       characteristics that tell us DP is a good option. Does this problem have sub-problems? **YES** and that's the most important feature of this question. The second & last most important feature is the fact that this is a **counting problem** in disguise. We need to produce every possible subset which is implicitly counting the total number of subsets, thus **counting problem**. DP is a the best way to **count**. So i would summarize all the above as follow.
       > DP is the best solution structure when the problem can be reduced to _sub-problems_ AND we must _count_ ALL solutions.
       > What we do with the _count_ is variable. Sometimes we count to optimize the **minimum**. Sometimes we count to optimize the **maximum**. Sometimes we count to simply **determine** if a solution exists at the expected count total. All scenarios either have an explicit or implicit count involved.
  3. Recurrence Relation:
     - `T(E, X) = T(E-1, X-e) or T(E-1, X)`
     - **Base Cases**:
       - `X == 0`: Means we have a solution; `return True`
       - `E == [] and X > 0`: Means solution is impossible `return False`
  4. Time Complexity:
     - Generally speaking the recursion solution is upper-bounded by Big-Oh(2^n), where `n` is length of elements `E`.
     - Additionally, it can be observed that `T(E-1, X-e)` is a decreasing function by factor greater than 1...`X-e` means that the decrease into the next recursive call is larger than 1. This actually makes the upper bound of this specific function call Big-Omega(2^n/m) where `m` is avg(E). So in a best-case scenario, E contains large numbers, that can decrease X to 0 such that a solution is found rather quickly; down the left-size of the recursion tree.


In [None]:
def lcs_recursive(s1, s2):
    if not s1 or not s2:
        return 0
    count = 0
    while s1[count] == s2[count]:
        count += 1

---

## Problem 6: **Longest Substring Half Sums Equal**

- Problem:
  - Find length of longest substring so that the sum of digits in first half are equal to the sum of digits in the second half.
- Input:
  ```python
  # input
  string = '9430723'
  # answer = `4307` = sum value of 7 = length of 4
  ```


In [8]:
def max_substr_length_DP(substr):
    n = len(substr)
    max_len = 0
    table = [["-"] * n for _ in range(n)]
    for i in range(n):
        table[i][i] = int(substr[i])
    for length in range(2, n + 1):
        for i in range((n - length) + 1):
            j = (length - 1) + i
            mid = length // 2
            # NOTE: think triangularly, NOT quadrilaterally
            left_half_sum = table[i][j - mid]
            right_half_sum = table[j - mid + 1][j]
            table[i][j] = left_half_sum + right_half_sum
            if length % 2 == 0 and left_half_sum == right_half_sum and length > max_len:
                max_len = length
    return max_len


max_substr_length_DP("9430723")

4