# Ultimate coding interview Algorithms and Concepts

## What is this?
<font color=green>
    <p>
        This is a notebook with descriptions, use cases, reasons, and, if applicable, implementations for several algorithms or concepts that are most important for passing the coding segment of the software engineer interview.
    </p>
    <p>
        These are techniques that are most common and can be applied to various types of questions.
    </p>
</font>

<p>
    These techniques include:
</p>
<ol>
    <li><b>Depth-first search (DFS)</b></li>
    <li><b>Breadth-first search (BFS)</b></li>
    <li><b>Matching parenthesis</b></li>
    <li><b>Hash tables</b></li>
    <li><b>Variables/Pointers manipulation</b></li>
    <li><b>Reversing a linked list</b></li>
    <li><b>Sorting fundamentals</b></li>
    <li><b>Recursion</b></li>
    <li><b>Custom data structures</b></li>
    <li><b>Binary search</b></li>
</ol>

In [4]:
from collections import defaultdict

class Graph:
    def __init__(self):
        self.graph = defaultdict(list)
        
    def addEdge(self, v: int, e: int):
        self.graph[v].append(e)

## 1. Depth-first search (DFS)

The most fundamental tree or graph traversal algorithm. This algorithm is used in many technical questions; questions that don't even seem related to graphs but may be turned into graph problems.

The essential idea is to traverse a tree or graph down to its leaf nodes, and then backtrack, and repeat. However, sometimes the tree structure is a series of string from which we need to traverse character by character in a depth-first search kind-of-way.

<p>
    <p>
        <strong>Time:</strong>
        <p>
            O(v + e) for explicit graphs traversed without repetition <strong>or</strong> O(b^d) for implicit graphs
        </p>
    </p>
    </br>
    <p>
        <strong>Space:</strong>
        <p>
            O(v) if entire graph is traversed without repetition <strong>or</strong> O(bd) for implicit graphs without elimination of duplicate nodes
        </p>
    </p>
    <p>
        where v = # of vertices, e = # of edges, b = branching factor, d = depth
    </p>
</p>

DFS is prone to looping infinitely if the graph contains a cycle. To combat this, we use an extra data structure to mark visited nodes.

In [8]:
def dfs(g: Graph) -> None:
    visited = defaultdict(bool)
    
    for v in g.keys():
        visited[v] = False
        
    dfs_util(g, visited, random.choice(g.keys()))


def dfs_util(g: Graph, visited: defaultdict, v: int) -> None:
    '''
    Recursive version
    '''  
    visited[v] = True
    print(v)
    for e in graph[v]:
        if not visited[e]:
            dfs(g, e)

In [9]:
import random

def dfs(g: Graph, v: int) -> None:
    '''
    Iterative version
    '''
    visited = defaultdict(bool)
    for v in g.keys():
        visited[v] = False

    stack = []
    stack.append(random.choice(g.keys())) # choose an arbitrary starting vertex
    
    while len(stack) > 0:
        v = stack[-1]
        visited[v] = True
        print(v)
        stack.pop()
        for e in graph[v]:
            if not visited[e]:
                stack.append(e)

## 2. Breadth-first search (BFS)

The main difference between BFS and DFS is the order in which we traverse child nodes. In DFS we use a stack to add child nodes, adding all the child nodes of some node, and then taking the last added child and repeating the action.

In BFS, however, we use a queue. We add all the child nodes like before, but instead of taking the last adding child node, we look at the first child we added and add all of its children to the very end of the queue.

<p>
    <p>
        <strong>Time:</strong>
        <p>
            O(v + e) <strong>or</strong> O(b^d)
        </p>
    </p>
    </br>
    <p>
        <strong>Space:</strong>
        <p>
            O(b^d)
        </p>
    </p>
    <p>
        where v = # of vertices, e = # of edges, b = branching factor, d = depth
    </p>
</p>

BFS is also prone to looping infinitely if the graph contains a cycle. To combat this, we use an extra data structure to mark visited nodes.

In [11]:
import queue
from collections import defaultdict

def bfs(g: Graph, v: int) -> None:
    '''
    Because the queue and stack are essentially opposites in use.
    There isn't a feasible recursive implementation of BFS unless we alter
    the call stack to act as a queue, which should never be done.
    '''
    visited = defaultdict(bool)
    for v in g.keys():
        visited[v] = False
    
    q = queue.Queue()
    q.put(random.choice(g.keys())) # choose an arbitrary starting vertex
    
    while not q.empty():
        v = q.get()
        visited[v] = True
        print(v)
        for e in graph[v]:
            if not visited[e]:
                q.put(e)

## 3. Matching Parenthesis

<p>
    Given a brackets such as '()[]{}', we want to find either the given string of brackets is valid ie. they all correctly close. Or we need to find the next valid bracket to add.
</p>

<p>
    Typically, we use a stack to solve this problem. We add opening parentheses to the stack, and when we encounter a closing one, we try and match it with the last added opening parenthesis. More convoluted answers can use recursion, but a stack is the easiest to understand.
</p>

In [12]:
def matching_parens(parens: str) -> bool:
    paren_dict = {
        '(': ')',
        '[': ']',
        '{': '}',
    }
    
    stack = []
    for paren in parens:
        if paren in paren_dict.keys():
            stack.append(paren)
        else:
            if len(stack) == 0:
                return False
            
            last_open = stack[-1]
            stack.pop()
            if paren != paren_dict[last_open]:
                return False
            
    if len(stack) != 0:
        return False
    
    return True

In [16]:
matching_parens('')

True

## 4. Making use of Hash Tables

Another fundamental skill is to be able to recognize situations in which a hash table can be utilized. For example, suppose we want to traverse a 2-D matrix. We need to somehow keep track of already visited locations in the matrix. A hash table can be used to maintain this information. 

Further, we can also use hash tables to cache values. For example, computing the nth fibonacci number. The fibonacci number of 10 (fib(10)) = fib(9) + fib(8). A hash table can be used to cache the previous fibonacci numbers.

In [2]:
def two_sum(nums: list, k: int) -> (int, int):
    '''
    In the two sum problem, the goal is to find two numbers that sum up to the 
    target value: k
    '''
    sums = dict()
    for num in nums:
        sums[k - num] = num
        
    for num in nums:
        if num in sums:
            return (num, sums[num])

res = two_sum([5,1,3], 8)
print(res)

(5, 3)


In [4]:
def fib(num: int) -> int:
    '''Calculate the fibonacci number'''
    fib_map = dict()
    fib_map[0] = 0
    fib_map[1] = 1
    
    for i in range(2, num+1):
        fib_map[i] = fib_map[i-1] + fib_map[i-2]
        
    return fib_map[num]

res = fib(10)
print(res)

55


## 5. Variables/Pointers manipulation

Manipulating multiple variables and/or pointers at once is not an algorithm but it is such a commonly needed concept in algorithms.

They can be used for wanting to parellize traversing strings, analyzing linked lists, etc.

A good example is finding the longest palindromic substring.

In [20]:
def longest_palindrome(S: str) -> str:
    '''Find the longest palindromic substring'''
    longest = ''
    n = len(S)
    for i in range(n):
        sub1 = expand(S, i, i)
        sub2 = expand(S, i, i+1)
        longest = max(longest, sub1, sub2)

    return longest

def expand(S: str, left: int, right: int) -> str:
    palindrome = ''
    while (left > -1 and right < len(S)):
            if S[left] != S[right]:
                return palindrome
            
            palindrome = S[left:right+1]
            left -= 1
            right += 1
            
    return palindrome

longest_palindrome('abaabba')

'baab'

## 6. Reversing a linked list

A more contrived and tricky problem, but has application in many interview questions. Problems revolving around linked lists hardly arise in real-world problems, but they are really popular for interviewing. These can be deleting a node in a linked list, finding duplicates, deleting duplicates, and, of course, reversing a linked list.

To reverse a linked list, you need 3 different pointers which can be stressful to do on the fly, and even creating the linked list class can cause issues, so let's practice!

In [23]:
class ListNode:
    def __init__(self, val, next=None):
        self.val = val
        self.next = next


In [54]:
def reverse_ll(root: ListNode) -> ListNode:
    if root.next == None:
        return root
    
    prev = None
    cur = root
    nxt = None
    
    while cur != None:
        nxt = cur.next
        cur.next = prev
        prev = cur
        cur = nxt

    return prev

def print_ll(node: ListNode):
    while node:
        print(node.val, end=' ')
        node = node.next
    print()

if __name__ == '__main__':
    root = ListNode(1)
    root.next = ListNode(5)
    root.next.next = ListNode(7)
    print_ll(root)
    root = reverse_ll(root)
    print_ll(root)

1 5 7 
7 5 1 


## 7. Sorting fundamentals

Understanding the fundamentals of sorting is important. You do not have to memorize how mergesort, quicksort, heapsort work, but understanding the general concepts behind them and understand why sorts like quicksort and mergesort run superiorly faster than bubble sort.

Implementing an actual sort is unlikely, but knowing the runtime of a sort, runtimes of algorithms with sorted vs. unsorted inputs, and how sorting would affect the overall computation complexity of an algorithm.

For example, if we were to sort an array in O(nlogn) time, but the overall algorithm runs in O(n^2), we need to understand that the sort is fine and does not affect the algorithm as if it that would have otherwise run in O(n) time.

<p>
    <strong>If you want to brush up on popular sorting algorithms, take a look at the "All sorts of Sorts" notebook!</strong>
</p>

## 8. Recursion

Recursion is rarely seen in production code or in an app, but it is still used and tested a lot in interviews. The reason for this is because it is a good indicator of your problem solving skills and understanding of fundamental coding.

It is not that practical but it is tricky. There are probably going to be a couple problems that have recursive solutions in the interview, so being very familiar with recursion is a must.

When learning about recursion and looking at recursive solutions may seem obvious to us, but in a whiteboard setting where we are asked to implement a recursive solution, the complexity of recursion may creep up on us and trip us up.

They key to preparation is a lot of practice with writing recursion.

<p>
    What to remember when implementing a recursive solution:
    <ul>
        <li>what you do at every iteration</li>
        <li>what are your cases for ending the recursion</li>
        <li>how the recursive calls themselves look like</li>
    </ul>
</p>

To get to the elegant forms that recursion may offer, we need to practice writing them out in their grosser versions and then smooth out the details to its elegant form by doing things like removing unnecessary checks that would be captured in the recursion already.

## 9. Custom data structures

Again, this is not an algorithm but the knowledge to be able to construct a custom data structure that you might need for an algorithm.

A prominent one is a suffix tree-like structure where you want to capture multiple strings. Some problems are difficult if solved with just an algorithm and should instead be solved with a class. So the idea is not constructing a data structure but a class that may be a data structure that will allow us to solve the problem well. 

Not every question is purely algorithm-focused, as we've seen, some are just coding where we manipulate pointers. Questions regarding these cases are focused on the data structure or class that we need to construct, and the algorithm wouldn't be too complicated to finish the solution.

## 10. Binary search

An algorithm that is quite simple once you understand, but absolutely fundamental. You do not want to be the person who walks into the interview not knowing how to implement binary search.

Binary search is commonly used in real life. 

Say you have a sorted array of integers, and you want to find an integer. You look at the middle of the array and check if that integer is the one you are looking for. If it is great! If not, if it is greater than the integer you are looking for, you know that your integer lies in the section of the array before that integer. Similarly, if that integer is less than the integer you are looking for, you know that it lies in the section of the array after. You repeat this process until you find the integer you are looking for.

On each iteration of the search, you effectively eliminate half of the subarray being searched on decreasing the search time to be O(logn).

A practical application is where you have many versions of your app and you wish to find the version where the crash occurred. You can use binary search and find this faulty version. Or, if you have a bunch of git commits and you want to find the commit with the bug.

Often times, you are not going to get an interview question that is literally "implement binary search". Instead, what you will get is a question with a solution that leads to binary search.

In [20]:
def binary_search(arr: list, num: int, left: int, right: int) -> int:
    '''Recursive version'''
    if right < left: # case where the integer could not be found
        return -1
    
    mid = (left + right) // 2
    if arr[mid] == num:
        return mid
    elif arr[mid] > num:
        return binary_search(arr, num, left, mid - 1)
    else:
        return binary_search(arr, num, mid + 1, right)
    
arr = [1,2,4,8,9,14,17]
pos = binary_search(arr, 14, 0, len(arr) - 1)
print(pos)

5


In [22]:
def binary_search(arr: list, num: int) -> int:
    '''Iterative version'''
    left = 0
    right = len(arr) - 1
    
    while left < right:
        mid = (left + right) // 2
        if arr[mid] == num:
            return mid
        elif arr[mid] > num:
            right = mid - 1
        else:
            left = mid + 1
            
    return -1

arr = [1,2,4,8,9,14,17]
pos = binary_search(arr, 14)
print(pos)

5
