<a href="https://colab.research.google.com/github/lingchm/datascience/blob/master/4_Fundamental_Algorithsm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#4. Fundamental Algorithms

##4.1.Search Algorithms

###Binary Search

How it works
* Set L and R boundaries. M is the middle
* Compare target to M. Readjust L and R
* Proceed until the end

General
* The size of the search space is reduced roughly by 1/2 for every iteration: n -> n/2 -> n/4 -> ... n/2^h
* Stop when L < R, When there is only one number, still needs to check if it is it. Return false if not found
* Must guarantee that the search space decreases over time
* Must guarantee that the target cannot be ruled out
* Minimum number of searches: H = O(logN)
* Time complexity: O(logN)

Key points:
* new search space must be smaller than the old one. Otherwise, you will end up with infinite loop
* New search space must contain all the possible candidates

Three types:
* left <= right: can only be used to find a number
* left < right: ?
* left < right - 1: needs post processsing


Summary of problems:
1. Get the last element == target --> Use upper_bound
2. Get the largest element < target --> Use lower_bound
3. Get total number of occurences --> Upper_bound - lower_bound


**Q1: Search number in a sorted array**
* Return None if not found

In [0]:
# Binary Search 1D
def binary_search(nums, target):
  if nums == None or len(nums) == 0: #if nums is not None and len(nums) != 0:
    return None
  left = 0
  right = len(nums) - 1
  while left <= right:
    mid = int((left + right) / 2)
    if nums[mid] > target:
      right = mid - 1
    elif nums[mid] < target:
      left = mid + 1
    else: 
      return mid
  return None

print (binary_search([1, 2, 3], 3))

2


In [0]:
# Laioffer Solution - with post processing
class Solution(object):
  def binarySearch(self, array, target):
    """
    input: int[] array, int target
    return: int
    """
    if not array:
      return -1
    left = 0
    right = len(array) - 1
    while left < right - 1:
      mid = (left + right) // 2
      if array[mid] == target:  #moved first 
        return mid
      if array[mid] < target:
        left = mid
      else:
        right = mid
    if array[left] == target:
      return left
    if array[right] == target:
      return right
    return -1

**Q2: 2D Matrix search**
* Convert 2D array into linear form
* Given matrix NxM; N = len(matrix); M = len(matrix[0])
* Row_index = index / M
* Col_index = index % M
* NOTE: len([])) == 0; len(None) == error
* Time complexity: O(log(N*M))

In [0]:
#Binary Search 2D
def binary_search_2d(matrix, target):
  #matrix is a 2D array, ex. [[0,1,2,3],[4,5,6,7]]
  if matrix == None or len(matrix) == 0:
    return None
  N, M = len(matrix), len(matrix[0])
  left, right = 0, N*M-1
  while left <= right:
    mid = int((left+right)/2)
    row_num = int(mid / M)  #row index
    col_num = int(mid % M)  #col index
    if matrix[row_num][col_num] > target:
      right = mid - 1
    elif matrix[row_num][col_num] < target:
      left = mid + 1
    else:
      return(row_num, col_num)
  return None

print (binary_search_2d([[0,1,2,3],[4,5,6,7]], 3))

(0, 3)


In [0]:
# Laioffer Solution
class Solution(object):
  def search(self, matrix, target):
    """
    input: int[][] matrix, int target
    return: int[]
    """
    if len(matrix) == 0 or len(matrix[0] == 0):
      return [-1, -1]
    left = 0
    right = len(matrix) * len(matrix[0]) - 1
    while left < right - 1:
      middle = left + (right - left) // 2
      middlex = middle / len(matrix[0])
      middley = middle % len(matrix[0])
      if matrix[middlex][middley] < target:
        left = middle
      elif matrix[middlex][middley] > target:
        right = middle
      else:
        return [middlex, middley]
    if matrix[left/len(matrix[0])][left % len(matrix[0])]:
      return [left/len(matrix[0]), left % len(matrix[0])]
    if matrix[right/len(matrix[0])][right % len(matrix[0])]:
      return [right/len(matrix[0]), right % len(matrix[0])]
    return [-1, -1]

# time: O(log(m*n))
# space: O(1)

**Q3 Find closest element**

* Loop condition: while left < right - 1
* When left = right - 1, infinite loop because mid wil be == left or right

In [0]:
#Binary Search (closest number)
def binary_search_C3(nums, target):
  if nums == None or len(nums) == 0: #if nums is not None and len(nums) != 0:
    return None
  left = 0
  right = len(nums) - 1
  while left < right - 1:
    mid = int((left + right) / 2)
    if nums[mid] > target:
      right = mid 
    elif nums[mid] < target:
      left = mid 
    else: 
      return mid
  return left if abs(nums[left]-target) < abs(nums[right]-target) else right

print (binary_search_C3([1,2,3], 3))

2


In [0]:
# Laioffer Solution
class Solution(object):
  def closest(self, array, target):
    """
    input: int[] array, int target
    return: int
    """
    if len(array) == 0:
      return -1
    left = 0
    right = len(array) - 1
    while left < right - 1:
      mid = (left + right) // 2
      if array[mid] > target:
        right = mid
      elif array[mid] < target:
        left = mid
      else:
        return mid
    return left if abs(array[left] - target) < abs(array[right] - target) else right


**Q4 Find k closest elements**

In [0]:
# M1
class Solution(object):
  def kClosest(self, array, target, k):
    """
    input: int[] array, int target, int k
    return: int[]
    """
    # write your solution here
    if array == None or len(array) == 0:
        return -1
    result = []
    while len(result) < k:
        left = 0
        right = len(array) - 1
        while left < right - 1:
            mid = (left + right) / 2
            if array[mid] < target:
                left = mid  
            elif array[mid] > target:
                right = mid
            else:
                result.append(array[mid])
                del(array[mid])
                break
        if left >= right - 1:
            if abs(array[left]-target) < abs(array[right]-target):
                result.append(array[left])
                del(array[left])
            else:
                result.append(array[right])
                del(array[right])
    return result 

In [0]:
# M2
class Solution(object):
  def kClosest(self, array, target, k):
    """
    input: int[] array, int target, int k
    return: int[]
    """
    # write your solution here
    if not array:
      return -1
    left = 0
    right = len(array) - 1
    res = []
    mid = (left + right) // 2
    if k == 0:
      return res
    while left < right - 1:
      mid = (left + right) // 2
      if array[mid] < target:
        left = mid
      elif array[mid] > target:
        right = mid
      else:
        break
    # set left right pointers 
    if array[mid] == target:
      res.append(array[mid])
      left, right = mid, mid
    elif abs(array[right] - target) > abs(array[left] - target):
      res.append(array[left])
      right = left
    else:
      res.append(array[right])
      left = right
    # Extend the range using left, right pointers 
    for i in range(k-1):
      if right >= len(array) - 1:
        left -= 1
        res.append(array[left])
      elif left == 0:
        right += 1
        res.append(array[right])
      elif abs(array[left-1] - target) > abs(array[right+1] - target):
        right +=1
        res.append(array[right])
      else:
        left -= 1
        res.append(array[left])
    return res

In [0]:
# Laioffer Solution
class Solution(object):
  def closest(self, array, target, k):
    """
    input: int[] array, int target, int k
    return: int[]
    """
    left = 0
    right = len(array) - 1
    if right < 0:
      return -1
    while left < right - 1:
      middle = left + (right - left) // 2
      if array[middle] < target:
        left = middle
      elif array[middle] > target:
        right = middle
      else:
        return middle
    if abs(array[left] - target) < abs(array[right] - target):
      return left
    return right

  def kClosest(self, array, target, k):
    res = []
    if len(array) == 0 or k == 0:
      return res
    close = self.closest(array, target)
    res.append(array[close])
    left = close - 1
    right = close + 1
    total = len(array)
    while len(res) < k and (left >= 0 or right < total):
      if right < total and (left < 0 or abs(array[left] - target) > abs(array[right] - target)):
        res.append(array[right])
        right += 1
      elif left >= 0:
        res.append(array[left])
        left -= 1
    return res

# time: O(logn + k)
# space: O(1)

In [0]:
### New method: Direct binary search
Could we directly use binary search to find the first element in those k closest ones towards target?


**Q5 First Occurence**

Given an array of integers that is non-decreasing and a target value, find the first position in this array at which the element is >= target. If you cannot find one, return len(array)

In [0]:
#First Occurence
def first_occurence(nums, target):
  if not nums: #if nums is not None and len(nums) != 0:
    return None
  left = 0
  right = len(nums) - 1
  while left < right - 1: # [left, right] >= 2
    mid = (left + right) // 2
    if nums[mid] < target:
      left = mid + 1     # left = mid is also okay
    else:
      right = mid 
  if nums[left] == target:
    return left
  if nums[right] == target:
    return right
  return None

print (first_occurence([1,2,2,5], 2))

1


In [0]:
#First Occurence
def first_occurence(nums, target):
  if nums == None or len(nums) == 0: #if nums is not None and len(nums) != 0:
    return None
  left = 0
  right = len(nums) - 1
  while left < right - 1:   # range <= 2
    mid = (left + right) // 2
    if nums[mid] >= target: # 如果大于或等于，往左看
      right = mid
    else:
      left = mid + 1
  if nums[left] == target:
    return left
  if nums[right] == target:
    return right
  return None

print (first_occurence([1,2,2,5], 2))

In [0]:
# Laioffer Solution
class Solution(object):
  def firstOccur(self, array, target):
    """
    array: in[]
    target: int
    return: int
    """
    left = 0
    right = len(array) - 1
    if right < 0:
      return -1
    while left < right - 1:
      middle = left + (right - left) // 2
      if array[middle] < target:
        left = middle
      elif array[middle] >= target:
        right = middle
    if array[left] == target:
      return left
    if array[right] == target:
      return right
    return -1

In [0]:
# Laioffer Solution - no post processing
class Solution(object):
  def firstOccur(self, array, target):
    left, right = 0, len(array) #includes len(array) because its part of answers
    while left < right:
      mid = (right + left) // 2
      if array[mid] >= target:
        right = mid
      else:
        left = mid + 1
    return l

# If only two elements, converge into one element
[1,2,5,6] -> 4
 l   m   r
     r
   m
     l
     
    

**Q6 Last Occurrence**

In [0]:
#Last Occurence
def last_occurence(nums, target):
  if nums == None or len(nums) == 0: #if nums is not None and len(nums) != 0:
    return None
  left = 0
  right = len(nums) - 1
  while left < right - 1:
    mid = int((left + right) / 2)
    if nums[mid] > target:
      right = mid - 1
    else:
      left = mid 
  if nums[right] == target:
    return right
  if nums[left] == target:
    return left
  return None

print (last_occurence([1,2,2,5], 2))
#time: O(logn)
#space: O(1)

2


In [0]:
# Laioffer Solution
class Solution(object):
  def lastOccur(self, array, target):
    """
    array: int[]
    target: int
    return: int
    """
    left = 0
    right = len(array) - 1
    if right < 0:
      return -1
    while left < right - 1:
      middle = left + (right - left) // 2
      if array[middle] <= target:
        left = middle
      elif array[middle] > target:
        right = middle
    if array[right] == target:
      return right
    if array[left] == target:
      return left
    return -1

**Q7 Search in Unknown Sized Sorted Array**
* S1 expand the boundary by two times to find the right boundary
* do binary searhc inside left and right bounda

In [0]:
# M1
class Solution(object):
  def search(self, dic, target):
    """
    input: Dictionary dic, int target
    return: int
    """
    # write your solution here
    i = 0
    while dic.get(i) != None:
      if dic.get(i) == target:
        return i
      i += 1
    return -1

In [0]:
# M2
# Definition for a unknown sized dictionary.
# class Dictionary(object):
#   def get(self, index):
#     pass

class Solution(object):
  def search(self, dic, target):
    """
    input: Dictionary dic, int target
    return: int
    """
    # write your solution here
    if not dic:
      return -1
    start = 1
    while dic.get(start) and dic.get(start) < target:
      start = start * 2
    left = 0
    right = start
    while left <= right:
      mid = (left + right) // 2
      if dic.get(mid) is None or dic.get(mid) < target:
        left = mid + 1
      elif dic.get(mid) > target:
        right = mid - 1
      else:
        return mid
    return -1

In [0]:
# Laioffer Solution
# Definition for a unknown sized dictionary.
# class Dictionary(object):
#   def get(self, index):
#     pass

class Solution(object):
  def search(self, dic, target):
    """
    input: Dictionary dic, int target
    return: int
    """
    # write your solution here
    start = 1
    while dic.get(start) and dic.get(start) < target:
      start = start * 2
    left, right = 0, start
    while left <= right:
      mid = (left + right) // 2
      if dic.get(mid) is None or dic.get(mid) < target:
        left = mid + 1
      elif dic.get(mid) > target:
        right = mid - 1
      else:
        return mid
    return -1

In [0]:
'''
M1 Binary search in [1, n]
bad: right might be very big
time: O(log(version_number))
'''
def bug():
  version = 1
  while is_bug(version):
    version *= 2
'''
M2 [start, end?]
S1 How find the end? binary search
    try start*2, then start*2^2, start*2^#...
    until find a bug
S2 binary search [end/2, end]
time: O(log(first_bug_version))
'''

**Q8 Total Occurence**

Given a target integer T and an integer array A sorted in ascending order, Find the total number of occurrences of T in A.
* Return 0 if A is null


In [0]:
class Solution(object):
  def totalOccurrence(self, array, target):
    """
    input: int[] array, int target
    return: int
    """
    # write your solution here
    if not array:
      return 0
    left = 0
    right = len(array) - 1
    res = 0
    # first occurence
    while left < right - 1:
      mid = (left + right) // 2
      if array[mid] >= target:
        right = mid 
      elif array[mid] < target:
        left = mid + 1
    if array[left] == target:
      mid = left
    elif array[right] == target:
      mid = right
    else:
      return res
    # more occurrences
    res = 1
    while (mid + 1 < len(array)):
      if (array[mid + 1] == target):
        mid += 1
        res += 1
      else:
        break
    return res

In [0]:
# Laioffer Solution
# Find the First Occurence and Last Occurence, get the distance between them
class Solution(object):
  def totalOccurences(self, array, target):
    """
    input: int[] array, int target
    return: int
    """
    if not array:
      return 0
    lastOccur = self.lastOccur(array, target)
    firstOccur = self.firstOccur(array, target)
    return 0 if lastOccur == -1 else lastOccur - firstOccur + 1

  def firstOccur(self, array, target):
    if not array:
      return -1
    left, right = 0, len(array) - 1
    while left < right - 1:
      mid = (left + right) // 2
      if array[mid] >= target:
        right = mid
      else:
        left = mid
    if array[left] == target:
      return left
    if array[right] == target:
      return right
    return -1
  
  def lastOccur(self, array, target):
    if not array:
      return -1
    left, right = 0, len(array) - 1
    while left < right - 1:
      mid = (left + right) // 2
      if array[mid] <= target:
        left = mid
      else:
        right = mid
    if array[right] == target:
      return right
    if array[left] == target:
      return left
    return -1

# time: O(logn)
# space: O(1)



**Q9 Square Root I**

Given an integer number n, find its integer square root.
* n is guaranteed to be >= 0.

In [0]:
'''
M1 Try every number in [1, n] to [1,n/2]
time: O(n)
space: O(1)
'''
def sqrt(n):
  for val in range(1, n/2+1):
    if val * val > n:
      return val - 1
  return n 

def sqrt(n):
  val = 1
  while vale * val <= n: 
    val += 1
  return val - 1

In [0]:
'''
M2 Binary search in [1, n/2]
if mid * mid < n: go right
if mid * mid > n: go left
if mid * mid == n: return
time: O(n)
space: O(1)
'''
def sqrt(n):
  if n <= 1:
    return n 
  left, right = 1, n/2
  while left < right - 1:
    mid = (left + right) / 2
    mdsq = mid * mid
    if midsq == n:
      return mid
    elif midsq > n:
      right = mid
    else:
       left = mid
  if right * right <= n:
    return right
  else:
    return left

In [0]:
# Laioffer Solution
class Solution(object):
  def sqrt(self, x):
    """
    input: int x
    return: int
    """
    if x == 0:
      return 0
    left = 1
    right = x
    while True:
      if mid > x // mid:
        right = mid - 1
      elif mid <= x // mid and mid + 1 > x // (mid + 1):
        return mid
      else:
        left = mid + 1

# time: O(logn) better, with fewer loops
# space: O(1)

**Q10 The first element larger than target**


In [0]:
def upper_bound(nums, target):
  

**Q11 Variant**

Suppose an array sorted might be rotated at some pivot unknown to you beforehand. Find the minimum element. You may assume no duplicates.

e.g. [0,1,2,4,5,6,7] might be [4,5,6,7,0,1,2]




In [0]:
'''
Three scenarios:
* sorted, classical binary search: nums[l] < nums[m] < nums[r]
* Min is on the left half: nums[l] > nums[m] < nums[r]
* Min is on the right half
'''



**Q12 Bad Version**

Suppose there is a version control interface contains n versions of product [1,2,3....n].

There is an API called isBadVersions(int n) in which input is version number and output is boolean representing that whether the version is bad or not. Versions after the first bad version are all bad. Versions before the first bad version are all good.

Write a new API called findFirstBadVersion(int n) where n is the total number of versions that returns the version number of the first bad one. 


In [0]:
l, r = 1, n
while...
  m = 
  if isBadVersion(m) is bad:
    look to left
  else:
    look to right

**Q13 Ropes**

You have N ropes and the ith rope has length l_i. If you want to cut these ropes down so that you could have at least K ropes taht have exactly the same length, what will be the maximum length you could get? The answer should have two decimal places

In [0]:
l = 0, r = +ing
while...
  m = ...
  if C(m):
    look to the right

start at 0,
  increment 0.01
  until reach K

Given a length l
C(l) => check whether we could have a solution with length l => O(n)
  for every rope length Li:
    count += Li / l
  if count > K

Entire question => O(N*log(max Li)) ~ O(N)

##4.2. Sort Algorithms

###Bubble Sort 

How it works

* Every iteration, sort by pairs of 2.

* 1st iteration: compared n numbers or (n-1) times --> the largest number will be at the end

* 2nd iteration: compared n-1 numbers. The second largest will be before n

General
* Time: O(n^2) = (n-1) + (n-2) + ....+ 1
* Space: O(1)

In [0]:
#M1
def bubble_sort(list):
  for n in range(len(list)-1, 0, -1): #0 is not included, loop n-1 times
    for j in range(n):                #loop n times
      #print ("compare:", list[j], list[j+1]) 
      if (list[j]>list[j+1]):
        list[j], list[j+1]=list[j+1], list[j]

alist = [2,3,1,0,5,-1]
bubble_sort(alist)
print (alist)

# time: O(n^2)
# space: O(1)

compare: 2 3
compare: 3 1
compare: 3 0
compare: 3 5
compare: 5 -1
compare: 2 1
compare: 2 0
compare: 2 3
compare: 3 -1
compare: 1 0
compare: 1 2
compare: 2 -1
compare: 0 1
compare: 1 -1
compare: 0 -1
[-1, 0, 1, 2, 3, 5]


In [0]:
#M2
def bubble_sort2(list):
  for n in range(len(list)):    #0,1,2,3...
    for j in range(len(list)-n-1): #[0,6), [0,5),...
      #print ("compare:", list[j], list[j+1]) 
      if (list[j]>list[j+1]):
        list[j], list[j+1]=list[j+1], list[j]

alist = [2,3,1,0,5,-1,-2]
bubble_sort2(alist)
print (alist)

compare: 2 3
compare: 3 1
compare: 3 0
compare: 3 5
compare: 5 -1
compare: 5 -2
compare: 2 1
compare: 2 0
compare: 2 3
compare: 3 -1
compare: 3 -2
compare: 1 0
compare: 1 2
compare: 2 -1
compare: 2 -2
compare: 0 1
compare: 1 -1
compare: 1 -2
compare: 0 -1
compare: 0 -2
compare: -1 -2
[-2, -1, 0, 1, 2, 3, 5]


###Selection Sort

How it works
* Find the max number, swap the max with the last
* Sublist of items sorted + sublist of unsorted 
* Find the smallest/largest in the unsorted list

General
* Inefficient on large lists, worse than insertion sort
* SImplicity; good when auxiliary memory is limited
* Time: O(n^2) = (n-1)+(n-2)+....+ 1
* Space: O(1)

In [0]:
#M1 Find max
def selection_sort(alist):
  for i in range(len(alist)-1, 0, -1):  #loop n-1 times: 5,4,3,2,1 
    max_index = 0
    for j in range(i+1):                #loop: n, n-1, n-2 ...
      if (alist[j] > alist[max_index]):
        max_index = j
    alist[i], alist[max_index] = alist[max_index], alist[i]   #swap
    print("swap: ", alist[i], alist[max_index], "->", alist)

alist = [2,3,1,0,5,4]
selection_sort(alist)
print (alist)

swap:  5 4 -> [2, 3, 1, 0, 4, 5]
swap:  4 4 -> [2, 3, 1, 0, 4, 5]
swap:  3 0 -> [2, 0, 1, 3, 4, 5]
swap:  2 1 -> [1, 0, 2, 3, 4, 5]
swap:  1 0 -> [0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5]


In [0]:
#M2 Find min 
def selection_sort(alist):
  for i in range(len(alist)-1):         #loop n-1 times: 0, 1, ..., n-1
    min_index = i   
    for j in range(i+1, len(alist)):    #loop: n-1, n-2, n-3.. 1 times
      if (alist[j]) < alist[min_index]:
        min_index = j
    alist[i], alist[min_index] = alist[min_index], alist[i]
    print("swap: ", alist[i], alist[min_index], "->", alist)

alist = [2,3,1,0,5,-1]
selection_sort(alist)
print (alist)

swap:  -1 2 -> [-1, 3, 1, 0, 5, 2]
swap:  0 3 -> [-1, 0, 1, 3, 5, 2]
swap:  1 1 -> [-1, 0, 1, 3, 5, 2]
swap:  2 3 -> [-1, 0, 1, 2, 5, 3]
swap:  3 5 -> [-1, 0, 1, 2, 3, 5]
[-1, 0, 1, 2, 3, 5]


###Insertion Sort

How it works
* Each time insert a number and puts it in a correct position.
* Use inverse bubble sort to find the correct position to insert

General
* Time: O(n^2)
* Space: O(n)  because created a new array with length n

Improvements
* Option 1: use new_arr and while greater, loop
* Option 2: use binary search to find the insertion location
* Optimal: In-place insertion using three indexes
 * Time: O(n^2)
 * Space: O(1)


In [0]:
#M1 Not in place. Use bubble sort for each insertion
def insert_num(array,num):
  idx = len(array) - 1
  array.append(num)  #insert at the end
  while idx >= 0:    #bubble sort (1 iteration)
    if (array[idx] > array[idx+1]):
      array[idx+1], array[idx] = array[idx], array[idx+1]
    idx -= 1

def insertion_sort(array):
  if not array:
    return None
  new_arr = [] 
  for i in range(len(array)):
    insert_num(new_arr, array[i])
    print("insert", array[i], "->", new_arr)
  return new_arr

alist = [1,5,6,3,2]
new_arr = insertion_sort(alist)
print (new_arr)

insert 1 -> [1]
insert 5 -> [1, 5]
insert 6 -> [1, 5, 6]
insert 3 -> [1, 3, 5, 6]
insert 2 -> [1, 2, 3, 5, 6]
[1, 2, 3, 5, 6]


In [0]:
#Binary search to find the closest number smaller than n
def binary_search(array, n):
  if len(array) == 0 or array == None:
    return 0
  left = 0
  right = len(array) - 1
  while left < right - 1:
    mid = (left+right) // 2
    if array[mid] >= n:
      right = mid - 1
    else:
      left = mid
  return right #returns the position smaller than the number's value

def insert_num(array, n):
  idx = binary_search(array, n) + 1
  array.insert(idx, n)

def insertion_sort(array):
  new_arr = [] 
  for i in range(len(array)):
    insert_num(new_arr, array[i])
    print("insert", array[i], "->", new_arr)
  return new_arr

alist = [1,5,6,3,2]
new_arr = insertion_sort(alist)
print (new_arr)

insert 1 -> [1]
insert 5 -> [1, 5]
insert 6 -> [1, 5, 6]
insert 3 -> [1, 3, 5, 6]
insert 2 -> [1, 2, 3, 5, 6]
[1, 2, 3, 5, 6]


In [0]:
#M2 In-place
'''
[1,5,6,3]  --> [1,5,6,6] --> [1,5,5,6] --> [1,3,5,6]
       I              i             i 
       k            k           k
     cur = 3      cur = 3     cur = 3
loop using i; 
cur holds value at i; 
k starts as index at i, moves backwards and changes value at k until no smaller found
'''
def insertion_sort(alist):
  for i in range(1, len(alist)):  #loop: [1, n-1]
    cur = alist[i]  
    k = i
    while k > 0 and cur < alist[k-1]: 
      alist[k] = alist[k-1]
      k -= 1 
    alist[k] = cur 
    
alist = [1,5,6,3,2]
new_arr = insertion_sort(alist)
print (alist)
print (new_arr)

[1, 2, 3, 5, 6]
None


###Merge Sort

How it works:
* A recursive algorithm that continually splits a list in half
 * If the list is empty of has one item, it is sorted by definition (the base case).
 * If the list has more than one item, we split the list and recursively invoke a merge sort on both halves.
* Once the two halves are sorted, merge is performed. Combines two smaller sorted lists and combine them into a single, sorted, new list

General
* Time: O(n logn) = O(n) per layer * logn layers
* Space: O(n)
 * Top layer, right, left occupies O(n) in the worst case
  * array[:middle], array[middle:] worst case oses O(n) -> may use index range to improve to O(1)
  * Recursion uses n/2 + n/3 + n/8 + ... 1 = O(n)
  

In [0]:
# Divide unsorted list into left + right
def divide(unsortedList):
  if(len(unsortedList) <= 1): #if only one or None element
    return unsortedList
  mid = int(len(unsortedList) / 2)
  lefthalf = unsortedList[:mid]
  righthalf = unsortedList[mid:]
  return (lefthalf, righthalf)

# Merge two sorted arrays using left index point + right index pointer
# time: O(2n) = O(n)
# space: O(n+n) = O(n)
def merge(list1, list2):
  i = 0
  j = 0
  new_list = []
  while (i < len(list1) and j < len(list2)):
    if (list1[i] < list2[j]):
      new_list.append(list1[i])
      i = i + 1
    else:
      new_list.append(list2[j])
      j = j + 1
  while(i < len(list1)):
    new_list.append(list1[i])
    i = i + 1
  while(j < len(list2)):
    new_list.append(list2[j])
    j = j + 1
  return (new_list)

# MergeSort 
def mergeSort(alist):
  if(len(alist) <= 1):
    return alist
  lefthalf, righthalf = divide(alist)
  lefthalf = mergeSort(lefthalf)
  righthalf = mergeSort(righthalf)
  return(merge(lefthalf, righthalf))

# time: O(nlogn) = logn layers * n per layer
# space: O(n) = O(logn) + O(n)

print (mergeSort([7,3,8,4,5,6,2]))

[2, 3, 4, 5, 6, 7, 8]


In [0]:
### Simpler version without divide
def merge(array1, array2):
  i = j = 0
  results = []
  while i < len(array1) and j < len(array2):
    if array1[i] < array2[j]:
      results.append(array1[i])
      i += 1
    else:
      results.append(array2[j])
      j += 1
  while i < len(array1):
    results.append(array1[i])
    i += 1
  while j < len(array2):
    results.append(array2[j])
    j += 1
  return results

def merge_sort(array):
  if len(array) <= 1 or array == None:
    return array
  middle = int(len(array) / 2)
  left = merge_sort(array[:middle])
  right = merge_sort(array[middle:])
  return merge(left, right)

print (merge_sort([7,3,8,4,5,6,2]))

[2, 3, 4, 5, 6, 7, 8]


In [0]:
# Laioffer solution
class Solution(object):
  def mergeSort(self, head):
    if not head or not head.next:
      return head
    one, two = self.splitInHalf(head)
    one = self.mergeSort(one)
    two = self.mergeSort(two)
    return self.merge(one, two)

  def splitInHalf(self, head):
    slow, fast = head, head.next
    while fast and fast.next:
      slow = slow.next
      fast = fast.next.next
    next = slow.next
    slow.next = None
    return head, next

  def merge(self, one, two):
    prev = ListNode(None)
    curr = prev
    while one and two:
      if one.val < two.val:
        curr.next = one
        one = one.next
      else:
        curr.next = two
        two = two.next
      curr = curr.next
    if one:
      curr.next = one
    else:
      curr.next = two
    return prev.next

# time: O(nlogn)
# space: O(logn)

###Quick Sort

General
* First randomly select a pivot value
* Use one pointer, called store_index. This is a dynamic pointer of how many processed values are smaller than the pivot. So, everything on the left of store_index is smaller than the pivor, everything on the right is bigger. Finally, swap pivot with store_index
* After each iteration, at least one number (pivot) is in correct order
* Partition: Put items smaller than the pivot on the left, greater on the right

 * (..., store_index): < pivot
 *[store_index, i): >= pivot
 * [i,...end): unknown]
 

General
* Time: O(nlogn) = logn layers * n per layer
 * Worst case: one partition each layer --> O(n^2)
* Space: O(logn)



In [0]:
from random import randrange

def partition(lst, start, end, pivot_index):
  lst[pivot_index], lst[end] = lst[pivot_index], lst[end]
  store_index = start # store index is a counter for how many are smaller than pivot
  pivot = lst[end]    # pivot is moved to the end
  for i in range(start, end):
    if lst[i] < pivot:
      lst[i], lst[store_index] = lst[store_index], lst[i]
      print("swap:", lst[i], lst[store_index], lst[pivot_index])
      store_index += 1
  lst[store_index], lst[end] = lst[end], lst[store_index]
  print("sorted:", lst)
  return store_index
# time: O(n)
# space: O(1)

def quick_sort(lst, start, end):
  if start >= end:
    return
  pivot_index = randrange(start, end + 1)
  new_pivot_index = partition(lst, start, end, pivot_index)
  quick_sort(lst, start, new_pivot_index - 1)
  quick_sort(lst, new_pivot_index + 1, end)
# time: O(nlogn) // worst case O(n^2)
# space: O(logn)

alist = [28, 1, -1, 5, 3, -3, -2]
quick_sort(alist, 0, len(alist)-1)
print(alist)

swap: 28 -3 -2
sorted: [-3, -2, -1, 5, 3, 28, 1]
swap: -1 -1 5
sorted: [-3, -2, -1, 1, 3, 28, 5]
swap: 3 3 3
sorted: [-3, -2, -1, 1, 3, 5, 28]
[-3, -2, -1, 1, 3, 5, 28]


###Rainbow Sort

How it works
* Use n pointers for n features/colors
* Iterative to count how many of each color -> O(n)
* Recreate the list directly using the counts (in-place) -> O(1) 

General
* Applicable for defined datasets (known length)
* Datasets can only have a finite number of values

In [0]:
def RainbowSort(array):
  if not array:
    return None
  left = 0
  right = len(array) - 1
  index = 0
  while index <= right:
    if array[index] == -1:
      array[index], array[left] = array[left], array[index]
      left += 1
      index += 1
    elif array[index] == 1:
      array[index], array[right] = array[right], array[index]
      right -= 1
    else:
      index += 1
  return array

#time: O(n)
#space: O(1)

##4.3. Recursion


what
* 表面上：function calls itself
* 实质上：boil down a big problem to smaller ones

implementation
* base case: the smallest problem to solve
* recursive rule: how to make the problem smaller


In [0]:
#了解recursion层数限制
import sys
sys.setrecursionlimit(10000)

In [0]:
#Q1 Fibonacci number

#M1 loop
def fib_array(n): 
  array = [0,1]
  for i in range(2, n+1):
    array.append(array[i-1]+array[i-2])
  return array[n]
#space: O(n)  needs to keep an array
#time: O(n)

#M2 Fibonacci
def fib(n):
  if n <= 1:
    return n 
  num_a = 0
  num_b = 1
  for i in range(n-1):
    temp = num_b
    num_b += num_a
    num_a = temp
  return num_b
#space: O(1)  only keeps previous two numbers
#time: O(n)

#M3 Recursion
def fib(n):
  if n <= 1:               #base case
    return n
  return fib(n-1)+fib(n-2) #recursive rule

In [0]:
# Q2 Sum from 1 to n

#M1 Loop
def getSum(n):
  acc = 0
  for num in range(1, n+1):
    acc += num
  return acc
#time: O(n)
#space: O(1)

#M2 Recursion
def getSum(n):
  if n == 0:      #termination point
    return 0
  return n + getSum(n-1)
#time: O(n)
#space: O(n)

print(getSum(2))

3


In [0]:
# Q3 Sum even numbers from 1 to n
def sumEven(n):
  if n == 0:
    return 0
  if n % 2 == 1:
    n -= 1
  return n + sumEven(n-2)

print(sumEven(5))

6


In [0]:
# Q4 Print a linked list
def printList(head):
  if not head:
    return
  print(head.value)
  printList(head.next)

# Q5 Reverse a singly linked list
def Reverse(head):
  if head is None or head.next is None: #When Linkedlist <= 1 node
    return head
  #problem of the previous step
  node = Reverse(head.next)  #this node is the original tail and the new head
  # Add original head as the new tail
  # M1 Since node is the head of the reversed list, 
  #    traverse from it and find the tail and add our original head to it
  #tail = node
  #while tail.next is not None:
  #  tail = tail.next
  #tail.next = head
  #head.next = None
  # M2 Directly reverse 
  head.next.next = head      
  head.next = None
  return node

def Reverse(head):
  if head is None or head.next is None: 
    return head
  node = Reverse(head.next)  
  head.next.next = head      
  head.next = None
  return node

# Q6 Merge two singly linked lists
#Define merge(head1, head2) is a function that will merge these two singly lnked lists and return the head of the new list
def merge(head1, head2):
  if not head1:
    return head2
  if not head2:
    return head1
  if head1.value < head2.value:
    head1.next = merge(head1.next, head2)
    return head1
  else:
    head2.next = merge(head1, head2.next)
    return head2



###Divide and Conquer

1. Divide the problem into a number of subproblems that are smaller instances of the same problem
2. Conquer the subproblems by solving them recursively. If they are small enough, solve the subproblems as base cases
3. Combine the solutions to the subproblems into the solution for the original problem

**Bottom Up**

Way of thinking
1. What do you expect form your left child/right child?
2. What do you want to do in the current layer?
3. What do you want to return to your parent?

**Top Down**

* Use variables to pass information; Use a global variable to record final result
* Usually no return value
* Logic more straight-forward, but code not as short as bottom-up
* 大部分需要输出完整path的题，top down 更实用
* 如果依赖subtree的结果，bottom up

In [0]:
#Q1 Get height of a binary tree
class Solution(object):
  def findHeight(self, root):
    if root is None:
      return 0
    self.result = 0
    self.helper(root, 0) #Can also use 1
    return self.result
  
  def helper(self, root, depth):
    if root is None:
      self.result = max(self.result, depth)
      return
    self.helper(root.left, depth + 1)
    self.helper(root.right, depth + 1)
    return
#time: O(n)
#space: worst O(n), avg O(logn)

In [0]:
#Q2 Get min depth
class Solution(object):
  def minDepth(self, root):
    if root is None:
      return 0
    self.ret = float('inf')
    self.helper(root, 0)
    return self.ret
  
  def helper(self, root, depth):
    if root is None:
      return
    if root.left is None and root.right is None:
      self.ret = min(self.ret, depth + 1)
      return
    self.helper(root.left, depth + 1)
    self.helper(root.right, depth + 1)
    return
#time: O(n)
#space: O(logn)
  
  

In [0]:
#Q3 Invert Binary tree (left with right)

#Bottom Up
class Solution:
  def invertTree(self, root):
    if root is None:
      return None
    right = self.invertTree(root.right)
    left = self.invertTree(root.left)
    root.left, root.right = right, left
    #or root.left = right
    #or root.right = left
    return root
#post-order
  
#Top Down
class Solution:
  def invertTree(self, root):
    if root is None:
      return None
    root.left, root.right = root.right, root.left
    self.invertTree(root.left)
    self.invertTree(root.right)
    return root
#pre-order

class Solution:
  def invertTree(self, root):
    self.helper(root)
    return root
  
  def helper(self, root):
    if root is None:
      return None
    root.left, root.right = root.right, root.left
    self.invertTree(root.left)
    self.invertTree(root.right)
    return root

In [0]:
def preorder_traversal(root):
  output = []
  if not root:
    return output
  stack = [(root, 1)]
  while stack:
    node, count = stack.pop()
    if count == 1: #first visit
      output.append(node.val)
      stack.append((node, count + 1))
      if node.left:
        stack.append((node.left, 1))
    if count == 2: #second visit
      if node.right:
        stack.append((node.right, 1))
  return output

#time: O(cn) = O(n)
#space: O(h)
#same performance as recursion

def inorder_traversal(root):
  output = []
  if not root:
    return output
  stack = [(root, 1)]
  while stack:
    node, count = stack.pop()
    if count == 2: #first visit
      output.append(node.val)
      stack.append((node, count + 1))
      if node.left:
        stack.append((node.left, 1))
    if count == 1: #second visit
      if node.right:
        stack.append((node.right, 1))
  return output
      

**Q1 Maximum Subarray**

Given an integer array nums, find the contiguous subarray (containing at least one number) which has the largest sum and return its sum

Input: [-2, 1, -3, 4, -1, 2, 1, -5, 4]

Output: 6

Input: [2, -1] 

Output: 2

Input: [2, -1, 3]  4


**Solution 1**: Divide into 2 parts: left, right, mid --> max(left_result, right_result, mid_result)

[2,1,-4,-3] --> 3

[2, 1] --> 3

[-4, -3] --> -3

[2, 1, -4] --> -1 中心开花 -- 从左往右，从右往左，分别算最大值

**Solution 2***: supporting array M, len(M) == len(a)

M[i] = the largest sum of a subarray that ends at a[i]

M[0] = a[0]

M[i] = if M[i-1] < 0 : a[i]  else: M[i-1] + a[i]

Final_result = max(M[0], M[1], M[2], M[n-1])

A:  -2  -3  4  -1  -2  1  5  -3

M: -2  -3  4   3   1  2  7   4


In [0]:
#Q1 M1

def maxSubarray(A, left, right):
  if left == right:
    return A[left]
  center = (left + right)/2
  maxLeftSum = maxSubarray(A, left, center)
  maxRightSum = maxSubarray(A, center+1, right)
  maxCenterSum = getMaxCenterSum(A, center, left, right)
  maxSum = max(maxLeftSum, maxRightSum, maxCenterSum)
  return maxSum

def getMaxCenterSum(A, center, left, right):
  maxLeftBorderSum = -float('inf')
  leftBorderSum = 0
  for i in range(center, left-1, -1):
    leftBorderSum += A[i]
    maxLeftBorderSum = max(leftBorderSum, maxLeftBorderSum)
  rightBorderSum = 0
  for i in range(center + 1, right+1):
    rightBorderSum += A[i]
    maxRightBordedSum = max(maxRightBorderSum, rightBorderSum)
  return maxLeftBorderSum + maxRightBorderSum

#time: O(nlogn) = n + (n/2+n/2) + (n/4+n/4+n/4+n/4)...
    

In [0]:
#Q1 M2 Improved

'''
#What if not using a supporting array and complete in one loop?

initialize:
global_max = 0
max_ending_here = 0

loop for each element of the array
  if max_Ending_here < 0
    max_ending_here = a[i]
  else:
    max_ending_here = max_ending_here + a[i]
  global_max = max(global_max, max_ending_here)
return global_max

'''

###Backtracking

**Q0 Motivating Example**

We have 3 types of flour, 2 types of creams, and 6 types of fruits. How many cakes can we make? -> 36

In [0]:
cakes = []
for flour in [flour1, four2, four3]:
  for creat in [cream1, cream2]:
    for fruit in [fruit1, fruit2, fruit3, fruit4, fruit5, fruit6]:
      cakes.append(make_Cake(flour, cream, fruit))

Let's generalize this. What if not all the choices can be compatible? For instance, flour1 does not work well with cream2, flour3 does not work well with cream1, cream1 and flour3 does not work well.

In [0]:
# M1 
cakes = []
for flour in [flour1, four2, four3]:
  for cream in [cream1, cream2]:
    for fruit in [fruit1, fruit2, fruit3, fruit4, fruit5, fruit6]:
      if is_compatible(flour, cream, fruit):
        cakes.append(make_Cake(flour, cream, fruit))

In [0]:
# M2
# this is better because reduces the number of loops
for flour in [flour1, four2, four3]:
  for cream in [cream1, cream2]:
    if (flour, cream) in [(flour1, cream2), (flour3, cream1)]:
      continue
    for fruit in [fruit1, fruit2, fruit3, fruit4, fruit5, fruit6]:
      if (cream, fruit) in [(cream1, fruit3), (cream2, fruit1), (Cream2, fruit6)]:
        continue
      cakes.append(make_Cake(flour, cream, fruit))

What if we don't know the types of flour, cream, and fruit ahead?
* This is a multistage decision problem
* hard to express in divide and conquer recursion manner
* It involves building a set of candidates incrementally and abandons a candidate ("backtracks") as soon as it determines that the candidate cannot possibly be a valid solution
* Tree structure of different partially built candidates set; each partial candidate set is a path from the root to a certain subtree's root --> similarly to preorder tree traversal



Backtracking = to **systematically** and **exhaustively** generate all possible combinations of compatible choices for all stages of a multistage decision porblem.

Strategy
1. Multistage decision problem? yes
2. What's the stage? How many stages?
3. What are possible candidates for a certain stage? How do we make sure it is compatible with previous choices we make?

Complexity -- Decision Tree Traversal
* Estimate number of nodes: number of choices
* Time = O(b*h)  = O(n^n)
* Every node might have different braching factor b, we can assume the worst

In [0]:
'''
Visualize the decision process as a tree
          cake
        /  |  \
       f1  f2  f3
     / \ 
    c1 c2
  /
fr1 .. fr6

We want to generate all possible combinations = tree traversal
This can be seen as a recursion:
  all possible combinations for n stages 
  = all possible combinations for n - 1 stages when the choice is C1
  + all possible combinations for n - 1 stages when the choice is C2
  + all possible combinations for n - 1 stages when the choice is C3
'''

In [0]:
# answer = tuple that contains compatible decisions we previously made. 
#          The size of it should be == current_position
# current_position = integer indicates the id of the current step. Starts at 0
# N = integer indicates total number of steps to build our final answer
# Possible_decisions = a map that associates the id of a step to a collection of possible decisions for that step
  
def backtracking(answer, current_position, N, possible_decisions):
  if len(answer) == N: # we have successfully built one answer here
    pass
  else:
    for decision in possible_decisions[current_position]]:
      if is_compatible(answer, decision): # only proceed if compatible
        answer.add(decision)
        backtracking(answer, current_position+1, N, possible_decisions)
        answer.remove(decision) 

In [0]:
# Cake example transformed

def bt(recipes, recipe, ingredients):
  if len(recipe) == len(ingredients): # base case
    recipes.append(recipe[:]) # make a copy for each full recipe
    return
  for ingredient in ingredients[len(recipe)]:
    recipe.append(ingredient)
    bt(recipes, recipe, ingredients)
    recipe.pop()

def gen_all_recipes(ingredients):
  recipes, recipe = [], []
  bt(recipes, recipe, ingredients)
  return recipes
                            
ingredients = [
               ["four1", "flour2", "flour3"],
               ["cream1", "cream2"],
               ["fruit1", "fruit2", "fruit3", "fruit4", "fruit5", "fruit6"]
]

print(gen_all_recipes(ingredients))

[['four1', 'cream1', 'fruit1'], ['four1', 'cream1', 'fruit2'], ['four1', 'cream1', 'fruit3'], ['four1', 'cream1', 'fruit4'], ['four1', 'cream1', 'fruit5'], ['four1', 'cream1', 'fruit6'], ['four1', 'cream2', 'fruit1'], ['four1', 'cream2', 'fruit2'], ['four1', 'cream2', 'fruit3'], ['four1', 'cream2', 'fruit4'], ['four1', 'cream2', 'fruit5'], ['four1', 'cream2', 'fruit6'], ['flour2', 'cream1', 'fruit1'], ['flour2', 'cream1', 'fruit2'], ['flour2', 'cream1', 'fruit3'], ['flour2', 'cream1', 'fruit4'], ['flour2', 'cream1', 'fruit5'], ['flour2', 'cream1', 'fruit6'], ['flour2', 'cream2', 'fruit1'], ['flour2', 'cream2', 'fruit2'], ['flour2', 'cream2', 'fruit3'], ['flour2', 'cream2', 'fruit4'], ['flour2', 'cream2', 'fruit5'], ['flour2', 'cream2', 'fruit6'], ['flour3', 'cream1', 'fruit1'], ['flour3', 'cream1', 'fruit2'], ['flour3', 'cream1', 'fruit3'], ['flour3', 'cream1', 'fruit4'], ['flour3', 'cream1', 'fruit5'], ['flour3', 'cream1', 'fruit6'], ['flour3', 'cream2', 'fruit1'], ['flour3', 'cream2'

**Q1 Generate all permutations**

Given a collection of distinct integers, return all possible permutations.



```
Example:
input: [1,2,3]
output: [[1,2,3], [1,3,2], [2,1,3], [3,1,2], [3,2,1]]

Solving:
stage1: pick one from [1,2,3], e.g. 1
stage2: pick one from [2,3]
```






In [0]:
# M1
def bt(perms, perm, nums):
  if len(perm) == len(nums): 
    perms.append(perm[:])    # making a copy of perm, object reference
    return
  for i in nums:
    if i not in perm: # compatibility test
      perm.append(i)
      bt(perms, perm, nums)
      perm.pop()
  
def GetAllPerms(nums):
  perms = [] #store all possible permutations
  perm = []
  bt(perms, perm, nums)
  return perms

print(GetAllPerms([1,2,3]))

[[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2], [3, 2, 1]]


In [0]:
# M2 Divide and conquer
# Assume we already solved n-1 elements

def permute(nums):
  res = [[]]
  for num in nums:
    res = [perm[:i] + [num] + perm[i:] for perm in res for i in range(len(perm)+1)]
  return res

print(permute([1,2,3]))

[[3, 2, 1], [2, 3, 1], [2, 1, 3], [3, 1, 2], [1, 3, 2], [1, 2, 3]]


**Q1.2 Permutations with duplicates**

Given a collection of integers that might contain duplicates, return all possible permutations

Input: [1,2,2]

Output: [[1,2,2], [2,1,2], [2,2,1]]

* Compatibility testing conditions are changed
 * note that [1, 2a, *] and [1, 2b, *] will be duplicated
 * at the same stage, we should not pick the same element more than once --> need to store all the decisions we make at the current stage --> set is preferred (O(1)) vs list (O(n))

In [0]:
def bt(perms, perm, nums):
  if len(perm) == len(nums): 
    perms.append(perm[:])    
    return
  used = set
  for i in nums:
    if perm.count(num) < nums.count(num) and num not in used:
      used.add(num)
      perm.append(i)
      bt(perms, perm, nums)
      perm.pop()
  
def GetAllPerms(nums):
  perms, perm = [],[]
  bt(perms, perm, nums)
  return perms

print(GetAllPerms([1,2,3]))

NameError: ignored

**Q1.3 All Permutations I - String**

Given a string with no duplicate characters, return a list with all permutations of the characters.

Assume that input string is not null.

In [0]:
class Solution(object):
  def permutations(self, input):
    """
    input: string input
    return: string[]
    """
    # write your solution here
    perms, perm = [], ""
    self.bt(perms, perm, input)
    return perms
    
  def bt(self, perms, perm, input):
    if len(perm) == len(input):
      perms.append(perm)
      return
    for i in range(len(input)):      
      if perm.find(input[i]) == -1:
        perm += input[i]
        self.bt(perms, perm, input)
        perm = perm[:len(perm)-1]

**Q2 Generate all subsets**

[*,*,*]

Stage 1: only two possible choices: choose 1 or do not choose 1

Stage 2: only two possible choices: choose 2 or do not choose 2

All possible ways = 2 * 2 * 2 = 8\


```
input: nums = [1,2,3]
output:  
[
  [3],
  [1],
  [2],
  [1,2,3],
  [1,3],
  [2,3],
  [1,2],
  []
]
```


Key observations
1. We can build a valid subset incrementally
2. At a certain position, we are not forced to always choose an item.
3. When consider a candidate, only consider the ones haven't been considered before

In [0]:
# M1 Backtracking
def subsets(self, nums):
    answers, subset = [], []
    bt(answers, subset, 0, nums)
    return answers

def bt(answer, subset, current_position, nums):
  if current_position == len(nums):
    answers.append(subset[:])
    return
  # case 1: do not choose current number
  bt(answers, subset, current_position + 1, nums)
  # case 2: choose current number
  subset.append(nums[current_position])
  bt(answers, subset, current_position + 1, nums)
  subset.pop()
  return
  
# time: O(2^n)

In [0]:
# M2 Divide and Conquer
# The previous result contains all subsets that do not contain 3.

def subsets(nums):
  if not nums:
    return [[]]
  r = subsets(nums[:-1]) #subsets that do not contain the last element
  return r + [s + [nums[-1]]for s in r]

def subsets(nums):
  res = [[]]
  for num in nums:
    for i in range(len(res)):
      print("append", res[i]+[num])
      res.append(res[i]+[num])
  return res

subsets([1,2,3])

append [1]
append [2]
append [1, 2]
append [3]
append [1, 3]
append [2, 3]
append [1, 2, 3]


[[], [1], [2], [1, 2], [3], [1, 3], [2, 3], [1, 2, 3]]

**Q2.2 Subsets with duplicates**

Given a collection of integers that might contain duplicates, nums, return all possible subsets (the power set)

```
input: [1,2,2]
output: [[2], [1], [1,2,2], [2,2], [1,2], [1]]
```

* Similar to the combination problem, we can try to enforce an order to eliminate the duplicate situation
* sort first so that all same elements are together
* if for a given element at the current position we choose not to include, then all the same elements will also be excluded; [2,2] cannot resolve in [N, Y] --> the idea is choose 2 0 times, 1 time, 2 times...


In [0]:
def bt(answer, subset, C, N, nums):
  if C == N:
    answers.append(subset[:])
    return
  #case 1: choose current number
  subset.append(nums[C])
  bt(answers, subset, C+1, N, nums)
  subset.pop()
  #case 2: do not choose current number
  i = C + 1
  while i < len(nums) and nums[C] == nums[i]:
    i += 1
  bt(answers, subset, i, N, nums)
  return
  
class Solution(object):
  def subsets(self, nums):
    nums.sort()  #sort to facilitate
    answers = []
    bt(answers, [], 0, len(nums), nums)
    return anwers

**Q3 Generate all valid parentheses**

Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.

For example, n = 3:
[ "((()))", "(()())". "(())()", "()(())", "()()()"]

Recursion:
1. Multi-stage problem: 6 decisions
2. Decisions at each stage: two possible choices "(", ")"
3. For any position, is all these two choices ok? Compatibility test
 a. if # ( = # ) --> cannot put )
 b. if # ( = n --> cannot put (

[*,*,*,*,*,*]

In [0]:
def bt(answers, sequence, l, r):
  # l, r = number of remaining left, right parenthesis
  if r == 0:
    answers.append(''.join(sequence))
    return
  if l > 0:
    sequence.append('(')
    bt(answers, sequence, l - 1, r)
    sequence.pop()
  if l < r:       # note it is not else if
    sequence.append(')')
    bt(answers, sequence, l, r - 1)
    sequence.pop()
  return

class Solution(object):
  def generateParenthesis(self, n):
    answers, sequence = [], []
    bt(answers, sequence, n, n)
    return answers

# time: O(2^2n)

In [0]:
# Backtracking style 2
def bt(seqs, seq, n):
  if len(seq) == 2 * n:
    seqs.append(seq[:])
    return
  if seq.count('(') < n:
    seq.append('(')
    bt(seqs, seq, n)
    seq.pop()
  if HasMachingLeft(seq):
    seq.append(')')
    bt(seqs, seq, n)
    seq.pop()

def HasMatchingLeft(seq):
  s = []
  for c in seq:
    if c == '(':
      s.append(c)
    else:
      s.pop()
  return len(s) > 0

In [0]:
# M2 Divide and conquer
# Assume S, A, B are valid parenthesis sequence, then
# 1. (S) is a valid sequence
# 2. A + B is a valid sequence

def impl(n):
  if n == 0:
    return set([''])
  return set(['(' + s + ')' for s in impl(n-1)]) | set([s1+s2 for k in range(1,n) for s1 in impl(k) for s2 in impl(n-k)])

class Solution(object):
  def generateParenthesis(self, n):
    return list(impl(n))

**Q4 N Queens problem**

Need to place n queens on an nxn chessboard such that no two queens attach each other

Example: input 4

Output: 2 solutions

[".Q..", "...Q", "Q...", "..Q."]
["..Q.", "Q...", "...Q", ".Q.."]


M1 
* 16 stages
* each stage decision: put or not

M2
* 4 stages (each stage one row)
* each stage decision: 

Observations
1. All queens will be on different rows
2. We could find result by putting queens row by row 
3. Assume we have put queens to previous k rows and the ith queen on ith row is put in c col, when we consider k + 1 row, we need to make sure the queen we put will not be on the same column and diagonals as the previous ones
4. For diagonal testing, |cj-ci| != j - 1
5. For column testing ci != cj  


In [0]:
def is_compatible(previous_columns, current_column, current_row):
  for previous_row in range(current_row):
    if previous_columns[previous_row] == current_column or current_row - previous_row == abs(previous_columns[previous_row] - current_column):
      return False
    return True

def bt(answers, positions, row, n):
  # positions: a list such that positions[i] means the column we will place the chass at ith row
  # row: an intger indicate the current row
  # n: total number of rows
  if row == n:
    answers.apend(['.' * p + 'Q' + '.' * (n-1-p) for p in positions])
    return
  for column in range(n):
    if is_compatible(positions, column, row):
      positions.append(column)
      bt(answers, positions, row + 1, n)
      positions.pop()
  return

class Solution(object):
  def solveNQueens(self, n):
    answers, positions = [], []
    bt(answers, positions, 0, n)
    return answers

**Q5 Combinations**

Given two integers n and k, return all possible combinations of k numbers out of 1 ... n

```
Input: n = 4, k = 2

Output: [[2,4], [3,4], [2,3], [1,2],[1,3],[1,3]]
```


In [0]:
'''
M1
* Multistage decision problem. We have k stages
* What will be the possible candidates for a certain stage? (1,4]

Problem: [1,2] is same as [2,1]
Solution: enforce a certain generation order when you build the result
We only pick numbers that is larger than the number we picked from the previous stage
'''

def bt(combs, comb, index, n, k, lower_bound):  # use lower bound
  if k == 0:
    combs.append(comb[:])
    return
  for num in range(lower_bound, n + 1):
    comb.append(num)
    bt(combs, comb, index + 1, n, k - 1, num + 1) # index + 1 is the next node
    comb.pop()
  
def gen_combs(n, k):
  combs, comb = [], []
  index = 0
  bt(combs, comb, 0, n, k)
  return combs


In [0]:
'''
M2 
* Multistage decision problem. We have n stages
* What will be the possible candidates for a certain stage? [*, *] The numbers not chosen before. The range of the stage K + 1 will start from the number we pick at stage K+1 -> avoids duplicates such as [1,2] and [2,1]
 * stage1: pick a number from 1~4 pick 1
 * stage2: pick a number from 2~4
* Compatibility test? We don't need 
'''

**Q6 Combinations of Coins**

Given a number of different coins (1 cent, 5 cents, 25 cents), get the possible ways to pay a target number of cents. {0,1,2,3,4} correspond to {25, 10, 5, 2,1}


```
input: coins = {2,1}, target = 4
output: 
[0,4] (4 cents = 0*2 cents + 4*1 cent)
[1,2] (4 cents = 1*2 cents + 2*1 cents)
[2,0] (4 cents = 2*2 cents + 0*1 cent)
```


In [0]:
'''
Picture a recursion tree of n levels, corresponding to n types of coins
Each level, decide how many coins we need
'''
def bt(coin_combs, coin_comb, target, coins):
  if len(coin_comb) == len(coins): # already reach the bottom
    if target == 0: # already enough money
      coin_combs.append(coin_comb[:])
    return
  # try a coin with 0 or more times
  for times in range(target // coins[len(coin_comb)] + 1):
    coin_comb.append(times)
    bt(coint_combs, coin_comb, target - coins[len(coin_comb)] * times, coins)
    coin_comb.pop()

def gen_all_coin_combs(coins, target):
  coin_combs, coin_comb = [], []
  bt(coin_combs, coin_comb, target, coins)
  return coin_combs

**Q7 Factor Combinations**

Given a number, generate all combinations of its factors
* may assume n is always positive
* factors should be greater than 1 and less than n

```
input: 37
output: []

input: 12
output: [[2,6], [2,2,3],[3,4]]
```

* Multistage problem
 * Stage 1: find the first factor f (n%f==0), range [2, n]. Though, for n = f*b, f is always < sqrt(n)
 * Stage 2: find the second factor f2, range[]
* Similar to coinds combination
* We need to avoid the generation of a single combination multiple times. For example, [2,2,3] and [2,3,2] are the same. We can achieve this by forcing the way we choose the factor -> the factor for the current stage should be >= the factor we choose for the last stage



In [0]:
'''
for factor in range(2, n):
  if n & factor == 0:
    for factor2 in range(factor1, ):
      if (n / factor) 
'''

# S1 (slow, 21 cases for 2s)
def bt(answers, comb, n):
  if n == 1 and len(comb) > 1:  # generate all possible factor combs for n
    answers.append(comb[:])
    return
  for f in range(2 if not comb else comb[-1], n+1):
    if n%f == 0:
      comb.append(f)
      bt(answers, comb, n/f)
      comb.pop()

In [0]:
#S2.1 improved （21ms for 21 cases)
'''
T = f1 * f2, assuming f1 <= f2
then f1 <= sqrt(T)
'''

import math
def bt(answers, comb, n):
  if len(comb) > 0:  # every stage is a valid solution, doesnt have to reach the bottom
    answers.append(comb + [n])  # no return because want to continue factoring
  for f in range(2 if not comb else comb[-1], int(math.sqrt(n)) + 1):
    if n % f == 0:
      comb.append(f)
      bt(answers, comb, n/f)
      comb.pop()
      

In [0]:
# S2.2 Simplified
def bt(answers, comb, s, n):
  while s * s <= n:
    if n % s == 0:
      answers.append(comb + [s, n / s]) #union
      bt(answers, comb + [s], s, n / s)
    s += 1
      
class Solution(object):
  def getFactors(self, n):
    answers = []
    bt(answers, [], n)
    return answers

###DFS Depth First Search

**DFS**

Concept - maze
* Start with one direction, remember all decisions along the way, until you hit a dead end, go back to the last junction then make a new decision to try a different route 不撞南墙不回头
* Recursively explore the graph, backtracking as necessary

DFS is great for edge classifications
* Tree edge = normal edge during DFS process. The whole DFS will give you a DFS spanning tree of the input graph; connects all vertices of the graph
* Forward edge = edge will point from a node to its descendant in the DFS spanning tree 后代到祖先
* Backward edge = edge that will point from a node to its ancestor in the DFS spanning tree 祖先到后代
* Cross edge = between tow nodes that reside in two non-ancestor related subtrees

Does a graph have cycle?
* If there is backward edge, yes
* If there is directed


In [0]:
#pseudo code

#从一个节点能走到的所有的点
#recursively visit every reachable vertex from s that are still not being visited
DFS(graph, visited, s):
  for u in graph.neighbors(s):
    if u not in visited:
      visited.add(u)
      dfs(graph, visited, u)

#尝试从不同点出发，graph may not be all connected
DFS_All(graph):
  visited = set() 所有dfs分享visited
  for v in graph:
    if v not in visited:
      visited.add(v)
      dfs(graph, visited, v)
  
# Time: O(V+E)
#  looking at each neighbor is equivalent to each edge
# Space: O(V+E) for adjacency list; note this is a lower bound
#  first, using adjacency list to represent a graph, the representation takes V+E spaces


**Q1 Graph Valid Tree?**

Given n nodes labeled from 0 to n-1 and a list of undirected edges (each edge is a pair of nodes), write a function to check whether these edges make up a valid tree

input: n=5, edges = [[0,1],[0,2],[0,3],[1,4]]

output: true

input: n=5, edges = [[0,1], [1,2], [2,3], [1,3],[1,4]]

output: false


Note: you can assume that noduplicate edges will appear in edges. All edges are undirected

A graph is a tree iff
* no cycle
* connected

For undirected graph, what type of edge could you have
* tree edge
* backward edge

In [0]:
'''
Graph is a map -> dictionary (given a node, needs to know what other nodes are edged)
For a vertex, need to be able to iterate through all its neighbors
  
#Start from vertex 0, do DFS.
   (1) if found backward edge, then not a tree
       backward edge occurs when node is (a) visited and (b) it is not the parent
   (2) otherwise, check whether all the nodes have been visited
 
#Pseudo code
from collections import defaultdict
graph = defaultdict(list)   #if a key does not exist, default to a 
for edge in edges:  #for undirected graph, A is B is neighbor, B is A's as well
  graph[edge[0]].append(edge[1])
  graph[edge[1]].append(edge[0])
'''

#Template to build an undirectde graph
DFS(graph, visited, parent, s):
  for u in graph.neighbors(s):
    #recursively visited every reachable vertex. True if backward edge is found
    if u not in visited:
      visited.add(u)
      if dfs(graph, visited, u):
        return True
    elif u != parent:
      return True
  return False


#Solution - undirected graph
from collections import defaultdict

def has_cycle(graph, visited, parent, u): #v being a neighbor of u
  visited.add(u)
  for v in graph[u]:
    if v != parent:
      if v in visited or has_cycle(graph, visited, u, v):
        return True
  return False
  
class Solution(object):
  def validTree(self, n, edges):
    """
    :type n: int
    :type edges: List[List[int]]
    :rtype: bool
    """
    visited = set()
    graph = defaultdict(list)
    for edge in edges:
      graph[edge[0]].append(edge[1])
      graph[edge[1]].append(edge[0])
    return not has_cycle(Graph, visited, -1, 0) and len(visited) == n #all nodes are connected 
  

**Q2 Does a directed graph have cycle?**


For directed graph, what type of edge could you have
* tree edge
* backward edge
* forward edge
* cross edge



In [0]:
'''
Since directed graph can have cross or forward edge, just using visited criteria
cannot determine if it is a backward edge

3 states for a node;
- not visited
- visiting (not all its neighbors are visited)
- visited

(1) Backward edge - u -> v is backward edge iff v is in a visiting state
    In other words, v still have edges not visited and may go back to u
    
Using dictionary

'''

#Solution - directed graph
from collections import defaultdict

def dfs(graph, visit_status, u): 
  # 0 represents we are currently still visiting a node
  # 1 represents we are done with a node
  visit_status[u] = 0
  for v in graph[u]:               
    if v not in visit_status:  #if not visited, recursion
      if dfs(graph, visit_status, v):
        return True
    elif visit_status[v] == 0: #if visiting, cycle
      return True
  visit_status[u] = 1 #mark visited before exit
  return False

def has_cycle(graph):
  visit_status = {}
  for v in graph:
    if v not in visit_status and dfs(graph, visit_status, v):
      return True
  return False
  

**Q3 Is Graph Bipartite?**

Given an undirected graph, return true if and only if it is bipartite

Recall that a grpah is bipartite if we can split it's set of nodes into two independent subsets A and B such that every edge in the graph has one node in A and another node in B

Bipartite conditions
* has a cycle
* the cycle's length is not an odd number

In [0]:
'''
input: [[1,2,3], [0,2], [0,1,3], [0,2]]
output: false
0--1
|\ |
| \}
3--2

output: true
0--1
|  |
|  }
3--2

3 states for a node:
- 用色看环边的奇偶性
'''

def can_color(graph, colors, u, color):
  color[u] = color
  for v in graph[u]:
    #u染成color, 所有从u出发的节点染成-color
    if color[v] == color or (not colors[v] and not can_color(Graph, colors, v, -color)):
      return False
  return True  
  
class Solution(object):
  def isBipartite(self, graph):
    """
    :type graph: List[List[int]]
    :rtype: bool
    """
    colors = [0]*len(graph)
    for v in range(len(graph)):
      if not colors[v] and not can_color(graph, colors, v, 1):
        return False
    return True

**Topological Sort**

Topological ordering of a DAG = a linear ordering of its vertices such that for every directed edge {u, v}, u comes before v in the ordering. This can be used to simulate dependency graph

Example:
* each vertex represents a task
* each edge represents a dependency between twotasks, if there is an edge from A to B, then B can only be executed after finishing A
* Topolofical ordering: 3->5->7->8->1...

How
* Leveraging property of DFS
* Append each DFS to the final result list before return 
* After visiting every node, reverse that list and it will be the correct answer. Why? the last one will be the one done latest, which have dependency on other nodes, so we want to start with the earliest, which do not have dependency on the later ones
* Note that there might be many solutions. As long as the dependency order is statisfied, the solution is correct
 * if we visit u before v, for sure we will visit v before u finishes
 * if we visit v before u, then we will finish v before even visiting u since graph is acryclic



**Q4 Course Schedule II**

There are a total of n courses you have to take, labeled 0 to n-1.

Some courses may have prerequisites. Given [0, 1], you have to take course 1 before taking course 0.

Given the total number of courses and a list of prerequisite pairs, return the ordering of courses you should take.

How
* Topolofical ordering
* Need to check if it is impossible to finish all courses

Input: 2, [[1, 0]]

Output: [0, 1]

Input: 4, [[1,0], [2,0], [3,1],[3,2]]

Output: [0,1,2,3] or [0,2,1,3]





In [0]:
'''
Note that [1,0] results in 1 -> 0, which we do not need to reverse the result
We can directly use this ordering and not reverse at the end.

'''
from collections import defaultdict

class Solution(object):
  def findOrder(self, numCourses, prerequisites):
    """
    :type numCourses. int
    :type prerequisites: List[List[int]]
    :rtype: List[int]
    """
    graph = defaultdict(list)
    for p in prerequisites:
      graph[p[0]].append(p[1])
    courses = [] #final result
    visited = [-1]*numCourses

    def dfs(u):
      visited[u] = 0
      for v in graph[u]:
        if visited[v] == 0 or (visited[v] == -1 and not dfs(v)):
          return False
      visited[u] = 1
      courses.append(u)
      return True
    for u in range(numCourses):
      if visited[u] == -1 and not dfs(u):
        return []
    return courses

**Graph Summary**

In [0]:
'''
MetaGraphSearchAlgorithm(graph, s):
  #systematically explore all vertices that are reachable from s
  put s in bag as well as mark s accordingly
  while bag is not empty:
    extract a node from the bag
    for neighbor in graph.neighbors(node):
      if neighbor is not marked:
        put neighbor in bag and mark neighbor
     
Bag implemented as queue --> BFS
Bag implemented as stack --> DFS
Bag implemented as a priority queue --> best first search
'''