# Technical Interview Practice - Python

August 2017, by Jude Moon

# Practice Overview

For this practice, I will be given [five technical interviewing questions](https://classroom.udacity.com/nanodegrees/nd002/parts/19280355-b835-4ce4-b867-2c2e2e85d0a0/modules/07cc5d99-f81d-45df-af3b-9206dca1739d/lessons/7736707697239847/concepts/78912813390923) on a variety of topics discussed in the technical interviewing course. I will write the answers using Python code, as well as the explanations of the efficiency of the code and the design choices.

***

# Question 1.
Given two strings s and t, determine whether some anagram of t is a substring of s. For example: if s = "udacity" and t = "ad", then the function returns True. Your function definition should look like: question1(s, t) and return a boolean True or False.

## Understand the Question:
"anagram" means a word, phrase, or name formed by **rearranging** the letters of another, using all the original letters **once**. From the example of s = "udacity", the input of t = "aad" should return False because t uses "a" twice. But inputting s = "apple" and t = "pp" should return True because t uses original letters once.

Some of odd cases such as empty string or null needs to be defined to return error messages.
- return error if two arguments are not string.
- return error if the length of the subject string (the first argument) is zero or smaller than anagramed string (the second argument). 

## Answer Code:

In [1]:
from time import time

# helper procedure to search character list of t in the character list of s and
# if any character of t list is not in the s list, return False
def is_anagram(s, t):
    
    # conver all characters to lower case
    s = s.lower()
    t = t.lower()
    
    # convert string to list; each character of the string as element of the list
    s_list = list(s)
    t_list = list(t)
    
    # stop the loop and return False when any character of t is not shown in s
    for char in t_list:
        if char not in s_list:
            return False
            break
        
        s_list.remove(char) # to prevent from using the same letter more than once, 
                            # remove the letter from the list
    return True

# main procedure
def question1(s, t):
    
    # if s is not string, return error message
    if type(s) != str:
        return "Error: The first argument is not string!"

    # if t is not string, return error message
    if type(t) != str:
        return "Error: The second argument is not string!"
    
    # if the number of characters of s is zero or smaller than that of t, return error message
    if len(s) == 0 or len(s) < len(t):
        return "Error: The string length of the first argument needs to be greater than that of the second argument!"
    
    # if t is empty string, the answer should alwasy be True
    if len(t) == 0:
        return True
    
    if is_anagram(s, t):
        return True
    
    return False

### Test Case 1-1:

In [None]:
s1 = "udacity"
t1 = "ad"
question1(s1, t1)

In [None]:
t2 = "cityuda"
question1(s1, t2)

### Test Case 1-2:

In [None]:
# Empty string for anagram
t3 = ""
question1(s1, t3)

In [None]:
# Null anagram
t4 = None
question1(s1, t4)

### Test Case 1-3:

In [None]:
start = time()
t5 = "uuda"

print question1(s1, t5)
print "\nThis took %.8f seconds\n" %(time() - start)

In [None]:
start = time()
s2 = "University"
t6 = "universe"

print question1(s2, t6)
print "\nThis took %.8f seconds\n" %(time() - start)

In [None]:
start = time()
s3 = "udacity"*1000
t7 = "ad"*1000

print question1(s3, t7)
print "\nThis took %.8f seconds\n" %(time() - start)

## Explanation:

My choice of the data structure to solve this question was list, which was treated as indexed array. The python built-in function, list() allowed to convert each character of a string to each element of a list. And each element of anagram (t) list was iterated in the for-loop and then was searched if the element appears in the subject (s) list in the if-statement. Therefore, the time efficiency and space complexity of worst case would be O(len(t) + len(s)). The output was generated only one time. So the O notation would be **O(len(t) + len(s) + 1)**.

***

# Question 2.
Given a string a, find the longest palindromic substring contained in a. Your function definition should look like question2(a), and return a string.


## Understand the Question:

"palindrome" is a word, phrase, number, or other sequence of characters which reads the same backward as forward, such as madam or racecar. And the question is to answer longest palindromic substring. For example, the procedure with the input of 'ababa' will return 'ababa', not 'aba'.

Some of odd cases such as empty string or null needs to be defined.
- return the original input if the length of the string is zero or smaller than 2. 
- return error if the argument is not string.

## Answer Code:

In [None]:
# helper procedure to find palindroms by looking at all of 
# possible substrings and checking them individually
def longest_palindrome(a):
    
    a = a.lower() # conver all characters to lower case
    
    longest = ""
    for i in range(len(a)):
        for j in range(0, i):
            substring = a[j:i + 1] # get every possible substring
            if substring == substring[::-1]: # [::-1] sorts elements reversely
                if len(substring) > len(longest):
                    longest = substring
    
    if longest:
        return longest
    
    return None # if there is no palindrome, return None


# main procedure
def question2(a):
    # if a is not string, return error message
    if type(a) != str:
        return "Error: a not string!"
    
    # if the length of a is zero or smaller than 2, return a as it is
    if len(a) < 2:
        return a
    
    return longest_palindrome(a)
        

### Test Case 2-1:

In [None]:
question2('ABCBAbcba')

In [None]:
question2("AbcbaI4ojajo4iaj8aoa8jA")

### Test Case 2-2:

In [None]:
# Null string case
question2(None)

In [None]:
# Empty string case
question2("")

In [None]:
# One letter string case
question2("a")

### Test Case 2-3:

In [None]:
start = time()
a1 = "udacity"
print question2(a1)
print "\nThis took %.8f seconds\n" %(time() - start)

In [None]:
start = time()
a2 = "udaaaaacity"
print question2(a2)
print "\nThis took %.8f seconds\n" %(time() - start)

In [None]:
start = time()
print question2(a2*1000)
print "\nThis took %.8f seconds\n" %(time() - start)

## Explanation:

My choice of the data structure to solve this question was string, which was treated as indexed array. The input string was manipulated using index system to create sub-string and sort reversely. I used two for-loops to iterate every possible substring and each loop will consume len(a) times or n times of time and space. The output was created as emtpy string first, updated each time iteration, and returned at the end of the procedure. The time efficiency and space complexity of worst case would be **O(n<sup>2</sup> + 3)**.

***

# Question 3.
Given an undirected graph G, find the minimum spanning tree within G. A minimum spanning tree connects all vertices in a graph with the smallest possible total weight of edges. Your function should take in and return an adjacency list structured like this:

>{'A': [('B', 2)],

> 'B': [('A', 2), ('C', 5)], 

> 'C': [('B', 5)]}
 
Vertices are represented as unique strings. The function definition should be question3(G)

## Understand the Question:

A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight.

Kruskal Algorithm builds the spanning tree by adding edges one by one into a growing spanning tree. 

Pseudocode

KRUSKAL(G):
1. A = ∅
2. foreach v ∈ G.V:
3.    MAKE-SET(v)
4. foreach (u, v) in G.E ordered by weight(u, v), increasing:
5.    if FIND-SET(u) ≠ FIND-SET(v):
6.       A = A ∪ {(u, v)}
7.       UNION(u, v)
8. return A

Some of odd cases such as empty dictionary or null needs to be defined.
- return the original input if the length of the string is zero or smaller than 2. 
- return error if the argument is not dictionary.


## Answer Code:

In [None]:
# declare global variables 'parent' and 'rank' which are used in all procedures
# parent stores a root of each node and will be updated when unified two disjoint sets
# rank is used to compare the number of nodes in two disjoint sets 
parent = {}
rank = {}

# helper procedure to find the root node which this vertex belongs, 
# using recursive definition; 
# base case is parent[vertex] == vertex, where itself is the root
# recursive case is parent[vertex] != vertex
def find_root(vertex):
    if parent[vertex] != vertex:  
        parent[vertex] = find_root(parent[vertex])
    return parent[vertex]

# helper procedure to unify the sets by comparing their root nodes and rank 
def union(vertex1, vertex2):
    root1 = find_root(vertex1)
    root2 = find_root(vertex2)
    if rank[root1] > rank[root2]:
        parent[root2] = root1
    else:
        parent[root1] = root2
        if rank[root1] == rank[root2]: 
            rank[root2] += 1

# main procedure
def question3(G):
    
    # if G is not dictionary, return error message
    if type(G) != dict:
        return "Error: G is not dictionary!"
    
    # if the length of G is zero or smaller than 2, return G as it is
    if len(G) < 2:
        return G
    
    ### perform kruskal algorithm to find MST ###
    
    # step1: initialize disjoint sets  
    vertices = G.keys() # collect vertices
    for vertex in vertices:
        parent[vertex] = vertex # the parent value of the key is set to itself
        rank[vertex] = 0 # the rank value is set to 0, meaning 0 union occurrence
    
        
    # step2: get list of unique edges
    # each edge consists of 3 element-tuple, (vertex1, vertex2, edge_weight)
    # vertex1 is always alphabetically earlier than vertex2
    edges = []
    for vertex in vertices:
        for connections in G[vertex]:
            if vertex < connections[0]:
                edges.append((vertex, connections[0], connections[1]))
                
    # step3: sort edges by increasing edge_weight
    edges = sorted(edges, key=lambda x : x[2])
    
    # step4: find and compare roots of vertex1 and vertex2,
    # and if the roots are different, unify disjoint sets containing vertex1 and vertex2,
    # update rank of the roots, and then add the edge to MST set
    MST = set() # a set is an unordered collection with no duplicate elements
    for edge in edges:
        if find_root(edge[0]) != find_root(edge[1]): 
            union(edge[0], edge[1])
            MST.add(edge)
    
    # MST set into the dictionary of adjacency list
    output = {}
    for node in MST:
        if  node[0] in output:
            output[node[0]].append((node[1], node[2]))
        else:
            output[node[0]] = [(node[1], node[2])]
        if node[1] in output:
            output[node[1]].append((node[0], node[2]))
        else:
            output[node[1]] = [(node[0], node[2])]
            
    return output

### Test Case 2-1:

In [None]:
G1 = {'A': [('B', 2)],
      'B': [('A', 2), ('C', 5)], 
      'C': [('B', 5)]}

question3(G1) 

### Test Case 2-2:

In [None]:
# Null dictionary case
G3 = None
question3(G3)

In [None]:
# Empty dictionary case
G4 = {}
question3(G4) # expect {}

In [None]:
# One node dictionary case
G5 = {'A': ['A', 2]}
question3(G5) # expect {}

### Test Case 2-3:

In [2]:
import pprint
start = time()
G2 = {'A': [('B', 1), ('C', 7)],
     'B': [('A', 1), ('C', 5), ('D', 3), ('E', 4)],
     'C': [('A', 7), ('B', 5), ('D', 6)],
     'D': [('B', 3), ('C', 6), ('E', 2)],
     'E': [('B', 4), ('D', 2)]}

pprint.pprint(question3(G2))
print "\nThis took %.8f seconds\n" %(time() - start)

NameError: name 'question3' is not defined

In [None]:
start = time()
G6 = {'A': [('B', 7), ('D', 5)],
      'B': [('A', 7), ('C', 8), ('D', 9), ('E', 7)],
      'C': [('B', 8), ('E', 5)],
      'D': [('A', 5), ('B', 9), ('E', 15), ('F', 6)],
      'E': [('B', 7), ('C', 5), ('D', 15), ('F', 8), ('G', 9)],
      'F': [('D', 6), ('E', 8), ('G', 11)],
      'G': [('E', 9), ('F', 11)]}

pprint.pprint(question3(G6)) 
print "\nThis took %.8f seconds\n" %(time() - start)

## Explanation:

My choices of the data structure to solve this question were dictionaries and lists, which were treated as indexed array. 

Time efficiency and space complexity of each step:

(G is graph input; V is vertices; E is edges)

1. initialize disjoint sets, containing one for-loop; O(len(V))
2. get list of unique edges, containing two for-loops; O(len(V)\*len(E))
3. find and compare roots of vertex1 and vertex2, containing one for-loop; O(len(E))

***

# Question 4.
Find the least common ancestor between two nodes on a binary search tree. The least common ancestor is the farthest node from the root that is an ancestor of both nodes. For example, the root is a common ancestor of all nodes on the tree, but if both nodes are descendents of the root's left child, then that left child might be the lowest common ancestor. You can assume that both nodes are in the tree, and the tree itself adheres to all BST properties. The function definition should look like question4(T, r, n1, n2), where T is the tree represented as a matrix, where the index of the list is equal to the integer stored in that node and a 1 represents a child node, r is a non-negative integer representing the root, and n1 and n2 are non-negative integers representing the two nodes in no particular order. For example, one test case might be

>question4(

>           [[0, 1, 0, 0, 0],

>           [0, 0, 0, 0, 0],

>           [0, 0, 0, 0, 0],

>           [1, 0, 0, 0, 1],

>           [0, 0, 0, 0, 0]],

>           3, 1, 4)

and the answer would be 3.

## Understand the Question:

Binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. Binary search tree (BST) is a type of binary tree with the property of that the key of every node on the right subtree has to be larger than the key of the current node and the key of every node on the left subtree has to be smaller than the key of the current node. The tree of the example given in the question would look something like this: 


![image](Capture2.JPG)

Notice that the nodes on the left subtree are '0' and '1' which is smaller than the root key '3' and the node on the right is '4', which is greater than the root key.

Steps for  the solution:
1. get a list of parents for each node; a parent list for n1 and a parent list for n2.
2. compare and find common elements between two lists.

Some of odd cases such as empty list or null needs to be defined. 
- return error if the T argument is not list or null.
- return error if the T argument is not 2D array with row length = col length.
- return error if the r, n1, and n2 arguments are not positive integers.
- return error if the r, n1, and n2 arguments are not tree (T).
- return error if the n1 and n2 arguments are the root (r) of tree (T).


## Answer Code:

In [49]:
import numpy as np

# helper procedure to get a immediate parent
def get_parent(tree, child):
    for i in range(len(tree)):
        if tree[:, child][i] == 1: # to access elements, use tree[row, col]
            return i # i is a parent of the child

# helper procedure to get a list of all parents including itself
def get_history(tree, root, child):
    history = [child] 
    while (child != root): 
        parent = get_parent(tree, child)
        history.append(parent) # collect parents each iteration
        child = parent # parent becomes child for the nex iteration
    return history

# helper procedure to get the first appeared common element from two lists
def common_ancestor(n1_parents, n2_parents):
    for n1_parent in n1_parents:
        for n2_parent in n2_parents:
            if n1_parent == n2_parent:
                return n2_parent 

# main procedure
def question4(T, r, n1, n2):
    
    # if T is not list, return error message
    if type(T) != list:
        return "Error: T is not list!"
    
    # if T is not 2D array with same lengths of row and column, return error message
    if len(np.shape(T)) != 2 or np.shape(T)[0] != np.shape(T)[1]:
        return "Error: T is not 2D array with len(row) = len(col)!"
    
    # if r, n1 and n2 are not positive integers, return error message
    if type(r) != int or r < 0:
        return "Error: r not positive integer!"
    if type(n1) != int or n1 < 0:
        return "Error: n1 not positive integer!"
    if type(n2) != int or n2 < 0:
        return "Error: n2 not positive integer!"
    
    # if r, n1 and n2 are not nodes in the tree, return error message
    if r >= len(T):
        return "Error: r not in the tree!"
    if n1 >= len(T):
        return "Error: n1 not in the tree!"
    if n2 >= len(T):
        return "Error: n2 not in the tree!"
    
    # if n1 and n2 are r, return error message
    if n1 == r:
        return "Error: n1 cannot be r!"
    if n2 == r:
        return "Error: n2 cannot be r!"
    
    # covert T from list to numpy array to access col and row using tuple
    T = np.array(T)
    
    return common_ancestor(get_history(T, r, n1), get_history(T, r, n2)) 

### Test Case 2-1:

In [50]:
T1 = [[0, 1, 0, 0, 0],
     [0, 0, 0, 0, 0],
     [0, 0, 0, 0, 0],
     [1, 0, 0, 0, 1],
     [0, 0, 0, 0, 0]]

question4(T1, 3, 1, 4)

3

In [51]:
question4(T1, 3, 1, 0)

0

### Test Case 2-2:

In [53]:
# Null tree case
question4(None, 3, 1, 0)

'Error: T is not list!'

In [59]:
# Enmpty list tree case
question4([[]], 3, 1, 0)

'Error: T is not 2D array with len(row) = len(col)!'

In [58]:
# Tree with different col length and row length
T2 = [[0, 1, 0, 0, 0],
      [0, 0, 0, 0, 0],
      [0, 0, 0, 0, 0],
      [1, 0, 0, 0, 1]]

question4(T2, 3, 1, 4)

'Error: T is not 2D array with len(row) = len(col)!'

In [73]:
# Negative integer for node index
question4(T, 3, -1, -1)

'Error: n1 not positive integer!'

In [71]:
# Node index is not in the tree
question4(T, 3, 1, 10)

'Error: n2 not in the tree!'

In [72]:
# Node index is same as root
question4(T, 3, 3, 4)

'Error: n1 cannot be r!'

### Test Case 2-3:

![image](Capture3.JPG)

In [74]:
start = time()
T3 = [[0, 0, 0, 0, 0, 0, 0],
      [1, 0, 0, 1, 0, 0, 0],
      [0, 0, 0, 0, 0, 0, 0],
      [0, 0, 1, 0, 1, 0, 0],
      [0, 0, 0, 0, 0, 0, 0],
      [0, 1, 0, 0, 0, 0, 1],
      [0, 0, 0, 0, 0, 0, 0]]

print question4(T3, 5, 0, 4)
print "\nThis took %.8f seconds\n" %(time() - start)

1

This took 0.00099993 seconds



## Explanation:

My choices of the data structure to solve this question was list, which were treated as indexed array. 

Time efficiency and space complexity of each step:

1. get a parent, containing one for-loop; O(len(T))
2. get list of parents, containing one while-loop; O(len(T))
3. find a common parent between two lists, containing two for-loops; O(len(T)<sup>2</sup>)


***

# Question 5.
Find the element in a singly linked list that's m elements from the end. For example, if a linked list has 5 elements, the 3rd element from the end is the 3rd element. The function definition should look like question5(ll, m), where ll is the first node of a linked list and m is the "mth number from the end". You should copy/paste the Node class below to use as a representation of a node in the linked list. Return the value of the node at that position.

class Node(object):
  def __init__(self, data):
    self.data = data
    self.next = None
    
    
## Understand the Question:

A linked list is a linear data structure where each element is a separate object. Each element (or node) of a list is comprising of two items - the data and a reference to the next node. The last node has a reference to null. The entry point into a linked list is called the head of the list.

## Answer Code:

In [None]:

class Node(object):
  def __init__(self, data):
    self.data = data
    self.next = None
    
#
def question5(ll, m):
    # make sure ll is a Node
    if type(ll) != Node:
        return "Error: ll not a Node!"

    # make sure m is an integer
    if type(m) != int:
        return "Error: m not an integer!"
    
    # get the length of ll
    length_ll = get_length(ll)

    # make sure ll is not circular
    if length_ll == -1:
        return "Error: circular linked list!"
        
    # make sure m is less than or equal to the length of ll
    if length_ll < m:
        return "Error: m greater than the length of ll!"


In [75]:
# linked list node
class Node(object):
    def __init__(self, data):
        self.data = data
        self.next = None
   
# linked list 
class LinkedList(object):
    def __init__(self, head=None):
        self.head = head
        
    def append(self, new_node):
        current = self.head
        if self.head:
            while current.next:
                current = current.next
            current.next = new_node
        else:
            self.head = new_node

    """
    Maintain two pointers - reference pointer and main pointer. Initialize both reference and main 
    pointers to head. First move reference pointer to n nodes from head. Now move both pointers one 
    by one until reference pointer reaches end. Now main pointer will point to nth node from the end. 
    Return main pointer.
    """
    def nthNodeFromLast(self, m):
        main_ptr = self.head
        ref_ptr = self.head 
     
        count = 0
        if(self.head is not None):
            while(count < m):
                if(ref_ptr is None):
                    print "ERROR: %d is greater than the no. of nodes in list" % (m)
                    return
  
                ref_ptr = ref_ptr.next
                count += 1
 
        while(ref_ptr is not None):
            main_ptr = main_ptr.next
            ref_ptr = ref_ptr.next
        
        return main_ptr
    
# main
def question5(ll, m):
    if ll:
        return ll.nthNodeFromLast(m)

# testcases
print "\nAnswer 5:"

# base testcase
# setup nodes
n1 = Node(1)
n2 = Node(2)
n3 = Node(3)
n4 = Node(4)
n5 = Node(5)

# setup LinkedList
ll = LinkedList(n1)
ll.append(n2)
ll.append(n3)
ll.append(n4)
ll.append(n5)

m = 3
answer = question5(ll, m) # expect 3
print "node in linked list that is {} steps from the end is {}".format(m, answer.data)

# edge testcase-1
m = 99
answer = question5(ll, m) # expect ERROR: 99 is greater than the no. of nodes in list

# edge testcase-2
ll = None
m = 3
answer = question5(ll, m) # expect None
print "node in linked list that is {} steps from the end is {}".format(m, answer)



Answer 5:
node in linked list that is 3 steps from the end is 3
ERROR: 99 is greater than the no. of nodes in list
node in linked list that is 3 steps from the end is None


In [76]:
class Node(object):
  def __init__(self, data):
    self.data = data
    self.next = None

def get_length(ll):
    # get the length of ll
    # also checking whether the linked list is circular
    # return -1 if the linked list is circular

    # length == 1
    if ll.next == None:
        return 1
    
    length_ll = 0
    current_node = ll
    current_node2 = ll.next
    while current_node != None and current_node != current_node2:
        current_node = current_node.next
        if current_node2 != None:
            current_node2 = current_node2.next
        if current_node2 != None:
            current_node2 = current_node2.next
        length_ll += 1

    if current_node == None:
        return length_ll
    else:
        return -1

def question5(ll, m):
    # make sure ll is a Node
    if type(ll) != Node:
        return "Error: ll not a Node!"

    # make sure m is an integer
    if type(m) != int:
        return "Error: m not an integer!"
    
    # get the length of ll
    length_ll = get_length(ll)

    # make sure ll is not circular
    if length_ll == -1:
        return "Error: circular linked list!"
        
    # make sure m is less than or equal to the length of ll
    if length_ll < m:
        return "Error: m greater than the length of ll!"
    
    # traverse to the last mth element
    current_node = ll
    for i in xrange(length_ll - m):
        current_node = current_node.next
        
    return current_node.data

def test5():
    n1, n2, n3, n4, n5 = Node(1), Node(2), Node(3), Node(4), Node(5)
    n4.next = n5
    n3.next = n4
    n2.next = n3
    n1.next = n2
    
    print "\nTesting 5"
    print "Edge case (ll not Node):", "Pass" if "Error: ll not a Node!" == question5(123, 111) else "Fail"
    print "Edge case (m > length of ll):", "Pass" if "Error: m greater than the length of ll!" == question5(n1, 6) else "Fail"
    print "Case (ll = n1 and m = 3):", "Pass" if 3 == question5(n1, 3) else "Fail" 
    n5.next = n1
    print "Case (circular linked list):", "Pass" if "Error: circular linked list!" == question5(n1, 3) else "Fail" 


In [77]:
test5()


Testing 5
Edge case (ll not Node): Pass
Edge case (m > length of ll): Pass
Case (ll = n1 and m = 3): Pass
Case (circular linked list): Pass


### Test Case 2-1:

### Test Case 2-2:

### Test Case 2-3:

## Explanation: