# List-based Collections

## Lists

<p><b>Array</b></p>
<p>
<ul>
    <li>have indeces; can access particular item quickly ($O(1)$)</li>
    <li>insertion and deletion inside becomes expensive ($O(n)$)</li>
    <li>getting length of list in Python is $O(1)$ -- (get index of last item, add 1?)</li>
</ul>
<p><b>Linked List</b></p>
<p>
<ul>
    <li>no indeces</li>
    <li>elements contain pointer to next element (and to previous in doubly-linked lists), or null in last element</li>
    <li>insertions and deletions $O(1)$ -- just update pointers (search remains $O(n)$ however)</li>
    <li>insert/delete gotcha -- don't delete element without updating pointers first</li>
    <li>append traverses entire list $O(n)$; more likely to use as stack and push/pop</li>
</ul>
</p>

In [None]:
### Linked List Implementation ###

class Element(object):
    def __init__(self, value):
        self.value = value
        self.next = None
        
class LinkedList(object):
    def __init__(self, head=None):
        self.head = head
        
    def append(self, new_element):
        current = self.head
        
        if self.head:
            while current.next:
                current = current.next
            current.next = new_element
        else:
            self.head = new_element
            
    def get_element_at_position(self, position):
        """
        Get an element from a particular position.
        Assumes 1-based indexing.
        Returns None if position is not in list.        
        """
        
        current_position = 1
        current_element = self.head
        
        if position < 1:
            return None
        
        while current_element:
            if current_position == position:
                return current_element
            current_position +=1
            current_element = current_element.next
        return None   
        
    def insert(self, new_element, position):
        """
        Insert a new node at the given position.
        Assumes 1-based indexing.
        Inserting at position 3 means between the 2nd and 3rd elements.
        """
        
        previous_element = self.get_element_at_position(position-1)
        
        if previous_element:
            new_element.next = previous_element.next
            previous_element.next = new_element
            return
        elif position == 1:
            new_element.next = self.head
            self.head = new_element
    
    def delete(self, value):
        """Delete the first node with a given value."""
        
        previous_element = None
        current_element = self.head
        
        while current_element:
            if current_element.value == value:
                if previous_element:
                    previous_element.next = current_element.next
                    # garbage?
                    current_element.next = None
                else:
                    self.head = current_element.next
                return
            else:
                previous_element = current_element
                current_element = current_element.next
                
    def insert_first(self, new_element):
        """Add element to 'top' of list (re: stack)"""
        
        new_element.next = self.head
        self.head = new_element
        return
    
    def delete_first(self):
        
        if self.head:
            deleted_element = self.head
            self.head = self.head.next
            deleted_element.next = None
        return


# Test cases

# Set up some Elements
e1 = Element(1)
e2 = Element(2)
e3 = Element(3)
e4 = Element(4)

# Start setting up a LinkedList
ll = LinkedList(e1)
ll.append(e2)
ll.append(e3)

# Test get_element_at_position
# Should print 3
assert ll.head.next.next.value == 3
# Should also print 3
assert ll.get_element_at_position(3).value == 3

# Test insert
ll.insert(e4,3)
# Should print 4 now
assert ll.get_element_at_position(3).value == 4

# Test delete
ll.delete(1)
# Should print 2 now
assert ll.get_element_at_position(1).value == 2
# Should print 4 now
assert ll.get_element_at_position(2).value == 4
# Should print 3 now
assert ll.get_element_at_position(3).value == 3

<p><b>Stacks</b></p>
<p>
<ul>
    <li>Last In, First Out</li>
    <li>adding to stack caleld 'push', removing called 'pop'; both $O(1)$</li>
    <li>could be implemented with various data structures, eg. linked list</li>
</ul>
</p>

In [None]:
### Stack Implementation (using a LinkedList) ###

# adds insert_first, delete_first methods to LinkedList class above

class Stack(object):
    def __init__(self,top=None):
        self.ll = LinkedList(top)

    def push(self, new_element):
        self.ll.insert_first(new_element)
        return
        
    def pop(self):
        element = self.ll.head
        self.ll.delete_first()
        return element
        
    
# Test cases
# Set up some Elements
e1 = Element(1)
e2 = Element(2)
e3 = Element(3)
e4 = Element(4)

# Start setting up a Stack
stack = Stack(e1)

# Test stack functionality
stack.push(e2)
stack.push(e3)
print stack.pop().value
print stack.pop().value
print stack.pop().value
print stack.pop()
stack.push(e4)
print stack.pop().value

<p><b>Queues</b></p>
<p>
<ul>
    <li>First In, First Out</li>
    <li>adding to stack called 'enqueue', removing called 'dequeue', looking at head elem. 'peek'; $O(1)$</li>
    <li>could be implemented with various data structures, eg. linked list with added reference to tail</li>
</ul>
</p>
<p><b>Deque</b></p>
<p>
<ul>
    <li>double-ended queue; can enqueue or dequeue from either end</li>
    <li>could function like a stack or a queue</li>
    <li>collections.deque</li>
</ul>
</p>
<p><b>Priority Queue</b></p>
<p>
<ul>
    <li>each element has a (numerical) priority assigned when added</li>
    <li>elements dequeued by (highest) priority; oldest element dequeued first in event of tie</li>
</ul>
</p>

In [None]:
### Queue implemented with Python list ###

class Queue:
    def __init__(self, head=None):
        self.storage = [head]

    def enqueue(self, new_element):
        self.storage.append(new_element)

    def peek(self):
        try:
            return self.storage[0]
        except IndexError:
            return "Queue is empty"

    def dequeue(self):
        try:
            return self.storage.pop(0)
        except IndexError:
            return "Queue is empty"
            
# Setup
q = Queue(1)
q.enqueue(2)
q.enqueue(3)

# Test peek
# Should be 1
print q.peek()

# Test dequeue
# Should be 1
print q.dequeue()

# Test enqueue
q.enqueue(4)
# Should be 2
print q.dequeue()
# Should be 3
print q.dequeue()
# Should be 4
print q.dequeue()
q.enqueue(5)
# Should be 5
print q.peek()

# Searching and Sorting

<p><b>Binary Search</b></p>
<p>
<ul>
    <li>find an element in ordered array</li>
    <li>check middle element, repeat for the half the value would be in; $O(log(n))$</li>
</ul>
</p>

In [None]:
# You're going to write a binary search function.
# You should use an iterative approach - meaning
# using loops.
# Your function should take two inputs:
# a Python list to search through, and the value
# you're searching for.
# Assume the list only has distinct elements,
# meaning there are no repeated values, and 
# elements are in a strictly increasing order.
# Return the index of value, or -1 if the value
# doesn't exist in the list.

def binary_search(input_array, value):
    # err low
    mindex = 0
    maxdex = len(input_array) - 1
    
    while mindex <= maxdex:
        checking_index = (mindex + maxdex) // 2
        if input_array[checking_index] < value:
            mindex = checking_index + 1
        elif input_array[checking_index] > value:
            maxdex = checking_index - 1
        else:
            return checking_index
    
    return -1

test_list = [1,3,9,11,15,19,29]
test_val1 = 25
test_val2 = 15
print binary_search(test_list, test_val1)
print binary_search(test_list, test_val2)

<p><b>Recursion</b></p>
<p>
<ul>
    <li>Base case</li>
    <li>Increment/modify value (input)</li>
    <li>Call function inside, until base case met</li>
</ul>
</p>

In [None]:
"""
Implement a function recursivly to get the desired
Fibonacci sequence value.
Your code should have the same input/output as the 
iterative code in the instructions.
"""

def get_fib(position):

    if position > 1:    
        return get_fib(position - 2) + get_fib(position - 1)
    elif position < 0:
        return -1
    else:
        return position
    
# Test cases
print get_fib(9)
print get_fib(11)
print get_fib(0)

<p><b>Bubble Sort</b></p>
<p>
<ul>
    <li>"Naive approach" comparing each element</li>
    <li>Compares pairs, swapping them if necessary</li>
    <li>Largest element in array bubbles to top with each iteration</li>
    <li>(n-1)comparisons * (n-1)iterations = $O(n^2)$</li>
    <li>some implementations know not to compare last i elements, where i = # iterations</li>
    <li>$\Omega$(n) (already sorted)</li>
    <li>in-place sort; constant space complexity</li>
</ul>
</p>

<p><b>Merge Sort</b></p>
<p>
<ul>
    <li>"Divide and conquer" -- breaking up array to smaller parts and sorting as re-assembled</li>
    <li>m-1 comparisons for array of length m</li>
    <li>n comparisons * $log(n)$ steps = $O(nlog(n))$ speed</li>
    <li>Auxillary space = $O(n)$; removing old arrays after every step</li>
</ul>
</p>

<p><b>Quick Sort</b></p>
<p>
<ul>
    <li>pick a value in array "at random" - a <i>pivot</i> - move all values larger above, lower below</li>
    <li>repeat recursively in the new subarrays</li>
    <li>convention is to pick last element as pivot(?)</li>
    <li>$O(n^2)$ (already sorted)</li>
    <li>$\Omega(nlog(n)), \Theta(nlog(n))$</li>
    <li>could pivot on median of last x elements; could run subsequent sorts concurrently</li>
    <li>standard quick sort in-place, $O(1)$</li>
</ul>
</p>

In [None]:
"""
Implement quick sort in Python.
Input a list.
Output a sorted list.
"""

def quicksort(array):
    
    if len(array) > 1:
        pivot_index = len(array) - 1
        pivot_value = array[pivot_index]
        target_index = 0
        while target_index < pivot_index:
            if array[target_index] >= pivot_value:
                popped = array.pop(target_index)
                array.insert(pivot_index, popped)
                #array.append(array.pop(target_index))
                pivot_index -= 1
            else:
                target_index += 1
        
        upper_sorted = quicksort(array[(pivot_index + 1):])             
        lower_sorted = quicksort(array[:pivot_index])        
                
        return lower_sorted + [pivot_value] + upper_sorted
        #return lower_sorted.append(pivot_value) + upper_sorted
        # fails because append method returns None (in-place operation)
    else:
        return array
    

test = [21, 4, 1, 3, 9, 20, 25, 6, 21, 14]
print quicksort(test)

# Maps and Hashing

<p><b>Sets and Maps</b></p>
<p>
<ul>
    <li>collection of elements with no order</li>
    <li>unique elements only</li>
    <li><b>Maps</b> are set-based data structures</li>
    <li>keys in a map are a set (hashed for quick look up) -- $\Omega(1)$ (but uses more space), or $O(m)$ where m is size of bucket array</li>
</ul>
</p>
<p><b>(Magic) Hash function</b></p>
<p>
<ul>
    <li>collection of elements with no order</li>
    <li>a value is converted with a hashing function, producing a ~unique value that's often used as a key</li>
    <li>one common technique when hashing large numbers is to use last few digits % some_num = hash_value</li>
    <li><b>Collisions</b></li>
    <li>avoid collisions by changing hash value or hash function</li>
    <li>store multiple values (eg. a list) in the bucket; 1-3 values ideal (?); or hash the hash value if large buckets</li>
    <li>Load factor = # of Entries / of Buckets</li>
    <li>Load factors closer to 0 waste space; closer to 1 at risk of collisions; >1 guaranteed collisions</li>
    <li><b>Hash Maps</b></li>
    <li>maps that use hash of key to store the key-value pair</li>
    <li><b>String hashing</b></li>
    <li>Java's string hashing function prefers large hash table to collisions</li>
    <li>$s[0]*31^{n-1} + s[1]*31^{n-2} + ... + s[n-1]$</li>
    <li>early hash functions got a lot of juice out of '31', but now w/ more complex functions, more convention</li>
</ul>
</p>

In [None]:
"""
Write a HashTable class that stores strings
in a hash table, where keys are calculated
using the first two letters of the string.

Hash Value = (ASCII Value of First Letter * 100) + ASCII Value of Second Letter

You can assume that the string will have at least two letters, and the first 
two characters are uppercase letters (ASCII values from 65 to 90).
"""

class HashTable(object):
    def __init__(self):
        # table[:6500], table[9091:] a waste?
        self.table = [None]*10000

    def store(self, string):
        """Input a string that's stored in 
        the table."""
        index = self.calculate_hash_value(string)
        if self.table[index] == None:
            self.table[index] = [string,]
            return
        self.table[index].append(string)
        return

    def lookup(self, string):
        """Return the hash value if the
        string is already in the table.
        Return -1 otherwise."""
        index = self.calculate_hash_value(string)
        if self.table[index] == None:          
            return -1
        return index
        
    def calculate_hash_value(self, string):
        """Helper function to calulate a
        hash value from a string."""
        first, second = tuple(string[:2])
        return ord(first) * 100 + ord(second)
    
# Setup
hash_table = HashTable()

# Test calculate_hash_value
# Should be 8568
print hash_table.calculate_hash_value('UDACITY')

# Test lookup edge case
# Should be -1
print hash_table.lookup('UDACITY')

# Test store
hash_table.store('UDACITY')
# Should be 8568
print hash_table.lookup('UDACITY')

# Test store edge case
hash_table.store('UDACIOUS')
# Should be 8568
print hash_table.lookup('UDACIOUS')

# Trees

<p>
<ul>
    <li>starts with a root node</li>
    <li>data added ("branches")</li>
    <li>terminal nodes are "leaves"/external nodes</li>
    <li>collection is a "forest"</li>
    <li>like a linked list, with pointers to several elements</li>
    <li>all nodes must be connected (get to any node from the root)</li>
    <li>must not be any cycles in the tree (encountering the same node more than once; only one parent)</li>
    <li>root is <b>level</b> 1, its children level 2, etc.</li>
    <li><b>edges</b> connect nodes, and form a <b>path</b> between distant nodes</li>
    <li><b>height</b> is inverse level, but 0-based; <b>depth</b> is inverse height</li>
</ul>
</p>

### Tree Traversal
<p><b>DFS - Depth-First Search</b></p>
<ul>
    <li>prioritize searching down to a leaf before exporing other branches</li>
    <li><i>Pre-order traversal</i>: check off nodes in the order they're seen, starting from root</li>
    <li><i>In-order travesal</i>: check off node once entirety of left branch is seen; nodes checked in order when tree sorted</li>
    <li><i>Post-order traversal</i>: check off node only once all of it's children are checked</li>
</ul>
<p><b>BFS - Breadth-First Search</b></p>
<ul><li>prioritize searching all nodes at each level before searching deeping</li></ul>
<p><b>Search and delete</b></p>
<ul>
    <li><b>binary trees</b> nodes have 0-2 children</li>
    <li>searching all nodes $O(n)$</li>
    <li>deleting an internal node requires repairing tree, $O(n)$</li>
    <li>inserting a node into unordered tree -- from root, BFS to find first open spot</li>
    <li>"perfect" trees are full and balanced</li>
    <li>each level n in a binary tree holds $2^{n-1}$ nodes max</li>
</ul>

In [None]:
class Node(object):
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

class BinaryTree(object):
    def __init__(self, root):
        self.root = Node(root)

    def search(self, find_val):
        """Return True if the value
        is in the tree, return
        False otherwise."""
        return self.preorder_search(self.root, find_val)
        
    def print_tree(self):
        """Print out all tree nodes
        as they are visited in
        a pre-order traversal."""
        preordered_nodes = map(str, self.preorder_print(self.root, None))
        print "-".join(preordered_nodes)
        return
        
    def preorder_search(self, start, find_val):
        """Helper method - use this to create a 
        recursive search solution."""
        if start:
            if start.value == find_val:
                return True
            else:
                return self.preorder_search(start.left, find_val) or self.preorder_search(start.right, find_val)
        return False
    
    def preorder_print(self, start, traversal=None):
        """Helper method - use this to create a 
        recursive print solution."""
        
        visited = [] if traversal == None else traversal
        visited.append(start.value)
        
        if start.left != None:
            self.preorder_print(start.left, visited)
        
        if start.right != None:
            self.preorder_print(start.right, visited)
            
        return visited


# Set up tree
tree = BinaryTree(1)
tree.root.left = Node(2)
tree.root.right = Node(3)
tree.root.left.left = Node(4)
tree.root.left.right = Node(5)

# Test search
# Should be True
print tree.search(4)
# Should be False
print tree.search(6)

# Test print_tree
# Should be 1-2-4-5-3
print tree.print_tree()

### Binary Search Tree (BST)
<p>
<ul>
    <li>sorted with lower values to left, larger to right</li>
    <li>search only takes height # of steps, ie. $\Theta(log(n))$</li>
    <li><i>balanced</i> BSTs have nodes that are evenly distributed around root and sub-trees</li>
    <li>in unbalanced trees, search, insert, and delete are $O(n)$</li>
</ul>
</p>

In [None]:
### SOFTBALL

class Node(object):
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

class BST(object):
    def __init__(self, root):
        self.root = Node(root)

    def insert(self, new_val):
        return self.find_open_position(self.root, new_val)
        
    def search(self, find_val):
        return self.find_value(self.root, find_val)
    
    def find_open_position(self, start_node, new_val):
        """
        Checks for direction from start_node and calls itself until no node 
        in that direction exists, then adds a child node with the new_val there.
        """
        
        if new_val < start_node.value:
            if start_node.left:
                self.find_open_position(start_node.left, new_val)
            else:
                start_node.left = Node(new_val)
                return
            
        else: # > start_node.value; "assume that two nodes with the same value won't be inserted"
            if start_node.right:
                self.find_open_position(start_node.right, new_val)
            else:
                start_node.right = Node(new_val)
                return
    
    def find_value(self, start_node, find_val):
        if not start_node:
            return False
        if start_node.value == find_val:
            return True
        return self.find_value(start_node.left, find_val) or self.find_value(start_node.right, find_val)
    
    
# Set up tree
tree = BST(4)

# Insert elements
tree.insert(2)
tree.insert(1)
tree.insert(3)
tree.insert(5)

# Check search
# Should be True
print tree.search(4)
# Should be False
print tree.search(6)

## Heaps
<p>
<ul>
    <li>trees where elements are arranged in increasing/decreasing order, with root the max/min elem.</li>
    <li>max heap - parent must always have a greater value than its child; min heap</li>
    <li>heaps don't have to be binary trees</li>
    <li>max binary heap: peek $O(1)$; search $O(n)$, $\Theta(\dfrac{n}{2})$ (due to ordering)</li>
    <li><i>heapify</i> - reordering the elements based on heap property</li>
    <li>insert $O(log(n))$ -- add item to open position, then heapify by checking the parent element recursively</li>
    <li>delete/insert $O(log(n))$ (roughly the # of the height of tree) -- remove root, put right-most leaf in root position and heapify</li>
</ul>
</p>

### Heap Implementation

<p>
<ul>
    <li>often represented as trees, but stored as arrays (if sorted); saves space</li>
    <li>root is first element, next level is next $2^{level-1}$ elements, etc. (left to right)</li>
</ul>
</p>

## Self-balancing Trees

<p>
<ul>
    <li>try to use the minimum number of levels; insert and delete operations keep balance, and nodes may have additional properties</li>
</ul>
</p>

### Red-Black Tree

<p>
<ul>
    <li>a self-balancing tree, extension of binary search tree</li>
    <li>nodes have additional color property; nodes must be red or black</li>
    <li>every node in the tree w/o two leaves must have Null children; all Null leaf nodes must have black color property</li>
    <li>if a node is red, both its children must be black</li>
    <li>root node must be black ("additional, optional rule")?</li>
    <li>every path from a node to its descendant Null nodes must contain the same number of black nodes</li>
</ul></p>
<p><b>Insertion</b></p>
<ul>
    <li>in general, try to insert new element as red node</li>
    <li>in some cases, rotation is necessary to keep tree balanced</li>
    <li>insert is $O(log(n))$</li>
</ul>
</p>

# Graphs

<ul>
    <li>show how elements are connected; composed of <i>nodes/vertices</i> and <i>edges</i></li>
    <li>edges can have a direction; 'Directed Graph' vs. 'Undirected Graph'</li>
    <li>graph cycles -- can return to starting node through edge; dangerous for some operations</li>
    <li>DAG -- Directed, Acyclic Graph</li>
</ul>

### Connectivity; aka 'Graph Theory'

<ul>    
    <li>Connected graph -- there is some path between one vertex and every other vertex</li>
    <li>connectivity metric -- minimum number of elements (edges?) that would need to be removed for a graph to become disconnected</li>
    <li>more strongly connected graphs would need more such elements removed</li>
    <li>a directed graph is weakly connected when only replacing all of the directed edges with undirected edges can cause it to be connected</li>
    <li>Strongly connected directed graphs must have a path from every node and every other node. So, there must be a path from A to B AND B to A</li>
</ul>

### Graph Representations

<ul>
    <li><b>Edge list</b> -- list of edges represented by a pair of the nodes they connect; eg. [[0,1], [1,2], [2,3]]</li>
    <li><b>Adjacency list</b> -- element at position n of list is a list of the elements node n shares an edge with; also a 2D list</li>
    <li><b>Adjacency matrix</b> -- an n x n 2D array, where each row represents a node, and columns are bools to indicate existence of edge</li>
</ul>

In [None]:
class Node(object):
    def __init__(self, value):
        self.value = value
        self.edges = []

class Edge(object):
    def __init__(self, value, node_from, node_to):
        self.value = value
        self.node_from = node_from
        self.node_to = node_to

class Graph(object):
    def __init__(self, nodes=[], edges=[]):
        self.nodes = nodes
        self.edges = edges

    def insert_node(self, new_node_val):
        new_node = Node(new_node_val)
        self.nodes.append(new_node)
        
    def insert_edge(self, new_edge_val, node_from_val, node_to_val):
        from_found = None
        to_found = None
        for node in self.nodes:
            if node_from_val == node.value:
                from_found = node
            if node_to_val == node.value:
                to_found = node
        if from_found == None:
            from_found = Node(node_from_val)
            self.nodes.append(from_found)
        if to_found == None:
            to_found = Node(node_to_val)
            self.nodes.append(to_found)
        new_edge = Edge(new_edge_val, from_found, to_found)
        from_found.edges.append(new_edge)
        to_found.edges.append(new_edge)
        self.edges.append(new_edge)

    def get_edge_list(self):
        """Don't return a list of edge objects!
        Return a list of triples that looks like this:
        (Edge Value, From Node Value, To Node Value)"""
        return [(i.value, i.node_from.value, i.node_to.value) for i in self.edges]

    def get_adjacency_list(self):
        """Don't return any Node or Edge objects!
        You'll return a list of lists.
        The indeces of the outer list represent
        "from" nodes.
        Each section in the list will store a list
        of tuples that looks like this:
        (To Node Value, Edge Value)"""
             
        adjacency_list = []
        for node in self.nodes:
            matches = []
            for edge in node.edges:
                if edge.node_from == node:
                    matches.append((edge.node_to.value, edge.value))
            adjacency_list.append(matches) if matches else adjacency_list.append(None)
        return adjacency_list       
        
    def get_adjacency_matrix(self):
        """Return a matrix, or 2D list.
        Row numbers represent from nodes,
        column numbers represent to nodes.
        Store the edge values in each spot,
        and a 0 if no edge exists."""
        
        # conflates node value and index?
        
        node_count = len(self.nodes)
        #matrix = [ [0] * node_count] * node_count #!#!#! inner lists are same object!
        matrix = [ [0] * node_count for _ in range(node_count)]
        
        for edge in self.edges:
            matrix[edge.node_from.value][edge.node_to.value] = edge.value
        return matrix

graph = Graph()
graph.insert_node(0) # to make the submission match Udacity
graph.insert_edge(100, 1, 2)
graph.insert_edge(101, 1, 3)
graph.insert_edge(102, 1, 4)
graph.insert_edge(103, 3, 4)
# Should be [(100, 1, 2), (101, 1, 3), (102, 1, 4), (103, 3, 4)]
print graph.get_edge_list()
# Should be [None, [(2, 100), (3, 101), (4, 102)], None, [(4, 103)], None]
print graph.get_adjacency_list()
# Should be [[0, 0, 0, 0, 0], [0, 0, 100, 101, 102], [0, 0, 0, 0, 0], [0, 0, 0, 0, 103], [0, 0, 0, 0, 0]]
print graph.get_adjacency_matrix()

### Graph Traversal

<b>Depth-First Search</b>
<ul>
    <li>like tree DFS, but no root; begin wherever</li>
    <li>common implementations store "seen" nodes in a stack:</li>
    <li>repeat, adding nodes to stack; if hit a repeat node, go back to previous and try another edge</li>
    <li>if run out of edges with new nodes, pop current node from stack and go back to one before it (next on stack)</li>
    <li>continue until stack is empty (for found target node)</li>
    <li>another implementation uses recursion, but no stack:</li>
    <li>$O(|E| + |V|)$ -- each edge visited twice, each vertex visited once</li>
    <li>https://www.cs.usfca.edu/~galles/visualization/DFS.html</li>
</ul>

<b>Breadth-First Search</b>
<ul>
    <li>search every edge of one node before continuing on</li>
    <li>start with a node, mark it as "seen"</li>
    <li>visit adjacent node, and add that node to a queue</li>
    <li>back to first node and repeat for all edges</li>
    <li>when all edges exhausted, dequeue a node and repeat for it</li>
    <li>$O(|E| + |V|)$</li>
    <li>https://www.cs.usfca.edu/~galles/visualization/BFS.html</li>
</ul>

In [None]:
class Node(object):
    def __init__(self, value):
        self.value = value
        self.edges = []

class Edge(object):
    def __init__(self, value, node_from, node_to):
        self.value = value
        self.node_from = node_from
        self.node_to = node_to

# You only need to change code with docs strings that have TODO.
# Specifically: Graph.dfs_helper and Graph.bfs
# New methods have been added to associate node numbers with names
# Specifically: Graph.set_node_names
# and the methods ending in "_names" which will print names instead
# of node numbers

class Graph(object):
    def __init__(self, nodes=None, edges=None):
        self.nodes = nodes or []
        self.edges = edges or []
        self.node_names = []
        self._node_map = {} # node.value: node_obj

    def set_node_names(self, names):
        """The Nth name in names should correspond to node number N.
        Node numbers are 0 based (starting at 0).
        """
        self.node_names = list(names)

    def insert_node(self, new_node_val):
        "Insert a new node with value new_node_val"
        new_node = Node(new_node_val)
        self.nodes.append(new_node)
        self._node_map[new_node_val] = new_node
        return new_node

    def insert_edge(self, new_edge_val, node_from_val, node_to_val):
        "Insert a new edge, creating new nodes if necessary"
        nodes = {node_from_val: None, node_to_val: None}
        for node in self.nodes:
            if node.value in nodes:
                nodes[node.value] = node
                if all(nodes.values()):
                    break
        for node_val in nodes:
            nodes[node_val] = nodes[node_val] or self.insert_node(node_val)
        node_from = nodes[node_from_val]
        node_to = nodes[node_to_val]
        new_edge = Edge(new_edge_val, node_from, node_to)
        node_from.edges.append(new_edge)
        node_to.edges.append(new_edge)
        self.edges.append(new_edge)

    def get_edge_list(self):
        """Return a list of triples that looks like this:
        (Edge Value, From Node Value, To Node Value)"""
        return [(e.value, e.node_from.value, e.node_to.value)
                for e in self.edges]

    def get_edge_list_names(self):
        """Return a list of triples that looks like this:
        (Edge Value, From Node Name, To Node Name)"""
        return [(edge.value,
                 self.node_names[edge.node_from.value],
                 self.node_names[edge.node_to.value])
                for edge in self.edges]

    def get_adjacency_list(self):
        """Return a list of lists.
        The indeces of the outer list represent "from" nodes.
        Each section in the list will store a list
        of tuples that looks like this:
        (To Node VALUE, Edge Value)"""
        max_index = self.find_max_index()
        adjacency_list = [[] for _ in range(max_index)]
        for edg in self.edges:
            from_value, to_value = edg.node_from.value, edg.node_to.value
            adjacency_list[from_value].append((to_value, edg.value))
        return [a or None for a in adjacency_list] # replace []'s with None

    def get_adjacency_list_names(self):
        """Each section in the list will store a list
        of tuples that looks like this:
        (To Node Name, Edge Value).
        Node names should come from the names set
        with set_node_names."""
        adjacency_list = self.get_adjacency_list()
        def convert_to_names(pair, graph=self):
            node_number, value = pair
            return (graph.node_names[node_number], value)
        def map_conversion(adjacency_list_for_node):
            if adjacency_list_for_node is None:
                return None
            return map(convert_to_names, adjacency_list_for_node)
        return [map_conversion(adjacency_list_for_node)
                for adjacency_list_for_node in adjacency_list]

    def get_adjacency_matrix(self):
        """Return a matrix, or 2D list.
        Row numbers represent from nodes,
        column numbers represent to nodes.
        Store the edge values in each spot,
        and a 0 if no edge exists."""
        max_index = self.find_max_index()
        adjacency_matrix = [[0] * (max_index) for _ in range(max_index)]
        for edg in self.edges:
            from_index, to_index = edg.node_from.value, edg.node_to.value
            adjacency_matrix[from_index][to_index] = edg.value
        return adjacency_matrix

    def find_max_index(self):
        """Return the highest found node VALUE number
        Or the length of the node names if set with set_node_names()."""
        if len(self.node_names) > 0:
            return len(self.node_names)
        max_index = -1
        if len(self.nodes):
            for node in self.nodes:
                if node.value > max_index:
                    max_index = node.value
        return max_index

    def find_node(self, node_number):
        "Return the node with value node_number or None"
        return self._node_map.get(node_number)

    def dfs_helper(self, start_node, visited):
        """TODO: Write the helper function for a recursive implementation
        of Depth First Search iterating through a node's edges. The
        output should be a list of numbers corresponding to the
        values of the traversed nodes.
        ARGUMENTS: start_node is the starting Node
        Because this is recursive, we pass in the set of visited node
        values.
        RETURN: a list of the traversed node values (integers).
        """
        ret_list = [start_node.value]
        
        # Your code here
        visited.add(start_node.value)
        to_node_vals = []
        
        for edge in start_node.edges:
            if edge.node_from == start_node:
                to_node_vals.append(edge.node_to.value)
        
        if not to_node_vals:
            return ret_list
        else:
            for node_val in to_node_vals:
                if node_val not in visited:
                    new_vals = self.dfs_helper(self._node_map.get(node_val), visited)
                    ret_list.extend(new_vals)
        return ret_list

    def dfs(self, start_node_num):
        """Outputs a list of numbers corresponding to the traversed nodes
        in a Depth First Search.
        ARGUMENTS: start_node_num is the starting node number (integer)
        RETURN: a list of the node values (integers)."""
        start_node = self.find_node(start_node_num)
        return self.dfs_helper(start_node, visited=set())

    def dfs_names(self, start_node_num):
        """Return the results of dfs with numbers converted to names."""
        return [self.node_names[num] for num in self.dfs(start_node_num)]

    def bfs(self, start_node_num):
        """TODO: Create an iterative implementation of Breadth First Search
        iterating through a node's edges. The output should be a list of
        numbers corresponding to the traversed nodes.
        ARGUMENTS: start_node_num is the node number (integer)
        RETURN: a list of the node values (integers)."""
        node = self.find_node(start_node_num)
        ret_list = [node.value]
        
        # Your code here
        nodes_seen = [node] # a queue
        while nodes_seen:
            current_node = nodes_seen.pop()
            if current_node.edges:
                for edge in current_node.edges:
                    if edge.node_from == current_node and edge.node_to.value not in ret_list:
                        ret_list.append(edge.node_to.value)
                        nodes_seen.insert(0, edge.node_to) # ANTI PATTERN - each element must be reindexed

        return ret_list
        
        
    def bfs_names(self, start_node_num):
        """Return the results of bfs with numbers converted to names."""
        return [self.node_names[num] for num in self.bfs(start_node_num)]

graph = Graph()

# You do not need to change anything below this line.
# You only need to implement Graph.dfs_helper and Graph.bfs

graph.set_node_names(('Mountain View',   # 0
                      'San Francisco',   # 1
                      'London',          # 2
                      'Shanghai',        # 3
                      'Berlin',          # 4
                      'Sao Paolo',       # 5
                      'Bangalore'))      # 6 

graph.insert_edge(51, 0, 1)     # MV <-> SF
graph.insert_edge(51, 1, 0)     # SF <-> MV
graph.insert_edge(9950, 0, 3)   # MV <-> Shanghai
graph.insert_edge(9950, 3, 0)   # Shanghai <-> MV
graph.insert_edge(10375, 0, 5)  # MV <-> Sao Paolo
graph.insert_edge(10375, 5, 0)  # Sao Paolo <-> MV
graph.insert_edge(9900, 1, 3)   # SF <-> Shanghai
graph.insert_edge(9900, 3, 1)   # Shanghai <-> SF
graph.insert_edge(9130, 1, 4)   # SF <-> Berlin
graph.insert_edge(9130, 4, 1)   # Berlin <-> SF
graph.insert_edge(9217, 2, 3)   # London <-> Shanghai
graph.insert_edge(9217, 3, 2)   # Shanghai <-> London
graph.insert_edge(932, 2, 4)    # London <-> Berlin
graph.insert_edge(932, 4, 2)    # Berlin <-> London
graph.insert_edge(9471, 2, 5)   # London <-> Sao Paolo
graph.insert_edge(9471, 5, 2)   # Sao Paolo <-> London
# (6) 'Bangalore' is intentionally disconnected (no edges)
# for this problem and should produce None in the
# Adjacency List, etc.

import pprint
pp = pprint.PrettyPrinter(indent=2)

print "Edge List"
pp.pprint(graph.get_edge_list_names())

print "\nAdjacency List"
pp.pprint(graph.get_adjacency_list_names())

print "\nAdjacency Matrix"
pp.pprint(graph.get_adjacency_matrix())

print "\nDepth First Search"
pp.pprint(graph.dfs_names(2))

# Should print:
# Depth First Search
# ['London', 'Shanghai', 'Mountain View', 'San Francisco', 'Berlin', 'Sao Paolo']

print "\nBreadth First Search"
pp.pprint(graph.bfs_names(2))

# Should print:
# Breadth First Search
# ['London', 'Shanghai', 'Berlin', 'Sao Paolo', 'Mountain View', 'San Francisco']

### Notable Paths

<b>Eulerian Path and Cycle</b>
<ul>
    <li>travels through every edge at least once</li>
    <li><i>Eulerian Cycle</i> -- traverse each path only once, end at starting node</li>
    <li>Eurlerian Cycle possible if all verteces have an even degree (# of edges connected to them)</li>
    <li>regular Eulerian Path can have two odd-degree verteces, if they're the start and end of the cyle</li>
    <li><b>Finding Eurlerian Cycles:</b></li>
    <li>start at any vertex, follow edges until return back that vertex (eg. ABE A)</li>
    <li>if not used all edges, pick an edge that starts from a vertex seen already, and repeat until no unseen edges (eg. BCD B)</li>
    <li>combine the paths joined by the verteces they have in common (eg. ABCDE)</li>
    <li>$O(|E|)$</li>
</ul>

<b>Hamiltonian Path</b>
<ul>
    <li>must go through every vertex once</li>
    <li>Hamiltonian Cycle must start and end with same vertex</li>
    <li>Determining if Hamiltonian Cycle exists is tricky</li>
</ul>

# Case Studies in Algorithms

## Shortest Path Problem

<ul>
    <li>if weighted edges, smallest cummulative weight between two verteces</li>
    <li>if unweighted edges, smallest number of edges between two verteces (BFS)</li>
</ul>

### Dijkstra's Algorithm

<ul>
    <li>one solution to Shortest Path for weighted, undirected graphs</li>
    <li>verteces have distance value (sum of edges between starting point and current vertex) with default value of infinity, replaced once weight is known or smaller weight found</li>
    <li>one common implementation uses a Min. Priority Queue, where minimum element is removed first</li>
    <li>"greedy" -- chooses next node by smallest distance value (queue.extract_min())</li>
    <li>basically $O(|V|^2)$,  $\Omega(|E| + |V|log(|V|))$</li>
</ul>

## Knapsack Problem

<ul>
    <li>a knapsack can hold a limited amount of weight</li>
    <li>there are items to place in the knapsack, each with a weight and a value</li>
    <li>how to optimize total value of items in the knapsack?</li>
    <li>brute force is exponential, $O(2^n)$ (imagine binary string, each item in (1) or out (0))</li>
    <li><b>a better way:</b></li>
    <li>max value for min weight, until weight full</li>
    <li>...poor explanation... $O(nW)$ where W is weight limit.. "psuedo-polynomial time" solution</li>
</ul>

## Dynamic Programming

<ul>
    <li>breaking problems down into sub-problems</li>
    <li>base case: smallest computation</li>
    <li>lookup table often used to store solutions to sub-problems</li>
    <li><i>memoization</i> -- storing pre-computed values</li>
</ul>

## Traveling Salesman (TSP)

<ul>
    <li><i>NP Hard</i> problem; (non-polynomial?); efficient general solution doesn't quite exist, actively researched</li>
    <li><i>Exact Algorithms</i> don't happen in polynomial time, but will get the correct answer</li>
    <li><i>Approximation Algorithms</i> don't always find exact solution, but generally optimal and often polynomial</li>
</ul>

# Technical Interviewing Techniques


#### Clarifying the Question

<ul>
    <li>to not dive head-first wasting time towards wrong solution</li>
</ul>

#### Confirming Inputs

<ul>
    <li>confirm and spec inputs, return value(s)</li>
</ul>

#### Test Cases

<ul>
    <li>account for edge cases: null or empty values, etc.</li>
</ul>

#### Brainstorming

<ul>
    <li>open brainstorming to show thought process, elicit (potentially helpful) interviewer feedback</li>
    <li>generalizing problem to data structures or algorithms can help</li>
</ul>

#### Runtime Analysis

<ul>
    <li>show understanding of runtime performance</li>
</ul>

#### Coding

<ul>
    <li>explaining out loud (higher-level) purpose of code being written</li>
</ul>

#### Debugging

<ul>
    <li>check over code after writing</li>
</ul>