# 1.1 Is Unique

Implement an algorihtm to determine if a string has all unique characters. What if we cannot use additional data structures?

## Answer - Hash Table / Dictionary

* uniqueness checking = use a hash table
* hash a character. if it's already in the hash table then we've seen it before and return false. otherwise return true
* runtime:
  * hash a character: O(1)
  * n characters -> O(n)
  * do this once on a string

In [1]:
from collections import defaultdict

def is_unique(string):
    d = defaultdict(int)
    for character in string:
        index = hash(character)
        if d[index]:
            return False
        
        d[index] = 1
    return True

# tests
assert(is_unique('a'))
assert(is_unique(''))
assert(is_unique('abc'))
assert(not is_unique('aa'))
assert(not is_unique('hotelmotelholidayinn'))

## Can we do better?

*Time*: No. At the very least each character needs to be visited.

*Memory*: Maybe.

## Book Answer

* Uses a fixed length array O(n)
* Also uses a bit vector of length 
* No additional data strctures:
  * O(n^2) --- compare everything with everything after it
  * O(n log(n)) + O(n) --- sort and then compare neighbors
  
### Side Note: Python's Sorting Algorithm

Python uses [timsort](https://en.wikipedia.org/wiki/Timsort)
* Time:
  * $O(n)$ best
  * $O(n \log(n))$ average
  * $O(n \log(n))$ worst
* Memory: $O(n)$

## 1.2 Check Permutation

Given two strings, write a method to decide if one is a permutation of the other.

## Solution

Can also be solved by a hash table: keys = h(characters), values = #occurrences in `s1`. After processing `s1` you next process the characters `c` in `s2`.
* if `c` is not in the hash table: return False
* if `c` is in the hash table:
    if d[`c`] == 0: return false
    else: d[`c`] -= 1

**Complexity**
* Time: $2n = O(n)$
  * n hash stores ($O(1)$ each) in `s1`, n hashes in `s2`
* Memory: $n = O(n)$
  * best case: $1 = O(1)$ (both strings contain only one letter)
  * worst case: $n = O(n)$. ($n$ different letters)

In [2]:
from collections import defaultdict

def is_permutation(s1, s2):
    # first check if s1 and s2 are same length
    if len(s1) != len(s2):
        return False
    
    d = defaultdict(int)
    
    # store s1
    for c in s1:
        d[c] += 1
        
    # compare with s2
    for c in s2:
        if not d[c]:
            return False
        else:
            d[c] -= 1
    return True

# tests
assert(is_permutation('', ''))
assert(is_permutation('a', 'a'))
assert(is_permutation('ab', 'ba'))
assert(is_permutation('abc', 'cab'))
assert(not is_permutation('abc', 'caa'))
assert(not is_permutation('abc', ''))
assert(not is_permutation('', 'abc'))


### Can we reduce constant factor from subtraction?

We can use a null stack and pop.

* Memory: $O(n)$
  * worst case: $n = O(n)$ (even if there is only one letter you need $n$ list elements to track the occurrences
  * best case: $n$
  
However, the memory operations to create a stack node may take longer than integer arithmetic. (1 cycle for addition)

In [3]:
from collections import defaultdict

def is_permutation(s1, s2):
    # first check if s1 and s2 are same length
    if len(s1) != len(s2):
        return False
    
    d = defaultdict(list)
    
    # store s1
    for c in s1:
        d[c].append(None)
        
    # compare with s2
    for c in s2:
        if not d[c]:
            return False
        else:
            d[c].pop()
    return True

# tests
assert(is_permutation('', ''))
assert(is_permutation('a', 'a'))
assert(is_permutation('ab', 'ba'))
assert(is_permutation('abc', 'cab'))
assert(not is_permutation('abc', 'caa'))
assert(not is_permutation('abc', ''))
assert(not is_permutation('', 'abc'))



# 1.9 String Rotation

Assume you have a method `isSubstring` which checks is one word is a substring of antoher. Given two strings, `s1`, and `s2`, write code to check if `s2` is a rotation of `s1` using only one call to `isSubstring` (e.g. `waterbottle` is a rotation of `erbottlewat`.)

## Attempt

We might first assume that we can take, say, the first four letters of `waterbottle` and check if it's a substring of `s2`. However, this wouldn't work since `wate` is not a substring of `s2`.

Consider the first character of `s1` and find all the positions in `s2` where the character occurs. Store indices in queue.
* $O(n)$ operation

Now take the second character of `s1`. dequeue an index `i` and check if `i+1 % n` of `s2` is equal to the character. If so, enqueue `i+1 % n`. Otherwise, discard.
* $O(n)$ operation

Problem: we need to keep track of old and new parts. Use two queues?

Repeat: need to do this $n$ times so this is an $O(n^2)$ solution.

In [4]:
def is_rotation(s1, s2):
    n = len(s1)
    if n != len(s2):
        return False
    
    # get the indices in s2 matching s1[0]
    current = [i for i in range(n) if s1[0] == s2[i]]
    for k in range(1,n):
        # get the indices in s2 matching s1[i]
        next = [(i+1) % n for i in current if s1[k] == s2[(i+1) % n]]
        current = next
        
        if current == []:
            return False
        
    return True
    
    
# tests
assert(is_rotation('abc', 'abc'))
assert(is_rotation('abc', 'cab'))
assert(is_rotation('waterbottle', 'erbottlewat'))
assert(is_rotation('aaabbbccc', 'aabbbccca'))

assert(not is_rotation('abc', 'acb'))

def shift_check(s1):
    n = len(s1)
    for k in range(n):
        s2 = s1[:n] + s1[n:]
        assert(is_rotation(s1,s2))
        
shift_check('banana')
shift_check('abracadabra')
shift_check('puppymonkeybaby')
shift_check('aaaaaaaaaaaaah')

The worst case is when the string consists of `n` equal characters.

## Anything Faster?

Hint: try concatenating `s2` with itself.

## Solution

if `s1` splits into two parts `x` and `y` such that `xy = s1` and `yx = s2` then we're done. However, `yx` is always a substring of `xyxy = s1s1`. So the solution is to do

`is_substring(s1s1, s2)`

# Random Problem - p. 67

Given an array of distinct integer values, count the number of pairs of integers that have difference `k`. For example, given `{1, 7, 5, 9, 2, 12, 3}` and `k = 2` there are four pairs with difference 2: `(1,3), (3,5), (5,7), (7,9)`.

# My Solution

Easier to determine differences after sorting: check current and next. 
* sorting: $O(n \log n)$
* checking: single forward scan $O(n)$
* ==> $O(n \log n)$

Is there an approach faster than $O(n \log n)$?

`a - b = k iff h(a-b) = h(k) iff h(a) == h(k+b)`

One approach is to take the array, `arr`, and the shfited array `arr + k`. hash the elements of `arr` and the hash the elements of `arr + k`. If there is collision then we found a pair.

Does it matter that `a > b`

`a - b < 0 ==> b - a > 0 ==> b - a == k iff b == k + a iff h(b) = h(k+a)`

but we end up computing h(b) anyway from `arr` and `h(k+a)` from `arr + k`. So we shouldn't have to check for which is greater or not.

**Complexity**

* Total 2n hashes computed: Time = $O(n)$
* Only n hashes stored: Mem = $O(n)$

In [5]:
from collections import defaultdict

def f(arr, k):
    d = defaultdict(int)
    num_pairs = 0
    
    for a in arr:
        d[hash(a)] = 1
        
    for b in arr:
        if d[hash(b+k)]:
            num_pairs += 1
            
    return num_pairs

arr = [1, 7, 5, 9, 2, 12, 3]
print f(arr, 2)

4


# Random Problem - p. 70

Given a smaller string `s` and a bigger string `b` design an algorithm to find all permutations of the shorter string within the longer one. Print the location of each permutation.

## My Solution

The wrong way: create a list of all permutations of `s` and perform a linear search on `b` for each one. $O(|b|\;|s|!)$

Create a hash table with `character of s` |-> `number of occurrences`. Then, starting at the first character of `b` scan up to `|s|` spaces and decrement the corresponding number of occurences:
* if there is a key error, break and move to one past the key error (any subsequent words won't work until afterwards)
* if by the time we reach |s| increment by one and repeat
* otherwise, if we reach |s| and the number of occurrences is now zero we add the index to the list

example

`s = cat`
`b = tacocat`

output

`0,4`

**Analysis**

* Creation of `s` hash table: $O(s)$
* Scanning each `|s|` block of text in `b`: $O(s)$
* Number of substrings we need to check: $O(b-s)$
* Total: $O(s(b-s)) = O(sb)$
* Skipping trick (constant factor)

In [6]:
from collections import defaultdict

def all_permutations(s, b):
    d = defaultdict(int)
    
    # build the s-character dictionary
    for char in s:
        d[char] += 1
        
    # start scanning
    indices = []
    n = len(b) - len(s) + 1
    s = len(s)
    for i in range(n):
        # we want a copy since Python assigns dictionaries by reference
        d_pass = d.copy()
        
        is_substring = True
        for j in range(s):
            char = b[i+j]
            if d_pass[char] > 0:
                d_pass[char] -= 1
            else:
                is_substring = False
                break
                
        if is_substring:
            indices.append(i)
            
    return indices
            
# tests

assert(all_permutations('a', 'a') == [0])
assert(all_permutations('a', 'aa') == [0,1])
assert(all_permutations('a', 'aaa') == [0,1,2])
assert(all_permutations('a', 'aba') == [0,2])
assert(all_permutations('ab', 'abcba') == [0,3])
assert(all_permutations('bc', 'abcba') == [1,2])

assert(all_permutations('cat', 'tacocat') == [0,4])



## Alternate Approach

I'm thinking we might be able to get this down to $O(b-s) + O(s)$ by using a "scanner approach".

`s = bat`

`b = otbafabstab`

`otbafabstab` should output `[1, 8]`

* intialize on first s: count the number of matching letters

no. this will still be O(s(b-s)) time. 

# EPI - Problem 6.8

Design an algorithm that takes a sequence of n three-dimensional coordinates to be traversed and return sthe minimum battery capacity needed to complete the journey. The robot begins with the battery fully charged.

## My Solution

`pi = (xi,yi,zi)`

Note that the x,y coordinates don't matter since recharging is a function of potential energy, not slope. Therefore, the problem can be rephrased as

> "Given `[z1, ..., zn]` heights the robot spends / accumulates energy by travelling from `zi` to `zj` which is proportional to `zi - zj`. (If `zj > zi` then the robot spends energy and vice versa.) Find a sequence of `zi`'s that minimized the amount of energy spent.

By visiting `z_i1`, `z_i2`, ..., `z_in` the robot's net energy is `(z_i1 - z_i2) + (z_i2 - z_i3) + ... + (z_i(n-1) - z_in) = (z_i1 - z_in)`. Therefore, **for a given sequence, `z`,** the difference of the first and last elements of the sequence is the amount of energy spent.

...

This problem statement is a confusing version of the max-difference problem:

> Given stock prices `s[i]` find the indices `i < j` such that `s[j] - s[i]` is maximized. Return the maximized value.(Buy low and sell high.)

**Brute force:** Compute all pairwise differences.

In [7]:
def max_difference(s):
    value = 0
    n = len(s)
    for i in range(n):
        for j in range(i,n):
            diff = s[j] - s[i]
            if diff > value:
                value = diff
                
    return value
                
s = [3, 6, 2, 4, 10, 8, 1]
assert(max_difference(s) == 8)

**Better approach:** Observe that on a given day `j` the max profit can be obtained by having bought on the minimum of the previous days `i`. Time $O(n)$. Space $O(1)$.

In [8]:
def max_difference(s):
    # need at least two elements to compare
    n = len(s)
    assert(n > 1)
    
    max_diff = 0
    current_diff = 0
    min_price = s[0]
    for i in range(1,n):
        current_diff = s[i] - min_price
        if (s[i] - min_price) > max_diff:
            max_diff = current_diff
        elif s[i] < min_price:
            min_price = s[i]
            
    return max_diff
            
s = [3, 6, 2, 4, 10, 8, 1]
assert(max_difference(s) == 8)

# Generalization of Max Profit Problem

Original problem:

> Given a time series of prices, `s`, find the maximum difference `s[j] - s[i]` where `i < j`. (Buy at `i` sell at `j`.)

Next step generalization:

> Given a time series of stock prices, `s`, assume you can buy and sell twice but you must sell first before you can buy again. What is the maximum profit that can be made?

## My Solution

The max difference code above can be used to construct an $O(n^2)$ solution. Let `max_difference_bounded(s, i, j)` be the max difference occuring starting at `s[i]` and ending (inclusive) at `s[j]`. The strategy, then, is for each `i` to partition the array into two arrays `s[0..i]` and `s[(i+1)..n]` and compute the sum of the max differences on each partition.

Example:

`[2, 3, 10, 6, 4, 8, 1] -> (10-2 = 8) + (8-4 = 4) = 12`

Example:

`[7, 9, 5, 6, 3, 2] -> (9-7 = 2) + (6-5 = 1) = 3`

In [9]:
def max_difference_bounded(s, i, j):
    return max_difference(s[i:j])

def max_difference_two_buys(s):
    n = len(s)
    assert(n > 2)
    
    max_diff = 0;
    for i in range(1,n-1):
        diff = max_difference_bounded(s, 0, i+1) + max_difference_bounded(s, i, n)
        if diff > max_diff:
            max_diff = diff
            
    return max_diff

l = [2, 3, 10, 6, 4, 8, 1]
assert(max_difference_two_buys(l) == 12)

l = [7, 9, 5, 6, 3, 2]
assert(max_difference_two_buys(l) == 3)

**Greedy Approach**

Here's a possible solution in $3n = O(n)$ time: rewrite the `max_difference` algorithm to return the indices `i,j` where `s[j] - s[i]` are maximized. The max difference therefore occurs in the intervals `[0,j]` and `[i,n]`. If the former case, add to it the max difference in `[j+1,n]` and in the latter add to it the max difference in `[0,i-1]`. Take the max of these two.

In [10]:
def max_difference(s):
    # need at least two elements to compare
    n = len(s)
    if n < 2:
        return 0, 0, 0
    
    max_diff = 0
    min_price = s[0]
    buy_index = 0
    sell_index = n+1
    for i in range(1,n):
        diff = s[i] - min_price
        if (diff > max_diff):
            max_diff = diff
            if i > buy_index:
                sell_index = i            
        elif (s[i] < min_price):
            min_price = s[i]
            if i < sell_index:
                buy_index = i
            
    return max_diff, buy_index, sell_index

# test new max_difference
l = [2, 3, 10, 6, 4, 8, 1]
diff, i, j = max_difference(l)
assert(diff == 8)
assert(i == 0)
assert(j == 2)

l = [7, 9, 5, 6, 3, 2]
diff, i, j = max_difference(l)
assert(diff == 2)
assert(i == 0)
assert(j == 1)

l = [8, 6, 2, 3, 10, 6, 4, 1, 3]
diff, i, j = max_difference(l)
assert(diff == 8)
assert(i == 2)
assert(j == 4)

def max_difference_two_buys(s):
    n = len(s)
    max_diff, i, j = max_difference(s)
    
    if i > 0:
        other_diff_1, _, _ = max_difference(s[:i])
    else:
        other_diff_1 = 0
        
    if j < n:
        other_diff_2, _, _ = max_difference(s[(j+1):])
    else:
        other_diff_2 = 0
    
    max_diff += max(other_diff_1, other_diff_2)
    return max_diff

l = [2, 3, 10, 6, 4, 8, 1]
assert(max_difference_two_buys(l) == 12)

l = [7, 9, 5, 6, 3, 2]
assert(max_difference_two_buys(l) == 3)

l = [8, 2, 3, 10, 6, 4, 1, 3]
assert(max_difference_two_buys(l) == 10)

# Cracking  - Problem 16.2 p. 181

Design a method to find the frequency of occurrences of any given word in a book. What is we were running this algorithm multiple times?

## My Solution

Implementation of `find_word(word, text)` is `O(kn)` where `k = len(word)` and `n = len(text)`:
* search for first letter of `word`
* when found, check if next letter is second letter of `word`
* at worst, `n` checks of `k` letters
* modify to return index of occurrence `idx` in text and rerun on `text[idx:]`
* compute `cumsum` of result array


In [11]:
def find_word(word, text):
    k = len(word)
    n = len(text)

    text_index = 0
    while (text_index < n):
        word_index = 0
        while (word_index < k):
            # check if the current location of the text matches
            # the current location of the word
            if text[text_index] != word[word_index]:
                text_index += 1
                break
                
            # move forward in both text and word
            text_index += 1
            word_index += 1
                
        # if the inner loop ran to completion then we found 
        # he location of the word
        if word_index == k:
            return text_index - k
        
    return n+1

def find_words(word, text):
    n = len(text)
    locations = []
    index = 0
    while (index < n):
        # search for the next occurrence of the word
        # at the current point
        index += find_word(word, text[index:])
        locations.append(index)
        
        # step past this occurrence to begin searching 
        # for the next one
        index += 1
        
    # we'll always get a "not found" at the end
    return locations[:-1]

word = 'cat'
text = 'the cat in the hat cat'

index = find_word(word, text)
print index
index = find_word(word, text[(index)+1:])
print index

4
14


In [12]:
print find_words(word, text)
print len(text)


[4, 19]
22


### Time Taken: 19 mins

Final comments: use a hash table to store previous searches. (Maybe up to a certain size.)

## Book Solution / Comments

One approach is to hash the entire book with words as keys and frequencies as values. This is easily done if we assume that each word is separated by a space.

In [13]:
from collections import defaultdict

def convert_to_table(text):
    words = map(lambda s: s.lower(), text.split(' '))
    d = defaultdict(int)
    for string in words:
        d[string] += 1

    return d

def frequency(word, table):
    word = word.lower()
    return table[word]


word = 'cat'
text = 'the cat in the hat cat'
table = convert_to_table(text)

print frequency(word, table)

2


# Cracking  - Problem 16.3 p. 181

Design an algorithm to figure out if someone has won a game of tic-tac-toe.

## My Solution

How do we represent a tic-tac-toe-grid?
* matrix:
  * `x` = `1`
  * `o` = `-1`
  * not filled = 0
  
x + x + x == 3
o + o + o == -3

If a row, column, or diag sum is equal to three then `x` wins. If equal to -3 then `o` wins.
Search all rows, columns, and the two diags. If `n x n` tic-tac-toe board then `n` rows, `n` cols, and `2` diags.

In [14]:
def sum_winner(value, n):
    if value == n:
        return 1
    if value == -n:
        return -1
    return 0

def winner(grid):
    n = len(grid)
    winner = 0
    
    # check rows
    for row_index in range(n):
        value = sum(grid[row_index])
        if value == n:
            return 1
        if value == -n:
            return -1
        
    # check cols
    for col_index in range(n):
        value = sum(grid[k][col_index] for k in range(n))
        if value == n:
            return 1
        if value == -n:
            return -1
        
    # check diags
    diag = sum(grid[k][k] for k in range(n))
    antidiag = sum(grid[k][n-k-1] for k in range(n))
    if (diag == n) or (antidiag == n):
        return 1
    if (diag == -n) or (antidiag == -n):
        return -1
        
    return 0

grid = [
    [1,0,0],
    [1,0,0],
    [1,0,0]
]
assert(winner(grid) == 1)

grid = [
    [1,1,1],
    [0,0,0],
    [0,0,0]
]
assert(winner(grid) == 1)

grid = [
    [1,1,1],
    [0,-1,0],
    [0,0,-1]
]
assert(winner(grid) == 1)

grid = [
    [-1,1,1],
    [0,-1,0],
    [0,0,-1]
]
assert(winner(grid) == -1)


grid = [
    [-1,1,1,1],
    [0,-1,0,1],
    [0,1,-1,0],
    [0,0,1,-1],
]
assert(winner(grid) == -1)

## Time: 11 mins

Some observations:
* no checking if there are multiple winners

Shouldn't be too bad to solve.

# Cracking - 17.10 p.187

A majority element is an element that makes up more than half of the items in an array. Given a positive integers array find the majority element. If there is not majorithm element then return `-1`. Do this in `O(n)` time and `O(1)` space.

### My Solution

Can do this in `O(n)` time `O(n)` space with a hash table: key = element, value = frequency.

* Idea #1: Given up to position `k` keep a hash table of the most frequent element visited thus far. (Bad idea in the event that all elements are unique, overwriting, etc.


examples:

`[1,0,0,0,0,1,1,1,1,1,1,1] - len(10)`

* the moment a count `> n/2` we can stop
* actually, the binary case is a good starting point: either `0` or `1` will be majority
* trinary case: if `2` is majority then `#0's` and `#1's < n/2`

other approach: subdivide

problem:

```
[1,1 | 1,0,0]

 1:2    0:2

* majority left:  O(n/2)
* majority right: O(n/2)
* occurrence of left in right, right in left: O(?) --> is there an invariant?


If x is the majority then it is always the majority of some subarray
```

In [15]:
from collections import defaultdict

def majority(array):
    # store frequencies O(n) time O(n) space
    d = defaultdict(int)
    for a in array:
        d[a] += 1
        
    # O(n) scan of array to retrieve highest frequency
    maxvalue = 0
    number = None
    for a in array:
        value = d[a]
        if value >= maxvalue:
            maxvalue = value
            number = a
            
    return number

array = [1,2,5,9,5,9,5,5,5]
print majority(array)

5


```
[1 2 5 9 5 9 5 5 5]

[1 2 5 9] [5 9 5 5 5]

[1 2] [5 9] | [5 9] [5 5 5]

[1] [2] . [5] [9] | [5] [9] . [5] [5 5]
   2:1      9:1        9:1        5:3
       2:1                  5:3
                
[1]  [2] -> 1:1  2:2
[5]  [9] -> 5:1  9:1  \
[5]  [5] -> 5:2       / 5:3 9:1

So is there a special case in a tie?
```

In [16]:
def majority_fast(array):
    subarray_length = 0
    current_majority = array[0]
    current_majority_count = 1
    for a in array[1:]:
        subarray_length += 1
        if a == current_majority:
            current_majority_count += 1
        
        if current_majority_count <= subarray_length/2:
            current_majority = a
            subarray_length = 0
            
    return current_majority

array = [1,2,5,9,5,9,5,5,5]
print majority(array)

5


In [17]:
import numpy
array = numpy.random.randint(0,4,10)
print array


[3 2 2 1 3 2 1 1 1 1]


In [18]:
array[-1] = 2
print array

[3 2 2 1 3 2 1 1 1 2]


In [19]:
print majority(array)
print majority_fast(array)

2
1


# Cracking - 4.2 - Minimal Tree

Given a sorted array with unique integer elements write an algoriitm to create a binary search tree with *minimal height*.

### My Solution

Naive implementation: --standard binary tree creation.--

But this is not necessarily balanced. Actually, will reduce to linked list case!!!!

Solution: midpoint and recurse

In [20]:
class Node(object):
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None
        
    def add(self, value):
        if value < self.value:
            if self.left:
                self.left.add(value)
            else:
                self.left = Node(value)
        else:
            if self.right:
                self.right.add(value)
            else:
                self.right = Node(value)
            
    def dfs(self):
        result = [self.value]
        if self.left:
            result.extend(self.left.dfs())
        if self.right:
            result.extend(self.right.dfs())
        return result
                
def sorted_tree(array, tree=None, left=0, right=len(array)):
    mid = (left + right)/2
    
    # root case
    if tree is None:
        tree = Node(array[mid])
        
    tree.value = array[mid]
    if left < right:
        tree.left = sorted_tree(array, tree, left, mid-1)
        tree.right = sorted_tree(array, tree, mid+1, right)
        
    return tree

In [21]:
import random
array = [random.randint(-10, 10) for _ in range(10)]
print array
         

[8, 3, 9, -5, 0, -6, 1, -1, 7, 6]


In [22]:
array[6] = -3
array.sort()
print array

[-6, -5, -3, -1, 0, 3, 6, 7, 8, 9]


# Cracking 17.17

In [24]:
class Trie(object):
    def __init__(self, string=None):
        self.root = Node()
        self.root.insert(string, 0)
        
    def __getitem__(self, key):
        return self.root.children[key]
        
    def insert(self, string, location):
        self.root.insert(string, location)
        
    def search(self, string):
        return self.root.search(string)
        
    def dfs(self):
        self.root.dfs()
        
class Node(object):
    # each node is indexed by a character and its values are the indices
    # at which the string occurrs in a given string b
    def __init__(self):
        self.children = {}
        self.indices = []
        
    def __repr__(self):
        return str(self.children)
    
    def __getitem__(self, key):
        return self.children[key]
        
    def insert(self, string, index):
        self.indices.append(index)
        
        if string:
            key = string[0]
            
            if key in self.children:
                child = self.children[key]
            else:
                child = Node()
                self.children[key] = child
                
            # handle remainder
            suffix = string[1:]
            child.insert(suffix, index+1)
        else:
            self.children['\x00'] = None # terminating character
            
    def search(self, string):
        if not string:
            return self.indices
        
        first = string[0]
        if first in self.children:
            suffix = string[1:]
            return self.children[first].search(suffix)
        
        return []
        
    def is_terminal(self):
        return '\x00' in self.children
    
    
    def dfs(self):
        keys = self.children.keys()
        for key in keys:
            print key
            node = self.children[key]
            if node:
                node.dfs()
                
def create_trie_from_string_list(T):
    trie = Trie()
    for s in T:
        trie.instert(s,0)
    return trie
            
def search_all(b, T):
    lookup = create_trie_from_string_list(T)
    n = len(b)
    
    for i in range(n):
        strings = find_strings_at_location(self.root, big, i)
        print strings, i    

In [25]:
T = ['is', 'ppi', 'hi', 'sis', 'i', 'ssippi']
b = 'mississippi'

trie = Trie()
#for s in T:
#    trie.insert(s,0)
    
#trie.root.children
trie.insert(b, 0)

In [26]:
trie.dfs()

 
m
i
s
s
i
s
s
i
p
p
i
 


In [27]:
trie.search('s')

[]

# Anagram Checker

An anagram of `cinema` is `iceman`.

In [28]:
from collections import defaultdict

def is_anagram(s1, s2):
    d = defaultdict(int)
    
    # make sure s1 and s2 are the same length
    if len(s1) != len(s2):
        return False
    
    # build letter count
    for c in s1:
        d[c] += 1
        
    # decrement letter count using s2. note that
    # because the lengths are the same a count will
    # be negative iff a letter doesn't exist or
    # occurrs more often in s2 than in s1
    for c in s2:
        if d[c] == 0:
            return False
        else:
            d[c] -= 1
            
    return True
        

In [29]:
is_anagram('cinema', 'iceman')

True

# EPI - Linked List Problems

**8.1** Merge two sorted lists.

In [30]:
class Node(object):
    def __init__(self, value=None):
        self.value = value
        self.next = None
        
    def __repr__(self):
        s = str(self.value)
        if self.next:
            s += ',' + str(self.next)
        return s
        
    def insert(self, value):
        if self.next:
            self.next.insert(value)
        else:
            self.next = Node(value)
            
    def insert_node(self, node):
        if self.next:
            self.next.insert_node(node)
        else:
            self.next = node
            
            
L = Node(0)
L.insert(2)
L.insert(3)
L.insert(8)
L.insert(9)

R = Node(1)
R.insert(5)
R.insert(6)

print L
print R

0,2,3,8,9
1,5,6


In [31]:
def merge(L, R):       
    # T is the merged list, node is the end of T
    node = Node() # dummy head
    T = node
    while (L and R):
        print node
        # both L and R exist: compare
        if L.value < R.value:
            # current node points to L's head
            node.next = L
            
            # move 'node' to next 
            node = node.next
            
            # move the head of L
            L = L.next
        else:
            # current node points to R's head
            node.next = R
            
            # move 'node' to next 
            node = node.next
            
            # move the head of R
            R = R.next
            
    # if elements of L remain, add. otherwise, elements of R remain
    if L:
        node.next = L
    elif R:
        node.next = R
        
    # return the linked list without the dummy
    return T.next
          
T = merge(L,R)

print T

None
0,2,3,8,9
1,5,6
2,3,8,9
3,8,9
5,6
0,1,2,3,5,6,8,9


In [32]:
def reverse(L):
    # initialize the head of the reversed list to the head of L
    head = L
    rev = None
    
    while (head):
        new_head = head.next
        head.next = rev
        rev = head
        head = new_head
        
    return rev

L = Node(0)
L.insert(2)
L.insert(3)
L.insert(8)
L.insert(9)

R = Node(1)
R.insert(5)
R.insert(6)

print L
print reverse(L)

print
print R
print reverse(R)

0,2,3,8,9
9,8,3,2,0

1,5,6
6,5,1


**Problem 8.3** Write a function which takes a singly linked list L and two integers S and f as arguments and reverses the order of the nodes from the s-th node to the f-th ndoe, inclusive.

Numbering begins at 1. Perform the reversal in a single pass. Do not allocate additional nodes.



In [33]:
def reverse_sublist(L, s, f):
    # scan to find the start
    start = L
    index = 1
    while (index < s):
        start = start.next
        index += 1
        
    # `start` now points to the node jsut before where
    # we want to begin reversing
    sub_head = start.next
    rev_sub_head = None
    rev_sub_tail = sub_head
    
    # build the reverse, but only up to index f
    while (index < f):
        new_sub_head = sub_head.next
        sub_head.next = rev_sub_head
        rev_sub_head = sub_head
        sub_head = new_sub_head
        index += 1
        
    # make start point to the reversed sublist
    start.next = rev_sub_head
    
    # make the end of the reversed sublist point to the
    # rest of the list
    rev_sub_tail.next = sub_head

    return L


L = Node(0)
L.insert(2)
L.insert(3)
L.insert(8)
L.insert(9)
L.insert(14)
L.insert(20)
L.insert(43)
L.insert(100)


print L
print reverse_sublist(L,2,6)

0,2,3,8,9,14,20,43,100
0,2,14,9,8,3,20,43,100


In [34]:
L = Node(0)
L.insert(2)
L.insert(3)
L.insert(8)
L.insert(9)
L.insert(14)
L.insert(20)
L.insert(43)
L.insert(100)
L.insert_node(L.next.next.next)

r = Node(0)
r.insert(2)
r.insert(3)
r.insert(8)
r.insert(9)
r.insert(14)
r.insert(20)
r.insert(43)
r.insert(100)

In [35]:
def has_cycle_dict(L):
    node = L
    d = {}
    while node:
        if node in d:
            return True
        d[node] = True
        node = node.next
    return False

print has_cycle_dict(L)
print has_cycle_dict(r)

True
False


In [36]:
def has_cycle_slow(L):
    start = L
    node = L
    prev = L

    while (node):
        # scan all previous nodes
        prev = start
        while (prev != node):
            if prev == node.next:
                return True
            prev = prev.next
            
        node = node.next
        
    return False
            
print has_cycle_slow(L)
print has_cycle_slow(r)

True
False


In [37]:
def has_cycle(L):
    
    slow_iter = L
    fast_iter = L.next
    
    while 1:
        if slow_iter == fast_iter:
            return True
        
        if slow_iter is None:
            return False
        
        if fast_iter is None or fast_iter.next is None:
            return False
        
        slow_iter = slow_iter.next
        fast_iter = fast_iter.next.next
        
    return False

print has_cycle(L)
print has_cycle(r)

True
False


# Binary Trees

In [38]:
class Node(object):
    def __init__(self, value=None):
        self.value = value
        self.left = None
        self.right = None
        
    def insert(self, value):
        if value < self.value:
            if self.left:
                self.left.insert(value)
            else:
                self.left = Node(value)
        else:
            if self.right:
                self.right.insert(value)
            else:
                self.right = Node(value)
                

t = Node(10)
t.insert(0)
t.insert(20)
t.insert(5)
t.insert(-1)
t.insert(25)
t.insert(15)

In [39]:
def check_balanced(node):
    if node is None:
        return (True, -1)

    left_result = check_balanced(node.left)
    if not left_result[0]:
        # left subtree is not balanced
        return (False, 0)
    
    right_result = check_balanced(node.right)
    if not right_result[0]:
        return (False, 0)
    
    is_balanced = abs(left_result[1] - right_result[1]) <= 1
    height = max(left_result[1], right_result[1]) + 1
    return (is_balanced, height)

def is_balanced(tree):
    return check_balanced(tree)[0]


t = Node(10)
t.insert(0)
t.insert(20)
t.insert(5)
t.insert(-1)
t.insert(25)
t.insert(15)
t.insert(30)
t.insert(40)
t.insert(50)

check_balanced(t)

(False, 0)

## ECS 10.3

Write a function that checks whether a binary tree is symmetric.

In [40]:
def is_symmetric(subtreeleft, subtreeright):
    if (subtreeleft == None) and (subtreeright == None):
        return True
    
    if (subtreeleft and subtreeright):
        equal_data = (subtreeleft.value == subtreeright.value)
        equal_subtrees = is_symmetric

# EPI 10.4

Find the lowest common ancestor of two nodes in a tree.

In [41]:
class Node(object):
    def __init__(self, value=None):
        self.value = value
        self.parent = None
        self.left = None
        self.right = None
        
    def insert(self, value):
        if value < self.value:
            if self.left:
                self.left.insert(value)
            else:
                node = Node(value)
                self.left = node
                node.parent = self
        else:
            if self.right:
                self.right.insert(value)
            else:
                node = Node(value)
                self.right = node
                node.parent = self
                
                
    def search(self, value):
        if value < self.value:
            return self.left.search(value)
        
        if value > self.value:
            return self.right.search(value)
        
        if value == self.value:
            return self
        
        return None
                

t = Node(10)
t.insert(0)
t.insert(20)
t.insert(5)
t.insert(-1)
t.insert(25)
t.insert(15)
t.insert(3)
t.insert(7)

#    10
#   0  20
# -1 5
#   3 7

def height(tree, node, h=0):
    if hasattr(node,'parent'):
        return height_parent(tree, node)
    
    if node.value == tree.value:
        return h    
    
    if tree.left:
        if node.value < tree.value:
            return height(tree.left, node, h+1)
    if tree.right:
        if node.value >= tree.value:
            return height(tree.right, node, h+1)
    
    return -1

def height_parent(tree, node):
    pred = node.parent
    h = 0
    while (pred):
        pred = pred.parent
        h += 1
    return h

n = t.search(3)
print height_parent(t,n)
    
def lca(tree, n1, n2):
    h1 = height(tree, n1)
    h2 = height(tree, n2)
    
    while (h1 > h2):
        n1 = n1.parent
        h1 -= 1
        
    while (h2 > h1):
        n2 = n2.parent
        h2 -= 1
        
    h = h1
    while (n1.value != n2.value):
        n1 = n1.parent
        n2 = n2.parent
        h -= 1
        
        if h < 0:
            return None
        
    return n1.value

n1 = t.search(3)
n2 = t.search(20)

print lca(t, n1, n2)

# O(h) + O(h) (searches)
# O(h) traverse to top
# O(1) memory

3
10
