# Problem 1:  BST Traversal
This problem builds on Problem 1 of Homework 7 in which you wrote a binary search tree.

### Part 1

As discussed in lecture, three different types to do a depth-first traversal are: preorder, inorder, and postorder. Here is a reference: [Tree Traversal](https://en.wikipedia.org/wiki/Tree_traversal#Depth-first_search).

Write an iterator class called `DFSTraversal` with the following specifications:

* `__init__(self, tree, traversalType)`: Constructor takes a `BinaryTree` object and one of the enums from `DFSTraversalTypes`

```python
from enum import Enum

class DFSTraversalTypes(Enum):
    PREORDER = 1
    INORDER = 2
    POSTORDER = 3
```

* `changeTraversalType(self, traversalType)`: Change the traversal type
* `__iter__(self)`: This is the initialization of an iterator
* `__next__(self)`: This is called in the iterator for getting the next value

Here's how you might use your `DFSTraversal` class:

```python
input_array = [3, 9, 2, 11]
bt = BinaryTree()
for val in input_array:
    bt.insert(val)
traversal = DFSTraversal(bt, DFSTraversalTypes.INORDER)
for val in traversal:
    print(val)
2
3
9
11
```

### Part 2
Put your `BinaryTree` class (from homework 7) and your `DFSTraversal` class (from Part 1 of this homework) in a file titled `TreeTraversal.py`.

In [None]:
# from HW7 solutions
import numpy as np
class BinaryTree:
    def __init__(self):
        self.data = [None]
        
    def insert(self, val):
        # keep track of idx we're at traversing the tree
        idx = 0
        while self.data[idx] is not None:
            idx = idx * 2 + (1 if self.data[idx] > val else 2)
            if idx >= len(self.data):
                self.data = self.data + [None]*(idx + 1 - len(self.data))
        self.data[idx] = val
        
    def find(self, val):
        idx = 0
        while self.data[idx] is not None and self.data[idx] != val:
            idx = idx * 2 + (1 if self.data[idx] > val else 2)
            if len(self.data) <= idx:
                return -1
        return idx if self.data[idx] is not None else -1
    
    def levelUp(self, idx):
        self.data[idx] = None
        leftChild = idx * 2 + 1
        rightChild = (idx + 1) * 2
        if len(self.data) > leftChild:
            if self.data[leftChild] is not None:
                self.data[idx] = self.data[leftChild]
                self.levelUp(leftChild)
            elif len(self.data) > rightChild and self.data[rightChild] is not None:
                self.data[idx] = self.data[rightChild]
                self.levelUp(rightChild)
                
    def getValues(self, level):
        values = []
        for x in range(2**level - 1, 2**(level + 1) - 1):
            if len(self.data) <= x:
                values.append(None)
            else:
                values.append(self.data[x])
        return values

In [None]:
from enum import Enum

class DFSTraversalTypes(Enum):
    PREORDER = 1
    INORDER = 2
    POSTORDER = 3

In [None]:
class DFSTraversal:
    def __init__(self, tree, traversalType):
        self.traversalType = traversalType.name
        self.tree = tree
        
    def changeTraversalType(self, traversalType):
        self.traversalType = traversalType.name
        
    def __iter__(self):
        
        
    def __next__(self):
        
      

## Problem 2: Markov Chains

[Markov Chains](https://en.wikipedia.org/wiki/Markov_chain) are widely used to model and predict discrete events. Underlying Markov chains are Markov processes which make the assumption that the outcome of a future event only depends on the event immediately preceeding it. In this exercise, we will be assuming that weather has Markov properties (e.g. today's weather is dependent only on yesterday's weather). We will use the Markov assumption to create a basic model for predicting weather.

To begin, let's categorize weather into 7 types: ['sunny', 'cloudy', 'rainy', 'snowy', 'windy', 'hailing'].

In the `weather.csv` file accompanying this homework, each row corresponds to one type of weather (in the order given above) and each column is the probability of one type of weather occurring the following day (also in the order given above).

The $ij$th element is the probability that the $j$th weather type occurs after the $i$th weather type. So for example, (1,2) is the probability a cloudy day occurs after a sunny day.

Take a look at the data. Make sure you see how if the previous day was sunny, the following day will have a 0.4 probability of being sunny as well. If the previous day was raining (index $i = 3$), then the following day (index $j$) has a 0.05 probability of being windy ($j = 5$).

In [3]:
import pandas as pd
df = pd.read_csv('weather.csv', header=None)
df

Unnamed: 0,0,1,2,3,4,5
0,0.4,0.3,0.1,0.05,0.1,0.05
1,0.3,0.4,0.1,0.1,0.08,0.02
2,0.2,0.3,0.35,0.05,0.05,0.05
3,0.1,0.2,0.25,0.3,0.1,0.05
4,0.15,0.2,0.1,0.15,0.3,0.1
5,0.1,0.2,0.35,0.1,0.05,0.2


### Part 1:  Parse the `.csv` file into a `Numpy` array

In [4]:
#Load CSV file -- hint: you can use np.genfromtxt()
import numpy as np
data = np.genfromtxt("weather.csv", delimiter=",")
data

array([[ 0.4 ,  0.3 ,  0.1 ,  0.05,  0.1 ,  0.05],
       [ 0.3 ,  0.4 ,  0.1 ,  0.1 ,  0.08,  0.02],
       [ 0.2 ,  0.3 ,  0.35,  0.05,  0.05,  0.05],
       [ 0.1 ,  0.2 ,  0.25,  0.3 ,  0.1 ,  0.05],
       [ 0.15,  0.2 ,  0.1 ,  0.15,  0.3 ,  0.1 ],
       [ 0.1 ,  0.2 ,  0.35,  0.1 ,  0.05,  0.2 ]])

In [5]:
data.shape


(6, 6)

### Part 2:  Create a class called `Markov` that has the following methods:

* `load_data(array)`: loads the Numpy 2D array and stores it as a class variable.
* `get_prob(previous_day, following_day)`: returns the probability of `following_day` weather given `previous_day` weather. 

**Note:** `previous_day` and `following_day` should be passed in string form (e.g. "sunny"), as opposed to an index (e.g. 0). 

In [6]:
class Markov:
    def __init__(self):
        # implement here
        self.weather = {"sunny":0, "cloudy":1, "rainy":2, "snowy":3, "windy":4, "hailing":5}
        
    def load_data(self, array):
        # implement here
        self.array = array
    
    def get_prob(self, previous_day, following_day):
        # implement here -- returns a probability
        i = self.weather[previous_day]
        j = self.weather[following_day]
        return self.array[i,j]
        

In [7]:
m = Markov()
m.load_data(data)
m.get_prob("sunny", "rainy")

0.10000000000000001

In [8]:
m.get_prob("rainy", "windy") 

0.050000000000000003

## Problem 3: Iterators

Iterators are a convenient way to walk along your Markov chain.

#### Part 1: Using your `Markov` class from Problem 3, write `Markov` as an iterator by implementing the `__iter__()` and `__next__()` methods.

Remember:  
* `__iter__()` should return the iterator object and should be implicitly called when the loop begins
* The `__next()__` method should return the next value and is implicitly called at each step in the loop.

Each 'next' step should be stochastic (i.e. randomly selected based on the relative probabilities of the following day weather types) and should return the next day's weather as a string (e.g. "sunny") rather than an index (e.g. 0).

In [9]:
class MarkovIterator:
    def __init__(self, markov):
        self.markov = markov;
        self.current_idx = self.markov.current_idx
        self.table = self.markov.array
        
    def __next__(self):
        rand_num = np.random.random()
        try:
            nextProb = self.table[self.current_idx];
        except IndexError:
            raise StopIteration()
        cdf = np.zeros(nextProb.shape);
        for i in range(nextProb.shape[0]):
            cdf[i] = nextProb[i] + cdf[i-1]
        
#         print(cdf, rand_num)
        next_idx = 0
        for i in range(nextProb.shape[0]):
            if rand_num <= cdf[i]:
                next_idx = i
                break
        
        current_str = self.markov.idx2str[self.current_idx]
        self.current_idx = next_idx
        
        return current_str
        
    
    def __iter__(self):
        return self

class Markov:
    def __init__(self, current_weather = "sunny"):
        self.idx2str = ["sunny", "cloudy", "rainy", "snowy", "windy", "hailing"]
        self.weather = {"sunny":0, "cloudy":1, "rainy":2, "snowy":3, "windy":4, "hailing":5}
        self.current_idx = self.weather[current_weather]

    def load_data(self, array):
        self.array = array
    
    def get_prob(self, previous_day, following_day):
        i = self.weather[previous_day]
        j = self.weather[following_day]
        return self.array[i,j]
    
    def set_current(self, current_weather):
        self.current_idx = self.weather[current_weather]
    
    def __iter__(self):
        return MarkovIterator(self)
    

    
#     def __repr__(self):
#         return self.following_day

In [10]:
m2 = Markov()
m2.load_data(data)
print(m2.current_idx)
m2.set_current("hailing")
print(m2.current_idx)

iter1 = iter(m2)
iter2 = iter(m2)
print(next(iter1))
print(next(iter1))
print(next(iter1))
print(next(iter1))
# print(next(iter(m2)))

#for i in m2:
 #   print(i)

0
5
hailing
rainy
sunny
cloudy


#### Part 2: We want to predict what weather will be like in a week for 5 different cities.

Now that we have our `Markov` iterator, we can try to predict what the weather will be like in seven days from now.

Given each city's current weather in the dictionary `city_weather` (see below), simulate what the weather will be like in 7 days from now.  Rather than just producing one prediction per city, simulate 100 such predictions per city and store the most commonly occuring prediction.

In your submission, print a dictionary `city_weather_predictions` that has each city as a key and the most commonly predicted weather as the corresponding value.

**Note**: Don't worry if your values don't seem to make intuitive sense.  We made up the weather probabilities.

In [11]:
city_weather = {
    'New York': 'rainy',
    'Chicago': 'snowy',
    'Seattle': 'rainy',
    'Boston': 'hailing',
    'Miami': 'windy',
    'Los Angeles': 'cloudy',
    'San Fransisco': 'windy'
}

In [17]:
from collections import Counter
city_weather_predictions = {}
nPredictions = 7
nSimulations = 100

m1 = Markov()
m1.load_data(data)

for city, weather in city_weather.items():    
    m1.set_current(weather)
    simulations = []
    for i in range(nSimulations):
        predictor = iter(m1)
        next(predictor)
        predictions = []
        for j in range(nPredictions):
            predictions.append(next(predictor))
        simulations.append(predictions)
    city_weather_predictions[city] = simulations
    
# city_weather_predictions

{'Boston': [['snowy', 'rainy', 'rainy', 'rainy', 'cloudy', 'cloudy', 'sunny'],
  ['rainy', 'cloudy', 'sunny', 'snowy', 'cloudy', 'cloudy', 'cloudy'],
  ['hailing', 'cloudy', 'sunny', 'sunny', 'sunny', 'sunny', 'cloudy'],
  ['rainy', 'cloudy', 'windy', 'snowy', 'snowy', 'hailing', 'windy'],
  ['cloudy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'sunny'],
  ['rainy', 'rainy', 'rainy', 'rainy', 'windy', 'sunny', 'sunny'],
  ['cloudy', 'cloudy', 'cloudy', 'cloudy', 'rainy', 'windy', 'cloudy'],
  ['cloudy', 'cloudy', 'sunny', 'sunny', 'cloudy', 'sunny', 'sunny'],
  ['hailing', 'sunny', 'cloudy', 'sunny', 'windy', 'snowy', 'rainy'],
  ['cloudy', 'sunny', 'sunny', 'cloudy', 'sunny', 'windy', 'windy'],
  ['snowy', 'snowy', 'cloudy', 'sunny', 'hailing', 'hailing', 'snowy'],
  ['rainy', 'cloudy', 'sunny', 'cloudy', 'sunny', 'sunny', 'sunny'],
  ['hailing', 'rainy', 'rainy', 'windy', 'cloudy', 'rainy', 'hailing'],
  ['sunny', 'sunny', 'cloudy', 'snowy', 'sunny', 'rainy', 'rainy'],
  ['sno

In [21]:
# Bos = city_weather_predictions['Boston']
# Bos

[['snowy', 'rainy', 'rainy', 'rainy', 'cloudy', 'cloudy', 'sunny'],
 ['rainy', 'cloudy', 'sunny', 'snowy', 'cloudy', 'cloudy', 'cloudy'],
 ['hailing', 'cloudy', 'sunny', 'sunny', 'sunny', 'sunny', 'cloudy'],
 ['rainy', 'cloudy', 'windy', 'snowy', 'snowy', 'hailing', 'windy'],
 ['cloudy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'sunny'],
 ['rainy', 'rainy', 'rainy', 'rainy', 'windy', 'sunny', 'sunny'],
 ['cloudy', 'cloudy', 'cloudy', 'cloudy', 'rainy', 'windy', 'cloudy'],
 ['cloudy', 'cloudy', 'sunny', 'sunny', 'cloudy', 'sunny', 'sunny'],
 ['hailing', 'sunny', 'cloudy', 'sunny', 'windy', 'snowy', 'rainy'],
 ['cloudy', 'sunny', 'sunny', 'cloudy', 'sunny', 'windy', 'windy'],
 ['snowy', 'snowy', 'cloudy', 'sunny', 'hailing', 'hailing', 'snowy'],
 ['rainy', 'cloudy', 'sunny', 'cloudy', 'sunny', 'sunny', 'sunny'],
 ['hailing', 'rainy', 'rainy', 'windy', 'cloudy', 'rainy', 'hailing'],
 ['sunny', 'sunny', 'cloudy', 'snowy', 'sunny', 'rainy', 'rainy'],
 ['snowy', 'cloudy', 'sunny', '

In [28]:
Boston1 = [item[0] for item in Bos]
Counter(Boston1).most_common(1)[0][0]

'rainy'

In [29]:
most_common = []
for i in range(nPredictions):
    each_day = [item[i] for item in Bos]
    most_common.append(Counter(each_day).most_common(1)[0][0])
most_common

['rainy', 'rainy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'cloudy']

In [33]:
report = {}
for c, w in city_weather_predictions.items():
    most_common = []
    for i in range(nPredictions):
        each_day = [item[i] for item in city_weather_predictions[c]]
        most_common.append(Counter(each_day).most_common(1)[0][0])
    report[c] = most_common

In [34]:
report

{'Boston': ['rainy',
  'rainy',
  'cloudy',
  'cloudy',
  'cloudy',
  'cloudy',
  'cloudy'],
 'Chicago': ['rainy',
  'cloudy',
  'sunny',
  'sunny',
  'cloudy',
  'cloudy',
  'cloudy'],
 'Los Angeles': ['cloudy',
  'cloudy',
  'cloudy',
  'cloudy',
  'cloudy',
  'cloudy',
  'cloudy'],
 'Miami': ['windy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'cloudy', 'sunny'],
 'New York': ['rainy',
  'cloudy',
  'cloudy',
  'sunny',
  'cloudy',
  'cloudy',
  'cloudy'],
 'San Fransisco': ['windy',
  'cloudy',
  'cloudy',
  'sunny',
  'cloudy',
  'cloudy',
  'cloudy'],
 'Seattle': ['cloudy',
  'sunny',
  'sunny',
  'sunny',
  'cloudy',
  'cloudy',
  'cloudy']}