### Heuristic Models
Look at the Seattle weather in the **data** folder. Come up with a heuristic model to predict if it will rain today. Keep in mind this is a time series, which means that you only know what happened historically (before a given date). One example of a heuristic model is: It will rain tomorrow if it rained more than 1 inch (>1.0 PRCP) today. Describe your heuristic model in the next cell.

### My heuristic model
**if it rained yesterday but not today, it will rain tomorrow.** 


**if it rained today but not yesterday , it will rain tomorrow.**   




In [1]:
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/daniel-dc-cd/data_science/master/module_4_ML/data/seattle_weather_1948-2017.csv')

numrows = 25549 # can be as large as 25549

#create an empty dataframe to hold 100 values
heuristic_df = pd.DataFrame({'yesterday':[0.0]*numrows,
                             'today':[0.0]*numrows,
                             'tomorrow':[0.0]*numrows,
                             'guess':[False]*numrows, #logical guess
                             'rain_tomorrow':[False]*numrows, #historical observation
                             'correct':[False]*numrows}) #TRUE if your guess matches the historical observation

#sort columns for convience
seq = ['yesterday','today','tomorrow','guess','rain_tomorrow','correct']
heuristic_df = heuristic_df.reindex(columns=seq)

In [2]:
df.head()

Unnamed: 0,DATE,PRCP,TMAX,TMIN,RAIN
0,1948-01-01,0.47,51,42,True
1,1948-01-02,0.59,45,36,True
2,1948-01-03,0.42,45,35,True
3,1948-01-04,0.31,45,34,True
4,1948-01-05,0.17,45,32,True


In [3]:
heuristic_df.head()

Unnamed: 0,yesterday,today,tomorrow,guess,rain_tomorrow,correct
0,0.0,0.0,0.0,False,False,False
1,0.0,0.0,0.0,False,False,False
2,0.0,0.0,0.0,False,False,False
3,0.0,0.0,0.0,False,False,False
4,0.0,0.0,0.0,False,False,False


Build a loop to add your heuristic model guesses as a column to this dataframe

### **if it rained yesterday but not today, it will rain tomorrow.**

In [4]:
# here is an example loop that populates the dataframe created earlier
# with the total percip from yesterday and today
# then the guess is set to true if it rained yesterday but not today 

for z in range(numrows):
    #start at time 2 in the data frame
    i = z + 2
    #pull values from the dataframe
    yesterday = df.iloc[(i-2),1]
    today = df.iloc[(i-1),1]
    tomorrow = df.iloc[i,1]
    rain_tomorrow = df.iloc[(i),1]
    
    heuristic_df.iat[z,0] = yesterday
    heuristic_df.iat[z,1] = today
    heuristic_df.iat[z,2] = tomorrow
    heuristic_df.iat[z,3] = False # set guess default to False
    heuristic_df.iat[z,4] = rain_tomorrow
    
# if it rained yesterday but not  today, then it will rain tomorrow
    if (today == 0.0) and (yesterday > 0.0 ):
        heuristic_df.iat[z,2] = True

    if heuristic_df.iat[z,3] == heuristic_df.iat[z,4]:
        heuristic_df.iat[z,5] = True
    else:
        heuristic_df.iat[z,5] = False

### Evaluate the performance of the Heuristic model

***the accuracy of your predicitions***

In [5]:
from sklearn.metrics import accuracy_score
print(accuracy_score(heuristic_df['correct'], heuristic_df['guess']))

0.4266703197776821


In [6]:
## Accuracy 57%
heuristic_df['correct'].value_counts()/numrows

True     0.57333
False    0.42667
Name: correct, dtype: float64

### **if it rained today but not yesterday , it will rain tomorrow.**

In [7]:
# here is an example loop that populates the dataframe created earlier
# with the total percip from yesterday and today
# then the guess is set to true if it rained today but not yesterday  

for z in range(numrows):
    #start at time 2 in the data frame
    i = z + 2
    #pull values from the dataframe
    yesterday = df.iloc[(i-2),1]
    today = df.iloc[(i-1),1]
    tomorrow = df.iloc[i,1]
    rain_tomorrow = df.iloc[(i),1]
    
    heuristic_df.iat[z,0] = yesterday
    heuristic_df.iat[z,1] = today
    heuristic_df.iat[z,2] = tomorrow
    heuristic_df.iat[z,3] = False # set guess default to False
    heuristic_df.iat[z,4] = rain_tomorrow
    
# if it rained today but not yesterday, then it will rain tomorrow 
    if (today > 0.0) and (yesterday == 0.0 ):
        heuristic_df.iat[z,2] = True

    if heuristic_df.iat[z,3] == heuristic_df.iat[z,4]:
        heuristic_df.iat[z,5] = True
    else:
        heuristic_df.iat[z,5] = False

### Evaluate the performance of the Heuristic model

***the accuracy of your predicitions***

In [8]:
from sklearn.metrics import accuracy_score
print(accuracy_score(heuristic_df['correct'], heuristic_df['guess']))

0.4266703197776821


In [9]:
## Accuracy 
heuristic_df['correct'].value_counts()/numrows

True     0.57333
False    0.42667
Name: correct, dtype: float64