#### Computing probabilities
The where9am data frame contains 91 days (thirteen weeks) worth of data in which Brett recorded his location at 9am each day as well as whether the daytype was a weekend or weekday.

Using the conditional probability formula below, you can compute the probability that Brett is working in the office, given that it is a weekday.

P(A|B)=P(A and B)/P(B)
Calculations like these are the basis of the Naive Bayes destination prediction model you'll develop in later exercises.

In [None]:
# Compute P(A) 
p_A <- nrow(subset(where9am, location == "office")) / nrow(where9am)

# Compute P(B)
p_B <- nrow(subset(where9am, daytype == "weekday")) / nrow(where9am)

# Compute the observed P(A and B)
p_AB <- nrow(subset(where9am, location == "office" & daytype == "weekday")) / nrow(where9am)

# Compute P(A | B) and print its value
p_A_given_B <- p_AB / p_B

#### A simple Naive Bayes location model
The previous exercises showed that the probability that Brett is at work or at home at 9am is highly dependent on whether it is the weekend or a weekday.

To see this finding in action, use the where9am data frame to build a Naive Bayes model on the same data.

You can then use this model to predict the future: where does the model think that Brett will be at 9am on Thursday and at 9am on Saturday?

The dataframe where9am is available in your workspace. This dataset contains information about Brett's location at 9am on different days.

In [None]:
# Load the naivebayes package
library(naivebayes)

# Build the location prediction model
locmodel <- naive_bayes(location ~ daytype, data = where9am)

# Predict Thursday's 9am location
predict(locmodel, thursday9am)

# Predict Saturdays's 9am location
predict(locmodel, saturday9am)

#### Examining "raw" probabilities
The naivebayes package offers several ways to peek inside a Naive Bayes model.

Typing the name of the model object provides the a priori (overall) and conditional probabilities of each of the model's predictors. If one were so inclined, you might use these for calculating posterior (predicted) probabilities by hand.

Alternatively, R will compute the posterior probabilities for you if the type = "prob" parameter is supplied to the predict() function.

Using these methods, examine how the model's predicted 9am location probability varies from day-to-day. The model locmodel that you fit in the previous exercise is in your workspace.

In [None]:
# The 'naivebayes' package is loaded into the workspace
# and the Naive Bayes 'locmodel' has been built

# Print the locmodel object to the console to view the computed a priori and conditional probabilities.locmodel
locmodel
#===================== Naive Bayes ===================== 
#Call: 
#naive_bayes.formula(formula = location ~ daytype, data = where9am)

#A priori probabilities: 

#appointment      campus        home      office 
# 0.01098901  0.10989011  0.45054945  0.42857143 

#Tables: 
         
#daytype   appointment    campus      home    office
#  weekday   1.0000000 1.0000000 0.3658537 1.0000000
#  weekend   0.0000000 0.0000000 0.6341463 0.0000000

# Obtain the predicted probabilities for Thursday at 9am
predict(locmodel, thursday9am , type = "prob")
#     appointment    campus      home office
#[1,]  0.01538462 0.1538462 0.2307692    0.6

# Obtain the predicted probabilities for Saturday at 9am
predict(locmodel, saturday9am , type = "prob")
#     appointment campus home office
#[1,]           0      0    1      0

#### Who are you calling naive?
The Naive Bayes algorithm got its name because it makes a "naive" assumption about event independence.

What is the purpose of making this assumption?

The joint probability calculation is simpler for independent events.

#### A more sophisticated location model
The locations dataset records Brett's location every hour for 13 weeks. Each hour, the tracking information includes the daytype (weekend or weekday) as well as the hourtype (morning, afternoon, evening, or night).

Using this data, build a more sophisticated model to see how Brett's predicted location not only varies by the day of week but also by the time of day. The dataset locations is already loaded in your workspace.

You can specify additional independent variables in your formula using the + sign (e.g. y ~ x + b).

In [None]:
# Build a NB model of location
locmodel <- naive_bayes(location ~ daytype + hourtype, locations)

# Predict Brett's location on a weekday afternoon
predict(locmodel, weekday_afternoon)

# Predict Brett's location on a weekday evening
predict(locmodel, weekday_evening)

#### Preparing for unforeseen circumstances
While Brett was tracking his location over 13 weeks, he never went into the office during the weekend. Consequently, the joint probability of P(office and weekend) = 0.

Explore how this impacts the predicted probability that Brett may go to work on the weekend in the future. Additionally, you can see how using the Laplace correction will allow a small chance for these types of unforeseen circumstances.

The model locmodel is already in your workspace, along with the dataframe weekend_afternoon.

In [None]:
# The 'naivebayes' package is loaded into the workspace already
# The Naive Bayes location model (locmodel) has already been built

# Observe the predicted probabilities for a weekend afternoon
predict(locmodel, weekend_afternoon, type = "prob")
#     appointment campus      home office restaurant      store theater
#[1,]  0.02472535      0 0.8472217      0  0.1115693 0.01648357       0

# Build a new model using the Laplace correction
locmodel2 <- naive_bayes(location ~ daytype + hourtype, locations, laplace = 1)

# Observe the new predicted probabilities for a weekend afternoon
predict(locmodel2, weekend_afternoon, type = "prob")
#     appointment      campus      home      office restaurant      store
#[1,]  0.01107985 0.005752078 0.8527053 0.008023444  0.1032598 0.01608175
#         theater
#[1,] 0.003097769