# Problem Description

A spam filter attempts to flag messages as spam based on the number of "target words", ie, the number of words likely to appear only in a spam message, it contains.  Based on the number of these target words it finds in a given message, it attempts to estimate the probability that the message is spam using Bayes Rule.  To do this, however, the filter must have an estimate of the probability of a given number of target words given whether the message is spam.  Estimating this probability for all numbers of target words is the problem.

# Random Variables

W represents the number of target words in the message  
    W is a multi-value variable defined for all positive integers  
S represents whether the message is spam  
    S is a binary variable  

# Probabilities

We want P(W|S), the posterior and hidden representation, the probability of a given number of target words given whether the message is spam.  
P(S|W) is our likelihood, the probability of a given message being spam given a certain number of target words.  
P(W) is our prior, the probability of a given number of target words in any given message.
P(S) is our normalizing constant, the probability that any given message is spam.

In [5]:
p_s = .3
print('We want P(W|S)')
print('Let\'s suppose we already know P(S)=' + str(p_s))

We want P(W|S)
Let's suppose we already know P(S)=0.3


# Values

In [16]:
import numpy as np


p_s = .3
print("Suppose P(S)=", p_s)

def p_s_given_w(w):
    return 1.0-1.0/(w+1)
print("Suppose P(S|W)=1-1/(W+1).  This way, the probability the message is spam increases with the number of target \
words found and is in [0,1]")


mu = 10.0
sigma_squared = 5.0
def p_w(w):
    return np.exp(-(w - mu)**2 / (2 * sigma_squared)) / np.sqrt(2 * np.pi * sigma_squared)
print("Let P(W) follow the normal distribution with a mean of ", mu, " and variance ", sigma_squared)
    

Suppose P(S)= 0.3
Suppose P(S|W)=1-1/(W+1).  This way, the probability the message is spam increases with the number of target words found and is in [0,1]
Let P(W) follow the normal distribution with a mean of  10.0  and variance  5.0


# Generative Process

# Example Data