# Machine Problem: Test 1, Set 0 (practice)

The goal of this machine problem is to help you prepare for the upcoming live computational challenge.
In particular, you should be familiar with manipulating data for a Pandas dataframe, reading the data in from a csv file, and writing the dataframe to a csv file.

The task is to create a decision rule for a binary detection problem in the Bayesian setting.

## Statistical Structure

The prior probabilities are $\Pr (H_0) = 0.5$ and $\Pr (H_1) = 0.5$.
The probability density function under hypothesis zero is
$$f(y;\theta_0) = \frac{1}{\sqrt{2 \pi}} \exp \left( - \frac{y^2}{2} \right)$$
and the probability density function under hypothesis one is
$$f(y;\theta_1) = \frac{1}{\sqrt{2 \pi}} \exp \left( - \frac{(y-1)^2}{2} \right) .$$

In [None]:
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import numpy as np
import pandas as pd

mean0 = 0.0
mean1 = 1.0

fig, ax1 = plt.subplots()
y = np.linspace(-3, 4, 200)
ax1.plot(y, mlab.normpdf(y, mean0, 1), 'b-', label='f(y;H0)')
ax1.plot(y, mlab.normpdf(y, mean1, 1), 'r-', label='f(y;H1)')
ax1.legend(loc='best', frameon=False)
plt.show()

In [None]:
import pandas as pd
from scipy.stats import bernoulli

identities_df = pd.DataFrame.from_csv("ecen662names.csv")
sample_size = 10
np.random.seed(0)

Y0 = np.random.randn(sample_size) + mean0
Y1 = np.random.randn(sample_size) + mean1
Z = bernoulli.rvs(0.5, size=sample_size)
Y = [h0*(1-h) + h1*h for h,h0,h1 in zip(Z,Y0,Y1)]

source_df = pd.DataFrame({'Y0':Y0, 'Y1':Y1, 'Y':Y, 'Z':Z})
sample_df = pd.DataFrame({'Y':Y})

source_df.to_csv("Data1Solution0.csv")
sample_df.to_csv("Data1Set0.csv")


## Data Set Provided to Students

Actual data sets will be given in the form of CSV files.
Your program should be able to load the appropriate data set in a Pandas dataframe and subsequently process it.

In [None]:
sample_df = pd.DataFrame.from_csv("Data1Set0.csv")

## Decision Rule

This part of the code simply translates a mathematical decision rule into Python code.

In [None]:
Z_hat = (sample_df > 0.5)
Z_hat['Y'] = Z_hat['Y'].map({False: 0, True: 1})
Z_hat.rename(columns = {'Y':'Z_hat'},inplace=True)
print(Z_hat)

## Output

You need to write your decisions to a CSV file called Data1Answer1.csv.
Remember to add, commit, pull, and push solution files to GitHub.frames.

Z_hat.to_csv("Data1Answer0.csv")