# Practice for in-class assignment

Use the problem from Assignment 2 Problem 2:

$H_0: \theta=0$ 

$H_1: \theta=1$

For the random variable $Y = N + \theta\lambda$, where $N\sim U(-1,1)$, $0 \leq \lambda \leq 2$.

## Generate data from the two distributions


In [1]:
import numpy as np
import pandas as pd

# values of theta
theta0 = 0.0
theta1 = 1.0

# number of samples in each distribution
N = 10

# use random value of lambda
lambd = np.random.uniform(0.0,2.0)
# generate samples from distribution 0 and 1
Y0 = np.random.uniform(-1.0,1.0,size=(N,)) + theta0*lambd
Y1 = np.random.uniform(-1.0,1.0,size=(N,)) + theta1*lambd
# stack the columns and create an index of classes
Y = np.concatenate((Y0,Y1))
classes = np.concatenate((np.zeros(N,),np.ones(N,)))
# randomize the data order
ind = np.random.permutation(2*N)
classes = classes[ind]
Y = Y[ind]

# write out the data to pandas data structure
truth = pd.DataFrame({'X':classes,'Y':Y})
unlabelled = pd.DataFrame({'Y':Y})
unlabelled.to_csv('unlabelled.csv')

print("Wrote to file unlabelled.csv")

Wrote to file unlabelled.csv


## Use the Neyman-Pearson classifier

In [2]:
import scipy.stats
# load the data file
read = pd.DataFrame.from_csv('unlabelled.csv')
# classify according to the likelihood ratio
Yv = read.values.copy()
# compute the likelihood for distribution 0
p0 = 0.5*((Yv-theta0*lambd > -1.0) & (Yv-theta0*lambd < 1.0))
# likelihood for distribution 1
p1 = 0.5*((Yv-theta1*lambd > -1.0) & (Yv-theta1*lambd < 1.0))

# Neyman Pearson decision rule with false alarm probability alph:
alph = 0.01
# threshold value of Y from the homework
ystar = 1.0-2.0*alph
# pick H1 for y > ystar - here are out guesses for the true classes
guesses = 1.0*(Yv > ystar)

## Evaluate the false alarm and miss rates in the data

In [3]:
# print the values for debugging
print(guesses.flatten())
print(classes)
# evaluate how good our classifier is. Compute the false-alarm rate
FAR = (guesses.flatten() == 1) & (classes == 0)
print("False alarm rate:")
print(sum(FAR)/len(FAR))
# compute the miss rate
MR = (guesses.flatten() == 0) & (classes == 1)
print("Miss rate:")
print(sum(MR)/len(MR))

[ 0.  1.  1.  0.  0.  0.  0.  1.  0.  1.  0.  0.  0.  0.  0.  0.  0.  1.
  0.  1.]
[ 1.  1.  1.  0.  0.  0.  1.  1.  1.  1.  0.  0.  1.  0.  0.  0.  0.  1.
  0.  1.]
False alarm rate:
0.0
Miss rate:
0.2
