## This is a notebook showing the example of estimating parameters for Categorical distribution. It includes
*   How to generate data that follows categorical distribution
*   How to estimate distribution parameters for each state by the frequentist approach
*   How to estimate the distribution parameters by Bayesian approach

Finally, you will compare the results among frequentist approach, Bayesian approach, and true value (Ground truth).


# **Part 1: Generation of data following categorical distribution**

In [1]:
import numpy as np
import scipy.stats as stats

# discrete random variable with K states
# generate the test results as 50 binary random numbers  (x = 1: test positive, x=0: test negative)
N = 50
K = 5

states = np.arange(K) # just ouput 0, 1, 2, 3, 4 for 5 states
params_alpha = 0.5 *np.ones (K,) # Provide parameters alpha, 0.5, 0.5, 0.5, 0.5, 0.5
# Use alpha to generate mu (probability that each state at state 1) by drawing a sample
# from Dirichlet distribution
# Note five mu's generated sum up to 1
mu_probs  = np.random.dirichlet ( params_alpha)

print('Parameters Mu for Categorical Distribution are:', mu_probs)
#generate data following categorical distribution
categoricaldist = stats.rv_discrete ( name = " categoricaldist", values = (states, mu_probs) )
x = categoricaldist. rvs (size = N)
print('The data following categorical distribution are: ')
print(x)

Parameters Mu for Categorical Distribution are: [0.58546094 0.04158259 0.11165693 0.23015395 0.03114558]
The data following categorical distribution are: 
[0 0 3 0 0 0 1 0 0 0 0 0 3 0 4 0 3 3 2 3 3 2 0 0 0 4 0 0 0 0 0 0 0 0 0 2 0
 3 0 0 3 0 2 1 0 0 0 0 0 0]


Question: What do the data generated mean? Please fill out the information below after question mark

For each data, the probability of the data in

state 4 is ?

State 3 is ?

State 2 is ?

State 1 is ?

State 0 is ?

# **Part 2: Frequentist approach to estimate parameter (Maximum Likelihood)**

In [2]:
# Estimate the parameter of the categorical distribution from data x
# The maxmimum likelhood estimate purely depends on the data
# Report you MLE estimates of all the parameters involved in categorical distribution
unique, mk = np.unique (x,return_counts = True) # Count the number of times x falls in each state

# use formula in Slide 20 in Chapter_2_Inference
MLE = mk/N
#print('The MLE estimation for mu is:', MLE)

for value, mle in zip(unique, MLE):
    print(f"State: {value}, MLE for Mu: {mle}")

State: 0, MLE for Mu: 0.68
State: 1, MLE for Mu: 0.04
State: 2, MLE for Mu: 0.08
State: 3, MLE for Mu: 0.16
State: 4, MLE for Mu: 0.04


# **Part 3: Bayesian approach to estimate model parameter**

In [3]:
# Bayesian Estimate uses some prior knowledge on the positive rate
# Report your MAP estimate for all parameters in the categorical distribution in this example
from scipy.stats import dirichlet
m = dict(zip(unique,mk))
alpha = dict(zip(states,params_alpha)) # create a dictionary for m
MAP_Numerator = {}
# apply formula in (2.11)
for k in states:
    if k in m and m[k] != 0:
        MAP_Numerator[k] = alpha [k]+ m[k] -1

# Remember MAP = alpha + mk - 1
MAP = {k: v/sum(MAP_Numerator.values ()) for k, v in MAP_Numerator.items()}



# **Part 4: Comparison of all results and summarize.**

In [None]:
# Compare all the estimation and then answer: Which one is closer to the true value?
print('The MLE estimator of Mu is:',
print('The MAP estimator for Mu is:', )
print('The true Mu is',  )

# **Part 5: Count Data**
Frequentist approach
Bayesian approach

In [None]:
counts = np. bincount (x, minlength )
mu_MLE =
print(' MLE of mu= '+np.array2string(mu_MLE, precision=4))

# Bayesian approach
# Assume a flat prior for alpha in Dirichlet distribution
alpha =
# Formula on Slide 30
mu_MAP =
# Set negative values to 0

# Normalize the mu_MAP
mu_MAP =
print('MAP of mu= '+np.array2string(mu_MAP, precision=4))