"""
Here you are going to compute the probabilistic likelihood of data for 
    two distributions. The below code reads in a vector of 100 data points.
    You have two tasks here. 1) Calculate the likelihood of this data coming
    from a Normal distribution. 2) Calculate the likelihood of this data coming
    from a Uniform distribution. 3) Finally, you will compute the "likelihood ratio".
    This is just the ratio of the two likelhoods. Remember, a larger likelihood
    indicates that the data is more likely to have come from that distribution.

Uniform model
    1) Calculate the minimum (a) and maximum (b) of the data.
    2) Construct a uniform distribution with these extremes.
    3) Use that distribution to construct the likelihood of this data.
    4) Wrap this in a function "uniform_likelihood_calc(data)"
        that outputs the total likelihood.
    5) Evaluate that function on the data.

Normal model
    1) Calculate the mean(μ) and standard deviation (σ) of the data.
    2) Construct a Normal distribution with these parameters.
    3) Use that distribution to construct the likelihood of this data.
    4) Wrap this in a function "normal_likelihood_calc(data)"
        that outputs the total likelihood.
    5) Evaluate that function on the data.

Finally, use these two functions to evaluate the likelihood ratio.

Which distribution better matches the structure of this data??

"""

In [1]:
using Pkg
Pkg.instantiate()
Pkg.status()

[32m[1mStatus[22m[39m `~/.julia/environments/v1.10/Project.toml`
  [90m[31c24e10] [39mDistributions v0.25.107
  [90m[033835bb] [39mJLD2 v0.4.45
  [90m[f0f68f2c] [39mPlotlyJS v0.18.12
  [90m[91a5bcdd] [39mPlots v1.40.0
  [90m[f3b207a7] [39mStatsPlots v0.15.6


In [2]:
using Distributions
using Plots
using Random
using JLD2

Random.seed!(1234);

In [24]:
n = 100;

data_load = load("Likelihood_calc_data.jld2");
data = data_load["data"]

100-element Vector{Float64}:
  1.5972424026826477
 -1.7741176410029897
  9.442721649109373
 -9.701823014298
  0.40709987447435836
  2.791231993605468
  6.792438681161421
  9.34285537830766
  5.795288190702614
  3.920813962878004
  ⋮
 -7.503055205511268
  9.872175725051665
 -8.626857487848499
  9.067230197335967
 -0.31326693032862174
 -7.379486875582819
  8.92906452462767
  1.4864697055663498
  3.5529981519915577

In [25]:
using Distributions

#  Uniform distribution
function uniform_likelihood_calc(data)
    a = minimum(data)
    b = maximum(data)
    dist = Uniform(a, b)
    total_likelihood = 1.0 
    for point in data
        total_likelihood *= pdf(dist, point)
    end
    return total_likelihood
end

#  Normal distribution
function normal_likelihood_calc(data)
    μ = mean(data)
    σ = std(data)
    dist = Normal(μ, σ)
    total_likelihood = 1.0
    for point in data
        total_likelihood *= pdf(dist, point)
    end
    return total_likelihood
end


data = randn(100)  

# Calculate likelihoods
uniform_likelihood = uniform_likelihood_calc(data)
normal_likelihood = normal_likelihood_calc(data)

#Calculate ratio of likelihoods
ratio_normal_to_uniform = normal_likelihood/uniform_likelihood

# Output the results
println("Uniform Likelihood: ", uniform_likelihood)
println("Normal Likelihood: ", normal_likelihood)
println("Ratio of Normal to Uniform Likelihood: ", ratio_normal_to_uniform)


# OVERALL WHAT IS LEARNED: It is highly likely that the Normal distribution is a better fit than the Uniform distribution


normal_likelihood_calc (generic function with 1 method)