# BAT.jl Tutorial - Poisson Counting Experiment

In [None]:
using BAT
using Distributions 
using IntervalSets
using ValueShapes
using Plots
using ArraysOfArrays
using StatsBase 

## The Situation
We want to measure a source in the presence of background. 
For example, this could be a radioactive element that is to be measured in a laboratory and is therefore background from natural radioactivity.


## 1. Background only measurement
In order to measure the count rate of the radiactive source, we need to estimate the background rate $\lambda_b$ first.
A measurement without the signal source yields a number of $N_B=10$ counts.
### Task: 
Perform a Bayesian analysis of this situation to estimate the parameter $\lambda_b$ using a Poisson model.
Start by defining the model and its Likelihood using the *logpdf()* and *Poisson()* functions of the distrtibution package.

In [None]:
struct Background<:AbstractDensity
    k::Float64 # observed counts
end

function BAT.density_logval(target::Background, params::Union{NamedTuple,AbstractVector{<:Real}})
    return logpdf(Poisson(params[1][1]), target.k) # poisson log-likelihood
end

Create an instance of the model and define the Prior with help of the *Named Prior()* function.

Afterwards, use the model and the prior to define the *PosteriorDensity()*. 

In [None]:
# Number of observed background events
kb = 10
likelihood_B = Background(kb)

prior_B = NamedPrior(
    λb = 0..30
)

posterior_B = PosteriorDensity(likelihood_B, prior_B)

Define the settings for the sampling. Choose *MetropolisHstings()* as your algorithm and set the number of chains and samples.

In [None]:
algorithm = MetropolisHastings()
nchains = 8
nsamples = 10^5

Start the sampling by using *rand()* on the *MCMCSpec()* object using the settings defined above.

In [None]:
samples_B, sampleids_B, stats_B, chains_B = rand(MCMCSpec(algorithm, posterior_B), nsamples, nchains);

At last look at the resulting disribution for the background rate using *plot()* and print out the reults of the sampling. 

In [None]:
par_names=["\$\\lambda_b\$"]
plot(posterior_B, samples_B, :λb, xlabel = par_names[1], ylabel = "P($(par_names[1]))")
plot!(prior_B, :λb)

In [None]:
print(stats_B)

Questions:
* What is the distribution of the posterior? What is the best estimator for the parameter?

## 2. Further Background only measurement
A second measurement without the signal source yields a number of $N_B=8$ counts.
Therefore, we want to update our estimation for the background rate using this new knowledge and the old result.
### Task:
Perform an anylsis in a similar fashion to the first one with the posterior distribution of the background measurement as the prior of this analysis.
This can be done by using a *StatsBase* histogram using *fit(Histogram,flatview(samples),weights,nbins)*.

Be mindful about carrying on the weights of the samples using the *FrequencyWeights()* function on the samples. 

In [None]:
hist_10 = fit(Histogram, flatview(samples_B.params)[1, :], FrequencyWeights(samples_B.weight), nbins = 400, closed = :left)

The histogram can be used as prior by converting it into a univariate distribution using *BAT.HistogramAsUvDistribution()*.
Otherwise proceed similarly to the first task.

In [None]:
# Number of observed background events
kb2 = 8
likelihood_B2 = Background(kb2)

prior_B2 = NamedPrior(
    λb = BAT.HistogramAsUvDistribution(hist_10) # replace by analytic poisson prior
)

prior_B2flat = NamedPrior(
    λb = 0..30
)

posterior_B2 = PosteriorDensity(likelihood_B2, prior_B2)
likelipost_B2 = PosteriorDensity(likelihood_B2, prior_B2flat)
;

In [None]:
samples_B2, sampleids_B2, stats_B2, chains_B2 = rand(MCMCSpec(algorithm, posterior_B2), nsamples, nchains);
likelipost_samples_B2, likelids_B2, likeli_stats_B2, like_chains_B2 = rand(MCMCSpec(algorithm, likelipost_B2), nsamples, nchains);

In [None]:
stats_B2.mode

Use the *plot!(prior)* function to visulaize both the posterior of the first analysis and the updated posterior. 

In [None]:
plot(posterior_B2, samples_B2, :λb, xlabel = par_names[1], ylabel = "P($(par_names[1]))")
#plot!(likelipost_B2, likelipost_samples_B2, :λb, seriestype=:stephist, linecolor=:blue,linewidth=1.5, localmode=false, label="likelihood")
plot!(prior_B2, :λb, linewidth=1.5)

Questions:
* How does the posterior change using this new knowledge about the Background.
* Would there be another way to implement the posterior of the first anylsis other then using the samples itself?

## 3. Signal + Background
Having added the radioactive source to our experimental setup we repeat our measurement and get a measurement of $N_{S+B}=12$.
From this measurement and our prior knowledge we should be able to estimate the rate of the signal $\lambda_s$.
### Task
Perform a third analysis using a poisson model with the combined singal + background rate.
Use the known information about the background as prior and choose a suitable prior for the signal.

In [None]:
struct SignalAndBackground<:AbstractDensity
    k::Float64 # observed counts
end

function BAT.density_logval(target::SignalAndBackground, params::Union{NamedTuple,AbstractVector{<:Real}})
    return logpdf(Poisson(params[1][1] + params[2][1]), target.k)  # poisson log-likelihood
end

kSB = 12
likelihood_SB = SignalAndBackground(kSB)

In [None]:
hist_B2 = fit(Histogram, flatview(samples_B2.params)[1, :], FrequencyWeights(samples_B2.weight), nbins = 400, closed = :left)
B2 = BAT.HistogramAsUvDistribution(hist_B2);

In [None]:
prior_SB = NamedPrior(
    λb = B2,
    λs = 0..30
)

posterior_SB = PosteriorDensity(likelihood_SB, prior_SB);

In [None]:
samples_SB, sampleids_SB, stats_SB, chains_SB = rand(MCMCSpec(algorithm, posterior_SB), nsamples, nchains);

In [None]:
plot(samples_SB)

Questions:
* (How) Does the distribtion of the background rate change?
* How would you communicate your result of the signal rate (estimate value and uncertainty)? 

## 4. Error propagation

Finally, we want to caluclate the cross section of the signal process using the formula
# $σ = \frac{λ_s}{ε \cdot L}$
With the value of $L$ set to $1.1$.
Our final results should be either a value or an upper limit on the signal crosssection.

### Task a) use $ϵ \propto $ Normaldistribution
The efficiency has been measured to be $ε = 0.1 \pm 0.02$.
Assume the error to follow a normal distribution and caluclate $σ$.
Use the *Distributions* package and *rand()* to obain a sample for $\epsilon$ and calculate $\sigma$ using the sampling points for $\lambda_s$ and the formula. 
The function *broadcast()* might be useful for element wise operation when handeling the samples.

In [None]:
nsamples=800000
ε = rand(Normal(0.1,0.02),nsamples)
L = 1.1
σS = (samples_SB.params.data[2,1:nsamples])./(ε*L)

In [None]:
hist_σ = fit(Histogram, σS,FrequencyWeights(samples_SB.weight),nbins=300,closed = :left)
plot(hist_σ,1,seriestype=:smallest_intervals,xlim=(0,400))

Questions:
* What is the limit on the crosssection?

### Task b) Binomial analysis of calibration measurement with known source 
The number of expected events is $N_\text{expected} = 1000$.
The detector measures $N_\text{measured} = 123$ events.
Implement a binomial model using the *Binomial(n,p)* function of the Distributions package and extract the efficiency of the detector with BAT.
Afterwards, repeat the calculations in a) using the posterior distrtibuion of the efficiency.

In [None]:
struct BinomialModel<:AbstractDensity
    n::Int64 # n trials
    k::Int64 # k succes
end

function BAT.density_logval(target::BinomialModel, params::Union{NamedTuple,AbstractVector{<:Real}})
    return logpdf(Binomial(target.n, params[1][1]), target.k) # poisson log-likelihood
end

likelihood_binomial = BinomialModel(100, 13)

In [None]:
prior_binomial = NamedPrior(
    p = 0..1
)
nsamples_binom = 20000
posterior_binomial = PosteriorDensity(likelihood_binomial, prior_binomial);

In [None]:
samples_binomial, sampleids_binomial, stats_binomial, chains_binomial = rand(MCMCSpec(algorithm, posterior_binomial), nsamples_binom, nchains);

In [None]:
plot(samples_binomial, 1)

In [None]:
σS = (samples_SB.params.data[2,1:nsamples_binom])./(samples_binomial.params.data[1,1:nsamples_binom]*L)
hist_σ = fit(Histogram, σS,FrequencyWeights(broadcast(+,samples_SB.weight[1:nsamples_binom],samples_binomial.weight[1:nsamples_binom])),nbins=200,closed = :left)
plot(hist_σ,1,seriestype=:smallest_intervals)
#Weigthing correct? 

In [None]:
#hist_binom = fit(Histogram,flatview(samples_binomial.params.data)[1,:],FrequencyWeights(samples_binomial.weight),closed = :left)
#hist_SB = fit(Histogram, flatview(samples_SB.params.data)[2,:],FrequencyWeights(samples_SB.weight),closed = :left)

#uv_binom = BAT.HistogramAsUvDistribution(hist_binom)
#uv_SB = BAT.HistogramAsUvDistribution(hist_SB)

#σS  = rand(uv_SB,80000)./(rand(uv_binom,80000)*L)
#σS_h= fit(Histogram,σS,closed = :left,nbins=200)
#plot(σS_h,1,seriestype=:smallest_intervals)

Questions:
* What is the final upper limit on the cross section?
* How could the experiment and its results be improved/checked?