# Chapter 6. Odds and Addends
[Link to chapter online](https://allendowney.github.io/ThinkBayes2/chap06.html)

## Warning

The content of this file may be incorrect, erroneous and/or harmful. Use it at Your own risk.

## Imports

In [None]:
import DataFrames as Dfs
import Distributions as Dsts
import CairoMakie as Cmk

In [None]:
include("./pmf.jl")
import .ProbabilityMassFunction as Pmf

## Odds

Odds [odds = $\frac{P(success)}{P(failure)} = \frac{P(success)}{1 -
P(success)}$], are used in the form of so called **Bayes's Rule**.

Bayes’s Rule is convenient if you want to do a Bayesian update on paper or in
your head. It also sheds light on the important idea of evidence and how we can
quantify the strength of evidence.

In [None]:
function getOdds(prob::Float64)::Float64
    @assert (0 <= prob <= 1) "prob takes value in range [0-1]"
    return prob / (1 - prob)
end

In [None]:
# if prob of winning is 75% then the odds are
# 3 to 1
getOdds(0.75)

In [None]:
# odds in favor (1 to 9)
getOdds(0.1)

In [None]:
# we may prefer to report odds agains 9:1
getOdds(1-0.1)

We can also do the opposite transformation, below my (BL) convesion for practice

$odds = \frac{p}{1-p}$

$\frac{p}{1-p} = odds$

$\frac{p}{1-p} = odds$ # multiply both sides by (1-p)

$p = odds * (1-p)$ # multiply: odds * (1-p)

$p = odds - odds * p$ # move (odds*p) to the left with the opposite sign

$p + (odds * p) = odds$ # rewrite left side to different equivalent form

$p * (1 + odds) = odds$ # move (1+odds) to the right with the opposite sign

$p = odds / (1 + odds)$

Below, a Julia's function for that calculation:

In [None]:
function getProb(odds::T)::T where T<:Union{Rational, Float64}
    return odds / (odds + 1) 
end

In [None]:
getProb(3/2),
getProb(3//2)

In [None]:
function getProb(yes::Int, no::Int)::Float64
    return yes / (yes + no) 
end

In [None]:
getProb(3, 2)

[...] some computations are easier when we work with odds, [...], and some
computations are even easier with log odds, [...].

## Bayes's Rule

So far we have worked with Bayes’s theorem in the “probability form”:

$P(H|D) = \frac{P(H)P(D|H)}{P(D)}$

Writing $odds(A)$ for odds in favor of `A`, we can express Bayes’s Theorem in “odds form”:

$odds(A|D) = odds(A) * \frac{P(D|A)}{P(D|B)}$

In general Bayes's Rule says that the posterior odds are prior odds times the likelihood ratio.


**My comment**

Not sure how this odds form came to be all of a sudden?!

$P(D|B)$ suggests that probability of failure was introduced, since
in the case of two exhaustive mutually exclusive events (success and failure we got):

$P(success) + P(failure) = 1$ or $P(A) + P(B) = 1$

therefore

$P(success) = 1 - P(failure)$ or $P(A) = 1 - P(B)$

So if:

$odds(A) = \frac{P(A)}{1 - P(A)} = \frac{P(A)}{P(B)}$, then

$odds(A|D) = \frac{P(A|D)}{1 - P(A|D)} = \frac{P(A|D)}{P(B|D)}$

Moreover, at the onset the Bayes's theorem had to be written in the $P(A|D)$ and not $P(H|D)$ form.

So:

$P(A|D) = \frac{P(A)P(D|A)}{P(D)}$, instead of

$P(H|D) = \frac{P(H)P(D|H)}{P(D)}$

Given the above and based on the definition of `getProb(odds)` above we see that:

$P(A) = \frac{odds(A)}{1 - odds{A}} = \frac{odds(A)}{odds{B}}$, therefore we can rewrite

$P(A|D) = \frac{P(A)P(D|A)}{P(D)}$ as

$\frac{odds(A|D)}{odds(B|D)} = \frac{P(A)P(D|A)}{P(D)}$, so i we multiply both sides by $odds(B|D)$ we get

$odds(A|D) = odds(B|D) * \frac{P(A)P(D|A)}{P(D)}$, but I am not sure what happens next.

Anyway, let's get back to the cookie problem:

> Suppose there are two bowls of cookies. Bowl 1 contains 30 vanilla cookies and
> 10 chocolate cookies. Bowl 2 contains 20 of each. Now suppose you choose one of
> the bowls at random and, without looking, select a cookie at random. The cookie
> is vanilla. What is the probability that it came from Bowl 1?

Let's use the **Bayes's Rule**:

$odds(A|D) = odds(A) * \frac{P(D|A)}{P(D|B)}$

In [None]:
# odds(A) = odds(bowl1) = p(bowl1)/p(bowl2)
cookiePriorOdds = 1 # 0.5/0.5
# p(D|A) = p(vanilla|bowl1) = 30/40 = 3/4 = 0.75
# p(D|B) = p(vanilla|bowl2) = 20/40 = 2/4 = 0.5
# likelihood ratio = p(D|A)/p(D|B)
cookieLikelihoodRatio = 0.75 / 0.5
cookiePostOdds = cookiePriorOdds * cookieLikelihoodRatio

In [None]:
cookiePostProb = getProb(cookiePostOdds)

If we draw another cookie and it's chocolate, we can do another update:

In [None]:
cookieLikelihoodRatio = (10/40) / (20/40)
cookiePostOdds *= cookieLikelihoodRatio

In [None]:
cookiePostProb = getProb(cookiePostOdds)

## Oliver's Blood

Problem from MacKay's [Information Theory, Inference, and Learning
Algorithms:](https://www.inference.org.uk/mackay/itila/)

> Two people have left traces of their own blood at the scene of a crime. A
> suspect, Oliver, is tested and found to have type ‘O’ blood. The blood groups of
> the two traces are found to be of type ‘O’ (a common type in the local
> population, having frequency 60%) and of type ‘AB’ (a rare type, with frequency
> 1%). Do these data [the traces found at the scene] give evidence in favor of the
> proposition that Oliver was one of the people [who left blood at the scene]?

What it means evidence in favor of hypothesis. The Bayes's rule says:

$odds(A|D) = odds(A) * \frac{P(D|A)}{P(D|B)}$

dividing throuigh by $odds(A)$, we get:

$\frac{odds(A|D)}{odds(A)} = \frac{P(D|A)}{P(D|B)}$

The term on the left is the ratio of the posterior and prior odds. The term on
the right is the likelihood ratio, also called the **Bayes factor**.

If the Bayes factor is greater than 1, that means that the data were more likely
under `A` than under `B`. And that means that the odds are greater, in light of
the data, than they were before.

If the Bayes factor is less than 1, that means the data were less likely under 
`A` than under `B`, so the odds in favor of `A` go down.

Finally, if the Bayes factor is exactly 1, the data are equally likely under
either hypothesis, so the odds do not change.

If Oliver is one of the people who left blood at the crime scene, he accounts
for the ‘O’ sample; in that case, the probability of the data is the
probability that a random member of the population has type ‘AB’ blood, which
is 1%.

If Oliver did not leave blood at the scene, we have two samples to account for.
If we choose two random people from the population, what is the chance of
finding one with type ‘O’ and one with type ‘AB’? Well, there are two ways it
might happen:
- 0 and AB (prob 0.6 * 0.01 = 0.006)
- AB and 0 (prob 0.01 * 0.6 = 0.006)
So, prob = 0.012 in total

In [None]:
# p(D|A) = p(gr0 is Oliver and other guy acccounts for AB)
like1 = 0.01
# p(D|B) = p(gr0 and grAB are left by someone else)
like2 = 0.6*0.01*2
likelihoodRatio = like1/like2

So little evidence against the hypothesis

In [None]:
getProb(likelihoodRatio) # prob oliver is guilty

If this result still bothers you, this way of thinking might help: the data
consist of a common event, type ‘O’ blood, and a rare event, type ‘AB’ blood. If
Oliver accounts for the common event, that leaves the rare event unexplained. If
Oliver doesn’t account for the ‘O’ blood, we have two chances to find someone in
the population with ‘AB’ blood. And that factor of two makes the difference.

### Exercise 0.1

Suppose that based on other evidence, you prior belief in Oliver’s guilt is 90%.
How much would the blood evidence in this section change your beliefs? What if
you initially thought there was only a 10% chance of his guilt?



Bayes's Rule.

$odds(A|D) = odds(A) \frac{P(D|A)}{P(D|B)}$

So, previously prior odds [odds(A) were 1, we assumed it equally likely that
Oliver accounted for the blood sample O (p = 0.5) as that he did not (p = 0.5)].
And now the probablity is 0.9 (90%), so the prior odds [odds(A)] are
0.9/0.1 = 9

In [None]:
# prior probability of Oliver's guild 90% = 0.9
postOdds = getOdds(0.9) * like1/like2

In [None]:
# prior probability of Oliver's guild 90% = 0.9
getProb(postOdds)

Bayes's Rule.

$odds(A|D) = odds(A) \frac{P(D|A)}{P(D|B)}$

So, previously prior odds [odds(A) were 1, we assumed it equally likely that
Oliver accounted for the blood sample O (p = 0.5) as that he did not (p = 0.5)].
And now the probablity is 0.1 (10%), so the prior odds [odds(A)] are
0.1/0.9 = 1/9 = 0.11(1)

In [None]:
# prior probability of Oliver's guild 10% = 0.1
postOdds = getOdds(0.1) * like1/like2

In [None]:
# prior probability of Oliver's guild 10% = 0.1
getProb(postOdds)

## Addends

The second half of this chapter is about distributions of sums and results of
other operations.

Let's start with simple 6-sided dice.

In [None]:
# creates a Pmf with names - number of dots on a dice
# priors - probability of obtaining the number of dots on a dice
function makeDice(nSides::Int = 6)::Pmf.Pmf{Int}
    return Pmf.getPmfFromSeq(1:nSides |> collect)
end

In [None]:
dice = makeDice(6)

In [None]:
fig = Cmk.Figure()
Cmk.barplot(fig[1, 1], dice.names, dice.priors, color = "lightblue",
    axis=(;title="Prior distribution of dots after a dice throw",
    xlabel="Outcome (number of dots)",
    ylabel="PMF",
    xticks=1:6)
)
fig

In [None]:
function sumProbsByNames(names::Vector{Int}, probs::Vector{Float64})::Dict{Int,Float64}
    @assert length(names) == length(probs)
    res::Dict{Int,Float64} = Dict()
    for i in eachindex(names)
        res[names[i]] = get(res, names[i], 0) + probs[i]
    end
    return res
end

In [None]:
"""
Inspired by a similar function fournd in empiricaldist by Allen Downey
    applies a fn to cartesianProduct(pmf1.names, pmf2.names)
    applies * to cartesianProduct(pmf1.prior, pmf2.priors), so P(A) and P(B)

args:
fn - function accepting two Int64s as input and returning Int64 as output
"""
function convolveDist(pmf1::Pmf.Pmf{Int}, pmf2::Pmf.Pmf{Int}, fn::Function)::Pmf.Pmf{Int}
    # Iterators.product(vec1, vec2) gives cartesian product of two vects [(v1.0, v2.0), (v1.0, v2.1), etc.]
    newNames::Vector{Int} = [[fn(a, b) for (a, b) in Iterators.product(pmf1.names, pmf2.names)]...]
    newPriors::Vector{Float64} = [[a * b for (a, b) in Iterators.product(pmf1.priors, pmf2.priors)]...] 
    probs::Dict{Int, Float64} = sumProbsByNames(newNames, newPriors)
    orderedKeys::Vector{Int64} = sort(collect(keys(probs)))
    orderedVals::Vector{Float64} = [probs[k] for k in orderedKeys]
    return Pmf.Pmf(orderedKeys, orderedVals ./ sum(orderedVals))
end

In [None]:
function addDist(pmf1::Pmf.Pmf{Int}, x::Int)::Pmf.Pmf{Int}
    return Pmf.Pmf(pmf1.names .+ x, pmf1.priors)
end

In [None]:
function addDist(pmf1::Pmf.Pmf{Int}, pmf2::Pmf.Pmf{Int})::Pmf.Pmf{Int}
    return convolveDist(pmf1, pmf2, +)
end

In [None]:
twice = addDist(dice, dice)

In [None]:
fig = Cmk.Figure()
Cmk.barplot(fig[1, 1], twice.names, twice.priors, color = "wheat",
    axis=(;title="Prior distribution of sum of dots after a two dice throw",
    xlabel="Outcome (sum of dots)",
    ylabel="PMF",
    xticks=1:12)
)
fig

In [None]:
"""
Add all pmfs in a sequence 
"""
function addDistSeq(seq::Vector{Pmf.Pmf{Int}})::Pmf.Pmf{Int}
    return reduce(addDist, seq) 
end

In [None]:
thrice = addDistSeq(repeat([dice], 3))

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(fig[1, 1], dice.names, dice.priors,
    color="blue", linestyle=:solid, linewidth=2,
    axis=(; title="Distribution of sums",
    xlabel="Outcome (sum of dots)",
    ylabel="PMF",
    xticks=1:18)
)
l2 = Cmk.lines!(fig[1, 1], twice.names, twice.priors,
    color = "orange", linestyle=:dash, linewidth=2)
l3 = Cmk.lines!(fig[1, 1], thrice.names, thrice.priors,
    color="green", linestyle=:dot, linewidth=2)
Cmk.axislegend(ax1,
    [l1, l2, l3],
    ["one dice", "two dice", "three dice"],
    "# dice used"
)
fig

## Gluten Sensitivity

A [paper](https://onlinelibrary.wiley.com/doi/full/10.1111/apt.13372) on gluten
sensitivity.

Out of 35 subjects, 12 correctly identified the gluten flour based on resumption
of symptoms while they were eating it.  Another 17 wrongly identified the
gluten-free flour based on their symptoms, and 6 were unable to distinguish.

So here’s the question: based on this data, how many of the subjects are
sensitive to gluten and how many are guessing?

Solution useng Bayes’s Theorem. Assumptions:
- People who are sensitive to gluten have a 95% chance of correctly identifying
gluten flour under the challenge conditions, and
- People who are not sensitive have a 40% chance of identifying the gluten flour
by chance (and a 60% chance of either choosing the other flour or failing to
distinguish).

These values are arbitrary, but sensitive.

First, assume that we know how many subjects are sensitive.
Second, using the likelihood of the data, compute the posterior distribution of
the number of sensitive patients.

The first is the **forward problem**; the second is the **inverse problem**.

### The Forward Problem

Suppose we know that 10 of the 35 subjects are sensitive to gluten.

In [None]:
n = 35
nSensitive = 10
nInsenstive = n - nSensitive

In [None]:
dstSensitive = Pmf.getBinomialPmf(nSensitive, 0.95)
dstInsensitive = Pmf.getBinomialPmf(nInsenstive, 0.4)

In [None]:
dstTotal = addDist(dstSensitive, dstInsensitive)

In [None]:
fig = Cmk.Figure()
ax1, l1 = Cmk.lines(fig[1, 1],
    dstSensitive.names, dstSensitive.priors,
    color="blue", linestyle=:dot, linewidth=2,
    axis=(;title="Gluten sensitivity",
        xlabel="Number of correct identifications",
        ylabel="PMF",
        xticks=0:5:35)
)
l2 = Cmk.lines!(fig[1, 1],
    dstInsensitive.names, dstInsensitive.priors,
    color="orange", linestyle=:dash, linewidth=2,
)
l3 = Cmk.lines!(fig[1, 1],
    dstTotal.names, dstTotal.priors,
    color="green", linestyle=:solid, linewidth=2,
)
Cmk.axislegend(ax1,
    [l1, l2, l3],
    ["sensitive", "insensitive", "total"],
    "distribution"
)
fig

We expect most of the sensitive subjects to identify the gluten flour correctly.
Of the 25 insensitive subjects, we expect about 10 to identify the gluten flour
by chance. So we expect about 20 correct identifications in total.

This is the answer to the forward problem: given the number of sensitive
subjects, we can compute the distribution of the data.