## Week 12 Lecture 2 - Prior elicitation

A key paper in this area from [Mikkola et al. 2024](https://projecteuclid.org/journals/bayesian-analysis/volume-19/issue-4/Prior-Knowledge-Elicitation-The-Past-Present-and-Future/10.1214/23-BA1381.full).

And a great blog post from [Michael Bettancourt](https://betanalpha.github.io/assets/case_studies/prior_modeling.html).


In [None]:
# Import python packages
%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns
import scipy as sp 
import random as rd
import pdb
import pymc as pm
import patsy
import arviz as az
import networkx as nx
from matplotlib import pyplot as plt
import dataframe_image as dfi
import pytensor as pyt
from scipy.optimize import minimize
from math import factorial as f



# Helper functions
def stdize(x):
    return (x-np.mean(x))/np.std(x)


def indexall(L):
    poo = []
    for p in L:
        if not p in poo:
            poo.append(p)
    Ix = np.array([poo.index(p) for p in L])
    return poo,Ix

def indexall_(L):
    Il, Ll = pd.factorize(L, sort=True)
    return Ll, Il



# Binomial distribution
def dbinom(x,n,p):
    return f(n)/(f(x)*f(n-x))*p**(x)*(1-p)**(n-x)

# Whence piors?

A key question in Bayesian modelling lies in what are priors anyhow? What do they represent? The most succinct definition is that priors are representations of our personal beliefs. But how can we do that? It seems both sensible and nonsense at the same time

> "...statistics are always to some extent constructed on the basis of judgements, and it would be an obvious delusion to think the full complexity of personal experience can be unambiguously coded and put into a spreadsheet or other software." -- David Spiegelhalter

And yet this is ultimately what Bayes theorem demands of us

> "In Bayesian theory, a 'prior' represents one's personal degree of belief before considering current evidence." -- ET Jaynes

So if this is the case, and we wish to set priors in a principled way, how can we go about it? How should we go about specifying our own priors? And how can we specify the priors of others?


# Personal Priors

As a first step, when we start to think about priors we need to first define the scale of what we're talking about. The first example often given on the topic - perhaps because bounds make things easier to think about - is of prior probabitily, how likely we think something is to happen, or what percentage something is. If we think back to our second lecture with the percentage of water on the earth, we can asign likelhood values to each percentage of water and get something we think is reasonable. 

For example, my personal beliefs about the percentage of water is that it is something near 72%, but certainly not less than 65% and certainly not more than 80%. 

In [None]:
# Subjective prior
my_prior = np.array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

# Fill in relevant values
my_prior[64:79] = np.array([1., 2., 3., 5., 7., 8., 9., 10., 9., 8., 6., 4., 3., 2., 1.])
# Normalize
my_prior = my_prior/sum(my_prior)
my_prior

In [None]:
# Plot the Beta distribution PDF
plt.figure(figsize=(8, 6))
x = np.linspace(0, 1, len(my_prior))
plt.plot(x, my_prior, label="My prior", color='blue')

# Add graph details
plt.title("My priors about water on earth", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0.72, color='red', linewidth=0.8)
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('my_prior.jpg',dpi=300);

We can then apply this to the data from class

In [None]:
# Define grid
p_grid = np.linspace(0,1,len(my_prior))
p_grid

# New observations
W = 6
L = 5
# Number of trials
N = W+L

# Calculate likelihood
likelihood = dbinom(W,N,p_grid)

# Bayes theorem
posterior = (likelihood*my_prior)/sum(likelihood*my_prior)

In [None]:
# Plot posterior over range of p_grid
plt.plot(p_grid, posterior, label='Posterior')
plt.plot(p_grid, likelihood/sum(likelihood), linestyle="-", label='Likelihood')
plt.plot(p_grid, my_prior, linestyle=":", label='Prior')
plt.legend()
plt.xlabel('Proportion water'),plt.ylabel('Posterior')
plt.savefig('my_prior_anal.jpg',dpi=300);

This might seem a little ad-hoc to you; is there something more formal we can do? The canonical reference for elicitaiton is [*Uncertain Judgements*](https://onlinelibrary.wiley.com/doi/book/10.1002/0470033312) by O'Hagan *et al*. (2006), who point out that with only a few datapoints we can parameterize a distribution using the cumulative distribution function for an appropriate distribution. 

Using my numbers above and interpreting 'certainly' as P>0.001 and P<0.999, we can find the closest cdf from a Beta distribution that represents probabilities:  

In [None]:
# Target values for the CDF
target_cdf_values = [0.5, 0.001, 0.999]
target_probs = [0.72, 0.65, 0.80]

# Define the objective function to minimize the difference between target and actual CDF values
def objective(params):
    alpha, beta = params
    cdf_values = [sp.stats.beta.cdf(p, alpha, beta) for p in target_probs]
    return np.sum((np.array(cdf_values) - np.array(target_cdf_values))**2)

# Initial guess for alpha and beta
initial_guess = [2, 2]

# Bounds for alpha and beta
bounds = [(1e-6, None), (1e-6, None)]

# Minimize the objective function
result = minimize(objective, initial_guess, bounds=bounds)

# Extract optimal alpha and beta
optimal_alpha, optimal_beta = result.x

# Compute CDF values with the fitted parameters
fitted_cdf_values = [sp.stats.beta.cdf(p, optimal_alpha, optimal_beta) for p in target_probs]

optimal_alpha, optimal_beta, fitted_cdf_values


And we can take a look at how we've done

In [None]:
# Generate points for the Beta distribution CDF
x = np.linspace(0, 1, len(my_prior))
y = sp.stats.beta.cdf(x, optimal_alpha, optimal_beta)

# Plot the Beta distribution CDF
plt.figure(figsize=(8, 6))
plt.plot(x, y, label="Beta CDF (α=282.47, β=110.06)", color='blue')

# Add target points
plt.scatter(target_probs, target_cdf_values, color='red', label="Target Points", zorder=5)

# Add labels for the target points
for prob, cdf in zip(target_probs, target_cdf_values):
    plt.text(prob, cdf, f"({prob}, {cdf})", fontsize=9, ha='left', va='bottom')

# Add graph details
plt.title("Fitted Beta Distribution CDF", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("CDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_cdf.jpg',dpi=300);

This gives us a prior to work with that seems a bit more grounded

In [None]:
# Generate points for the Beta distribution PDF
pdf_y = sp.stats.beta.pdf(x, optimal_alpha, optimal_beta)

# Plot the Beta distribution PDF
plt.figure(figsize=(8, 6))
plt.plot(x, pdf_y, label="Beta PDF (α=282.47, β=110.06)", color='green')

# Add graph details
plt.title("Fitted Beta Distribution PDF", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_pdf_prior.jpg',dpi=300);

Now we can run this through Bayes theorem on the grid

In [None]:
# Bayes theorem
posterior2 = (likelihood*pdf_y)/sum(likelihood*pdf_y)

In [None]:
# Plot posterior over range of p_grid
plt.plot(p_grid, posterior, label='Posterior')
plt.plot(p_grid, likelihood/sum(likelihood), linestyle="-", label='Likelihood')
plt.plot(p_grid, my_prior, linestyle=":", label='Prior')
plt.plot(p_grid, pdf_y/sum(pdf_y), linestyle=":", label='Elicit Prior')
plt.plot(p_grid, posterior2, label='Elict Posterior')
plt.legend()
plt.xlabel('Proportion water'),plt.ylabel('Posterior')
plt.savefig('my_prior_anal2.jpg',dpi=300);

# Multiple respondents

Now in practice we might want to do better than a single person's guess as to the distribution of water; instead we may poll a group of experts on the question at hand. Assuming we have 5 people interviewed, we can undertake the same exercise

In [None]:
# Updated target values
target_cdf_values_multi = [0.5, 0.001, 0.999]
target_prob_ranges = np.array([
    [0.72, 0.68, 0.70],  # Around 0.5
    [0.65, 0.62, 0.69],  # Around 0.001
    [0.80, 0.82, 0.75],  # Around 0.999
]).T

# Define a modified objective function for multiple target points
def objective_multi(params):
    alpha, beta = params
    total_error = 0
    for target_probs, target_cdf in zip(target_prob_ranges, target_cdf_values_multi):
        cdf_values = [sp.stats.beta.cdf(p, alpha, beta) for p in target_probs]
        avg_cdf = np.mean(cdf_values)
        total_error += (avg_cdf - target_cdf)**2
    return total_error

# Minimize the objective function
result_multi = minimize(objective_multi, initial_guess, bounds=bounds)

# Extract optimal alpha and beta
optimal_alpha_multi, optimal_beta_multi = result_multi.x

# Compute CDF values for each set of points with the fitted parameters
fitted_cdf_values_multi = [
    [sp.stats.beta.cdf(p, optimal_alpha_multi, optimal_beta_multi) for p in target_probs]
    for target_probs in target_prob_ranges
]

optimal_alpha_multi, optimal_beta_multi, fitted_cdf_values_multi


In [None]:
# Plot the Beta distribution CDF with adjusted labels
plt.figure(figsize=(8, 6))
plt.plot(x, y_multi, label=f"Beta CDF (α={optimal_alpha_multi:.2f}, β={optimal_beta_multi:.2f})", color='blue')

# Add target points and adjusted labels
offsets = [0.02, 0.04, -0.03]  # Y-offsets for the text labels
colors = ['red', 'green', 'purple']
for i, (target_probs, target_cdf) in enumerate(zip(target_prob_ranges, target_cdf_values_multi)):
    plt.scatter(target_probs, [target_cdf] * len(target_probs), color=colors[i], label=f"Target Set {i+1}", zorder=5)
    for j, prob in enumerate(target_probs):
        plt.text(prob, target_cdf + offsets[j % len(offsets)], f"({prob:.2f}, {target_cdf})", 
                 fontsize=8, ha='center', va='bottom', color=colors[i])

# Add graph details
plt.title("Fitted Beta Distribution CDF for Multiple Point Sets", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("CDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_cdf_allone.jpg',dpi=300);

In [None]:
# Generate points for the Beta distribution PDF
pdf_y = sp.stats.beta.pdf(x, optimal_alpha_multi, optimal_beta_multi)

# Plot the Beta distribution PDF
plt.figure(figsize=(8, 6))
plt.plot(x, pdf_y, label="Beta PDF (α=282.47, β=110.06)", color='green')

# Add graph details
plt.title("Fitted Beta Distribution PDF", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0.708, color='red', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_pdf2.jpg',dpi=300);

Now we have a prior - but heck it's really spiky! Really it's **too** informative about the quantity we're tyring to elicit. If the true percentage of water is 70.8% (vertical red line above), then we're saying *a priori*

In [None]:
sp.stats.beta.cdf(0.708, optimal_alpha_multi, optimal_beta_multi)

There is a 0.0000000000001% chance of there being less than the true percentage of water on the earth, which seems like we're missing the boat. In truth we are - the optimizaiton is fitting 3 sets of 3 points with a function that doesn't account for the scale of each person's beliefs. We're throwing out individual uncertainty and optimizing the group mean uncertainty. So what can we do? 

One solution is to fit each set of estimates individually, then combine the resulting pdfs into a single esimate

In [None]:
# Define sets of target probabilities and corresponding CDF values for three curves
target_cdf_values_list = [
    [0.5, 0.001, 0.999],  # Target CDF values for the first curve
    [0.5, 0.001, 0.999],  # Target CDF values for the second curve
    [0.5, 0.001, 0.999],  # Target CDF values for the third curve
]

target_probs_list = [
    [0.72, 0.65, 0.80],  # Target probabilities for the first curve
    [0.68, 0.62, 0.82],  # Target probabilities for the second curve
    [0.70, 0.69, 0.75],  # Target probabilities for the third curve
]

# Function to fit a Beta distribution for each set of points
def fit_beta(target_probs, target_cdf_values):
    def objective(params):
        alpha, beta = params
        cdf_values = [sp.stats.beta.cdf(p, alpha, beta) for p in target_probs]
        return np.sum((np.array(cdf_values) - np.array(target_cdf_values))**2)

    result = minimize(objective, initial_guess, bounds=bounds)
    return result.x  # Return alpha and beta

# Fit separate Beta distributions for each set of points
fitted_params_list = [
    fit_beta(target_probs, target_cdf_values)
    for target_probs, target_cdf_values in zip(target_probs_list, target_cdf_values_list)
]

fitted_params_list

In [None]:
x = p_grid
# Extract individual parameters
alpha_values, beta_values = zip(*fitted_params_list)

# Plot CDFs for all three fitted distributions
plt.figure(figsize=(8, 6))
for i, (alpha, beta) in enumerate(zip(alpha_values, beta_values)):
    y_cdf = sp.stats.beta.cdf(x, alpha, beta)
    plt.plot(x, y_cdf, label=f"Set {i+1} CDF (α={alpha:.2f}, β={beta:.2f})")

# Add graph details
plt.title("Fitted Beta Distribution CDFs", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("CDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_cdf_multi.jpg',dpi=300);

In [None]:
# Generate and plot PDFs for each set
plt.figure(figsize=(10, 8))
x = np.linspace(0, 1, 1000)

for i, (params, target_probs) in enumerate(zip(fitted_params_list, target_probs_list)):
    alpha, beta = params
    y = sp.stats.beta.pdf(x, alpha, beta)
    plt.plot(x, y, label=f"Beta PDF {i+1} (α={alpha:.2f}, β={beta:.2f})", lw=2)

# Combine the three PDFs by taking a weighted average
# Assume equal weighting for simplicity
weights = [1/3, 1/3, 1/3]  # Equal weights for the three PDFs
combined_pdf = sum(
    weight * sp.stats.beta.pdf(x, alpha, beta)
    for weight, (alpha, beta) in zip(weights, fitted_params_list)
)

# Function to estimate alpha and beta for the combined PDF
def estimate_combined_alpha_beta(pdf, x):
    def objective(params):
        alpha, beta = params
        estimated_pdf = sp.stats.beta.pdf(x, alpha, beta)
        return np.sum((pdf - estimated_pdf) ** 2)

    # Minimize the objective function to find the best alpha and beta
    result = minimize(objective, initial_guess, bounds=bounds)
    return result.x

# Estimate alpha and beta for the combined PDF
combined_alpha, combined_beta = estimate_combined_alpha_beta(combined_pdf, x)

plt.plot(x, sp.stats.beta.pdf(x,combined_alpha, combined_beta), label=f"Combined PDF {i+1} (α={combined_alpha:.2f}, β={combined_beta:.2f})", color="purple", lw=2)

# Add graph details
plt.title("Fitted Beta Distribution PDFs for Separate CDF Curves", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_pdf3.jpg',dpi=300);

Here, the weights are equal - we assume equal expertise for all individuals. But what if we value some opinions over others? There are many possible methods to figure out weightings:


1. Performance-Based Weighting

    Assign weights based on the historical accuracy of each expert in similar contexts or test scenarios.
    Implementation:
        Test experts on questions with known answers.
        Calculate weights proportional to the inverse of their error rates (e.g., Brier score for probabilistic predictions).
        Normalize weights to sum to 1.

2. Self-Assessment

    Let experts rate their confidence or expertise.
    Implementation:
        Collect confidence scores from experts.
        Normalize these scores to sum to 1.
    Limitations: May introduce bias as experts might over- or underestimate their abilities.

3. Calibration and Informativeness Scoring

    Evaluate experts based on their calibration (how well probabilities match observed frequencies) and informativeness (the specificity of their predictions).
    Implementation:
        Compute calibration scores (e.g., using a reliability diagram).
        Combine with informativeness measures (e.g., entropy reduction or Kullback-Leibler divergence).
        Assign higher weights to well-calibrated and informative experts.

4. Behavioral Aggregation

    Use elicitation techniques to gather meta-judgments about the reliability of peers.
    Implementation:
        Ask experts to assess the relative reliability of others.
        Use these assessments to construct a weighting scheme (e.g., using pairwise comparisons or ranking).

5. Bayesian Methods

    Treat expert weights as parameters in a Bayesian model and update them based on data or performance.
    Implementation:
        Use a prior distribution for weights.
        Update weights based on observed data (e.g., accuracy of predictions or outcomes of test scenarios).

6. Delphi Method

    Iteratively refine expert judgments through structured feedback and convergence.
    Implementation:
        Conduct multiple rounds of elicitation.
        Weigh experts more heavily if their opinions stabilize around a consensus.

7. Equal Weighting

    Assign equal weights when there is no clear basis for differential weighting.
    Use Case: Appropriate when there is no performance data or clear distinction in expertise.

Assuming we perform one of these, we can re-weight the Beta parameters for each model accordingly

In [None]:
# Generate and plot PDFs for each set
plt.figure(figsize=(10, 8))
x = np.linspace(0, 1, 1000)

for i, (params, target_probs) in enumerate(zip(fitted_params_list, target_probs_list)):
    alpha, beta = params
    y = sp.stats.beta.pdf(x, alpha, beta)
    plt.plot(x, y, label=f"Beta PDF {i+1} (α={alpha:.2f}, β={beta:.2f})", lw=2)

# Combine the three PDFs by taking a weighted average
# Assume equal weighting for simplicity
weights = [.2, .7, .1]  # Equal weights for the three PDFs
combined_pdf = sum(
    weight * sp.stats.beta.pdf(x, alpha, beta)
    for weight, (alpha, beta) in zip(weights, fitted_params_list)
)

# Function to estimate alpha and beta for the combined PDF
def estimate_combined_alpha_beta(pdf, x):
    def objective(params):
        alpha, beta = params
        estimated_pdf = sp.stats.beta.pdf(x, alpha, beta)
        return np.sum((pdf - estimated_pdf) ** 2)

    # Minimize the objective function to find the best alpha and beta
    result = minimize(objective, initial_guess, bounds=bounds)
    return result.x

# Estimate alpha and beta for the combined PDF
combined_alpha, combined_beta = estimate_combined_alpha_beta(combined_pdf, x)

plt.plot(x, sp.stats.beta.pdf(x,combined_alpha, combined_beta), label=f"Combined PDF {i+1} (α={combined_alpha:.2f}, β={combined_beta:.2f})", color="purple", lw=2)

# Add graph details
plt.title("Fitted Beta Distribution PDFs for Separate CDF Curves", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('binom_elicit_beta_pdf4.jpg',dpi=300);

This elicitation is on the probability scale, but what if we're dealing with logOdds? How can we convert to the real line? Well we can convert to a Normal distribution by getting the joint Beta mean and variance

$$
\mu_{Beta} = \frac{\alpha}{\alpha+\beta}
$$

$$
\sigma^2_{Beta} = \frac{\alpha \beta} {(\alpha + \beta)^2 (\alpha + \beta + 1)}
$$

which we can then transform to the log-odds scale

$$
\mu = logit(\mu_{Beta})
$$

$$
\sigma = \sqrt[2][\frac{\sigma^2_{Beta}}{\mu_{Beta}(1-\mu_{Beta})^2}]
$$

and plug into a normal distribtion

$$
\sim N(\mu, \sigma)
$$


In [None]:
# Define the logit function and its inverse
def logit(p):
    return np.log(p / (1 - p))

def invlogit(x):
    return 1 / (1 + np.exp(-x))

# Compute the mean and variance of the Beta distribution
mean_beta = combined_alpha / (combined_alpha + combined_beta)
var_beta = (combined_alpha * combined_beta) / ((combined_alpha + combined_beta) ** 2 * (combined_alpha + combined_beta + 1))

# Find the parameters for the Normal distribution that closely matches
# the transformed Beta distribution under the logit link
mean_normal = logit(mean_beta)
sd_normal = np.sqrt((var_beta / (mean_beta * (1 - mean_beta)) ** 2))

In [None]:
# Generate and plot PDFs for each set
plt.figure(figsize=(10, 8))
x = np.linspace(0, 1, 1000)

for i, (params, target_probs) in enumerate(zip(fitted_params_list, target_probs_list)):
    alpha, beta = params
    y = sp.stats.beta.pdf(x, alpha, beta)
    plt.plot(x, y, label=f"Beta PDF {i+1} (α={alpha:.2f}, β={beta:.2f})", lw=2)

# Combine the three PDFs by taking a weighted average
# Assume equal weighting for simplicity
weights = [1/3, 1/3, 1/3]  # Equal weights for the three PDFs
combined_pdf = sum(
    weight * sp.stats.beta.pdf(x, alpha, beta)
    for weight, (alpha, beta) in zip(weights, fitted_params_list)
)

# Function to estimate alpha and beta for the combined PDF
def estimate_combined_alpha_beta(pdf, x):
    def objective(params):
        alpha, beta = params
        estimated_pdf = sp.stats.beta.pdf(x, alpha, beta)
        return np.sum((pdf - estimated_pdf) ** 2)

    # Minimize the objective function to find the best alpha and beta
    result = minimize(objective, initial_guess, bounds=bounds)
    return result.x

# Estimate alpha and beta for the combined PDF
combined_alpha, combined_beta = estimate_combined_alpha_beta(combined_pdf, x)

plt.plot(x, sp.stats.beta.pdf(x,combined_alpha, combined_beta), label=f"Combined PDF (α={combined_alpha:.2f}, β={combined_beta:.2f})", color="purple", lw=2)

x2 = np.linspace(-5, 5, 1000)
y2 = sp.stats.norm.pdf(x2, mean_normal, sd_normal)
plt.plot(invlogit(x2), (y2/max(y2))*max(sp.stats.beta.pdf(x,combined_alpha, combined_beta)), label=f"Normal-logit PDF (μ={mean_normal:.2f}, σ={sd_normal:.2f})", color="black", lw=2, linestyle=':')


# Add graph details
plt.title("Fitted Beta Distribution PDFs for Separate CDF Curves and combined", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.xlim(0.6, 0.82)
plt.grid(alpha=0.4)

plt.savefig('logOdds_beta.jpg',dpi=300);

Beside eliciting bounds or ranges of things, there are many other ways of getting at the underlying (implied) paramers of a probability distribution. One highlighted by O'Hagan *et al.* is the *equivalent prior sample* (EST) method, which recognizes that elicition methods often make the uncertainties too low, even for an opinion from one person. 

The EST method asks for a point estimate of the expected value (in my case, 0.72) but also then asks "based on how many samples?" Then given $n$ the calculation can be made that $\alpha = n\hat{p}$ and $\beta = n(1-\hat{p})$.

For examples about the earth, there is only one, so the resulting pdf would simply be $Beta(0.72, 0.18)$

In [None]:
# Generate points for the Beta distribution PDF
pdf_y = sp.stats.beta.pdf(x, 0.72, 0.18)

# Plot the Beta distribution PDF
plt.figure(figsize=(8, 6))
plt.plot(x, pdf_y, label="Beta PDF (α=0.72, β=0.18)", color='green')

# Add graph details
plt.title("Some Beta Distribution PDF", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('one_earth.jpg',dpi=300);

Which is a bit of a crap estimate. But let's instead use samples from people to give some perspective

In [None]:
# Generate points for the Beta distribution PDF

# Plot the Beta distribution PDF
plt.figure(figsize=(8, 6))
plt.plot(x, sp.stats.beta.pdf(x, 5*0.72, 0.18), label="Beta PDF (α=0.72, β=0.18)")
plt.plot(x, sp.stats.beta.pdf(x, 5*0.72, 5*0.18), label="Beta PDF  (α=5x0.72, β=5x0.18)")
plt.plot(x, sp.stats.beta.pdf(x, 10*0.72, 10*0.18), label="Beta PDF  (α=10x0.72, β=10x0.18)")
plt.plot(x, sp.stats.beta.pdf(x, 50*0.72, 50*0.18), label="Beta PDF  (α=50x0.72, β=50x0.18)")
plt.plot(x, sp.stats.beta.pdf(x, 500*0.72, 500*0.18), label="Beta PDF  (α=500x0.72, β=500x0.18)")


# Add graph details
plt.title("Some Beta Distributions PDF", fontsize=14)
plt.xlabel("Probability", fontsize=12)
plt.ylabel("PDF Value", fontsize=12)
plt.axhline(0, color='black', linewidth=0.8, linestyle='--')
plt.axvline(0, color='black', linewidth=0.8, linestyle='--')
plt.legend()
plt.grid(alpha=0.4)
plt.savefig('multi_earth.jpg',dpi=300);

In general, this looks worse than what we have using the CDFs with few numbers of people and in practice it frequently is - the CDF is just better. But good to know that there are other options.

While the elicitation of probabilities is good in the sense it is bounded and therefore tractable, what about paramters in a geocentric linear model? How can people elicit such things? 

In [None]:
# Simulate expert responses
# Assume three experts provide their mean and confidence intervals for the slope
expert_responses = {
    "Expert 1": {"mean": 2.0, "std": 0.5},  # Mean and standard deviation
    "Expert 2": {"mean": 2.2, "std": 0.3},
    "Expert 3": {"mean": 1.8, "std": 0.4},
}

# Simulate individual distributions
n_samples = 1000
x = np.linspace(0, 4, n_samples)  # Range for the slope
distributions = {
    name: sp.stats.norm.pdf(x, loc=data["mean"], scale=data["std"])
    for name, data in expert_responses.items()
}

# Aggregate expert opinions using weighted averaging
# Equal weights for simplicity
weights = np.array([1/3, 1/3, 1/3])
aggregated_pdf = sum(weight * dist for weight, dist in zip(weights, distributions.values()))


# Calculate the aggregated mean and standard deviation
aggregated_mean = sum(weights[i] * expert_responses[f"Expert {i+1}"]["mean"] for i in range(len(weights)))
aggregated_variance = sum(weights[i] * (expert_responses[f"Expert {i+1}"]["std"]**2 + 
                                       (expert_responses[f"Expert {i+1}"]["mean"] - aggregated_mean)**2) for i in range(len(weights)))
aggregated_std = round(np.sqrt(aggregated_variance),2)

In [None]:
# Plot individual expert distributions and the aggregated distribution
plt.figure(figsize=(10, 6))
for name, pdf in distributions.items():
    plt.plot(x, pdf, label=f"{name} (Mean: {expert_responses[name]['mean']}, Std: {expert_responses[name]['std']})")
plt.plot(x, aggregated_pdf, label=f"{'Aggregated'} (Mean: {aggregated_mean}, Std: {aggregated_std})", color="black", lw=2, linestyle="--")

# Add plot details
plt.title("Direct elicitation of Slope in Simple Linear Regression", fontsize=14)
plt.xlabel("Slope", fontsize=12)
plt.ylabel("Density", fontsize=12)
plt.legend()
plt.grid(alpha=0.4)

plt.savefig('direct.jpg',dpi=300);

This is fine if we have experts that know something about - and can think in terms of - the mean and standard deviation of a regression slope. But what about normal people? Well first the language has to be good - saying what is the slope isn't good, but saying "what is the most change you would expect in Y given a change from X1 to X2?" (with context appropriate words for Y, X1 and X2)..."and what is the minimum change you would expect?" And "How sure are you that the change would be within this range?"

With these statements in hand we can convert into quantitative estimates

In [None]:
# Experts provide a plausible range (lower and upper bounds) with a confidence level (e.g., 95%)
expert_ranges = {
    "Expert 1": {"lower": 1.5, "upper": 2.5, "confidence": 0.90},
    "Expert 2": {"lower": 2.0, "upper": 2.4, "confidence": 0.95},
    "Expert 3": {"lower": 1.6, "upper": 2.0, "confidence": 0.75},
}

# Convert ranges to mean and std assuming a normal distribution
for name, data in expert_ranges.items():
    mean = (data["lower"] + data["upper"]) / 2
    std = (data["upper"] - data["lower"]) / (2 * norm.ppf((1 + data["confidence"]) / 2))
    expert_ranges[name]["mean"] = mean
    expert_ranges[name]["std"] = std

# Calculate the aggregated mean from the weighted means
aggregated_mean = sum(
    weights[i] * expert_ranges[f"Expert {i+1}"]["mean"] for i in range(len(weights))
)

# Calculate the aggregated variance from the weighted variances
aggregated_variance = sum(
    weights[i] * (expert_ranges[f"Expert {i+1}"]["std"]**2 +
                  (expert_ranges[f"Expert {i+1}"]["mean"] - aggregated_mean)**2)
    for i in range(len(weights))
)

# Compute the aggregated standard deviation
aggregated_std = np.sqrt(aggregated_variance)


aggregated_pdf = sp.stats.norm.pdf(x, aggregated_mean, scale=aggregated_std)


In [None]:
# Plot individual expert distributions and the aggregated distribution
plt.figure(figsize=(10, 6))
for name, pdf in distributions.items():
    plt.plot(x, pdf, label=f"{name} (Mean: {expert_responses[name]['mean']}, Std: {expert_responses[name]['std']})")
plt.plot(x, aggregated_pdf, label=f"{'Aggregated'} (Mean: {aggregated_mean}, Std: {round(aggregated_std,2)})", color="black", lw=2, linestyle="--")

# Add plot details
plt.title("Sort-of indirect elicitation of Slope in Simple Linear Regression", fontsize=14)
plt.xlabel("Slope", fontsize=12)
plt.ylabel("Density", fontsize=12)
plt.legend()
plt.grid(alpha=0.4)

plt.savefig('indirect.jpg',dpi=300);

All this is but the tip of the elicitation iceberg - there are many other, very complex ways to derive estimates!