<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Bayesian-estimation-equivalent-ofFactor-analyis" data-toc-modified-id="Bayesian-estimation-equivalent-ofFactor-analyis-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Bayesian estimation equivalent ofFactor analyis</a></span><ul class="toc-item"><li><span><a href="#Classic-Factor-analysis" data-toc-modified-id="Classic-Factor-analysis-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Classic Factor analysis</a></span></li></ul></li><li><span><a href="#Bayesian-inference" data-toc-modified-id="Bayesian-inference-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Bayesian inference</a></span></li><li><span><a href="#Steps-of-Bayesian-data-analysis" data-toc-modified-id="Steps-of-Bayesian-data-analysis-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Steps of Bayesian data analysis</a></span></li><li><span><a href="#Step-1---Identify-the-relevant-data-for-question-under-investigation" data-toc-modified-id="Step-1---Identify-the-relevant-data-for-question-under-investigation-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Step 1 - Identify the relevant data for question under investigation</a></span><ul class="toc-item"><li><span><a href="#Study/data-description" data-toc-modified-id="Study/data-description-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Study/data description</a></span></li><li><span><a href="#Import-data" data-toc-modified-id="Import-data-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Import data</a></span><ul class="toc-item"><li><span><a href="#Clean-the-data" data-toc-modified-id="Clean-the-data-4.2.1"><span class="toc-item-num">4.2.1&nbsp;&nbsp;</span>Clean the data</a></span></li></ul></li></ul></li><li><span><a href="#Step-4---Use-Bayes-rule" data-toc-modified-id="Step-4---Use-Bayes-rule-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Step 4 - Use Bayes rule</a></span><ul class="toc-item"><li><span><a href="#Stan-model-of-Bayesian-Factor-analysis" data-toc-modified-id="Stan-model-of-Bayesian-Factor-analysis-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Stan model of Bayesian Factor analysis</a></span></li></ul></li></ul></div>

In [1]:
# Import analysis packages
%matplotlib inline
import pystan as ps
import numpy as np
import pandas as pd
import seaborn as sns
import arviz as az
import matplotlib.pyplot as plt
import scipy.stats as ss

In [2]:
from IPython.core.display import HTML as Center

Center(""" <style>
.output_png {
    display: table-cell;
    text-align: center;
    vertical-align: middle;
}
</style> """)

# Bayesian estimation equivalent ofFactor analyis

## Classic Factor analysis


# Bayesian inference
<font size = "3"> Following the quick description of the classic one sample t-test above its important to keep in mind that Bayesian analysis inference are all derived from the applciation of Bayes rule $P(\theta \mid y) = \large \frac{P(y \mid \theta) \, P(\theta)}{P(y)}$ and as such while the following description of the Bayesian model is an equivalent to Factor analysis it is fundamentally different, because it uses fully probabilistic modelling and the infernce is not based on sampling distributions</font>
    
<font size = "1"> For a fuller description see the Practicing Bayesian statistics markdown file within the Github repository.</font>

# Steps of Bayesian data analysis

<font size = "3"> Kruscke (2015) offers a step by step formulation for how to conduct a Bayesian analysis:

1. Identify the relevant data for question under investigation.

2. Define the descriptive (mathematical) model for the data.

3. Specify the Priors for the model. In the case of scientific research publication is the goal, as such the priors must be accepted by a skeptical audience. Much of this can be achieved using prior predcitve checks to acsetain os the priors are reasonable.

4. Using Bayes rule estimate the posterior for the parameters of the model using the likelihood and priors. Then interprete and the posterior

5. Conduct model checks. i.e. Posterior predcitive checks.</font> 

<font size = "1">This notebook will follow this approach generally.</font> 

#  Step 1 - Identify the relevant data for question under investigation

## Study/data description

## Import data

In [5]:
# Call github repository
url = 'https://raw.githubusercontent.com/ebrlab/Statistical-methods-for-research-workers-bayes-for-psychologists-and-neuroscientists/master/Data/Birthweight_reduced_kg.csv'
df = pd.read_csv(url)

### Clean the data

# Step 4 - Use Bayes rule

## Stan model of Bayesian Factor analysis

In [7]:
factorAnalysis = '''

data {
  int<lower=1> N;                // number of observations
  int<lower=1> P;                // number of 
  matrix[N,P] Y;                 // data matrix of order [N,P]
  int<lower=1> D;              // number of latent dimensions 
}
transformed data {
  int<lower=1> M;
  vector[P] mu;
  M  <- D*(P-D)+ D*(D-1)/2;  // number of non-zero loadings
  mu <- rep_vector(0.0,P);
}
parameters {    
  vector[M] L_t;   // lower diagonal elements of L
  vector<lower=0>[D] L_d;   // lower diagonal elements of L
  vector<lower=0>[P] psi;         // vector of variances
  real<lower=0>   mu_psi;
  real<lower=0>  sigma_psi;
  real   mu_lt;
  real<lower=0>  sigma_lt;
}
transformed parameters{
  cholesky_factor_cov[P,D] L;  //lower triangular factor loadings Matrix 
  cov_matrix[P] Q;   //Covariance mat
{
  int idx1;
  int idx2;
  real zero; 
  zero <- 0;
  for(i in 1:P){
    for(j in (i+1):D){
      idx1 <- idx1 + 1;
      L[i,j] <- zero; //constrain the upper triangular elements to zero 
    }
  }
  for (j in 1:D) {
      L[j,j] <- L_d[j];
    for (i in (j+1):P) {
      idx2 <- idx2 + 1;
      L[i,j] <- L_t[idx2];
    } 
  }
} 
Q<-L*L'+diag_matrix(psi); 
}
model {
// the hyperpriors 
   mu_psi ~ cauchy(0, 1);
   sigma_psi ~ cauchy(0,1);
   mu_lt ~ cauchy(0, 1);
   sigma_lt ~ cauchy(0,1);
// the priors 
  L_d ~ cauchy(0,3);
  L_t ~ cauchy(mu_lt,sigma_lt);
  psi ~ cauchy(mu_psi,sigma_psi);
//The likelihood
for( j in 1:N)
    Y[j] ~ multi_normal(mu,Q); 
}
'''

In [8]:
sm = ps.StanModel(model_code = factorAnalysis)

INFO:pystan:COMPILING THE C++ CODE FOR MODEL anon_model_b416f8d5e3e0c0d71640c639266fc9df NOW.
