## 1. Exploring political differences through latent factor analysis

- Motivation
    - Political spectrum is usually considered 1-dimensional
    - Latent factor analysis has in other instances been used to understand human actions and motivatation


- Overview of project
    - Survey data
    - Naive Bayesian latent factor analysis
    - Extension with ordered probit to accomadate the discrete nature of the data


## 2. Introduction to factor models

\begin{equation}
    \underset{(p \times 1)}{y_i} = \underset{(p \times k)}{\beta} \underset{(k \times 1)}{F_i} + \underset{(p \times 1)}{\epsilon_i}
\end{equation}

for $i=1,...,T$ individuals. $\beta$ denotes the factor loadings and F are the factor scores

\begin{equation}
    \epsilon \sim I.I.D.N(0,\Sigma)
\end{equation}

\begin{equation}
    \underset{(p \times p)}{var(y)} = \underset{(p \times k)(k \times k)(k \times p)}{\beta\psi\beta'} + \underset{(p \times p)}{\Sigma} = \underset{(p \times p)}{\Omega}
\end{equation}

where a normal assumption is... 

\begin{equation}
    \psi = I_k
\end{equation}

<img style="float" src="factorloading.jpeg" width="300">

## 3. Naive Factor model

\begin{equation}
    \underset{(p \times t)}{y} = \underset{(p \times t)}{\beta} \underset{(k \times t)}{F} + \underset{(p \times t)}{\epsilon}
\end{equation}


Mispecification!
                                                     
                                                     
\begin{equation}
    p(\beta \vert y, \Sigma, F) \propto P(Y \vert \beta, F, \Sigma)P(\beta)
\end{equation}
                                                     
                                                     



## 4. Naive Factor model 2

For Q being an orthogonal matrix $Q\cdot Q' = I_k$, we have the following problem:

define new factor loading matrix s.t. $\tilde\beta = \beta Q'$

\begin{equation}
    var(y) = \tilde\beta \tilde\beta' + \Sigma =  \beta Q'(\beta Q') + \Sigma = \beta\beta' + \Sigma
\end{equation}
                                    ....No identification!
                                    
\begin{array}{lcr}
\beta_{11} & 0 & 0 \\
\beta_{11} & \beta_{22} & 0 \\
\beta_{31} & \beta_{32} & \beta_{33} \\
... & ... & ... \\
\beta_{p1} & \beta_{p2} & \beta_{p3} \\ \end{array}


Second problem: Sign switching matrix



## 5. Ordered probit Factor Model

\begin{equation}
    Y_i = l \text{ if } \tau_{l-1}\leq Y_i^* \leq \tau_{l}\text{ for l=1,2,3,4}
\end{equation}

\begin{equation}
    Y^*_i = \beta F_i + \epsilon_i
\end{equation}

<img style="float" src="gaussian.png" width="300">

Posterior:
\begin{equation}
    P(\beta \vert Y^*, Y, \tau, F) \propto  P(Y \vert Y^*, \tau)P(Y^*\vert \beta, F)P(\beta)P(F)
\end{equation}


## 6. Probit Factor Model

In order to properly identify the model we do the following:

\begin{equation}
    \Sigma = I_p
\end{equation}

If we include a constant term, we would have to fix one of our $\tau=0$, but we don't

\begin{equation}
    Y^* \sim TN_{[\tau_l, \tau_{l+1}]}(\beta F; 1)
\end{equation}

\begin{equation}
    \tau_l \sim U[\underline{\tau_l};\bar{\tau_l}]
\end{equation}



## 7. Data, simulations and factor loading

- Data
    - Survey data from municipality elections 2017 (KV17)
    - Politicians answered questionaire of 15 questions by the danish broadcasting service
    - 1200 answers by the largest danish parties used in analysis
    - dataset of size 15 variables and 1200 rows
    
    
- Simulations
    - To test if our algorithm converges 2 test data strategies utilized:
    - Strategy 1: simulate from 3 underlying factors
    - Strategy 2: Use cholesky decomposition to create dataset with correct covariance matrix
    - Both had fast convergence
    
    
- Factor loading
    - Upper triangular matrix require we think about the ordering of the questions
    - 2 first questions should be different in nature (which latent factor the questions represent)
        + loads only to factor 0: _Municipality tax should be reduced_
        + loads only to factor 0 & 1: _Institutions run by local authorities takes to much into consideration the concerns of religious minorities._


## 8. Results Naive - Trace and distribution of $\beta$ 



<table>
<tr>
    <td> <img src="estimation_trace_beta_plot.png", width = 400> </td>
    <td> <img src="estimation_dist_beta_plot.png", width = 400> </td>
</tr>
</table>

## 9. Results Naive - scatter plots 1


<table>
<tr>
    <td> <img src="estimation_scatter_la_el.png", width = 450> </td>
    <td> <img src="estimation_scatter_rv_df.png", width = 450> </td>
</tr>
</table>



## 9. Results Naive - scatter plots 2 

<table>
<tr>
    <td> <img src="estimation_scatter_v_c.png", width = 450> </td>
    <td> <img src="estimation_scatter_v_sd.png", width = 450> </td>
</tr>
</table>


## 10. Feedback/issues

- Computation issues
- Factor loading matrix - not what expected
- Unknown unknowns