In [None]:
bayesplay::loadpackages()

# Bayesian data analysis workshop 2020


## Guide

All analysis code should be written in this notebook. Explanatory text can also be included. 

Once complete, the notebook should be downloaded as an `.ipynb` as attached to the assignment. Note, you have no persistent storage, so please ensure you regularly download the file while you're working on it so work does not get lost.


# Question 1

A researcher was interested in whether there is a gender difference in maths anxiety scores. They predict women would have higher scores than men.  Using the aMAS score (which assigns scored from 9 to 45), they find the following data:


In [None]:
set.seed(1)
max_possible = 45
male = 28 + 9 * scale(rnorm(n = 15, 0, 1)) %>% map_dbl(., function(x) round(x))
female = (35 + 10 * scale(rnorm(n = 16, 0, 1))) %>% map_dbl(.,function(x) ifelse(x > 45, 45, round(x)))

amas_data = rbind(tibble(gender = "male", score = male),
             tibble(gender = "female", score = female))
                                                            
amas_data

## 1.1

Represent this prediction (that women would score higher than man) in three ways. 

1. With a uniform prior (remember that the max score is 45)
2. With a normal prior. For the normal prior you theorise that the difference is mostly like to be small. You decide to put most of the weight near 0 with the prior weight reducing steadily as it approaches the maximum possible difference. To do this, use a normal prior centred at zero and with a standard deviation of half the maximum possible difference.
3. Some other way you think is reasonable and explain your reasoning.

# 1.2 

1. For your $\mathcal{H}_0$ you should use a point null of 0. 
2. In addition you should also trying using an interval null. You consider differences between 0 and 5 to be equally "null". Represent this with a uniform prior. 

# 1.3 

1. Define likelihood for your data. Explain your reasoning behind it. 
2. Compute the corresponding **Bayes Factors** for the 3 $\times$ $\mathcal{H}_1$ priors and 2 $\times$ $\mathcal{H}_0$ priors

You can also choose to include plots of the priors and likelihoods, or anything else that will help you explanations. 


In [None]:
bayesplay::loadpackages()

# You can write the code for question 1 in here!

# Question 2


The data set `math_data` contains data from a mathematics and reading assessment for 40 children. The data contains 4 columns. 

1. `id` is the ID of the child
2. `reading` is their reading score 
3. `procspeed` is their score on a processing speed task
4. `math` is their score on a math assessment

You're interested in the **math** outcome measure. Specifically, you're interested in whether **reading** predicts **math**. 

In [None]:
set.seed(613)

n   <- 40
rho <- .7

math_data =  tibble(
         id = 1:n,
         reading = rnorm(n, 100, 1),                    
         procspeed = rnorm(n, rho * reading, sqrt(1 - rho^2)), 
         math     = 100 + rnorm(n, (reading - procspeed), 1))     

math_data

## 2.1

1. Using **brms** fit 4 regression models. 
    1. Model 1 should be an intercept only model (predicting math)
    2. Model 2 should include 1 predictor (predicting math)
    3. Model 3 should include 1 predictor (predicting math) - Model 2 and Model 3 should have different predictors
    4. Model 4 should include both predictiors (i.e., reading and processing speed)

2. Describe the priors you've used for the parameters and include your reasoning about the prior choices.

## 2.2

1. Provide a table or figure of the regression coefficients for all 4 models

## 2.3 

1. Compare the models using PSIS-LOO and state which model is "best"
2. Average the model using PSIS-LOO
3. Generate **model averaged** predictions for the relationship between Reading and Math.


In [None]:
bayesplay::loadpackages()

# You can write the code for question 2 in here!