In [86]:
import pymc as pm
import numpy as np
import arviz as az
from pymc.math import matrix_inverse, extract_diag, sqrt
import aesara.tensor as at

%load_ext lab_black
%load_ext watermark

The lab_black extension is already loaded. To reload it, use:
  %reload_ext lab_black
The watermark extension is already loaded. To reload it, use:
  %reload_ext watermark


# Dental Development


Adapted from [unit 10: growth.odc](https://raw.githubusercontent.com/areding/6420-pymc/main/original_examples/Codes4Unit10/growth.odc).

Data for the y array can be found [here](https://raw.githubusercontent.com/areding/6420-pymc/main/data/growthy.txt).

## Associated lecture video: Unit 10 Lesson 2

In [39]:
%%html
<iframe width="560" height="315" src="https://www.youtube.com/embed?v=xomK4tcePmc&list=PLv0FeK5oXK4l-RdT6DWJj0_upJOG2WKNO&index=99" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>

## Problem statement


Data set on dental development was first provided by Potthof and Roy in their 1964 paper. It consists of longitudinal observations on 11 girls (gender=1) and 16 boys (gender=2). 

There are 4 observations on each subject centered at times -3,-1, 1, 3, where the units were years.

The measurement on each subject is the distance (in mm) from the center of the pituitary to the pteryomaxillary fisure.

Potthoff and Roy (1964). "A Generalized Multivariate Analysis of Variance Model Useful Especially for Growth Curve Problems," Biometrika, 51, 313-326.

MVN with Gender Specific Means but Common Precision Matrix

notes:
- definitely talk about Wishart. Pymc docs say it's unusable??? why tf do they have it?
- https://github.com/pymc-devs/pymc/issues/538 super interesting discussion here
- this is all new to me, need to think about it more.

In [9]:
t = np.array((-3, -1, 1, 3))
gender = np.array(
    (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)
)
y = np.loadtxt("../data/growthy.txt")

In [21]:
C = np.eye(4)
C

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [124]:
coords = {"t": t, "gender": np.array([1, 2]), "id": np.array(range(26))}

In [26]:
coords

{'gender': array([1, 2]),
 'id': array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21, 22, 23, 24, 25])}

In [97]:
pm.Wishart.dist(nu=4, V=C).eval()

array([[ 0.20265061,  0.71904916,  0.14760878,  0.59028027],
       [ 0.71904916,  3.97705637,  2.11725019,  1.05959649],
       [ 0.14760878,  2.11725019,  2.0715082 , -0.6685237 ],
       [ 0.59028027,  1.05959649, -0.6685237 ,  2.49725134]])

In [115]:
pm.LKJCorr.dist(n=4, eta=0.4).eval()

array([-0.28551757,  0.25067463,  0.6477226 , -0.7740144 ,  0.08064782,
        0.06060438])

In [107]:
pm.WishartBartlett2(nu=4, S=C).eval()

TypeError: WishartBartlett() missing 1 required positional argument: 'name'

In [127]:
with pm.Model(coords=coords) as m:
    beta1 = pm.Normal("beta1", 20, tau=.001, dims="gender")
    beta2 = pm.Normal("beta2", 1, tau=.001, dims="gender")
    g = pm.Data("gender_idx", gender, dims="id", mutable=False)

    # T = pm.Wishart("T", nu=4, V=C) # might need to switch to LKJ
    T, corr, stds = pm.LKJCholeskyCov("T", n=4, eta=1, sd_dist=pm.Normal.dist(0, 1), compute_corr=True)

    mu = pm.Deterministic("mu", beta1[g] + beta2[g] * t, dims=("gender", "t")) # mu should be shape (2, 4)? but mvnormal mu should be a vector...
    pm.MvNormal("likelihood", mu, chol=T, observed=y) # (26, 4)

    pm.Deterministic("corr", corr)
    #V = matrix_inverse(T)
    #diag_sqrt = sqrt(extract_diag(V))
    #V = V/diag_sqrt[:, None]
    #corr = pm.Deterministic("corr", V/diag_sqrt[None, :])

    trace = pm.sample(2000)

ERROR (aesara.graph.opt): Optimization failure due to: constant_folding
ERROR:aesara.graph.opt:Optimization failure due to: constant_folding
ERROR (aesara.graph.opt): node: Assert{msg=Could not broadcast dimensions}(TensorConstant{26}, TensorConstant{False})
ERROR:aesara.graph.opt:node: Assert{msg=Could not broadcast dimensions}(TensorConstant{26}, TensorConstant{False})
ERROR (aesara.graph.opt): TRACEBACK:
ERROR:aesara.graph.opt:TRACEBACK:
ERROR (aesara.graph.opt): Traceback (most recent call last):
  File "/Users/aaron/mambaforge/envs/pymc_env/lib/python3.10/site-packages/aesara/graph/opt.py", line 1850, in process_node
    replacements = lopt.transform(fgraph, node)
  File "/Users/aaron/mambaforge/envs/pymc_env/lib/python3.10/site-packages/aesara/graph/opt.py", line 1055, in transform
    return self.fn(fgraph, node)
  File "/Users/aaron/mambaforge/envs/pymc_env/lib/python3.10/site-packages/aesara/tensor/basic_opt.py", line 2944, in constant_folding
    required = thunk()
  File "/U

AssertionError: Could not broadcast dimensions