### Floran Defossez, Sarra Mars, Abdellah Laassairi, Sébastien Roig

# Report

# Kidney: Weibull regression with random effects

## Presentation of the problem

The objective of this study is to analyze time to first and second recurrence of infection in kidney patients on dialysis. The analysis uses a Weibull regression model with random effects and includes age, sex, and underlying disease as risk variables. The aim is to identify the risk factors associated with recurrence of infection in kidney patients.

# Model 

To model the time to first and second recurrence of infection $t_{i,j}$, we use a Weibull density : $$
t_{i,j} \sim Weibull(r, \mu_{i,j})$$
The Weibull density is particularly adapted in survival analysis, it is often used to model the average time of operation without failure of an appliance for example.

<img src="Docs/Weibull density.png" data-canonical-src="Docs/Weibull density.png" width="400" height="300" />

#### Figure 1: Weibull density for different parameters ($\lambda$ scale parameter, $k$ shape parameter)

It is important to note that there is right censorship and we model this censorship using a truncated Weibull.

The shape parameter $r$ follows a gamma density and $\mu_{i,j}$ is regressed from other parameters :
$$ r \sim Gamma(1, 0.0001)$$

$$ log(\mu_{i,j}) = \alpha + \beta_{age}AGE_{i,j} + \beta_{sex}SEX_{i} + \beta_{disease1}DISEASE_{i1} + \beta_{disease2}DISEASE_{i2} + \beta_{disease3}DISEASE_{i3} + b_i $$  
Where SEXi is a 2 level factor and DISEASEik are dummy variables representing the 4-level factor for underlying disease. All the $\beta_s$ model the different risk factors and this will the interesting part to look at for the results.

To take into account the difference between the patients latent variables $b_i$ are added to the regression.
$b_i$ is the random effect for each patient with :
$$ b_i \sim Normal(0, \tau) $$

All the regression parameters are modeled with non informative priors :
$$ \alpha \sim Normal(0, 0.0001)$$

$$ \beta_{diseasek}, \beta_{age}, \beta_{sex} \sim Normal(0, 0.0001) $$ 
for $k$ in $1,2,3$
(The normal distributions are expressed with the precision parameter not variance).

Finally, we have :
$$ \tau \sim Gamma(0.0001, 0.0001)$$

This gives the following directed acyclic graph :

<img src="Docs/DAG.png" data-canonical-src="Docs/DAG.png" width="600" height="400" />

#### Figure 2: DAG of our model

# Gibbs sampler and Metropolis Hasting

We use a Gibbs sampler to simulate the laws associated with the model variables.

To do this, we must first determine the conditional laws that will be used to update each variable:

#### Conditional laws

- law of $\tau$ :

$\pi\left(\tau \mid r, \alpha,  \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, \textbf{b}\right) \propto \pi(\tau) \cdot \prod_{i=1}^{N} \pi\left(b_{i}\mid \tau)\right. $

$\tau \sim gamma(a,b)$ et $(b_i\mid \tau) \sim Normal(0, \tau)$.   

The law of $\tau$ is a conjugate prior so we can express easily the conditional law :   
$\tau \mid r, \alpha,  \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, \textbf{b} \sim Gamma(a^*, b^*)$, with $a^* = a + \frac{N}{2}$ and $b^* = b + \frac{1}{2} \cdot \sum_{i=1}^{N} b_i^2$

- law of $\alpha$

$\pi\left(\alpha \mid r, \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, \textbf{b}\right) \propto \pi(\alpha) \cdot \prod_{i,j} \pi(t_{i,j} \mid r, \alpha,  \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, b_i)$ 

With $t_{i,j} \sim Weibull(r, \mu_{i,j})$  if $t_{i,j}$ isn't a censored observation, and follows a truncated Weibull on the if it is a censored observation. The truncation is done at the value of the censored observation. $\mu_{i,j}$ is computed as mentionned before.  

$\alpha \sim Normal(0, 0.0001)$.

- laws of $\beta_{age}, \beta_{disease1,2,3}, \beta_{sex}$

The same as $\alpha$.

- law of $r$

$\pi\left(r \mid \alpha, \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, \textbf{b}\right) \propto \pi(r) \cdot \prod_{i,j} \pi(t_{i,j} \mid r, \alpha,  \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, b_i)$ 

With $r \sim Gamma(1, 0.0001)$.

- law of $b_i$

$\pi\left(b_i \mid \alpha, \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, r \right) \propto \pi(b_i \mid \tau) \cdot \prod_{j} \pi(t_{i,j} \mid r, \alpha,  \beta_{age}, \beta_{disease1,2,3}, \beta_{sex}, b_i)$

With $b_i \mid \tau \sim Normal(0, \tau).$

As we can't express in an easy way the conditional laws of $b_i$, $r$, $\alpha$, $\beta_{age}, \beta_{disease1,2,3}, \beta_{sex}$, we will use Metropolis hasting to simulate this laws.
For these parameters, we make a random walk proposal where our proposition kernel is a centred normal distribution whose variance we will adjust.


# Results