Cumulative Link Mixed-effects Models_R/Tutorial_CLMM_script.Rmd

---
title: "Tutorial for Cumulative Link Mixed Models (CLMMs) in R with the package 'ordinal'"
author: "Christophe Bousquet"
date: "25/02/2021"
output:
  html_document: default
  pdf_document: default
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Introduction

Cumulative Link Mixed Models (CLMMs) make it possible to analyse ordinal response variables while allowing the use of random effects.

Ordinal responses are very common, most often in the form of Likert scales (e.g. 1. strongly disagree, 2. disagree, 3. neither agree nor disagree, 4. agree, 5. strongly agree) or preference ranks (e.g. 1. apple, 2. pear, 3. melon).

By allowing the use of random effects, CLMMs enable the inclusion of data collected repeatedly on several subjects or at several study sites.

The example presented here takes the data published by [Bousquet et al. in Behaviour](https://brill.com/view/journals/beh/154/4/article-p467_5.xml) (2017) and shows how to model the effects of different parameters on the order of individuals during collective movements of female mallards.

Briefly, this study sought to determine the factors favouring leadership among those already identified in the literature: energy needs, social relations and personality.
In a first step, mallards learned to find food in different compartments at the end of a maze. Each informed duck therefore had a direction preference in the labyrinth. The average time taken by the duck to cross the labyrinth during its solitary training was also recorded.
Once this learning phase was over, groups were formed with mallards holding different information in different proportions. The order in which the mallards entered the maze was recorded, as well as the order in which they arrived at the end of the labyrinth. For the order of arrival, we excluded the cases in which the groups split up (72 out of 471).

The retained variables for all experiments are:
- Scaled Mass Index of each bird [high values indicate higher needs]
- Average exploration score of each bird in the personality tests (3 novel environment tests and 3 novel object tests) [high values indicate more exploratory birds]
- Frequency of being the nearest neighbour of another bird [high values indicate social birds]
- Average crossing time of the maze during the learning phase [high values indicate slower birds]
- Number of passages in the maze, to control for maze exposure

From the second set on, behaviours in the pre-departure area (where the mallards were grouped together for 5 minutes before the opening of the sliding door into the maze) were recorded:
- Mobility of each bird during the pre-departure period [high values indicate more mobile birds]
- Surface explored by each bird [high values indicate that the bird explored a higher portion of the pre-departure area]
- Number of vocalizations [high values indicate more vocal birds]
- Time spent foraging
- Number of attacks emitted towards other birds
- Number of attacks received from other birds
- Distance to the door

Other variables are included in the various sets of the experiment, when appropriate.

As all individuals are females that hatched on the same day, sex and age are not taken into account.

As we are interested in the relative effects of these variables in groups of birds, we first compute for each experimental group the difference between each bird's values and the average values for the corresponding group.

In order to improve the interpretability of the estimates, we scale all numeric variables before adding them to the model.

# A - Setup
## 1. Load the necessary packages to run the script
(use install.packages("package") before if the package is not yet installed on your machine)

``` {r libraries, message = FALSE}
library(dplyr)
library(tidyr)
library(ordinal)
library(lme4)
library(ggplot2)
library(car)
library(MuMIn)
```

## 2. Import the data

load the files "Leadership_RawResults.csv" and "Leadership_IndividualCharacteristics.csv"
```{r datasets}
Leadership <- read.csv("Leadership_RawResults.csv", header = TRUE)
Ind_Charac <- read.csv("Leadership_IndividualCharacteristics.csv", header = TRUE)
```

## 3. Join the two datasets
```{r joinLeaCha}
Leadership <- Leadership %>%
  left_join(Ind_Charac, by = "Individual")
```

## 4. Specify which variables are factors

Treating OrderEntrance and OrderEnd as factors is essential to run CLMMs: their response variable has to be (ordered) factors

```{r factors}
Leadership$OrderEntrance <- factor(Leadership$OrderEntrance, ordered = TRUE)
Leadership$OrderEntrance4 <- factor(Leadership$OrderEntrance4, ordered = TRUE)
Leadership$OrderEnd <- factor(Leadership$OrderEnd, ordered = TRUE)
Leadership$OrderEnd4 <- factor(Leadership$OrderEnd4, ordered = TRUE)
Leadership$Individual <- factor(Leadership$Individual)
Leadership$Group <- factor(Leadership$Group)
Leadership$OrderAll <- factor(Leadership$OrderAll)
```

## 5. Form subsets for each experimental set

The analysis of OrderEntrance takes into account all events (as all individuals always entered the maze), but the analysis of OrderEnd takes into account only events in which the group did not split up.

In order to produce predictions on new data, it is important to limit the levels of OrderEntrance and OrderEnd to the levels existing in each set. Hence the use of the `droplevels()` function

```{r subsets}
Set1 <- Leadership %>%
  filter(Set == 1)
Set1$OrderEntrance <- droplevels(Set1$OrderEntrance)
Set1$OrderEnd <- droplevels(Set1$OrderEnd)

Set1_nf <- Leadership %>%
  filter(Set == 1)
Set1_nf$OrderEntrance <- droplevels(Set1_nf$OrderEntrance)
Set1_nf$OrderEnd <- droplevels(Set1_nf$OrderEnd)


Set2 <- Leadership %>%
  filter(Set == 2)
Set2$OrderEntrance <- droplevels(Set2$OrderEntrance)
Set2$OrderEnd <- droplevels(Set2$OrderEnd)

Set2_nf <- Leadership %>%
  filter(Set == 2)
Set2_nf$OrderEntrance <- droplevels(Set2_nf$OrderEntrance)
Set2_nf$OrderEnd <- droplevels(Set2_nf$OrderEnd)


Set3 <- Leadership %>%
  filter(Set == 3)
Set3$OrderEntrance <- droplevels(Set3$OrderEntrance)
Set3$OrderEnd <- droplevels(Set3$OrderEnd)

Set3_nf <- Leadership %>%
  filter(Set == 3)
Set3_nf$OrderEntrance <- droplevels(Set3_nf$OrderEntrance)
Set3_nf$OrderEnd <- droplevels(Set3_nf$OrderEnd)
```

# B - Analysis of leadership using CLMMs in Set1

This set focuses on pairs of female mallards with identical information and rewards.

## 1. create group-level variables (differences between individuals and group average)

```{r Set1GroupLevel}
Set1 <- Set1 %>%
  group_by(OrderAll) %>%
  mutate(SMI_av = mean(SMI),
         Exp_av = mean(Exploration),
         LTi_av = mean(LearningTime),
         FNN_av = mean(FreqNN),
         NPa_av = mean(NumPas),
         SMI_rel = SMI - SMI_av,
         Exp_rel = Exploration - Exp_av,
         LTi_rel = LearningTime - LTi_av,
         FNN_rel = FreqNN - FNN_av,
         NPa_rel = NumPas - NPa_av)

# scale numerical variables
# the as.numeric() function avoids creating lists
Set1$SMI_rel_s <- as.numeric(scale(Set1$SMI_rel))         
Set1$Exp_rel_s <- as.numeric(scale(Set1$Exp_rel))         
Set1$LTi_rel_s <- as.numeric(scale(Set1$LTi_rel))         
Set1$FNN_rel_s <- as.numeric(scale(Set1$FNN_rel))         
Set1$NPa_rel_s <- as.numeric(scale(Set1$NPa_rel))   


Set1_nf <- Set1_nf %>%
  group_by(OrderAll) %>%
  mutate(SMI_av = mean(SMI),
         Exp_av = mean(Exploration),
         LTi_av = mean(LearningTime),
         FNN_av = mean(FreqNN),
         NPa_av = mean(NumPas),
         SMI_rel = SMI - SMI_av,
         Exp_rel = Exploration - Exp_av,
         LTi_rel = LearningTime - LTi_av,
         FNN_rel = FreqNN - FNN_av,
         NPa_rel = NumPas - NPa_av)

# scale numerical variables
# the as.numeric() function avoids creating lists
Set1_nf$SMI_rel_s <- as.numeric(scale(Set1_nf$SMI_rel))         
Set1_nf$Exp_rel_s <- as.numeric(scale(Set1_nf$Exp_rel))         
Set1_nf$LTi_rel_s <- as.numeric(scale(Set1_nf$LTi_rel))         
Set1_nf$FNN_rel_s <- as.numeric(scale(Set1_nf$FNN_rel))         
Set1_nf$NPa_rel_s <- as.numeric(scale(Set1_nf$NPa_rel))
```

## 2. Analysis of the entrance order in the maze

### 2a. Modelling
First, we create a null model containing only the random effects, here the identifier of the experimental group (OrderAll) and the identifier of each individual (Individual)
```{r CLMMSet1EntranceNull}
CLMM_Entrance_S1_Null <- clmm(OrderEntrance ~ 1 +
                            (1|OrderAll) + (1|Individual), data = Set1)
summary(CLMM_Entrance_S1_Null)
```
We can see that the random effect of the experimental group (OrderAll) does not explain any variance, but we keep it to respect the experimental protocol.

Next, we implement the full model with all relevant scaled variables for Set1

```{r CLMMSet1EntranceFull}
CLMM_Entrance_S1_Full <- clmm(OrderEntrance ~ NPa_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s +
                               (1|OrderAll) + (1|Individual), data = Set1)
summary(CLMM_Entrance_S1_Full)
```

By conducting backwards elimination of non-significant terms, we arrive to a minimal model.
```{r CLMMSet1EntranceMin}
CLMM_Entrance_S1_Min_temp <- clmm(OrderEntrance ~ NPa_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s +
                               (1|OrderAll) + (1|Individual), data = Set1)
anova(CLMM_Entrance_S1_Full, CLMM_Entrance_S1_Min_temp)

CLMM_Entrance_S1_Min_temp2 <- clmm(OrderEntrance ~ NPa_rel_s + SMI_rel_s + LTi_rel_s +
                               (1|OrderAll) + (1|Individual), data = Set1)
anova(CLMM_Entrance_S1_Min_temp, CLMM_Entrance_S1_Min_temp2)

CLMM_Entrance_S1_Min <- clmm(OrderEntrance ~ SMI_rel_s + LTi_rel_s +
                               (1|OrderAll) + (1|Individual), data = Set1)
anova(CLMM_Entrance_S1_Min_temp2, CLMM_Entrance_S1_Min)
summary(CLMM_Entrance_S1_Min)
```
Both scaled mass index and crossing time during training have a significant effect on the probability to enter first in the maze. A higher SMI than the group average decreases the probability to be first and a higher crossing time during training also decreases the probability to be first.

### 2b. Visualization of the effects

To visualize the effects, we cannot use the `clmm()` function (at the time of writing), but we have to use the `clmm2()` function, which authorizes only 1 random effect to be specified. Its syntax is slightly different. We have to specify `Hess = TRUE` to get a summary.
```{r CLMMSet1EntranceMinclmm2}
CLMM2_Entrance_S1_min <- clmm2(OrderEntrance ~ SMI_rel_s + LTi_rel_s,
                               random = Individual, Hess = TRUE, data = Set1)
summary(CLMM2_Entrance_S1_min)
```
We can now predict values on new data, which will cover the whole parameter space of SMI_rel_s and LTi_rel_s.
```{r CLMMSet1EntranceNewData}
newdata_Set1_Entrance <- expand.grid(SMI_rel = seq(from = -100,
                                                   to = 100,
                                                   by = 10),
                                     LTi_rel = seq(from = -80,
                                                   to = 80,
                                                   by = 10))

# the next two lines enable to report the data on the scaled units
newdata_Set1_Entrance$SMI_rel_s <- (newdata_Set1_Entrance$SMI_rel - mean(Set1$SMI_rel)) / sd(Set1$SMI_rel)
newdata_Set1_Entrance$LTi_rel_s <- (newdata_Set1_Entrance$LTi_rel - mean(Set1$LTi_rel)) / sd(Set1$LTi_rel)
```

Now, we predict the values for the new data based on the modelling approach. The `sapply()` is important to get the results for each of the two outcomes (first or second in the pair order).
```{r Set1PredictedValues}
predict_Set1_Entrance <- sapply(as.character(1:2),
                       function(x) {
                         newdata1 = expand.grid(SMI_rel_s = seq(from = min(newdata_Set1_Entrance$SMI_rel_s),
                                                                to = max(newdata_Set1_Entrance$SMI_rel_s),
                                                                by =  max(newdata_Set1_Entrance$SMI_rel_s) - max(newdata_Set1_Entrance[newdata_Set1_Entrance$SMI_rel_s != max(newdata_Set1_Entrance$SMI_rel_s), ]$SMI_rel_s)),
                                                LTi_rel_s = seq(from = min(newdata_Set1_Entrance$LTi_rel_s),
                                                                to = max(newdata_Set1_Entrance$LTi_rel_s),
                                                                by =  max(newdata_Set1_Entrance$LTi_rel_s) - max(newdata_Set1_Entrance[newdata_Set1_Entrance$LTi_rel_s != max(newdata_Set1_Entrance$LTi_rel_s), ]$LTi_rel_s)),
                                                OrderEntrance = factor(x, levels = levels(Set1$OrderEntrance)))
                         predict(CLMM2_Entrance_S1_min, newdata = newdata1) })

# bind together the new data and the predicted values
predict_Set1_Entrance <- cbind(newdata_Set1_Entrance, predict_Set1_Entrance)

# pass the dataframe in a long format
predict_Set1_Entrance <- predict_Set1_Entrance %>%
  gather("1", "2", key = "Rank", value = "prob")
```

As two numerical variables influence the entrance order, it is easier to visualise the respective effects if we extract specific values (min, 0 and max) of one of the two variables. Then, we can plot the effects by using `ggplot2`.
```{r Set1EntrancePlot}
predict_Set1_Entrancebis <- predict_Set1_Entrance %>%
  filter(LTi_rel == min(LTi_rel) |
           LTi_rel == 0 |
           LTi_rel == max(LTi_rel))

LTi_rel.labs <- c("-80 s\nin crossing time", "no difference\nin crossing time ", "+80 s\nin crossing time")
names(LTi_rel.labs) <- c("-80", "0", "80") 

predict_Set1_Entrancebis %>%
  ggplot(aes(x = SMI_rel, y = prob, colour = Rank)) +
  geom_point() +
  geom_line() +
  facet_grid(~ LTi_rel, labeller = labeller(LTi_rel = LTi_rel.labs)) +
  labs(x = "Scaled Mass Index relative to the average of the group (in grams)",
       y = "Probability to attain a specific rank") +
  theme_gray(base_size = 15)
```

By plotting the random effect of the identity of the mallards, we can verify that there is indeed not a single individual whose 95% confidence interval does not contain 0: not a single female mallard behaved very differently from the others.

```{r Set1EntranceRandomEffect}
ci_Set1_Entrance <- CLMM2_Entrance_S1_min$ranef + qnorm(0.975) * sqrt(CLMM2_Entrance_S1_min$condVar) %o% c(-1, 1)
ord.re_Set1_Entrance <- order(CLMM2_Entrance_S1_min$ranef)
ci_Set1_Entrance <- ci_Set1_Entrance[order(CLMM2_Entrance_S1_min$ranef), ]
plot(1:10, CLMM2_Entrance_S1_min$ranef[ord.re_Set1_Entrance], axes = FALSE, ylim = range(ci_Set1_Entrance),
     xlab = "Individual", ylab = "Individual effect")
axis(1, at = 1:10, labels = ord.re_Set1_Entrance)
axis(2)
for(i in 1:10) segments(i, ci_Set1_Entrance[i, 1], i, ci_Set1_Entrance[i, 2])
abline(h = 0, lty = 2)
```

### 2c. Side note on the equivalence between logistic regression and CLMMs when the ordinal variable has only two levels

When the ordinal variable has only two levels, there is an equivalence between the cumulative link approach and the logistic regression. To run a mixed-effects logistic regression, we will use the `glmer()` from the R package `lme4` and follow the same steps.

First, we compute a null model.
```{r GLMERSet1EntranceNull}
GLMER_Entrance_S1_Null <- glmer(OrderEntrance ~ 1 +
                              (1|OrderAll) + (1|Individual), family = "binomial", data = Set1)
summary(GLMER_Entrance_S1_Null)
```
Again, the OrderAll random effect explain no variance, but we keep it to respect the experimental protocol.

Then, we compute the full model. In order to analyze the probability to be the leader, we need to respecify the response variable: the binary response variable used in the logistic regression has the value 1 when the individual was first and 0 when the individual was second. That is what the bit `ifelse(OrderEntrance == 1, 1, 0)` does. The argument `na.action = na.fail` is only necessary because we will later use the `dredge()` from the R package `MuMIn`.
```{r GLMERSet1EntranceFull}
GLMER_Entrance_S1_Full <- glmer(ifelse(OrderEntrance == 1, 1, 0) ~ NPa_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s +
                                  (1|OrderAll) + (1|Individual), family = "binomial", data = Set1, na.action = na.fail)
summary(GLMER_Entrance_S1_Full)
```
By using the `glmer()` framework, we can apply the `vif()` function from the R package `car` in order to assess the Variance Inflation Factor associated to each factor.
```{r GLMERvif}
vif(GLMER_Entrance_S1_Full)
```
None of the VIF are above, but some are close to it, potentially indicating collinearity issues.

The `glmer()` framework also allows the use of the `dredge()` function, which report the best candidate models (we only print the models with delta < 2).
```{r GLMERdredge, message = FALSE, warning = FALSE}
GLMER_Entrance_S1_dredge <- dredge(GLMER_Entrance_S1_Full)
GLMER_Entrance_S1_dredge[GLMER_Entrance_S1_dredge$delta < 2]
```
The best model is again the one containing only the scaled mass index and the average learning time, even though the model including also the number of passages is very close to it.

The minimal model is the following:
```{r GLMERSet1EntranceMin}
GLMER_Entrance_S1_min <- glmer(ifelse(OrderEntrance == 1, 1, 0) ~ SMI_rel_s + LTi_rel_s +
                                  (1|OrderAll) + (1|Individual), family = "binomial", data = Set1)
summary(GLMER_Entrance_S1_min)
```

In this model, the VIF of both variables are much more acceptable.
```{r GLMERSet1EntranceMinvif}
vif(GLMER_Entrance_S1_min)
```
We can now plot the predictions of the model on new data, but this time only the probability to be the leader is plotted (because we run a logistic regression).
```{r Set1EntrancePlotLogReg}
predict_Set1_Entrance_glmer <- cbind(newdata_Set1_Entrance,
                                     prob = predict(GLMER_Entrance_S1_min, newdata = newdata_Set1_Entrance,
                                                    type = "response", re.form = NA))

predict_Set1_Entrance_glmer_bis <- predict_Set1_Entrance_glmer %>%
  filter(LTi_rel == min(LTi_rel) |
           LTi_rel == 0 |
           LTi_rel == max(LTi_rel))

LTi_rel.labs <- c("-80 s\nin crossing time", "no difference\nin crossing time ", "+80 s\nin crossing time")
names(LTi_rel.labs) <- c("-80", "0", "80")

predict_Set1_Entrance_glmer_bis %>%
  ggplot(aes(x = SMI_rel, y = prob)) +
  geom_point() +
  geom_line() +
  facet_grid(~ LTi_rel, labeller = labeller(LTi_rel = LTi_rel.labs)) +
  labs(x = "Scaled Mass Index relative to the average of the group in grams",
       y = "Probability to be the leader") +
  theme_gray(base_size = 15)
```

We can also verify that the residuals of the model do not behave weirdly with any of the explanatory variables (the confidence intervals always contain 0).
```{r Set1EntrancePlotLogRegresid}
Set1 %>%
  ggplot(aes(x = SMI_rel_s, y = resid(GLMER_Entrance_S1_min))) +
  geom_point() +
  geom_smooth()

Set1 %>%
  ggplot(aes(x = LTi_rel_s, y = resid(GLMER_Entrance_S1_min))) +
  geom_point() +
  geom_smooth()
```


## 3. Analysis of the order at the end of the maze

For this analysis, the same approach is taken, but this time focusing on cases without group fission.
```{r CLMMSet1EndCode, message = FALSE, warning = FALSE}
# Null CLMM
CLMM_End_S1_Null <- clmm(OrderEnd ~ 1 +
                           (1|OrderAll) + (1|Individual), data = Set1_nf)
summary(CLMM_End_S1_Null)

# Full CLMM
CLMM_End_S1_Full <- clmm(OrderEnd ~ NPa_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s +
                           (1|OrderAll) + (1|Individual), data = Set1_nf)
summary(CLMM_End_S1_Full)


CLMM2_End_S1_min <- clmm2(OrderEnd ~ FNN_rel_s,
                           random = Individual, Hess = TRUE, data = Set1_nf)
summary(CLMM2_End_S1_min)

newdata_Set1_End <- expand.grid(FNN_rel = seq(from = -0.036,
                                                   to = 0.036,
                                                   by = 0.001))

# the next line enables to report the data on the scaled units
newdata_Set1_End$FNN_rel_s <- (newdata_Set1_End$FNN_rel - mean(Set1_nf$FNN_rel)) / sd(Set1_nf$FNN_rel)

predict_Set1_End <- sapply(as.character(1:2),
                            function(x) {
                              newdata1 = expand.grid(FNN_rel_s = seq(from = min(newdata_Set1_End$FNN_rel_s),
                                                                     to = max(newdata_Set1_End$FNN_rel_s),
                                                                     by =  max(newdata_Set1_End$FNN_rel_s) - max(newdata_Set1_End[newdata_Set1_End$FNN_rel_s != max(newdata_Set1_End$FNN_rel_s), ]$FNN_rel_s)),
                                                     OrderEnd = factor(x, levels = levels(Set1$OrderEnd)))
                              predict(CLMM2_End_S1_min, newdata = newdata1) })

predict_Set1_End <- cbind(newdata_Set1_End, predict_Set1_End)

predict_Set1_End <- predict_Set1_End %>%
  gather("1", "2", key = "Rank", value = "prob")

predict_Set1_End %>%
  ggplot(aes(x = FNN_rel, y = prob, colour = Rank)) +
  geom_point() +
  geom_line() +
  labs(x = "Frequency of being a nearest neighbour\nrelative to the average of the group (in percent)",
       y = "Probability to attain a specific rank") +
  theme_gray(base_size = 15)

# Now, plot the Individual random effect for the end order of Set1
ci_Set1_End <- CLMM2_End_S1_min$ranef + qnorm(0.975) * sqrt(CLMM2_End_S1_min$condVar) %o% c(-1, 1)
ord.re_Set1_End <- order(CLMM2_End_S1_min$ranef)
ci_Set1_End <- ci_Set1_End[order(CLMM2_End_S1_min$ranef), ]
plot(1:10, CLMM2_End_S1_min$ranef[ord.re_Set1_End], axes = FALSE, ylim = range(ci_Set1_End),
     xlab = "Individual", ylab = "Individual effect")
axis(1, at = 1:10, labels = ord.re_Set1_End)
axis(2)
for(i in 1:10) segments(i, ci_Set1_End[i, 1], i, ci_Set1_End[i, 2])
abline(h = 0, lty = 2)
```


# C - Analysis of leadership using CLMMs in Set2

This set focuses on trios of female mallards with identical rewards but different spatial information on where to find the rewards.

## 1. Analysis of the entrance order in the maze
```{r CLMMSet2Entrance}
Set2 <- Set2 %>%
  group_by(OrderAll) %>%
  mutate(PrD_Mob_av = mean(Mobility),
         PrD_Exp_av = mean(Explored_Surface),
         PrD_Voc_av = mean(PrD_voca),
         PrD_For_av = mean(PrD_foraging),
         PrD_Emi_av = mean(Att_Emitted),
         PrD_Suf_av = mean(Att_Suffered),
         PrD_Dis_av = mean(DistDoor),
         SMI_av = mean(SMI),
         Exp_av = mean(Exploration),
         LTi_av = mean(LearningTime),
         FNN_av = mean(FreqNN),
         NPa_av = mean(NumPas),
         PrD_Mob_rel = Mobility - PrD_Mob_av,
         PrD_Exp_rel = Explored_Surface - PrD_Exp_av,
         PrD_Voc_rel = PrD_voca - PrD_Voc_av,
         PrD_For_rel = PrD_foraging - PrD_For_av,
         PrD_Emi_rel = Att_Emitted - PrD_Emi_av,
         PrD_Suf_rel = Att_Suffered - PrD_Suf_av,
         PrD_Dis_rel = DistDoor - PrD_Dis_av,
         SMI_rel = SMI - SMI_av,
         Exp_rel = Exploration - Exp_av,
         LTi_rel = LearningTime - LTi_av,
         FNN_rel = FreqNN - FNN_av,
         NPa_rel = NumPas - NPa_av)

Set2$PrD_Mob_rel_s <- as.numeric(scale(Set2$PrD_Mob_rel))
Set2$PrD_Exp_rel_s <- as.numeric(scale(Set2$PrD_Exp_rel))         
Set2$PrD_Voc_rel_s <- as.numeric(scale(Set2$PrD_Voc_rel))         
Set2$PrD_For_rel_s <- as.numeric(scale(Set2$PrD_For_rel))         
Set2$PrD_Emi_rel_s <- as.numeric(scale(Set2$PrD_Emi_rel))         
Set2$PrD_Suf_rel_s <- as.numeric(scale(Set2$PrD_Suf_rel))         
Set2$PrD_Dis_rel_s <- as.numeric(scale(Set2$PrD_Dis_rel))         
Set2$SMI_rel_s <- as.numeric(scale(Set2$SMI_rel))         
Set2$Exp_rel_s <- as.numeric(scale(Set2$Exp_rel))         
Set2$LTi_rel_s <- as.numeric(scale(Set2$LTi_rel))         
Set2$FNN_rel_s <- as.numeric(scale(Set2$FNN_rel))         
Set2$NPa_rel_s <- as.numeric(scale(Set2$NPa_rel))         

##Entrance Order
# Null CLMM
CLMM_Entrance_S2_Null <- clmm(OrderEntrance ~ 1 +
                            (1|OrderAll) + (1|Individual), data = Set2)
summary(CLMM_Entrance_S2_Null)

# Full CLMM
CLMM_Entrance_S2_Full <- clmm(OrderEntrance ~ PrD_Mob_rel_s + PrD_Exp_rel_s + PrD_Voc_rel_s + PrD_For_rel_s + PrD_Emi_rel_s +
                                PrD_Suf_rel_s + PrD_Dis_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s + NPa_rel_s +
                                Side + MajMin +
                                (1|OrderAll) + (1|Individual), data = Set2)
summary(CLMM_Entrance_S2_Full)

# backward elimination of non-significant term, only the first step and the minimal model are shown here
CLMM_Entrance_S2_Full_drop <- drop1(CLMM_Entrance_S2_Full)
CLMM_Entrance_S2_Full_drop[CLMM_Entrance_S2_Full_drop$AIC == min(CLMM_Entrance_S2_Full_drop$AIC), ]

CLMM_Entrance_S2_min <- clmm(OrderEntrance ~ PrD_Dis_rel_s + SMI_rel_s + LTi_rel_s +
                                (1|OrderAll) + (1|Individual), data = Set2)
summary(CLMM_Entrance_S2_min)

# in order to obtain predicted values, switch to clmm2()
CLMM2_Entrance_S2_min <- clmm2(OrderEntrance ~ PrD_Dis_rel_s + SMI_rel_s + LTi_rel_s,
                               random = Individual, Hess = TRUE, data = Set2)
summary(CLMM2_Entrance_S2_min)

# New data for Set2
newdata_Set2_Entrance <- expand.grid(PrD_Dis_rel = seq(from = -40,
                                                       to = 70,
                                                       by = 10),
                                     SMI_rel = seq(from = -160,
                                                     to = 220,
                                                     by = 20),
                                     LTi_rel = seq(from = -80,
                                                     to = 90,
                                                     by = 10))

# the next three lines enable to report the data on the scaled units
newdata_Set2_Entrance$PrD_Dis_rel_s <- (newdata_Set2_Entrance$PrD_Dis_rel - mean(Set2$PrD_Dis_rel, na.rm = TRUE)) / sd(Set2$PrD_Dis_rel, na.rm = TRUE)
newdata_Set2_Entrance$SMI_rel_s <- (newdata_Set2_Entrance$SMI_rel - mean(Set2$SMI_rel)) / sd(Set2$SMI_rel)
newdata_Set2_Entrance$LTi_rel_s <- (newdata_Set2_Entrance$LTi_rel - mean(Set2$LTi_rel)) / sd(Set2$LTi_rel)


# Predict values for Set2
predict_Set2_Entrance <- sapply(as.character(1:3),
                       function(x) {
                         newdata1 = expand.grid(PrD_Dis_rel_s = seq(from = min(newdata_Set2_Entrance$PrD_Dis_rel_s),
                                                                    to = max(newdata_Set2_Entrance$PrD_Dis_rel_s),
                                                                    by = (10 - mean(Set2$PrD_Dis_rel, na.rm = TRUE)) / sd(Set2$PrD_Dis_rel, na.rm = TRUE)),
                                                SMI_rel_s = seq(from = min(newdata_Set2_Entrance$SMI_rel_s),
                                                                    to = max(newdata_Set2_Entrance$SMI_rel_s),
                                                                    by = (20 - mean(Set2$SMI_rel)) / sd(Set2$SMI_rel)),
                                                LTi_rel_s = seq(from = min(newdata_Set2_Entrance$LTi_rel_s),
                                                                    to = max(newdata_Set2_Entrance$LTi_rel_s),
                                                                    by = (10 - mean(Set2$LTi_rel)) / sd(Set2$LTi_rel)),
                                                OrderEntrance = factor(x, levels = levels(Set2$OrderEntrance)))
                         predict(CLMM2_Entrance_S2_min, newdata = newdata1) })

# bind together the new data and the predicted values
predict_Set2_Entrance <- cbind(newdata_Set2_Entrance, predict_Set2_Entrance)

# pass the dataframe in a long format
predict_Set2_Entrance <- predict_Set2_Entrance %>%
  gather("1", "2", "3", key = "Rank", value = "prob")


# produce the graph
predict_Set2_Entrancebis <- predict_Set2_Entrance %>%
  filter((LTi_rel == min(LTi_rel) |
           LTi_rel == 0 |
           LTi_rel == max(LTi_rel)) &
           (PrD_Dis_rel == min(PrD_Dis_rel) |
           PrD_Dis_rel == 0 |
           PrD_Dis_rel == max(PrD_Dis_rel)))

LTi_rel.labs <- c("-80 s\nin crossing time", "no difference\nin crossing time ", "+90 s\nin crossing time")
names(LTi_rel.labs) <- c("-80", "0", "90") 

PrD_Dis_rel.labs <- c("-40 cm\nin distance\nfrom door", "no difference\nin distance\nfrom door", "+70 cm\nin distance\nfrom door")
names(PrD_Dis_rel.labs) <- c("-40", "0", "70") 

predict_Set2_Entrancebis %>%
  ggplot(aes(x = SMI_rel, y = prob, colour = Rank)) +
  geom_point() +
  geom_line() +
  facet_grid(PrD_Dis_rel ~ LTi_rel, labeller = labeller(LTi_rel = LTi_rel.labs, PrD_Dis_rel = PrD_Dis_rel.labs)) +
  labs(x = "Scaled Mass Index relative to the average of the group (in grams)",
       y = "Probability to attain a specific rank") +
  theme_gray(base_size = 15)

# random effect for Set2 Entrance order
ci_Set2_Entrance <- CLMM2_Entrance_S2_min$ranef + qnorm(0.975) * sqrt(CLMM2_Entrance_S2_min$condVar) %o% c(-1, 1)
ord.re_Set2_Entrance <- order(CLMM2_Entrance_S2_min$ranef)
ci_Set2_Entrance <- ci_Set2_Entrance[order(CLMM2_Entrance_S2_min$ranef), ]
plot(1:10, CLMM2_Entrance_S2_min$ranef[ord.re_Set2_Entrance], axes = FALSE, ylim = range(ci_Set2_Entrance),
     xlab = "Individual", ylab = "Individual effect")
axis(1, at = 1:10, labels = ord.re_Set2_Entrance)
axis(2)
for(i in 1:10) segments(i, ci_Set2_Entrance[i, 1], i, ci_Set2_Entrance[i, 2])
abline(h = 0, lty = 2)
```


```{r CLMMSet2End}
Set2_nf <- Set2_nf %>%
  group_by(OrderAll) %>%
  mutate(PrD_Mob_av = mean(Mobility),
         PrD_Exp_av = mean(Explored_Surface),
         PrD_Voc_av = mean(PrD_voca),
         PrD_For_av = mean(PrD_foraging),
         PrD_Emi_av = mean(Att_Emitted),
         PrD_Suf_av = mean(Att_Suffered),
         PrD_Dis_av = mean(DistDoor),
         SMI_av = mean(SMI),
         Exp_av = mean(Exploration),
         LTi_av = mean(LearningTime),
         FNN_av = mean(FreqNN),
         NPa_av = mean(NumPas),
         PrD_Mob_rel = Mobility - PrD_Mob_av,
         PrD_Exp_rel = Explored_Surface - PrD_Exp_av,
         PrD_Voc_rel = PrD_voca - PrD_Voc_av,
         PrD_For_rel = PrD_foraging - PrD_For_av,
         PrD_Emi_rel = Att_Emitted - PrD_Emi_av,
         PrD_Suf_rel = Att_Suffered - PrD_Suf_av,
         PrD_Dis_rel = DistDoor - PrD_Dis_av,
         SMI_rel = SMI - SMI_av,
         Exp_rel = Exploration - Exp_av,
         LTi_rel = LearningTime - LTi_av,
         FNN_rel = FreqNN - FNN_av,
         NPa_rel = NumPas - NPa_av)

Set2_nf$PrD_Mob_rel_s <- as.numeric(scale(Set2_nf$PrD_Mob_rel))
Set2_nf$PrD_Exp_rel_s <- as.numeric(scale(Set2_nf$PrD_Exp_rel))         
Set2_nf$PrD_Voc_rel_s <- as.numeric(scale(Set2_nf$PrD_Voc_rel))         
Set2_nf$PrD_For_rel_s <- as.numeric(scale(Set2_nf$PrD_For_rel))         
Set2_nf$PrD_Emi_rel_s <- as.numeric(scale(Set2_nf$PrD_Emi_rel))         
Set2_nf$PrD_Suf_rel_s <- as.numeric(scale(Set2_nf$PrD_Suf_rel))         
Set2_nf$PrD_Dis_rel_s <- as.numeric(scale(Set2_nf$PrD_Dis_rel))         
Set2_nf$SMI_rel_s <- as.numeric(scale(Set2_nf$SMI_rel))         
Set2_nf$Exp_rel_s <- as.numeric(scale(Set2_nf$Exp_rel))         
Set2_nf$LTi_rel_s <- as.numeric(scale(Set2_nf$LTi_rel))         
Set2_nf$FNN_rel_s <- as.numeric(scale(Set2_nf$FNN_rel))         
Set2_nf$NPa_rel_s <- as.numeric(scale(Set2_nf$NPa_rel))         

##End Order
# Null CLMM
CLMM_End_S2_Null <- clmm(OrderEnd ~ 1 +
                            (1|OrderAll) + (1|Individual), data = Set2_nf)
summary(CLMM_End_S2_Null)

# Full CLMM
CLMM_End_S2_Full <- clmm(OrderEnd ~ PrD_Mob_rel_s + PrD_Exp_rel_s + PrD_Voc_rel_s + PrD_For_rel_s + PrD_Emi_rel_s +
                                PrD_Suf_rel_s + PrD_Dis_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s + NPa_rel_s +
                                Side + MajMin +
                                (1|OrderAll) + (1|Individual), data = Set2_nf)
summary(CLMM_End_S2_Full)

# backward elimination of non-significant term, only the first step and the minimal model are shown here
CLMM_End_S2_Full_drop <- drop1(CLMM_End_S2_Full)
CLMM_End_S2_Full_drop[CLMM_End_S2_Full_drop$AIC == min(CLMM_End_S2_Full_drop$AIC), ]

CLMM_End_S2_min <- clmm(OrderEnd ~ FNN_rel_s +
                          (1|OrderAll) + (1|Individual), data = Set2_nf)
summary(CLMM_End_S2_min)

# in order to obtain predicted values, switch to clmm2()
CLMM2_End_S2_min <- clmm2(OrderEnd ~ FNN_rel_s,
                               random = Individual, Hess = TRUE, data = Set2_nf)
summary(CLMM2_End_S2_min)

# New data for Set2
newdata_Set2_End <- expand.grid(FNN_rel = seq(from = -0.036,
                                              to = 0.036,
                                              by = 0.001))

# the next line enables to report the data on the scaled units
newdata_Set2_End$FNN_rel_s <- (newdata_Set2_End$FNN_rel - mean(Set2_nf$FNN_rel)) / sd(Set2_nf$FNN_rel)

predict_Set2_End <- sapply(as.character(1:3),
                            function(x) {
                              newdata1 = expand.grid(FNN_rel_s = seq(from = min(newdata_Set2_End$FNN_rel_s),
                                                                     to = max(newdata_Set2_End$FNN_rel_s),
                                                                     by =  (0.001 - mean(Set2_nf$FNN_rel)) / sd(Set2_nf$FNN_rel)),
                                                     OrderEnd = factor(x, levels = levels(Set2_nf$OrderEnd)))
                              predict(CLMM2_End_S2_min, newdata = newdata1) })

# bind together the new data and the predicted values
predict_Set2_End <- cbind(newdata_Set2_End, predict_Set2_End)

# pass the dataframe in a long format
predict_Set2_End <- predict_Set2_End %>%
  gather("1", "2", "3", key = "Rank", value = "prob")


predict_Set2_End %>%
  ggplot(aes(x = FNN_rel, y = prob, colour = Rank)) +
  geom_point() +
  geom_line() +
  labs(x = "Frequency of being a nearest neighbour\nrelative to the average of the group (in percent)",
       y = "Probability to attain a specific rank") +
  theme_gray(base_size = 15)

# random effect for Set2 End order
ci_Set2_End <- CLMM2_End_S2_min$ranef + qnorm(0.975) * sqrt(CLMM2_End_S2_min$condVar) %o% c(-1, 1)
ord.re_Set2_End <- order(CLMM2_End_S2_min$ranef)
ci_Set2_End <- ci_Set2_End[order(CLMM2_End_S2_min$ranef), ]
plot(1:10, CLMM2_End_S2_min$ranef[ord.re_Set2_End], axes = FALSE, ylim = range(ci_Set2_End),
     xlab = "Individual", ylab = "Individual effect")
axis(1, at = 1:10, labels = ord.re_Set2_End)
axis(2)
for(i in 1:10) segments(i, ci_Set2_End[i, 1], i, ci_Set2_End[i, 2])
abline(h = 0, lty = 2)
```
# D - Analysis of leadership using CLMMs in Set3

This set focuses on trios of female mallards with different rewards and different spatial information on where to find the rewards.

## 1. Analysis of the entrance order in the maze
```{r CLMMSet3Entrance}
Set3 <- Set3 %>%
  group_by(OrderAll) %>%
  mutate(PrD_Mob_av = mean(Mobility),
         PrD_Exp_av = mean(Explored_Surface),
         PrD_Voc_av = mean(PrD_voca),
         PrD_For_av = mean(PrD_foraging),
         PrD_Emi_av = mean(Att_Emitted),
         PrD_Suf_av = mean(Att_Suffered),
         PrD_Dis_av = mean(DistDoor),
         SMI_av = mean(SMI),
         Exp_av = mean(Exploration),
         LTi_av = mean(LearningTime),
         FNN_av = mean(FreqNN),
         NPa_av = mean(NumPas),
         PrD_Mob_rel = Mobility - PrD_Mob_av,
         PrD_Exp_rel = Explored_Surface - PrD_Exp_av,
         PrD_Voc_rel = PrD_voca - PrD_Voc_av,
         PrD_For_rel = PrD_foraging - PrD_For_av,
         PrD_Emi_rel = Att_Emitted - PrD_Emi_av,
         PrD_Suf_rel = Att_Suffered - PrD_Suf_av,
         PrD_Dis_rel = DistDoor - PrD_Dis_av,
         SMI_rel = SMI - SMI_av,
         Exp_rel = Exploration - Exp_av,
         LTi_rel = LearningTime - LTi_av,
         FNN_rel = FreqNN - FNN_av,
         NPa_rel = NumPas - NPa_av)

Set3$PrD_Mob_rel_s <- as.numeric(scale(Set3$PrD_Mob_rel))
Set3$PrD_Exp_rel_s <- as.numeric(scale(Set3$PrD_Exp_rel))         
Set3$PrD_Voc_rel_s <- as.numeric(scale(Set3$PrD_Voc_rel))         
Set3$PrD_For_rel_s <- as.numeric(scale(Set3$PrD_For_rel))         
Set3$PrD_Emi_rel_s <- as.numeric(scale(Set3$PrD_Emi_rel))         
Set3$PrD_Suf_rel_s <- as.numeric(scale(Set3$PrD_Suf_rel))         
Set3$PrD_Dis_rel_s <- as.numeric(scale(Set3$PrD_Dis_rel))         
Set3$SMI_rel_s <- as.numeric(scale(Set3$SMI_rel))         
Set3$Exp_rel_s <- as.numeric(scale(Set3$Exp_rel))         
Set3$LTi_rel_s <- as.numeric(scale(Set3$LTi_rel))         
Set3$FNN_rel_s <- as.numeric(scale(Set3$FNN_rel))         
Set3$NPa_rel_s <- as.numeric(scale(Set3$NPa_rel))         

##Entrance Order
# Null CLMM
CLMM_Entrance_S3_Null <- clmm(OrderEntrance ~ 1 +
                            (1|OrderAll) + (1|Individual), data = Set3)
summary(CLMM_Entrance_S3_Null)

# Full CLMM
CLMM_Entrance_S3_Full <- clmm(OrderEntrance ~ PrD_Mob_rel_s + PrD_Exp_rel_s + PrD_Voc_rel_s + PrD_For_rel_s + PrD_Emi_rel_s +
                                PrD_Suf_rel_s + PrD_Dis_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s + NPa_rel_s +
                                Side + MajMin + Mot +
                                (1|OrderAll) + (1|Individual), data = Set3)
summary(CLMM_Entrance_S3_Full)

# backward elimination of non-significant term, only the first step and the minimal model are shown here
CLMM_Entrance_S3_Full_drop <- drop1(CLMM_Entrance_S3_Full)
CLMM_Entrance_S3_Full_drop[CLMM_Entrance_S3_Full_drop$AIC == min(CLMM_Entrance_S3_Full_drop$AIC), ]

CLMM_Entrance_S3_min <- clmm(OrderEntrance ~ SMI_rel_s + Mot +
                                (1|OrderAll) + (1|Individual), data = Set3, na.action = na.omit)
summary(CLMM_Entrance_S3_min)

# in order to obtain predicted values, switch to clmm2()
CLMM2_Entrance_S3_min <- clmm2(OrderEntrance ~ SMI_rel_s + Mot,
                               random = Individual, Hess = TRUE, data = Set3)
summary(CLMM2_Entrance_S3_min)

# New data for Set3
newdata_Set3_Entrance <- expand.grid(SMI_rel = seq(from = -180,
                                                     to = 240,
                                                     by = 10),
                                     Mot = c("H", "L"))

# the next two lines enable to report the data on the unscaled units
newdata_Set3_Entrance$SMI_rel_s <- (newdata_Set3_Entrance$SMI_rel - mean(Set3$SMI_rel)) / sd(Set3$SMI_rel)


# Predict values for Set3
predict_Set3_Entrance <- sapply(as.character(1:3),
                       function(x) {
                         newdata1 = expand.grid(SMI_rel_s = seq(from = min(Set3$SMI_rel_s),
                                                                to = max(Set3$SMI_rel_s),
                                                                by = (10 - mean(Set3$SMI_rel)) / sd(Set3$SMI_rel)),
                                                Mot = c("H", "L"),
                                                OrderEntrance = factor(x, levels = levels(Set3$OrderEntrance)))
                         predict(CLMM2_Entrance_S3_min, newdata = newdata1) })

# bind together the new data and the predicted values
predict_Set3_Entrance <- cbind(newdata_Set3_Entrance, predict_Set3_Entrance)

# pass the dataframe in a long format
predict_Set3_Entrance <- predict_Set3_Entrance %>%
  gather("1", "2", "3", key = "Rank", value = "prob")


# produce the graph
Mot.labs <- c("High motivation", "Low motivation")
names(Mot.labs) <- c("H", "L") 

predict_Set3_Entrance %>%
  ggplot(aes(x = SMI_rel, y = prob, colour = Rank)) +
  geom_point() +
  geom_line() +
  facet_grid(~ Mot, labeller = labeller(Mot = Mot.labs)) +
  labs(x = "Scaled Mass Index relative to the average of the group (in grams)",
       y = "Probability to attain a specific rank") +
  theme_gray(base_size = 15)

# random effect for Set3 Entrance order
ci_Set3_Entrance <- CLMM2_Entrance_S3_min$ranef + qnorm(0.975) * sqrt(CLMM2_Entrance_S3_min$condVar) %o% c(-1, 1)
ord.re_Set3_Entrance <- order(CLMM2_Entrance_S3_min$ranef)
ci_Set3_Entrance <- ci_Set3_Entrance[order(CLMM2_Entrance_S3_min$ranef), ]
plot(1:10, CLMM2_Entrance_S3_min$ranef[ord.re_Set3_Entrance], axes = FALSE, ylim = range(ci_Set3_Entrance),
     xlab = "Individual", ylab = "Individual effect")
axis(1, at = 1:10, labels = ord.re_Set3_Entrance)
axis(2)
for(i in 1:10) segments(i, ci_Set3_Entrance[i, 1], i, ci_Set3_Entrance[i, 2])
abline(h = 0, lty = 2)
```

## 2. Analysis of the end order in the maze
```{r CLMMSet3End}
Set3_nf <- Set3_nf %>%
  group_by(OrderAll) %>%
  mutate(PrD_Mob_av = mean(Mobility),
         PrD_Exp_av = mean(Explored_Surface),
         PrD_Voc_av = mean(PrD_voca),
         PrD_For_av = mean(PrD_foraging),
         PrD_Emi_av = mean(Att_Emitted),
         PrD_Suf_av = mean(Att_Suffered),
         PrD_Dis_av = mean(DistDoor),
         SMI_av = mean(SMI),
         Exp_av = mean(Exploration),
         LTi_av = mean(LearningTime),
         FNN_av = mean(FreqNN),
         NPa_av = mean(NumPas),
         PrD_Mob_rel = Mobility - PrD_Mob_av,
         PrD_Exp_rel = Explored_Surface - PrD_Exp_av,
         PrD_Voc_rel = PrD_voca - PrD_Voc_av,
         PrD_For_rel = PrD_foraging - PrD_For_av,
         PrD_Emi_rel = Att_Emitted - PrD_Emi_av,
         PrD_Suf_rel = Att_Suffered - PrD_Suf_av,
         PrD_Dis_rel = DistDoor - PrD_Dis_av,
         SMI_rel = SMI - SMI_av,
         Exp_rel = Exploration - Exp_av,
         LTi_rel = LearningTime - LTi_av,
         FNN_rel = FreqNN - FNN_av,
         NPa_rel = NumPas - NPa_av)

Set3_nf$PrD_Mob_rel_s <- as.numeric(scale(Set3_nf$PrD_Mob_rel))
Set3_nf$PrD_Exp_rel_s <- as.numeric(scale(Set3_nf$PrD_Exp_rel))         
Set3_nf$PrD_Voc_rel_s <- as.numeric(scale(Set3_nf$PrD_Voc_rel))         
Set3_nf$PrD_For_rel_s <- as.numeric(scale(Set3_nf$PrD_For_rel))         
Set3_nf$PrD_Emi_rel_s <- as.numeric(scale(Set3_nf$PrD_Emi_rel))         
Set3_nf$PrD_Suf_rel_s <- as.numeric(scale(Set3_nf$PrD_Suf_rel))         
Set3_nf$PrD_Dis_rel_s <- as.numeric(scale(Set3_nf$PrD_Dis_rel))         
Set3_nf$SMI_rel_s <- as.numeric(scale(Set3_nf$SMI_rel))         
Set3_nf$Exp_rel_s <- as.numeric(scale(Set3_nf$Exp_rel))         
Set3_nf$LTi_rel_s <- as.numeric(scale(Set3_nf$LTi_rel))         
Set3_nf$FNN_rel_s <- as.numeric(scale(Set3_nf$FNN_rel))         
Set3_nf$NPa_rel_s <- as.numeric(scale(Set3_nf$NPa_rel))         

##End Order
# Null CLMM
CLMM_End_S3_Null <- clmm(OrderEnd ~ 1 +
                            (1|OrderAll) + (1|Individual), data = Set3_nf)
summary(CLMM_End_S3_Null)

# Full CLMM
CLMM_End_S3_Full <- clmm(OrderEnd ~ PrD_Mob_rel_s + PrD_Exp_rel_s + PrD_Voc_rel_s + PrD_For_rel_s + PrD_Emi_rel_s +
                                PrD_Suf_rel_s + PrD_Dis_rel_s + SMI_rel_s + Exp_rel_s + LTi_rel_s + FNN_rel_s + NPa_rel_s +
                                Side + MajMin + Mot +
                                (1|OrderAll) + (1|Individual), data = Set3_nf)
summary(CLMM_End_S3_Full)

# backward elimination of non-significant term, only the first step and the minimal model are shown here
CLMM_End_S3_Full_drop <- drop1(CLMM_End_S3_Full)
CLMM_End_S3_Full_drop[CLMM_End_S3_Full_drop$AIC == min(CLMM_End_S3_Full_drop$AIC), ]

CLMM_End_S3_min <- clmm(OrderEnd ~ PrD_Emi_rel_s + SMI_rel_s + LTi_rel_s + FNN_rel_s +
                                (1|OrderAll) + (1|Individual), data = Set3_nf, na.action = na.omit)
summary(CLMM_End_S3_min)

# in order to obtain predicted values, switch to clmm2()
CLMM2_End_S3_min <- clmm2(OrderEnd ~ PrD_Emi_rel_s + SMI_rel_s + LTi_rel_s + FNN_rel_s,
                               random = Individual, Hess = TRUE, data = Set3_nf)
summary(CLMM2_End_S3_min)

# New data for Set3
newdata_Set3_End <- expand.grid(SMI_rel = seq(from = -180,
                                              to = 240,
                                              by = 20),
                                PrD_Emi_rel = seq(from = -2,
                                                  to = 3,
                                                  by = 0.1),
                                LTi_rel = seq(from = -90,
                                              to = 90,
                                              by = 10),
                                FNN_rel = seq(from = -0.05,
                                              to = 0.05,
                                              by = 0.005))

# the next two lines enable to report the data on the unscaled units
newdata_Set3_End$SMI_rel_s <- (newdata_Set3_End$SMI_rel - mean(Set3_nf$SMI_rel)) / sd(Set3$SMI_rel)
newdata_Set3_End$PrD_Emi_rel_s <- (newdata_Set3_End$PrD_Emi_rel - mean(Set3_nf$PrD_Emi_rel)) / sd(Set3_nf$PrD_Emi_rel)
newdata_Set3_End$LTi_rel_s <- (newdata_Set3_End$LTi_rel - mean(Set3_nf$LTi_rel)) / sd(Set3_nf$LTi_rel)
newdata_Set3_End$FNN_rel_s <- (newdata_Set3_End$FNN_rel - mean(Set3_nf$FNN_rel)) / sd(Set3_nf$FNN_rel)


# Predict values for Set3
predict_Set3_End <- sapply(as.character(1:3),
                       function(x) {
                         newdata1 = expand.grid(SMI_rel_s = seq(from = min(newdata_Set3_End$SMI_rel_s),
                                                                to = max(newdata_Set3_End$SMI_rel_s),
                                                                by = (20 - mean(Set3_nf$SMI_rel)) / sd(Set3_nf$SMI_rel)),
                                                PrD_Emi_rel_s = seq(from = min(newdata_Set3_End$PrD_Emi_rel_s),
                                                                to = max(newdata_Set3_End$PrD_Emi_rel_s),
                                                                by = (0.1 - mean(Set3_nf$PrD_Emi_rel)) / sd(Set3_nf$PrD_Emi_rel)),
                                                LTi_rel_s = seq(from = min(newdata_Set3_End$LTi_rel_s),
                                                                to = max(newdata_Set3_End$LTi_rel_s),
                                                                by = (10 - mean(Set3_nf$LTi_rel)) / sd(Set3_nf$LTi_rel)),
                                                FNN_rel_s = seq(from = min(newdata_Set3_End$FNN_rel_s),
                                                                to = max(newdata_Set3_End$FNN_rel_s),
                                                                by = (0.005 - mean(Set3_nf$FNN_rel)) / sd(Set3_nf$FNN_rel)),
                                                OrderEnd = factor(x, levels = levels(Set3$OrderEnd)))
                         predict(CLMM2_End_S3_min, newdata = newdata1) })

# bind together the new data and the predicted values
predict_Set3_End <- cbind(newdata_Set3_End, predict_Set3_End)

# pass the dataframe in a long format
predict_Set3_End <- predict_Set3_End %>%
  gather("1", "2", "3", key = "Rank", value = "prob")


# produce the graph
predict_Set3_Endbis <- predict_Set3_End %>%
  filter((LTi_rel == min(LTi_rel) |
            LTi_rel == 0 |
            LTi_rel == max(LTi_rel)) &
           (PrD_Emi_rel == min(PrD_Emi_rel) |
              PrD_Emi_rel == 0 |
              PrD_Emi_rel == max(PrD_Emi_rel)) &
           (SMI_rel == min(SMI_rel) |
              SMI_rel == 0 |
              SMI_rel == max(SMI_rel)))

LTi.labs <- c("-90 s\nin\ncrossing\ntime", "Same\ncrossing\ntime", "+90 s\nin\ncrossing\ntime")
names(LTi.labs) <- c("-90", "0", "90") 
PrD_Emi.labs <- c("-2\nemitted\nattacks", "Same\nemmited\attacks", "+3\nemitted\nattacks")
names(PrD_Emi.labs) <- c("-2", "0", "3") 
SMI.labs <- c("-180 g\nin\nScaled\nMass\nIndex", "Same\nScaled\nMass\nIndex", "+240 g\nin\nScaled\nMass\nIndex")
names(SMI.labs) <- c("-180", "0", "240") 


predict_Set3_Endbis %>%
  ggplot(aes(x = FNN_rel, y = prob, colour = Rank)) +
  geom_point() +
  geom_line() +
  facet_grid(PrD_Emi_rel ~ LTi_rel + SMI_rel,
             labeller = labeller(LTi_rel = LTi.labs, PrD_Emi_rel = PrD_Emi.labs, SMI_rel = SMI.labs)) +
  labs(x = "Frequency of being a nearest neighbour\nrelative to the average of the group (in percent)",
       y = "Probability to attain a specific rank") +
  theme_gray(base_size = 10)

# random effect for Set3 End order
ci_Set3_End <- CLMM2_End_S3_min$ranef + qnorm(0.975) * sqrt(CLMM2_End_S3_min$condVar) %o% c(-1, 1)
ord.re_Set3_End <- order(CLMM2_End_S3_min$ranef)
ci_Set3_End <- ci_Set3_End[order(CLMM2_End_S3_min$ranef), ]
plot(1:10, CLMM2_End_S3_min$ranef[ord.re_Set3_End], axes = FALSE, ylim = range(ci_Set3_End), xlab = "Individual", ylab = "Individual effect")
axis(1, at = 1:10, labels = ord.re_Set3_End)
axis(2)
for(i in 1:10) segments(i, ci_Set3_End[i, 1], i, ci_Set3_End[i, 2])
abline(h = 0, lty = 2)
```