In [None]:
# import the packages

library(lme4)
library(MuMIn)

# Analysis on the Checkpoint data

In [None]:
# Import the data
data <- read.csv("/mnt/upramdya_data/MD/F1_Tracks/Datasets/241114_F1_Checkpoints.csv")

# Show the data
head(data)

In [None]:
# Get all columns
colnames(data)

In [None]:
# List variables to be used in the model
vars <- c("adjusted_time", "pretraining", "unlocked", "distance", "fly", "Date")

# Remove the rows with missing values in the variables of interest
data_clean <- data[complete.cases(data[vars]), ]

# Check what was removed
print(dim(data))
print(dim(data_clean))

# Remove all columns that are not in the vars list
data_clean <- data_clean[vars]

# Check the structure of the cleaned data
str(data_clean)


In [None]:
sapply(data_clean, function(x) sum(is.na(x)))

In [None]:
# Now run a linear mixed effects model to see if the adjusted time to reach each checkpoint is significantly different between the conditions, with important variables being: pretraining, unlocked, distance, success_direction and random effects being fly


model <- lmer(adjusted_time ~ pretraining * unlocked * distance + (1|fly) + (1|Date), 
              data = data_clean, 
              na.action = na.fail)

summary(model)

In [None]:
model_selection <- dredge(model)

summary(model_selection)

In [None]:
top_models <- get.models(model_selection, subset = delta < 2)
model.avg(top_models)


In [None]:
sw(model_selection)

In [None]:
best_model <- lmer(adjusted_time ~ distance + pretraining + unlocked +
    distance:pretraining + distance:unlocked + pretraining:unlocked +
    (1 | fly) + (1 | Date), data=data_clean)

summary(best_model)


# Summary of the model:

## Linear Mixed-Effects Model Analysis

### Model Overview

A linear mixed-effects model was fitted using the `lmer` function from the `lme4` package to analyze the adjusted time to reach each checkpoint. The model included fixed effects for distance, pretraining, unlocking, and their interactions, while accounting for random effects associated with individual flies and dates.

**Model Formula:**
$$
\text{adjusted\_time} \sim \text{distance} + \text{pretraining} + \text{unlocked} + \text{distance:pretraining} + \text{distance:unlocked} + \text{pretraining:unlocked} + (1 | \text{fly}) + (1 | \text{Date})
$$

### Model Fit

- **REML Criterion at Convergence:** 12520

### Residuals

The scaled residuals were analyzed:
- **Minimum:** -2.9675
- **1st Quartile:** -0.4848
- **Median:** -0.0290
- **3rd Quartile:** 0.4729
- **Maximum:** 4.1826

### Random Effects

The random effects structure indicated significant variability:

| Groups | Name        | Variance | Std. Dev. |
|--------|-------------|----------|-----------|
| fly    | (Intercept) | 2,236,313| 1495.4    |
| Date   | (Intercept) | 93,525   | 305.8     |
| Residual             | 1,031,233| 1015.5    |

- **Number of Observations:** 733
- **Groups:** 
  - Flies: 147
  - Dates: 14

### Fixed Effects

The fixed effects estimates are summarized below:

| Predictor                  | Estimate   | Std. Error | t value |
|----------------------------|------------|-------------|---------|
| (Intercept)                | 1327.894   | 277.419     | 4.787   |
| distance                   | 42.318     | 2.818       | 15.019  |
| pretraining (y)           | -395.013   | 387.482     | -1.019  |
| unlocked (y)               | -241.974   | 366.413     | -0.660  |
| distance:pretraining (y)   | -21.341    | 3.981       | -5.361  |
| distance:unlocked (y)      | 6.259      | 3.740       | 1.673   |

### Correlation of Fixed Effects

The correlation between fixed effects coefficients is as follows:

| Predictor                  | distance   | pretraining (y) | unlocked (y) |
|----------------------------|------------|------------------|--------------|
| distance                   | -0.389     |                  |              |
| pretraining (y)           | -0.667     | 0.278            |              |
| unlocked (y)               | 0.005      | -0.553           |              |
| distance:pretraining (y)   | 0.276      | -0.708           | -0.395       |
| distance:unlocked (y)      | 0.001      | 0.210            | -0.384       |

### Warnings

A warning was issued indicating that the fixed-effect model matrix is rank deficient, resulting in the dropping of one column/coefficient.

### Conclusion

The analysis indicates that:
- Distance significantly affects adjusted time.
- The interaction between distance and pretraining is significant, suggesting that pretraining modifies the effect of distance on adjusted time.
- The main effects of pretraining and unlocking alone are not statistically significant.
- There is substantial variability in adjusted time attributed to differences between individual flies and dates.

Further investigation may be warranted to address the rank deficiency and explore potential collinearity among predictors.