In [None]:
# Source the package setup script
source("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/scripts/00_setup_packages.R")

# Source the custom graphing functions
source("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/scripts/01_graphing_functions.R")

# Substrate Experiment Model

## Question

How do environmental factors (**microhabitat**, **nudibranch presence**) affect P. cristatus occurrence across individuals, sexes, and morphs?

## Objective

Test for the effect of **microhabitat** on the occurrence of **pods** *ex situ*. 

## Method

### 1. Load cleaned data.

Load data that was cleaned and transformed in "02_data_cleaning.qmd".


In [None]:
data_exp_clean <- read.csv("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/cleaned/data_exp_clean.csv")

---

### 2. Prepare data for modeling.

#### **Categorical posteriors**

For categorical and binary predictors, R automatically dummy-codes the variables. For binary variables, such as Sex (0 = Female, 1 = Male), R sets the baseline to 0 unless specified otherwise. This ensures comparisons are made relative to the baseline group.

#### **Continuous predictors**

We standardize continuous predictors, such as Fecundity, by centering and scaling them (dividing by two standard deviations). This improves interpretability, aligning their coefficients with those of binary posteriors, as suggested by Gelman (2008).

In [None]:
# Convert categorical variables to factors
# Define the columns to convert to factor
columns_to_convert_exp <- c("Chamber", "TOD", "Microhabitat", "Sex")

data_exp_clean <- data_exp_clean %>%
    mutate(across(all_of(columns_to_convert_exp), as.factor))

# Set reference category for categorical posteriors
data_exp_clean$Microhabitat <- relevel(data_exp_clean$Microhabitat, ref = "Red_Algae")



# Convert continuous variables to numeric
columns_to_convert_exp <- c("Count")
columns_to_convert_exp <- c("Time")

data_exp_clean <- data_exp_clean %>%
    mutate(across(all_of(columns_to_convert_exp), as.numeric))



---

### 3. Visualize response variable distributions

To understand the distributions of our response variables, we plot density curves. This helps confirm whether a Poisson distribution is appropriate for modeling.


In [None]:
# Function to generate density plots by Microhabitat
generate_pod_count_plot <- function(data, 
                                    axis_title_x = TRUE, 
                                    axis_text_x = TRUE,
                                    axis_title_y = TRUE, 
                                    axis_text_y = TRUE,
                                    axis_title_size = 10, 
                                    axis_text_size = 8,
                                    plot_title = NULL) {
  data %>% 
    ggplot(aes(x = Microhabitat, y = Count, fill = Microhabitat)) +
    geom_boxplot(alpha = 0.7, outlier.shape = NA) +
    theme_bw(base_size = 8) +
    labs(
      title = plot_title,
      x = if (axis_title_x) "Microhabitat" else NULL,
      y = if (axis_title_y) "Pod Count" else NULL,
      fill = NULL
    ) +
    theme(
      axis.title.x = if (axis_title_x) element_text(size = axis_title_size) else element_blank(),
      axis.text.x  = if (axis_text_x) element_text(size = axis_text_size, angle = 45, hjust = 1) else element_blank(),
      axis.title.y = if (axis_title_y) element_text(size = axis_title_size) else element_blank(),
      axis.text.y  = if (axis_text_y) element_text(size = axis_text_size) else element_blank(),
      legend.position = "none",
      panel.grid.major = element_blank(),
      panel.grid.minor = element_blank()
    )
}



# Generate the density plot directly for the variable "Count"
plot_exp_count <- generate_pod_count_plot(
  data = data_exp_clean,
  axis_title_y = TRUE,
  axis_text_y = TRUE,
  axis_title_x = TRUE,
  axis_text_x = TRUE,
  axis_title_size = 10, 
  axis_text_size = 8
)


ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_exp_count.png", plot = plot_exp_count, width = 3, height = 3, units = "in", dpi = 300)


In [None]:
# Convert images to base64
plot_exp_count <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_exp_count.png")

# Create the HTML 
html_count <- paste0("
  <style>
    .image-container {
      display: flex;
      justify-content: space-between;
      align-items: flex-start;
      flex-wrap: wrap;
      gap: 20px;
      max-width: 100%;
    }

    .image-container img {
      flex: 1 1 48%;
      max-width: 48%;
      height: auto;
      border: 1px solid #ccc;
    }

    @media screen and (max-width: 800px) {
      .image-container img {
        max-width: 100%;
        flex: 1 1 100%;
      }
    }
  </style>

  <div class='image-container'>
    <img src='", plot_exp_count, "' alt='Pod Count by Microhabitat'>
  </div>
")

# Display the HTML
IRdisplay::display_html(html_count)


---

### 4. Justification for GLMs and Bayesian methods

#### **Why GLMs?**

Generalized Linear Models (GLMs) are used because our response variables are continuous, non-negative, and right-skewed, which violates the assumptions of linear regression. The Poisson distribution with a log link function allows us to model these variables appropriately.

---

#### **Why Bayesian methods?**

Bayesian GLMs were chosen over frequentist approaches because:

-   **Sparse data**: Bayesian methods handle sparse datasets more robustly.

-   **Priors**: They allow us to incorporate prior knowledge, improving model performance.

-   **Posterior distributions**: Bayesian models provide posterior distributions, offering a full view of parameter uncertainty.

---

### 5. Define priors

#### **What are priors?**

In Bayesian statistics, priors represent our beliefs about parameter values before analyzing the data. These beliefs are mathematically expressed as probability distributions. Priors guide the model, especially when data are sparse or when the signal-to-noise ratio is low.

---

#### **Why priors matter:**

-   **Prevent overfitting**: Priors discourage extreme parameter estimates unless strongly supported by the data.

-   **Balance restrictiveness and flexibility**: Weakly informative priors let the data dominate while providing reasonable bounds.

-   **Leverage existing knowledge**: Informative priors incorporate previous research or domain expertise, improving accuracy in well-studied systems.

---

#### **How priors work in this analysis:**
We combine the priors (representing initial beliefs) with the likelihood of the observed data to compute posterior distributions, which reflect updated beliefs after observing the data.

The general formula is:

$$
\begin{aligned}
\text{Posterior } \alpha \text{ Likelihood} * \text{Prior}
\end{aligned}
$$

---

#### **Model family and formula**

Our response variable (i.e., Pod count) is discrete. We model these variables using a Poisson distribution, which is suitable for count data, assuming the mean equals the variance (we will test for this after we build the model). The Poisson distribution can operate on a log scale to force the result to be positive:

$$
\begin{aligned}
& y_{i} \sim \text{Poisson}(\mu_{i}) \\
\end{aligned}
$$

The mean pod count $\mu_i$ is modeled as:

$$
\text{log}(\mu_{i}) = \alpha + \beta x_{i} \\
$$

Where:

-   **$y_{i}$**: Response variable for observation $i$

-   **$\mu_{i}$**: Mean of *P. cristatus* count for $i$ (on the log scale)

-   **$\alpha$**: Intercept, mean value of $\text{log}(\mu)$ when predictors are at baseline

-   **$beta$**: Coefficient/slope for predictor $x_{i}$, representing the effect of $x_{i}$ on $\text{log}(\mu_i)$

---

#### Chosen priors and rationale

**Intercept (\alpha)**: The intercept (\alpha) reflects the baseline log-count of *P. cristatus* in the reference conditions (Red Algae). Based on our experimental setup, we know the maximum count on a microhabitat should never go above 3 because there are no more than 3 pods per chamber, translating to:

$$
\begin{aligned}
\text{log}(3) = 1.10
\end{aligned}
$$

We set a weakly informative prior:
$$
\begin{aligned}
\alpha \approx N(0,0.5)
\end{aligned}
$$

This prior allows for moderate variability in log-count (95% of values between -1 and 1), corresponding to probabilities ranging from 12% to 88% on the original scale:

$N(0, 0.5)$ is a normal distribution with a mean of 0 and a standard deviation of $\sqrt(0.5) = 0.702$. On the original scale, this translates to a mean of $e^{0}=1$, and a standard deviation of $e^{0.702}=2.02$. In a normal distribution, 95% of values fall within 2 standard deviations of the mean. Thus, most values will lie roughly between -1.404 and 1.404, on the log scale. On the original scale, the values lie roughly between $e^{-1.404} \approx 0.246$ to $e^{1.404} \approx 4.07$. This says that the number of pods can range from close to 0 all the way up to 3 (and a little beyond).

---

**Slope ($\beta$)**: The slope reflects the rate of change in $\log(\mu)$ for a one-unit change in the predictor. On the original scale, this translates to:

$$
\begin{aligned}
\frac{y_{x+1}}{y_x} &= \frac{e^{(\alpha + \beta(x+1))}}{e^{(\alpha + \beta x)}} \\
&= \frac{e^{(\alpha + \beta x + \beta)}}{e^{(\alpha + \beta x)}} \\
&= e^{(\alpha + \beta x)} \times \frac{e^{\beta}}{e^{(\alpha + \beta x)}} \\
&= e^{\beta} \\
& \quad \quad \quad \text{OR} \\
& \text{Scaling factor} = e^{\beta}
\end{aligned}
$$


For example:

-   If $\beta = 0.1$, a one-unit increase in the predictor scales $\mu$ by $e^{0.1} \approx 1.10$ (a 10% increase).

-   If $\beta = -0.1$, a one-unit increase scales $\mu$ by $e^{-0.1} \approx 0.91$ (a 9% decrease).

We don't know how our predictors will affect the response, so we use a weakly informative prior:

$$
\begin{aligned}
\beta \approx N(0,0.5)
\end{aligned}
$$

This prior assumes no effect of the predictor on average ($e^{0} = 1$, and allows moderate positive or negative effects, spanning approximately $[-1, 1]$ on the log scale ($e^{-1} \approx 0.37$ to $e^{1} \approx 2.72$ on the original scale). So $\mu$ of each response variable will not exceed a minimum of $\mu * 0.37$ and maximum of $\mu * 2.72$.

---

**Random effects ($\sigma$)**: Random effects account for variability between groups that we are not explicitly testing. In this model, Chamber will be the random effect. This is because we have repeated measures within the same Chambers. Measurements within the same Chamber are likely to be correlated due to shared environmental factors, spatial proximity, or other unobserved factors specific to the Chamber. Including random effects in our model accounts for this. To capture moderate variability, we assign:

$$
\begin{aligned}
\approx N(0,0.5)
\end{aligned}
$$

This ensures flexibility without overfitting.

---

#### **Visualizing priors**

To validate these priors, we run models sampling only from the priors (sample_prior = "only") and inspect their outputs to ensure they align with our expectations.


>**Note:** We will run these models in RStudio to be consistent because the rstan package sometimes does not like Jupyter. The models were saved in RStudio and loaded below. We left the code chunks as comments for reference.

In [None]:
# # Set seed for reproducibility
# set.seed(5678)
# exp_priors <- brm(
#     family = poisson(link = "log"),
#     formula = Count ~ 1 + Microhabitat * Time + (1 | Chamber),
#     data = data_exp_clean,
#     prior = c(
#         prior(normal(0, 0.5), class = "Intercept"),
#         prior(normal(0, 0.5), class = "b"),
#         prior(normal(0, 0.5), class = "sd")
#     ),
#     sample_prior = "only",
#     save_pars = save_pars(all = TRUE),
#     control = list(adapt_delta = 0.99, max_treedepth = 12),
#     iter = 15000,
#     warmup = 5000,
#     chains = 2,
#     cores = parallel::detectCores(logical=FALSE),
#     backend = "rstan"
# )
# saveRDS(exp_priors, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/exp_priors.rds")

In [None]:
# Load models
exp_priors <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/exp_priors.rds")

# Extract results.
prior_samples_exp <- as_draws_df(exp_priors)

In [None]:
# List of datasets and labels
prior_samples <- list(
    Count = prior_samples_exp
)

# Custom labels
custom_labels_priors <- c(
  "b_MicrohabitatSand" = "Sand",
  "b_MicrohabitatSertulariidae" = "Sertulariidae",
  "b_MicrohabitatHydrallmania" = "Hydrallmania",
  "b_Time" = "Time"
)

custom_labels_priors_intercept <- c(
  "b_Intercept" = "Intercept"
)

# Generate plots for predictors
prior_plots <- lapply(
    names(prior_samples),
    function(label) {
        generate_posterior_plot(
            prior_samples[[label]],
            regex_pars = c(
                "b_MicrohabitatSand",
                "b_MicrohabitatSertulariidae",
                "b_MicrohabitatHydrallmania",
                "b_Time"
            ),
            x_range = c(-5, 5),
            custom_labels = custom_labels_priors,
            axis_title_y = label %in% c("Count")
        )
    }
)
names(prior_plots) <- names(prior_samples)

# Generate plot for intercept
prior_plots_intercept <- lapply(
    names(prior_samples),
    function(label) {
        generate_posterior_plot(
            prior_samples[[label]],
            regex_pars = c("b_Intercept"),
            x_range = c(-10, 10),
            custom_labels = custom_labels_priors_intercept,
            axis_title_y = label %in% c("Count")
        )
    }
)
names(prior_plots_intercept) <- names(prior_samples)

# Add grey bars with labels for each plot using the top bar function
prior_plots_with_bars <- mapply(
    function(plot, label) {
        patchwork::wrap_elements(create_top_bar(label)) / plot +
            patchwork::plot_layout(heights = c(0.2, 1)) # Adjust heights to reduce spacing
    },
    prior_plots,
    names(prior_plots),
    SIMPLIFY = FALSE
)

prior_plots_intercept_with_bars <- mapply(
    function(plot, label) {
        patchwork::wrap_elements(create_top_bar(label)) / plot +
            patchwork::plot_layout(heights = c(0.2, 1)) # Adjust heights to reduce spacing
    },
    prior_plots_intercept,
    names(prior_plots_intercept),
    SIMPLIFY = FALSE
)

# Combine plots into a grid using the correct name ("Count") instead of "Pod Count"
plot_priors_exp <- patchwork::wrap_plots(
    prior_plots_with_bars[c("Count")],
    ncol = 1
) +
    patchwork::plot_layout(guides = "collect")

plot_priors_intercept_exp <- patchwork::wrap_plots(
    prior_plots_intercept_with_bars[c("Count")],
    ncol = 1
) +
    patchwork::plot_layout(guides = "collect")

# Add a unified x-axis label (if needed, otherwise remove this step)
plot_priors_exp <- plot_priors_exp +
    patchwork::plot_annotation(
        caption = "Expected value of the response (log scale)",
        theme = theme(plot.caption = element_text(hjust = 0.65, size = 10))
    )

plot_priors_intercept_exp <- plot_priors_intercept_exp +
    patchwork::plot_annotation(
        caption = "Expected value of the response (log scale)",
        theme = theme(plot.caption = element_text(hjust = 0.65, size = 10))
    )

# Save the final plots
ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_priors_exp.png", 
       plot = plot_priors_exp, width = 4.5, height = 6, units = "in", dpi = 300)

ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_priors_intercept_exp.png", 
       plot = plot_priors_intercept_exp, width = 4.5, height = 4.5, units = "in", dpi = 300)




In [None]:
# Convert images to base64 (assuming these return base64 data URIs)
plot_priors_exp <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_priors_exp.png")
plot_priors_intercept_exp <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_priors_intercept_exp.png")

# Create the HTML 
html_priors <- paste0("
  <style>
    .image-row {
      display: flex;
      gap: 20px;
      justify-content: center;
      align-items: flex-start;
    }
    .image-row img {
      max-width: 48%;
      height: auto;
      border: 1px solid #ccc;
    }
  </style>
<div class='image-row'>
  <img src='", plot_priors_exp, "' alt='Prior Plot'>
  <img src='", plot_priors_intercept_exp, "' alt='Intercept Prior Plot'>
</div>
")

# Display the HTML
IRdisplay::display_html(html_priors)



---

### 6. Run final models

Now that we have finalized the model parameters, we fit models using the actual data and compare them to null models to assess the significance of predictors.


>**Note:** We will run these models in RStudio to be consistent because the rstan package sometimes does not like Jupyter. The models were saved in RStudio and loaded below. We left the code chunks as comments for reference.

In [None]:
# # Set seed for reproducibility
# set.seed(5678)
# # null model (for comparison)
# expv0 <- brm(
#     family = poisson(link = "log"),
#     formula = Count ~ 1 + (1 | Chamber),
#     data = data_exp_clean,
#     prior = c(
#         prior(normal(0, 0.5), class = "Intercept"),
#         prior(normal(0, 0.5), class = "sd")
#     ),
#     sample_prior = TRUE,
#     save_pars = save_pars(all = TRUE),
#     control = list(adapt_delta = 0.99, max_treedepth = 12),
#     iter = 15000,
#     warmup = 5000,
#     chains = 2,
#     cores = parallel::detectCores(logical=FALSE),
#     backend = "rstan"
# )
# saveRDS(expv0, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv0.rds")


# # Set seed for reproducibility
# set.seed(5678)
# # zero-inflated model (for comparison)
# expv1_ZI <- brm(
#     family = zero_inflated_poisson(link = "log"),
#     formula = Count ~ 1 + Microhabitat + Time + (1 | Chamber),
#     data = data_exp_clean,
#     prior = c(
#         prior(normal(0, 0.5), class = "Intercept"),
#         prior(normal(0, 0.5), class = "sd")
#     ),
#     sample_prior = TRUE,
#     save_pars = save_pars(all = TRUE),
#     control = list(adapt_delta = 0.99, max_treedepth = 12),
#     iter = 15000,
#     warmup = 5000,
#     chains = 2,
#     cores = parallel::detectCores(logical=FALSE),
#     backend = "rstan"
# )
# saveRDS(expv1_ZI, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1_ZI.rds")


# # Set seed for reproducibility
# set.seed(5678)
# # Model with no interaction effects
# expv1 <- brm(
#     family = poisson(link = "log"),
#     formula = Count ~ 1 + Microhabitat + Time + (1 | Chamber),
#     data = data_exp_clean,
#     prior = c(
#         prior(normal(0, 0.5), class = "Intercept"),
#         prior(normal(0, 0.5), class = "b"),
#         prior(normal(0, 0.5), class = "sd")
#     ),
#     sample_prior = TRUE,
#     save_pars = save_pars(all = TRUE),
#     control = list(adapt_delta = 0.99, max_treedepth = 12),
#     iter = 15000,
#     warmup = 5000,
#     chains = 2,
#     cores = parallel::detectCores(logical=FALSE),
#     backend = "rstan"
# )
# saveRDS(expv1, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1.rds")


# # Set seed for reproducibility
# set.seed(5678)
# # Model with interaction effects
# expv2 <- brm(
#     family = poisson(link = "log"),
#     formula = Count ~ 1 + Microhabitat * Time + (1 | Chamber),
#     data = data_exp_clean,
#     prior = c(
#         prior(normal(0, 0.5), class = "Intercept"),
#         prior(normal(0, 0.5), class = "b"),
#         prior(normal(0, 0.5), class = "sd")
#     ),
#     sample_prior = TRUE,
#     save_pars = save_pars(all = TRUE),
#     control = list(adapt_delta = 0.99, max_treedepth = 12),
#     iter = 15000,
#     warmup = 5000,
#     chains = 2,
#     cores = parallel::detectCores(logical=FALSE),
#     backend = "rstan"
# )
# saveRDS(expv2, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv2.rds")


#### *Model comparison*

Use LOO to compare models and check to see which predictors perform best.


>**Note:** We will added loo to these models in RStudio. The models were saved in RStudio and loaded below. We left the code chunks as comments for reference.

In [None]:
# # Add LOO to models
# expv0 <- add_criterion(expv0, "loo", moment_match = TRUE)
# expv1 <- add_criterion(expv1, "loo", moment_match = TRUE)
# expv2 <- add_criterion(expv2, "loo", moment_match = TRUE)

# Save models with loo so you don't have to do this again
# saveRDS(expv0, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv0.rds")
# saveRDS(expv1, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1.rds")
# saveRDS(expv2, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv2.rds")

In [None]:
# Load models from R
expv0 <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv0.rds")
expv1 <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1.rds")
expv2 <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv2.rds")


In [None]:

# Compare models with loo
loo1 <- loo_compare(expv0, expv1, expv2)

In [None]:
# Convert to dataframe
df_loo1 <- as.data.frame(loo1) %>%
  rownames_to_column(var = "Model")

# Save loo as tables
write.table(df_loo1, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_loo_exp.csv", sep = ",", row.names = TRUE, col.names = TRUE)


In [None]:
# Convert each data frame to a plain HTML table string
table_1_loo <- minimal_html_table(df_loo1, caption = "LOO values - Pod Count")

my_tabs_loo <- '
<style>
/* Basic container styling */
.tabs-container {
  width: 100%;
  margin: 1em 0;
}

/* Hide the radio inputs (we only show their labels as tabs) */
.tabs-container input[type="radio"] {
  display: none;
}

/* The “tab-label” styling: looks like a tab */
.tab-label {
  display: inline-block;
  padding: 10px;
  margin-right: 2px;
  background: #eee;
  border: 1px solid #ccc;
  cursor: pointer;
  border-bottom: none;
}

/* The active tab label */
.tab-label-active {
  background: #fff;
}

/* The panel that holds table content */
.tab-content {
  border: 1px solid #ccc;
  padding: 10px;
  display: none;
}

/* For each radio input, show its corresponding content when checked */
#tab1_loo:checked ~ #content1_loo {
  display: block;
}

/* Also style the label of the checked radio as “active” using the :checked + label technique */
#tab1_loo:checked + label[for="tab1_loo"] {
  background: #fff;
  border-bottom: none;
}
</style>

<div class="tabs-container">

  <!-- 1) Tab radio + label -->
  <input type="radio" name="tabs_loo" id="tab1_loo" checked>
  <label class="tab-label" for="tab1_loo">Table 1</label>

  <!-- Content for each tab -->
  <div class="tab-content" id="content1_loo">REPLACE_WITH_table_1</div>
</div>
'

# Now do the replacements for each table
my_tabs_loo <- gsub("REPLACE_WITH_table_1", table_1_loo, my_tabs_loo)

IRdisplay::display_html(my_tabs_loo)


#### **Visualize posteriors**

We extract the final model results and visualize the posteriors of our model parameters to get an idea of the significance of the results. This is what we did above when we were evaluating our priors.

In [None]:
# Extract results
posterior_samples_expv1 <- as_draws_df(expv1)
posterior_samples_expv2 <- as_draws_df(expv2)

In [None]:

# List of datasets and labels
predictor_samples <- list(
    Count = posterior_samples_expv1
)

predictor_samples_interaction <- list(
    Count = posterior_samples_expv2
)

# Baseline category data
baseline_data <- tibble(
  parameter = "Red_Algae",
  mean = 0,  # Centered at 0
  ci_low = -0.2,  # Example CI range
  ci_high = 0.2
)

# Order categories so "Red Algae" appears at the bottom
parameter_order <- c(
  "Red_Algae",
  "b_MicrohabitatSand",
  "b_MicrohabitatSertulariidae",
  "b_MicrohabitatHydrallmania"
)

#Custom labels
custom_labels_posteriors <- c(
  "Red_Algae" = "Red Algae",
  "b_MicrohabitatSand" = "Sand",
  "b_MicrohabitatSertulariidae" = "Sertulariidae",
  "b_MicrohabitatHydrallmania" = "Hydrallmania"
)


# Convert the y-axis parameters to factors for consistent alignment
baseline_data$parameter <- factor(baseline_data$parameter, levels = parameter_order)



# Generate plots for predictors
predictor_plots <- lapply(
    names(predictor_samples),
    function(label) {
        generate_posterior_plot(
            predictor_samples[[label]],
            regex_pars = c(
                "b_MicrohabitatSand",
                "b_MicrohabitatSertulariidae",
                "b_MicrohabitatHydrallmania"
            ),
            x_range = c(-2, 2),
            custom_labels = custom_labels_posteriors,
            axis_title_y = label %in% c("Count")
        ) +
        geom_point(
            data = baseline_data,
            aes(x = mean, y = parameter),
            inherit.aes = FALSE,
            color = "dodgerblue4",
            size = 2
        )
    }
)
names(predictor_plots) <- names(predictor_samples)


predictor_plots_interaction <- lapply(
    names(predictor_samples_interaction),
    function(label) {
        generate_posterior_plot(
            predictor_samples_interaction[[label]],
            regex_pars = c(
                "^b_MicrohabitatSand$",
                "^b_MicrohabitatSertulariidae$",
                "^b_MicrohabitatHydrallmania$"
            ),
            x_range = c(-2, 2),
            custom_labels = custom_labels_posteriors,
            axis_title_y = label %in% c("Count")
        ) +
        geom_point(
          data = baseline_data,
          aes(x = mean, y = parameter),
          inherit.aes = FALSE,
          color = "dodgerblue4",
          size = 2
        )
    }
)
names(predictor_plots_interaction) <- names(predictor_samples_interaction)




# Add grey bars with labels for each plot

predictor_plots_with_bars <- mapply(
    function(plot, label) {
        patchwork::wrap_elements(create_top_bar("Pod Count")) / plot +
            patchwork::plot_layout(heights = c(0.2, 1)) # Adjust heights to reduce spacing
    },
    predictor_plots,
    names(predictor_plots),
    SIMPLIFY = FALSE
)

predictor_plots_interaction_with_bars <- mapply(
    function(plot, label) {
        patchwork::wrap_elements(create_top_bar("Pod Count\n(with Interaction Effects)")) / plot +
            patchwork::plot_layout(heights = c(0.2, 1)) # Adjust heights to reduce spacing
    },
    predictor_plots_interaction,
    names(predictor_plots_interaction),
    SIMPLIFY = FALSE
)



# Combine plots into a 2x3 grid
plot_posteriors_expv1 <- patchwork::wrap_plots(
    predictor_plots_with_bars[c("Count")],
    ncol = 1
) +
    patchwork::plot_layout(guides = "collect")

plot_posteriors_expv2 <- patchwork::wrap_plots(
    predictor_plots_interaction_with_bars[c("Count")],
    ncol = 1
) +
    patchwork::plot_layout(guides = "collect")


# Add a unified x-axis label
plot_posteriors_expv1 <- plot_posteriors_expv1 +
    patchwork::plot_annotation(
        caption = "Expected value of the odds response (log scale)",
        theme = theme(plot.caption = element_text(hjust = 0.65, size = 10))
    )

plot_posteriors_expv2 <- plot_posteriors_expv2 +
    patchwork::plot_annotation(
        caption = "Expected value of the odds response (log scale)",
        theme = theme(plot.caption = element_text(hjust = 0.65, size = 10))
    )


ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_posteriors_expv1.png", plot = plot_posteriors_expv1, width = 4.5, height = 6, units = "in", dpi = 300)

ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_posteriors_expv2.png", plot = plot_posteriors_expv2, width = 4.5, height = 6, units = "in", dpi = 300)


In [None]:
# Convert images to base64
plot_posteriors_expv1 <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_posteriors_expv1.png")

plot_posteriors_expv2 <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/plot_posteriors_expv2.png")

# Create the HTML 
html_posteriors <- paste0("
  <style>
    .image-row {
      display: flex;
      gap: 20px;
      justify-content: center;
      align-items: flex-start;
    }
    .image-row img {
      max-width: 48%;
      height: auto;
      border: 1px solid #ccc;
    }
  </style>
<div class='image-row'>
  <img src='", plot_posteriors_expv1, "' alt='Posterior Plot'>
  <img src='", plot_posteriors_expv2, "' alt='Posterior Plot (with interactions)'>
</div>
")

# Display the HTML
IRdisplay::display_html(html_posteriors)



The graph of the posteriors gives us an idea of the significance of each predictor. We need to follow up with an evaluation of the model performance before we can trust these results.

---

### 7. Evaluate model performance

#### **Overdispersion test**

The standard Poisson model assumes that the mean equals the variance. If this assumption is violated, it will affect our final model choice above. Thus, we need to test for over-dispersion. 


In [None]:
# Is our data OVER-DISPERSED? 
data_exp_clean <- read.csv("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/cleaned/data_exp_clean.csv")

# Calculate mean and variance of the observed data
mean_count <- mean(data_exp_clean$Count)
var_count <- var(data_exp_clean$Count)

# Check for overdispersion
dispersion_ratio <- var_count / mean_count
print(dispersion_ratio)

#If this ratio is much greater than 1, your data is likely overdispersed. If not, you are good.

---

#### **Zero inflation test**

The standard Poisson model assumes no zero inflation. If this assumption is violated, it will affect our final model choice above. Thus, we need to test for zero-inflation.


In [None]:
# # Add LOO to model
# expv1_ZI <- add_criterion(expv1_ZI, "loo", moment_match = TRUE)

# Save model with loo so you don't have to do this again
# saveRDS(expv1_ZI, file = "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1_ZI.rds")

In [None]:
# Load models
expv1_ZI <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1_ZI.rds")
expv1 <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv1.rds")
expv2 <- readRDS("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/models/expv2.rds")


In [None]:

# Compare models with loo
loo_ZI <- loo_compare(expv1_ZI, expv1, expv2)

In [None]:
# Convert to dataframe
df_loo_ZI <- as.data.frame(loo_ZI) %>%
  rownames_to_column(var = "Model")

# Save loo as tables
write.table(df_loo_ZI, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_loo_exp_ZI.csv", sep = ",", row.names = TRUE, col.names = TRUE)

In [None]:

# Convert each data frame to a plain HTML table string
table_1_loo_ZI <- minimal_html_table(df_loo_ZI, caption = "LOO values - Pod Count")

my_tabs_loo_ZI <- '
<style>
/* Basic container styling */
.tabs-container {
  width: 100%;
  margin: 1em 0;
}

/* Hide the radio inputs (we only show their labels as tabs) */
.tabs-container input[type="radio"] {
  display: none;
}

/* The “tab-label” styling: looks like a tab */
.tab-label {
  display: inline-block;
  padding: 10px;
  margin-right: 2px;
  background: #eee;
  border: 1px solid #ccc;
  cursor: pointer;
  border-bottom: none;
}

/* The active tab label */
.tab-label-active {
  background: #fff;
}

/* The panel that holds table content */
.tab-content {
  border: 1px solid #ccc;
  padding: 10px;
  display: none;
}

/* For each radio input, show its corresponding content when checked */
#tab1_loo_ZI:checked ~ #content1_loo_ZI {
  display: block;
}

/* Also style the label of the checked radio as “active” using the :checked + label technique */
#tab1_loo_ZI:checked + label[for="tab1_loo_ZI"] {
  background: #fff;
  border-bottom: none;
}
</style>

<div class="tabs-container">

  <!-- 1) Tab radio + label -->
  <input type="radio" name="tabs_loo_ZI" id="tab1_loo_ZI" checked>
  <label class="tab-label" for="tab1_loo_ZI">Table ZI</label>

  <!-- Content for each tab -->
  <div class="tab-content" id="content1_loo_ZI">REPLACE_WITH_table_1_ZI</div>
</div>
'

# Now do the replacements for each table
my_tabs_loo_ZI <- gsub("REPLACE_WITH_table_1_ZI", table_1_loo_ZI, my_tabs_loo_ZI)

IRdisplay::display_html(my_tabs_loo_ZI)


The zero-inflated model performs worst, so there our data is not zero-inflated.

#### **Trace plots**

Visualize parameter sampling across iterations to confirm convergence. Each chain should wander around the same mean value without any strong upward or downward trends. "fuzzy caterpillar" or "horizontal band." 


In [None]:
# Generate trace plots for all models
trace_plots_expv1 <- generate_trace_plot(
  model = expv1,
  regex_pars = c("b_MicrohabitatSand", "b_MicrohabitatSertulariidae", "b_MicrohabitatHydrallmania"),
  plot_title = "Pod Count Model",
  axis_title_y = TRUE,
  axis_text_x = TRUE,
  axis_title_size = 10,  # Increase axis title size
  axis_text_size = 8    # Increase axis tick label size
)

trace_plots_expv2 <- generate_trace_plot(
  model = expv2,
  regex_pars = c("^b_MicrohabitatSand$", "^b_MicrohabitatSertulariidae$", "^b_MicrohabitatHydrallmania$"),
  plot_title = "Pod Count Model (with Interaction Effects)",
  axis_title_y = TRUE,
  axis_text_x = TRUE,
  axis_title_size = 10,  # Increase axis title size
  axis_text_size = 8    # Increase axis tick label size
)


ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/trace_plots_expv1.png", plot = trace_plots_expv1, width = 6, height = 3, units = "in", dpi = 300)

ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/trace_plots_expv2.png", plot = trace_plots_expv2, width = 6, height = 3, units = "in", dpi = 300)



In [None]:
# Convert images to base64 (if not already done)
trace_plots_expv1 <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/trace_plots_expv1.png")
trace_plots_expv2     <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/trace_plots_expv2.png")


# Create the HTML 
html_trace_plots <- paste0("
<style>
  .grid-container {
    display: grid;
    grid-template-columns: repeat(1, 1fr); /* 1 columns per row */
    gap: 3px;
    padding: 3px;
    justify-items: center;
  }

  .grid-container img {
    max-width: 600px;
    width: 100%;
    height: auto;
    border: 1px solid #ccc;
  }
</style>

<div class='grid-container'>
  <img src='", trace_plots_expv1, "' alt='Trace Plot'>
  <img src='", trace_plots_expv2, "' alt='Trace Plot (with interactions)'>
</div>
")

# Display the HTML
IRdisplay::display_html(html_trace_plots)



#### **Posterior predictive checks**

Simulate data based on the models and compare to observed data to verify goodness-of-fit. We do this using "pp_check". The observed data (black line/dots) should sit comfortably within the distribution of simulated data (colored areas or lines)

In [None]:
# Generate posterior predictive check plots for all models
pp_check_plots_expv1 <-  generate_pp_check(
  model = expv1,
  nreps = 100,
  axis_title_y = TRUE,
  y_label = "Density",
  plot_title = "Pod Occurrence Model",
  axis_title_size = 10,  # custom title size
  axis_text_size = 8    # custom tick label size
)

pp_check_plots_expv2 <-  generate_pp_check(
  model = expv2,
  nreps = 100,
  axis_title_y = TRUE,
  y_label = "Density",
  plot_title = "Pod Occurrence (with Interactions) Model",
  axis_title_size = 10,  # custom title size
  axis_text_size = 8    # custom tick label size
)

ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/pp_check_plots_expv1.png", plot = pp_check_plots_expv1, width = 3, height = 3, units = "in", dpi = 300)

ggsave("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/pp_check_plots_expv2.png", plot = pp_check_plots_expv2, width = 3, height = 3, units = "in", dpi = 300)

In [None]:
# Convert images to base64
pp_check_plots_expv1 <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/pp_check_plots_expv1.png")
pp_check_plots_expv2 <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/pp_check_plots_expv2.png")


# Create the HTML (horizontal display)
html_pp_check_plots <- paste0("
  <style>
    .image-row {
      display: flex;
      gap: 20px;
      justify-content: center;
      align-items: flex-start;
    }
    .image-row img {
      max-width: 48%;
      height: auto;
      border: 1px solid #ccc;
    }
  </style>
<div class='image-row'>
  <img src='", pp_check_plots_expv1, "' alt='Pod Count PP Plot'>
  <img src='", pp_check_plots_expv2, "' alt='Pod Count (with Interactions) PP Plot'>
</div>
")

IRdisplay::display_html(html_pp_check_plots)



#### **Check convergence**

Check that all $\hat{R}$ values are close to 1, indicating good convergence.


In [None]:
# Create a helper function
extract_rhat <- function(model, model_name) {
  rhat(model) %>%
    as.data.frame() %>%
    rownames_to_column(var = "Parameter") %>%
    dplyr::filter(startsWith(Parameter, "b_")) %>%   # <-- keep only b_ terms
    dplyr::rename(Rhat = 2) %>%
    dplyr::mutate(Model = model_name) %>%
    dplyr::mutate(across(where(is.numeric), ~ signif(.x, digits = 3)))
}

# Extract for each model group
# Model without interaction effects
rhat1 <- extract_rhat(expv1, "Count")

# Model with interaction effects
rhat2 <- extract_rhat(expv2, "Count")

In [None]:
# Save tables
write.table(rhat1, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_rhat_expv1.csv", sep = ",", row.names = FALSE, col.names = TRUE)
write.table(rhat2, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_rhat_expv2.csv", sep = ",", row.names = FALSE, col.names = TRUE)

In [None]:
# Convert each data frame to a plain HTML table string
table_1_rhat <- minimal_html_table(rhat1, caption = "Rhat values - Counts")
table_2_rhat <- minimal_html_table(rhat2, caption = "Rhat values - Counts with interaction effects")

my_tabs_rhat <- '
<style>
/* Basic container styling */
.tabs-container {
  width: 100%;
  margin: 1em 0;
}

/* Hide the radio inputs (we only show their labels as tabs) */
.tabs-container input[type="radio"] {
  display: none;
}

/* The “tab-label” styling: looks like a tab */
.tab-label {
  display: inline-block;
  padding: 10px;
  margin-right: 2px;
  background: #eee;
  border: 1px solid #ccc;
  cursor: pointer;
  border-bottom: none;
}

/* The active tab label */
.tab-label-active {
  background: #fff;
}

/* The panel that holds table content */
.tab-content {
  border: 1px solid #ccc;
  padding: 10px;
  display: none;
}

/* For each radio input, show its corresponding content when checked */
#tab1_rhat:checked ~ #content1_rhat,
#tab2_rhat:checked ~ #content2_rhat {
  display: block;
}

/* Also style the label of the checked radio as “active” using the :checked + label technique */
#tab1_rhat:checked + label[for="tab1_rhat"],
#tab2_rhat:checked + label[for="tab2_rhat"] {
  background: #fff;
  border-bottom: none;
}
</style>

<div class="tabs-container">

  <!-- 1) Tab radio + label -->
  <input type="radio" name="tabs_rhat" id="tab1_rhat" checked>
  <label class="tab-label" for="tab1_rhat">Table 1</label>

  <!-- 2) Tab radio + label -->
  <input type="radio" name="tabs_rhat" id="tab2_rhat">
  <label class="tab-label" for="tab2_rhat">Table 2</label>

  <!-- Content for each tab -->
  <div class="tab-content" id="content1_rhat">REPLACE_WITH_table_1</div>
  <div class="tab-content" id="content2_rhat">REPLACE_WITH_table_2</div>
</div>
'

# Now do the replacements for each table
my_tabs_rhat <- gsub("REPLACE_WITH_table_1", table_1_rhat, my_tabs_rhat)
my_tabs_rhat <- gsub("REPLACE_WITH_table_2", table_2_rhat, my_tabs_rhat)

IRdisplay::display_html(my_tabs_rhat)




#### **Check uncertainty**

Extract parameter estimates and their confidence intervals to assess the significance of the predictors on color pattern metrics. We check 85% and 95% confidence intervals. Summaries are displayed in tables for all models. 

In [None]:
# Extract summaries for each variable
extract_summary <- function(model, prob_85, prob_95) {
    summary_85 <- summary(model, prob = prob_85)
    summary_95 <- summary(model, prob = prob_95)

    as.data.frame(summary_85$fixed) %>%
        dplyr::select("Estimate", "Est.Error", "l-85% CI", "u-85% CI") %>%
        mutate(
            "l-95% CI" = summary_95$fixed$`l-95% CI`,
            "u-95% CI" = summary_95$fixed$`u-95% CI`
        ) %>%
        mutate(across(where(is.numeric), ~ signif(.x, digits = 3))) %>%
        rownames_to_column(var = "Parameter") # Add rownames as Parameter column
}

uncertainty1 <- extract_summary(expv1, prob_85 = 0.85, prob_95 = 0.95)
uncertainty2 <- extract_summary(expv2, prob_85 = 0.85, prob_95 = 0.95)


In [None]:
# Save tables
write.table(uncertainty1, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_uncertainty_expv1.csv", sep = ",", row.names = TRUE, col.names = TRUE)

write.table(uncertainty2, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_uncertainty_expv2.csv", sep = ",", row.names = TRUE, col.names = TRUE)


In [None]:

# Convert each data frame to a plain HTML table string
table_1_uncertainty <- minimal_html_table(uncertainty1, caption = "Uncertainty values - Counts")
table_2_uncertainty <- minimal_html_table(uncertainty2, caption = "Uncertainty values - Counts with interaction effects")

my_tabs_uncertainty <- '
<style>
/* Basic container styling */
.tabs-container {
  width: 100%;
  margin: 1em 0;
}

/* Hide the radio inputs (we only show their labels as tabs) */
.tabs-container input[type="radio"] {
  display: none;
}

/* The “tab-label” styling: looks like a tab */
.tab-label {
  display: inline-block;
  padding: 10px;
  margin-right: 2px;
  background: #eee;
  border: 1px solid #ccc;
  cursor: pointer;
  border-bottom: none;
}

/* The active tab label */
.tab-label-active {
  background: #fff;
}

/* The panel that holds table content */
.tab-content {
  border: 1px solid #ccc;
  padding: 10px;
  display: none;
}

/* For each radio input, show its corresponding content when checked */
#tab1_uncertainty:checked ~ #content1_uncertainty,
#tab2_uncertainty:checked ~ #content2_uncertainty {
  display: block;
}

/* Also style the label of the checked radio as “active” using the :checked + label technique */
#tab1_uncertainty:checked + label[for="tab1_uncertainty"],
#tab2_uncertainty:checked + label[for="tab2_uncertainty"] {
  background: #fff;
  border-bottom: none;
}
</style>

<div class="tabs-container">

  <!-- 1) Tab radio + label -->
  <input type="radio" name="tabs_uncertainty" id="tab1_uncertainty" checked>
  <label class="tab-label" for="tab1_uncertainty">Table 1</label>

  <!-- 2) Tab radio + label -->
  <input type="radio" name="tabs_uncertainty" id="tab2_uncertainty">
  <label class="tab-label" for="tab2_uncertainty">Table 2</label>

  <!-- Content for each tab -->
  <div class="tab-content" id="content1_uncertainty">REPLACE_WITH_table_1</div>
  <div class="tab-content" id="content2_uncertainty">REPLACE_WITH_table_2</div>
</div>
'

# Now do the replacements for each table
my_tabs_uncertainty <- gsub("REPLACE_WITH_table_1", table_1_uncertainty, my_tabs_uncertainty)
my_tabs_uncertainty <- gsub("REPLACE_WITH_table_2", table_2_uncertainty, my_tabs_uncertainty)

IRdisplay::display_html(my_tabs_uncertainty)


---

#### **Check posterior probabilities**

Now we can evaluate our hypotheses using the posterior distributions of the model parameters. Remember, we are concerned with two things: 
1. Is there a difference between the number of pods between microhabitats? (Microhabitat selection effect)
2. Is there a difference between the number of pods over time? (Time effect)

We will calculate the posterior probability of these effects.

In [None]:
##### **Posterior Probability of Microhabitat Effect**

draws <- as_draws_df(expv1)

# Posterior probability that Pod count on microhabitat is > Red Algae (reference microhabitat)
pp_Hydrallmania_positive <- mean(draws$b_MicrohabitatHydrallmania > 0)
pp_Sertulariidae_positive <- mean(draws$b_MicrohabitatSertulariidae > 0)
pp_Sand_positive <- mean(draws$b_MicrohabitatSand > 0)

# Posterior probability that Pod count on microhabitat is < Red Algae (reference microhabitat)
pp_Hydrallmania_negative <- mean(draws$b_MicrohabitatHydrallmania < 0)
pp_Sertulariidae_negative <- mean(draws$b_MicrohabitatSertulariidae < 0)
pp_Sand_negative <- mean(draws$b_MicrohabitatSand < 0)


# Create a summary table
pp_summary_expv1_microhabitat <- data.frame(
  Hypothesis = c(
    "P(Hydrallmania effect > 0)",
    "P(Sand effect > 0)",
    "P(Sertulariidae effect > 0)",
    "P(Hydrallmania effect < 0)",
    "P(Sand effect < 0)",
    "P(Sertulariidae effect < 0)"
  ),
  PosteriorProbability = c(
    pp_Hydrallmania_positive,
    pp_Sand_positive,
    pp_Sertulariidae_positive,
    pp_Hydrallmania_negative,
    pp_Sand_negative,
    pp_Sertulariidae_negative
  )
)

# Save table
write.table(pp_summary_expv1_microhabitat, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_postprob_expv1_microhabitat.csv", sep = ",", row.names = TRUE, col.names = TRUE)

In [None]:
# Generate HTML table
html_table <- minimal_html_table(pp_summary_expv1_microhabitat, caption = "Posterior probabilities for Microhabitat Effect")

# Display it as HTML
IRdisplay::display_html(html_table)

In [None]:
##### **Posterior Probability of Time Effect**

draws <- as_draws_df(expv1)

# Posterior probability that time has a **positive** effect
pp_time_positive <- mean(draws$b_Time > 0)

# Posterior probability that time has a **negative** effect
pp_time_negative <- mean(draws$b_Time < 0)


# Create a summary table
pp_summary_expv1_time <- data.frame(
  Hypothesis = c(
    "P(Time effect > 0)",
    "P(Time effect < 0)"
  ),
  PosteriorProbability = c(
    pp_time_positive,
    pp_time_negative
  )
)

# Save table
write.table(pp_summary_expv1_time, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_postprob_expv1_time.csv", sep = ",", row.names = TRUE, col.names = TRUE)


In [None]:
# Generate HTML table
html_table <- minimal_html_table(pp_summary_expv1_time, caption = "Posterior probabilities for Time Effect")

# Display it as HTML
IRdisplay::display_html(html_table)

In [None]:
##### **Posterior Probability of Microhabitat Effect (with interactions)**

draws <- as_draws_df(expv2)

# Posterior probability that Pod count on microhabitat is > Red Algae (reference microhabitat)
pp_Hydrallmania_positive <- mean(draws[["b_MicrohabitatHydrallmania"]] > 0)
pp_Sertulariidae_positive <- mean(draws[["b_MicrohabitatSertulariidae"]] > 0)
pp_Sand_Time_positive <- mean(draws[["b_MicrohabitatSand"]] > 0)

# Posterior probability that Pod count on microhabitat is < Red Algae (reference microhabitat)
pp_Hydrallmania_negative <- mean(draws[["b_MicrohabitatHydrallmania"]] < 0)
pp_Sertulariidae_negative <- mean(draws[["b_MicrohabitatSertulariidae"]] < 0)
pp_Sand_negative <- mean(draws[["b_MicrohabitatSand"]] < 0)


# Create a summary table
pp_summary_expv2_microhabitat <- data.frame(
  Hypothesis = c(
    "P(Hydrallmania effect > 0)",
    "P(Sand effect > 0)",
    "P(Sertulariidae effect > 0)",
    "P(Hydrallmania effect < 0)",
    "P(Sand effect < 0)",
    "P(Sertulariidae effect < 0)"

  ),
  PosteriorProbability = c(
    pp_Hydrallmania_positive,
    pp_Sand_positive,
    pp_Sertulariidae_positive,
    pp_Hydrallmania_negative,
    pp_Sand_negative,
    pp_Sertulariidae_negative
  )
)

# Save table
write.table(pp_summary_expv2_microhabitat, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_postprob_expv2_microhabitat.csv", sep = ",", row.names = TRUE, col.names = TRUE)


In [None]:
# Generate HTML table
html_table <- minimal_html_table(pp_summary_expv2_microhabitat, caption = "Posterior probabilities for Microhabitat Effect with interaction effects")

# Display it as HTML
IRdisplay::display_html(html_table)

In [None]:

##### **Posterior Probability of Time Effect (with interactions)**

draws <- as_draws_df(expv2)

# Posterior probability that time has a **positive** effect
pp_time_positive <- mean(draws$b_Time > 0)

# Posterior probability that time has a **negative** effect
pp_time_negative <- mean(draws$b_Time < 0)


# Create a summary table
pp_summary_expv2_time <- data.frame(
  Hypothesis = c(
    "P(Time effect > 0)",
    "P(Time effect < 0)"
  ),
  PosteriorProbability = c(
    pp_time_positive,
    pp_time_negative
  )
)

# Save table
write.table(pp_summary_expv2_time, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_postprob_expv2_time.csv", sep = ",", row.names = TRUE, col.names = TRUE)

In [None]:

# Generate HTML table
html_table <- minimal_html_table(pp_summary_expv2_time, caption = "Posterior probabilities for Time Effect with interaction effects")

# Display it as HTML
IRdisplay::display_html(html_table)

In [None]:
##### **Posterior Probability of Microhabitat*Time Effect

draws <- as_draws_df(expv2)

# Posterior probability that Pod count on microhabitat over time is > Red Algae (reference microhabitat)
pp_Hydrallmania_Time_positive <- mean(draws[["b_MicrohabitatHydrallmania:Time"]] > 0)
pp_Sertulariidae_Time_positive <- mean(draws[["b_MicrohabitatSertulariidae:Time"]] > 0)
pp_Sand_Time_positive <- mean(draws[["b_MicrohabitatSand:Time"]] > 0)

# Posterior probability that Pod count on microhabitat over time is < Red Algae (reference microhabitat)
pp_Hydrallmania_Time_negative <- mean(draws[["b_MicrohabitatHydrallmania:Time"]] < 0)
pp_Sertulariidae_Time_negative <- mean(draws[["b_MicrohabitatSertulariidae:Time"]] < 0)
pp_Sand_Time_negative <- mean(draws[["b_MicrohabitatSand:Time"]] < 0)


# Create a summary table
pp_summary_expv2_microhabitat_time <- data.frame(
  Hypothesis = c(
    "P(Hydrallmania:Time effect > 0)",
    "P(Sand:Time effect > 0)",
    "P(Sertulariidae:Time effect > 0)",
    "P(Hydrallmania:Time effect < 0)",
    "P(Sand:Time effect < 0)",
    "P(Sertulariidae:Time effect < 0)"
  ),
  PosteriorProbability = c(
    pp_Hydrallmania_Time_positive,
    pp_Sand_Time_positive,
    pp_Sertulariidae_Time_positive,
    pp_Hydrallmania_Time_negative,
    pp_Sand_Time_negative,
    pp_Sertulariidae_Time_negative
  )
)

# Save table
write.table(pp_summary_expv2_microhabitat_time, "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/data/tables/table_postprob_expv2_microhabitat_time.csv", sep = ",", row.names = TRUE, col.names = TRUE)


In [None]:
# Generate HTML table
html_table <- minimal_html_table(pp_summary_expv2_microhabitat_time, caption = "Posterior probabilities for Microhabitat*Time Effect")

# Display it as HTML
IRdisplay::display_html(html_table)


---

### 8. Summarize results
On average, there are more individuals on Hydrallmania than Red Algae, and fewer individuals on Setularidae than Red Algae. They definitely prefer any branching habitat over Sand. However, Time was not a significant variable affecting Count for any substrate except Sand, meaning they were not significantly moving between branching habitats once they were on them. Thus, their preference for microhabitat was likely driven by the first microhabitat that they landed on at the start of the experiment, or the first microhabitat they crawled onto off of the Sand. This might be a product of experimental setup (i.e., poor acclimation to chambers), or a natural clinging behavior.

We did not have enough material to test hydroids with and without bryozoans, unfortunately.

This experiment likely does not contribute much useful information, so maybe should be excluded from the final results?

#### **Make final figures**

Generate tidy figures for the Results section of our report/paper. We will limit the results to showing only Microhabitat categories and add the intercept category at 0 for comparison.

In [None]:

# Define the predictor list
predictors_exp <- list(
  list(
    name = "Microhabitat",
    model = "m2",
    baseline = tibble(parameter = "Red_Algae", mean = 0),
    order = c("b_MicrohabitatSand", "Red_Algae", "b_MicrohabitatSertulariidae", "b_MicrohabitatHydrallmania"),
    labels = c(
      "b_MicrohabitatSand" = "Sand",
      "Red_Algae" = "Red algae",
      "b_MicrohabitatSertulariidae" = "Sertulariidae",
      "b_MicrohabitatHydrallmania" = "Hydrallmania"
    ),
    regex_pars = c(
      "^b_MicrohabitatSand$",
      "^b_MicrohabitatSertulariidae$",
      "^b_MicrohabitatHydrallmania$"
    )
  ),
  list(
    name = "Time",
    model = "m2",
    baseline = NULL,
    order = c("b_Time"),
    labels = c("b_Time" = "Time"),
    regex_pars = c("^b_Time$")
  )
)

# Posterior samples by model group
posterior_samples_all <- list(
  Count = posterior_samples_expv1,
  Count_interaction = posterior_samples_expv2
)

# Extraction function (updated for optional baseline)
extract_effects <- function(posterior_df, predictor_cfg, model_label) {
  matched_cols <- posterior_df %>%
    dplyr::select(matches(paste(predictor_cfg$regex_pars, collapse = "|"))) %>%
    colnames()

  if (length(matched_cols) == 0) {
    stop(paste("No columns matched for predictor:", predictor_cfg$name))
  }

  draws <- posterior_df %>%
    dplyr::select(all_of(matched_cols)) %>%
    pivot_longer(cols = everything(), names_to = "parameter", values_to = "value") %>%
    mutate(
      label = predictor_cfg$labels[parameter],
      predictor = predictor_cfg$name,
      group = model_label
    )

  if (!is.null(predictor_cfg$baseline)) {
    baseline <- predictor_cfg$baseline %>%
      mutate(
        label = predictor_cfg$labels[parameter],
        predictor = predictor_cfg$name,
        group = model_label,
        value = mean
      )
    out <- bind_rows(draws, baseline)
  } else {
    out <- draws
  }

  return(out)
}

# Build combined plot data
plot_data_exp <- map_dfr(
  names(posterior_samples_all),
  function(model_label) {
    map_dfr(
      predictors_exp,
      function(predictor_cfg) {
        extract_effects(
          posterior_df = posterior_samples_all[[model_label]],
          predictor_cfg = predictor_cfg,
          model_label = model_label
        )
      }
    )
  }
)

# Relabel models for display
plot_data_exp$group <- factor(
  plot_data_exp$group,
  levels = c("Count", "Count_interaction"),
  labels = c("Additive model", "Interaction model")
)

# Set y-axis label order (top to bottom on the plot)
label_order_exp <- c("Sand", "Red algae", "Sertulariidae", "Hydrallmania", "Time")
plot_data_exp$label <- factor(plot_data_exp$label, levels = label_order_exp)

# Final plot
plot_posteriors_expv1_expv2 <- ggplot(
  plot_data_exp, 
  aes(x = value, y = label, fill = predictor, color = predictor)
) +
  stat_halfeye(
    .width = c(0.85, 0.95),
    slab_alpha = 0.4,
    interval_size_range = c(0.75, 0),
    normalize = "groups",
    slab_linewidth = 0.6,  # ✅ makes interval bar visible and colored
    point_size = 1.25
  ) +
  geom_point(
    data = filter(plot_data_exp, !is.na(value) & !is.na(mean)),
    aes(x = mean),
    inherit.aes = TRUE,
    shape = 21,
    size = 1,
    stroke = 0.3,
    fill = "white"  # ✅ white center with colored border
  ) +
  geom_vline(xintercept = 0, size = 0.6, linetype = 3, color = "red") +
  facet_wrap(~ group, ncol = 2, labeller = label_value) +
  scale_fill_manual(
    values = c(
      "Microhabitat" = "forestgreen",
      "Time" = "darkgrey"
    )
  ) +
  scale_color_manual(
    values = c(
      "Microhabitat" = "forestgreen",
      "Time" = "darkgrey"
    )
  ) +
  labs(
    x = "Effect size (logit scale)",
    y = NULL
  ) +
  theme_bw(base_size = 8) +
  theme(
    strip.text = element_text(face = "bold", size = 10),
    axis.text.y = element_text(size = 8),
    panel.spacing = unit(0.5, "lines"),
    panel.grid.major.x = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = "none"
  )



# Save plot
ggsave(
  "C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/final_plot_posteriors_expv1_expv2.png",
  plot = plot_posteriors_expv1_expv2,
  width = 7.5, height = 4.5, units = "in", dpi = 300
)


In [None]:
# Convert images to base64
final_plot_posteriors_expv1_expv2 <- knitr::image_uri("C:/Users/bmc82/Documents/UF/PhD_Projects/DISSERTATION_MANUSCRIPT/Chapter_3/chapter3_data_analysis/images/final_plot_posteriors_expv1_expv2.png")


# Create the HTML 
html_posteriors_final <- paste0("
  <style>
    .image-row {
      display: flex;
      gap: 20px;
      justify-content: center;
      align-items: flex-start;
    }
    .image-row img {
      max-width: 100%;
      height: auto;
      border: 1px solid #ccc;
    }
  </style>
<div class='image-row'>
  <img src='", final_plot_posteriors_expv1_expv2, "' alt='Final Posterior Plot'>
</div>
")

# Display the HTML
IRdisplay::display_html(html_posteriors_final)