# SuppMat 2: Effects of Network Reconstruction on Food-Web Structure

Tanya Strydom [](https://orcid.org/0000-0001-6067-1349)  
February 26, 2026

Expanded results section.

# Effect of Body Size Sampling Method on Network Metrics

To assess whether the choice of body size sampling distribution (uniform, lognormal, or truncated lognormal) influences the estimated structure of ecological networks, we computed partial eta-squared (η²) values for each network metric within each reconstruction model. This approach isolates the effect of body size distribution while controlling for model-specific variation.

Across all models and network metrics, the effect of body size sampling method was extremely small. Most η² values were effectively zero (\<0.01), indicating that the choice of distribution had negligible influence on metrics such as connectance, generality, vulnerability, and various motif counts. Only a few metrics in the ATN model showed slightly higher effects (η² ≈ 0.17 for number of linear chains), but these remained the exception rather than the rule.

These findings justify using any of the tested body size sampling approaches for our simulations, as the structural conclusions drawn from the networks are robust to this methodological choice. Consequently, analyses of network metrics and comparisons across reconstruction models are not confounded by the specific form of the synthetic body size distribution.

## Figure S1. Effect of Body Size Sampling Method on Network Metrics

![](attachment:figures/effect_size_bodysize_within_model.png)

Metrics where η² was notably higher are labelled directly on the bars. The figure highlights that, for the majority of network properties and models, η² values are extremely low (\<0.01), confirming that the body size distribution choice has negligible impact on network structure. A few exceptions appear for the ATN model, but these are limited to specific metrics (*e.g.,* number of linear chains).

# Multivariate analysis of network structure

We quantified food-web structure using a suite of macro-, meso-, and micro-scale network metrics capturing global topology, motif composition, and species-level interaction patterns (Table S1). Differences among reconstruction approaches were assessed using a multivariate analysis of variance (MANOVA), with model identity as a fixed factor and the full set of network metrics as response variables. Pillai’s trace was used to assess overall multivariate significance due to its robustness to violations of multivariate normality. To identify the multivariate axes driving differences among models, we performed canonical discriminant analysis (CDA) on the MANOVA model. Canonical variates represent orthogonal linear combinations of network metrics that maximize separation among reconstruction approaches. The contribution of individual metrics to each canonical variate was quantified using canonical structure coefficients (correlations between original metrics and canonical scores).

For visualization, canonical scores were plotted using linear discriminant analysis (LDA), which yields an equivalent discriminant subspace under equal group priors. Model separation in canonical space was visualized using convex hulls encompassing all network replicates for each reconstruction approach. Univariate analyses of variance and effect sizes (partial η²) were calculated for individual metrics and are reported in the Supplementary Materials for descriptive comparison. Pairwise interaction turnover was quantified using link-based beta diversity, which measures dissimilarity in the identity of trophic interactions between networks and captures differences arising from species turnover or changes in interactions among shared species.

# Quantification of simulation outcomes and model concordance

To evaluate how reconstruction framework influenced inferred extinction dynamics, simulated community states were compared against observed or expected reference states using two complementary approaches. First, deviation in continuous network metrics (e.g., connectance, mean trophic level, modularity) was quantified using mean absolute difference (MAD). For each metric and time step, MAD was calculated as:

$MAD = \frac{1}{n} \sum_{i=1}^{n} | M_{i}^{sim} - M_{i}^{ref}|$

where $M_{i}^{sim}$ is the simulated value and $M_{i}^{ref}$ is the corresponding observed or expected value. MAD was chosen because it provides a scale-preserving measure of deviation that is less sensitive to extreme values than squared-error metrics and allows direct comparison across reconstruction frameworks.

Second, agreement in predicted persistence outcomes was evaluated using a modified True Skill Statistic (TSS) at both node and link levels. At the node level, species were classified as present (persisting) or absent (extinct) in each simulated network and compared to their presence–absence status in the reference community.

At the link level, each possible species pair was classified according to the presence or absence of a trophic interaction in the simulated versus reference network. Thus, link-level evaluation quantified agreement in the retention or loss of specific trophic interactions, independent of overall species richness.

For both node- and link-level classifications, outcomes were assigned as true positives (TP), true negatives (TN), false positives (FP), or false negatives (FN), and TSS was calculated as:

TSS = Sensitivity + Specificity − 1

where Sensitivity = TP/(TP + FN) and Specificity = TN/(TN + FP). TSS ranges from −1 to 1, with 1 indicating perfect agreement, 0 indicating performance no better than random expectation, and negative values indicating systematic disagreement. Because TSS is prevalence-independent, it is appropriate for extinction simulations in which class imbalance may occur (*e.g.,* many persisting species or many absent links).

# Effects of Network Reconstruction on Food-Web Structure

## Table S1. Descriptive statistics of network metrics by model

In [None]:
library(knitr)
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Rows: 6 Columns: 17
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): model
dbl (16): connectance_mean, connectance_sd, trophic_level_mean, trophic_leve...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  model      connectance_mean   connectance_sd   trophic_level_mean   trophic_level_sd   generality_mean   generality_sd   vulnerability_mean   vulnerability_sd     S1_mean       S1_sd     S2_mean       S2_sd     S4_mean       S4_sd     S5_mean       S5_sd
  -------- ------------------ ---------------- -------------------- ------------------ ----------------- --------------- -------------------- ------------------ ----------- ----------- ----------- ----------- ----------- ----------- ----------- -----------
  ADBM              0.2980289        0.0173274              14.9000           4.394688         0.9018448       0.0950970            0.8893876          0.0559394   0.0000000   0.0000000   1.1891795   0.4682134   1.4537193   0.6319601   1.3887004   0.3871614

  ATN               0.2250133        0.0170831              15.7300           6.889176         1.0308906       0.1385251            1.0673261          0.1042244   0.0291176   0.0502323   0.4557063   0.1981065   1.3411211   0.5991332   1.1685892   0.2930910

  PFIM              0.1185292        0.0231336              15.6525           6.238513         1.7226598       0.1528536            0.6326936          0.0763279   0.1044935   0.0352603   0.1390467   0.0305457   0.0644215   0.0456966   0.5227883   0.1453698

  log               0.1720401        0.0155925              19.9100          10.106545         0.6371490       0.0896016            0.6898121          0.1403929   0.7490114   0.2910247   0.1410731   0.0622885   0.5286292   0.2069172   0.5014183   0.2151592
  ratio                                                                                                                                                                                                                                              

  niche             0.1181557        0.0328685              18.4925          12.145317         1.1245242       0.1685898            0.6482552          0.1672472   0.1924341   0.1096531   0.1088552   0.0890663   0.1434281   0.0855302   0.3231550   0.1852910

  random            0.2213846        0.0449628              17.0675          12.154929         0.3263843       0.0833936            1.7046249          0.2168757   0.0107583   0.0127651   0.1808526   0.0937089   1.6233885   0.6556234   0.0296900   0.0213948
  --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


## Table S2. Canonical discriminant analysis

In [None]:
readr::read_csv("tables/canonical_loadings.csv") %>%
kable()

Rows: 8 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Metric
dbl (3): Can1, Can2, Can3

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

  Metric             Can1    Can2    Can3
  --------------- ------- ------- -------
  connectance       -0.75    0.36   -0.45
  trophic_level     -0.31   -0.75   -0.25
  generality         0.73    0.58    0.32
  vulnerability     -0.86   -0.20    0.38
  S1                 0.38   -0.60   -0.48
  S2                -0.35    0.62   -0.53
  S4                -0.81    0.14   -0.11
  S5                -0.12    0.74   -0.52


## Figure S1. Canonical Loadings

Canonical loadings for the first two canonical variates (CV1, CV2) from the canonical discriminant analysis of network metrics. Arrows indicate the contribution of each metric to the multivariate separation among reconstruction models. Colours denote the scale of each metric: Macro (green), Meso (orange), Micro (purple). Metric labels are shown for the most influential variables.

![](attachment:../figures/canonical_loadings_plot.png)

## PERMANOVA Variance Partitioning

To quantify the relative contributions of reconstruction framework and temporal turnover to variation in inferred network structure, we conducted permutational multivariate analysis of variance (PERMANOVA). Euclidean distance matrices were calculated from standardized (z-transformed) network metrics. Reconstruction framework (‘model’) and extinction phase (‘time’) were analysed separately to estimate their total contributions to variance, and in combination to assess interaction effects. Significance was assessed using 999 permutations.

### Robustness of model effects after temporal centering

To determine whether the dominance of reconstruction framework reflected absolute structural shifts among extinction phases, we repeated the analysis after centering network metrics within each time bin. This procedure removes mean temporal differences while preserving within-phase structural variation. Even after temporal centering, reconstruction framework explained 84.8% of multivariate variance (R² = 0.848, p \< 0.001), exceeding the variance explained in the uncentered analysis. Thus, the strong influence of model identity is not attributable to temporal mean differences, but reflects intrinsic structural divergence among reconstruction frameworks.

# Statistical Drivers of Network Variation

## Statistical Robustness and Assumptions

Factorial ANOVA assumptions were validated via residual analysis. Despite significant heteroscedasticity (Levene’s test, p\<0.001), the perfectly balanced design (n = 100 per cell) and large sample size (N = 2400) ensure the robustness of the F-test. Visual inspection of Q-Q plots and Residuals-vs-Fitted plots confirmed that the distributions were sufficiently symmetric for parametric analysis.

## Figure S2. Temporal Trajectories of Network Structure by Model

Detailed shifts in network properties across the four extinction phases, categorized by organizational scale (Macro, Meso, Micro). Each line represents the mean value for a specific reconstruction framework, with error bars denoting standard error. This figure illustrates the “baseline” differences between models—such as the Niche model’s tendency to over-estimate motif counts—and their divergent responses to species loss.

![](attachment:../figures/raw_time_structure.png)

## Table S3. Variance Partitioning of Framework, Time, and Interaction Effects.

Summary of the two-way factorial ANOVA results for all eight metrics. Values represent partial eta-squared ($\eta^{2}_{p}$), which quantifies the proportion of variance explained by each factor. The dominance of the ‘Model’ term across all scales confirms that framework choice is the primary determinant of network topology.

In [None]:
readr::read_csv("../tables/ANOVA_Results.csv") %>%
kable()

Rows: 8 Columns: 4
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Metric
dbl (3): Model, Time, Interaction

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

  Metric                                 Model    Time   Interaction
  ------------------------------------ ------- ------- -------------
  Connectance                            0.871   0.029         0.171
  Max trophic level                      0.821   0.322         0.324
  Generality                             0.959   0.230         0.405
  Vulnerability                          0.900   0.010         0.168
  No. of linear chains                   0.959   0.514         0.783
  No. of omnivory motifs                 0.955   0.673         0.782
  No. of direct competition motifs       0.973   0.915         0.863
  No. of apparent competition motifs     0.959   0.771         0.605


## Figure S3. Model Disagreement (CV%) Across Extinction Phases

Trends in inter-model disagreement, quantified as the Coefficient of Variation (CV%) between framework means. The Y-axis is standardized across panels to facilitate comparison between organizational scales. A characteristic “dip” at the ‘during’ phase in several meso-scale metrics illustrates the structural canalization effect, where severe species loss forces a temporary convergence in model predictions.

![](attachment:../figures/cv.png)

## Table S4. Percentage Disagreement Between Frameworks Across Extinction Phases.

Calculated inter-model CV% for each metric at each time step. These data points underpin the “bubble sizes” in the main text’s executive summary and the trajectories in Figure S3. Note the reduction in CV% for linear chains and omnivory motifs during the peak extinction phase (During).

In [None]:
readr::read_csv("../tables/Model_Agreement_CV.csv") %>%
kable()

Rows: 8 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): statistic
dbl (4): Pre extinction, During extinction, Early extinction, Late extinction

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

  ----------------------------------------------------------------------------
  statistic                         Pre       During        Early         Late
                             extinction   extinction   extinction   extinction
  ------------------------ ------------ ------------ ------------ ------------
  Connectance                  39.68855     32.15691     34.79191     40.54695

  Generality                   58.51606     41.01746     48.53126     51.75989

  Max trophic level            41.65860     31.46020     32.30812     39.32523

  No. of apparent              74.26024     83.35463     84.28280     79.62819
  competition motifs                                              

  No. of direct                80.49528     81.63039     79.94947     82.73417
  competition motifs                                              

  No. of linear chains        173.66964    131.61905    156.24100    160.33886

  No. of omnivory motifs      119.22142    102.05423    101.35425    123.04302

  Vulnerability                47.92763     41.32709     39.99020     47.67690
  ----------------------------------------------------------------------------


## Figure S4. Pairwise Framework Comparisons (Tukey HSD)

Heatmap showing significant differences between specific pairs of reconstruction frameworks across each extinction phase. Colours represent the magnitude and direction of the difference (estimate); asterisks ($*$) indicate statistical significance (p\<0.05). This identifies which specific models drive the high CV values seen in Figure 1.

![](attachment:../figures/ANOVA_tukey.png)

# References