-
Notifications
You must be signed in to change notification settings - Fork 1
Code
The code used to produce the results presented in the paper. All the code files are located in this repository under a folder with the same name.
Open the R project file (softimpute.Rproj) in the main repository folder. This will automatically set the working directory so the scripts run properly. From there, you can run the relevant scripts to reproduce the results. All scripts source code/common.R, which defines shared parameters, themes, and utility functions.
This script contains the main code producing the results in the paper. It contains:
- Island-scale prediction based on SVD
- Evaluation of predictive performance (based on binary and weighted metrics)
- Ecological inference, including distance decay analysis
- Visualizations of existing and potential missing links
emln dataset no. 60: dataset of plant-pollinator interactions collected in the canary islands, archived in the emln R package. Used to build the layers of the network.
data/distance_between_sites_canary.csv: geographic distances between sites, used for distance-decay analysis. Taken from the original data publication.
results/predictions_island_scale.rds: the results of the link prediction. Used for further analysis in the same scripts as well as in other scripts
Plots (saved to results/paper_figs/):
| File | Figure |
|---|---|
island_heatmap_f05.pdf |
Fig. 2c |
hist_f05a_legend_bottom.pdf |
Fig. 2d |
missing_interactions_degree2.pdf |
Fig. 3 (composite: panels a, b, c) |
map_missing_links_merged.pdf |
Fig. 3a (individual panel) |
pie_chart.pdf |
Fig. 3b (individual panel) |
degree_unobserved_links.pdf |
Fig. 3c (individual panel) |
isl_jaccard_distance.pdf |
Fig. 4 |
pr_roc.pdf |
Fig. S11 |
roc_curve.pdf |
Fig. S12 |
predicted_original.png |
Fig. S13 |
degree_occurrence.pdf |
Fig. S14 |
local_degree_predicted_links.pdf |
Fig. S15 |
degree_binning.pdf |
Fig. S16 |
plant_island_degree.pdf |
Fig. S17 |
netdensity_f05_nnse.pdf |
Fig. S18 |
netsize_f05_nnse.pdf |
Fig. S19 |
This script handles a sensitivity analysis to check the combined effect of different Ks and lambdas.
results/predictions_island_scale.rds: The predictions found by Abramov_et_al_spatial_prediction_analysis.R script
plot: boxplot to visualize f1s for each lambda and K tested. not saved in a new file.
This script contains the code for site scale analysis. It contains:
- Island-scale prediction.
- Jaccard analysis
- Distance decay
- ecological inference
emln dataset no. 60: dataset of plant-pollinator interactions collected in the canary islands that was archived in the emln R package. used to build the layers of the network.
data/distance_between_sites_canary.csv: used to calculate correlations in the analysis. Taken from the paper that collected the data.
results/predictions_site_scale.rds: The results of the link prediction, but this time, site scale.
Plots:
hist_f1_site.pdf: figure S3
site_netsize_f1_nnse.pdf: figure S6
site_netdensity_f1_nnse.pdf: figure S7
site_heatmap_f1.pdf: figure S2
jaccard_site_f1.pdf: figure S4
cor_plot_site_dif_f1.pdf: figure S5
nnse_f1_scales.pdf: figure S1 \
This script contains the code to compare prediction quality when using the entire species pools of A and P for prediction vs. prediction done based on subsetting only the species they share.
emln dataset no. 60: dataset of plant-pollinator interactions collected in the canary islands that was archived in the emln R package. used to build the layers of the network.
results/predictions_island_scale.rds: The results of the link prediction, on the island scale (only if it didn't exist already).
Plots:
subset_analysis.pdf: figure 2a,b