Data for the Xylella fastidiosa remote sensing study
Nature Plants 2018
This repository contains the codes and data needed to reproduce the article:
Zarco-Tejada, P.J., Camino, C., Beck, P.S.A., Calderon, R., Hornero, A., Hernández-Clemente, R., Kattenborn, T., Montes-Borrego, M., Susca, L., Morelli, M., Gonzalez-Dugo, V., North, P.R.J., Landa, B.B., Boscia, D., Saponari, M., Navas-Cortes, J.A., Pre-visual Xylella fastidiosa infection revealed in spectral plant-trait alterations, Nature Plants (2018)
DOI:10.1038/s41477-018-0189-7
The article is available at the following address.
Instructions
The codes and data provided in the repository are the following:
File | Download | Descripcion |
---|---|---|
.R | Codes | R codes to reproduce the analysis from the original data |
.csv | Raw data | Tables used in the Xylella fastidiosa remote sensing study |
Note: All analyses were done in R.
Update: Trees misclassified in the original Xf-database used for the publication of the manuscript have been updated (04/Feb/2020) in the file available here, which includes a new field ("Year") to separate the evaluated trees by years. Differences in the results obtained when using the updated database are minor (less than 3% differences for the overall accuracy obtained with PSFT and the SVM model). Plant trait importance and the major conclusions published in the manuscript remain unchanged.
.R Files
To recreate the results, run the commands with the .R extension placed on the codes folder. To achieve this, download the repository, and then open an R session with working directory set to the root of the project.
The code reproduces the confusion matrix of the supplementary Table 4. For that purpose, the code splits the data
set (training and test data), executes the VIF, Wilks.lambda and ROC analylsis and runs the classification and
machine learning algorithms described in the article.
This code is valid for the Case A: asymptomatic (AS) vs. symptomatic trees (AF; affected)
The code reproduces the confusion matrix of the supplementary Table 5. Similar to previous R code, the code splits
the data set (training and test data), executes the VIF, Wilks.lambda and ROC analylsis and runs the classification
and machine learning algorithms described in the article.
This code is valid for the Case B: Initial Xf-symptoms (IN, DS=1) vs. advanced Xf-symptoms (AD,DS = 2 3 and 4)
severity levels.
Note: Codes Analysis-1.R and Analysis-2.R split the data set according to two criterias:
-
the training sample (TR), containing 80% of the data collected over two years for each disease severity class selected at random, and the testing or validation sample (TS), with the remaining 20% for testing the model.
-
the training sample (TR), containing 90% of the data collected over two years for each disease severity class selected at random, and the testing or validation sample (TS), with the remaining 10% for testing the model.
Prior to running the scripts, you may need to install the appropriate packages, all available on the CRAN repository. In R codes, the procedure to install packages is also indicated. To install the packages, open an R session and execute the following commands:
if (!require("fmsb")) { install.packages("fmsb"); require("fmsb") } ### VIF analysis
if (!require("klaR")) { install.packages("klaR"); require("klaR") } ### Wilks.lambda
if (!require("caret")) { install.packages("caret"); require("caret") } ### Partition data set and LDA model
if (!require("e1071")) { install.packages("e1071"); require("e1071") } ### SVM model
if (!require("nnet")) { install.packages("nnet"); require("nnet") } ### NN model
if (!require("pROC")) { install.packages("pROC"); require("pROC") } ### ROC AUC analysis
The code reproduces the confusion matrix between the field evaluation and remote sensing predictions
vs qPCR tests at two spatial scales:
- At parcel level (Figure 5).
- At orchard level (Table 6).
The code generates 50 non-linear SVM classification models using a radial basis function and leave-one-out-cross-validation (LOOCV). Then, the code uses the SVM predictions to generate a stochastic gradient boosting machine to
test the remote sensing-based PSFT model at parcel and orchard levels.
A VIF function for stepwise variable selection.
Raw data
For access to the raw data, see data folder:
Contact information
Pablo J. Zarco-Tejada
email: pablo.zarco@gmail.com
http://quantalab.ias.csic.es/
This repository follows the principles of reproducible research (Peng, 2011). When using the raw data, please cite the original publication.
Acknowledgments
We thank Z.G. Cerovic, J.Flexas, F.Morales, and P.Martín for scientific discussions, QuantaLab-IAS-CSIC for laboratory assistance, and G.Altamura, A.Ceglie, and D.Tavano for field support. The study was funded by the European Union’s Horizon 2020 research and innovation programme through grant agreements POnTE (635646) and XF-ACTORS (727987). The views expressed are purely those of the writers and may not in any circumstance be regarded as stating an official position of the European Commission.