Carob creates reproducible workflows that standardize primary agricultural research data from experiments and surveys. Standardization includes the use of a common file format, variable names, units and accepted values according to the terminag standard. Standardized data sets are aggregated into larger collections that can be used in further research. We do this by writing an R script for each individual dataset. See the website for more information.
Carob is an open access Extract, Transform, and Load (ETL) framework supported by CGIAR to support predictive analytics (machine learning, artifical intelligence) and other types of data analysis.
Contributions are welcome from anyone, and they can be made via pull-requests. Feel free to improve these scripts, or provide new ones. See the instructions on how to write a Carob script described here. You can also raise an issues. A good place to discover new data sets is the Gardian website or our to-do list.
Compiled versions of the dataset can be downloaded from carob-data.org and some will eventually be made available on the carob dataverse.
You can also compile your own version by cloning the repo and running
remotes::install_github("reagro/carobiner")
ff <- carobiner::make_carob(path)
where path
is the folder of the cloned repo (e.g. "d:/github/carob"
)