The ALFAM2 dataset (database) on ammonia loss through volatilization from field-applied manure. This repository contains the "ALFAM2 dataset", the code used to generate it from data files submitted by researchers who made the emission measurements, and those original data files.
- Most visitors are interested in downloading data. Look in data-output/03 for the latest version of the ALFAM2 dataset. Typically it makes sense to use the version available in the latest release, which will also correspond to a version available through Zenodo.
- For more details on the ALFAM2 project, and access to project products see http://alfam.dk
- For a database interface that can be used to subset (filter) by country, application method, or more, use this web app: https://biotransformers.shinyapps.io/ALFAM2/. The data available through this interface are generally from the latest release, but always check the version number.
- Zenodo versions can be found here
- Looking for the ALFAM2 model R package? You want the ALFAM2 repo: https://github.com/sashahafner/ALFAM2.
- Checking your submitted data? Find the link in this list.
The dataset consists of two files: one with plot- and one with measurement-level observations.
See the headers
directory here for variable descriptions.
Files can be merged on the two plot keys or identification codes: pid
and pmid
.
Files are saved in a compressed format with the extension .gz
(gzip) in order to reduce file size.
The easiest way to get the data into R is to read the files directly with data.table::fread()
.
Alternatively, various utilies can be used to extract (unzip) the files, which are regular comma-separated ASCII text files.
The ALFAM2 data are organized into uptake
periods: 1 is for the original ALFAM work, 2 for the work described in this paper, and 3 for the current effort to expand the database.
The latest version will always be in the highest number (3 currently).
Earlier versions are saved to facilitate addition of data without rebuilding older dataset versions while maintaining the option for revising data submitted in an earlier uptake period.
See the Data handling tips
below for more information on working with the data.
The best way to refer to this work if you use the data in a publication is to cite the version available through Zenodo. In addition to a digital object identifier (doi), this Zenodo version includes an author list. Alternatively, individual releases can be cited. In either case, be sure to note the version number used in any citation. If you have used a version not included in a release or a Zenodo dataset, please contact me (here)and I will create the necessary release or dataset. Authors may also want to cite (and read!) this paper. But again, include the dataset version number in any citation. You can find the dataset version number in the release tags or in a text file here: data-output/03/data_version.txt.
For a description of the dataset, see this paper: https://doi.org/10.1016/j.agrformet.2017.11.027. For the ALFAM2 model for ammonia emission, with parameter estimation based on the ALFAM2 dataset, see this paper: https://doi.org/10.1016/j.agrformet.2017.11.027.
The simplest way to load the data in R is with the fread()
function from the data.table package.
idat <- data.table::fread('ALFAM2_interval.csv.gz')
pdat <- data.table::fread('ALFAM2_plot.csv.gz')
Once these two data frames (tables) are created, they can be combined (if needed) with base::merge()
or data.table::merge()
function.
cdat <- merge(idat, pdat, by = c('pid', 'pmid'))
(The by
argument is optional here, but it is good practice to be aware of the columns used for merging.)
In Python, the read_csv()
function from the pandas package can be used to read the compressed files directly.
import pandas as pd
idat = pd.read_csv('ALFAM2_interval.csv.gz')
For some more information, see the analysis
directory.
To inquire about submitting data to the ALFAM2 dataset, send a message to sasha.hafner AT bce.au.dk
through email or else used this form.
To check submitted data, find your completed template file in this list and follow the instructions you will see at the top of the linked file.
After cloning this repo, the process of creating the ALFAM2 database from submitted data can be completely repeated by running two scripts in R (https://www.r-project.org/) sequentially:
scripts/02/main.R
scripts/03/main.R
The only obvious reason for doing so is to update the database after making a correction to submitted data or data processing code.
(Submitted data, as spreadsheet templates (usually) completed by data submitters, can be found in the data-submitted
directory.)
For changes to only the most recent submitted file, only the highest numbered file, for the latest uptake period, needs to be run.
Because of changes in base R and add-on packages over time, seeing R return an error when updating earlier uptake periods, or even the latest period some time after the latest update by database developers, is possible (e.g., this problem was caused by a change in base R).
The record of R and add-on package versions in logs/*/R_package_version.txt
should help sort out any of these problems.
analysis
demo code for reading and plotting data and creating summariesarchive
early versions of the database from before this repository was createdcontributors
list of indviduals who have contributed data, used for author listdata-ALFAM1
the original ALFAM database, developed by Sven G. Sommer et al., considered "uptake period 1", and read in byscripts/02/make_database.R
data-output
output: ALFAM2 database files, with latest version in subdirectory with highest uptake numberdata-submitted
data as submitted by data submitters, in spreadsheet templates, in some cases with data processing etc. by database developersdocs
html logs or summaries of submitted data for checking by submitters or others through this page: https://au-bce-ee.github.io/ALFAM2-data/functions
R functions for building database, some from https://github.com/sashahafner/jumbledheaders
detailed description of data file headers (variable names)logs
various logs created when database is builtscripts
the R scripts that are used directly to build the databaseuptake-notes
notes on some submitted files