Developer Guideline

`AMR` package Developer Guideline

Welcome to the Developer Guideline of the AMR R package. This guideline explains about repository workflows and updates of package elements.

Contents

Introduction
Updating the AMR package
Other
- Reproducibility scripts
- S3 extensions

Introduction

Copyright

To start, it is important to know that this R package and all of its components are free, open-source software and licensed under the GNU General Public License (GPL) v2.0. Open-source software does not mean that there are no legal constraints. There are actually some profound ones, since GPL-2.0 in a nutshell means that this package:

May be used for commercial and private purposes, but may not be used for patent purposes
May be modified, although (1) modifications must also be released under the GPL-2.0 license when distributing the package, and (2) changes made to the code must be documented (using NEWS.md)
May be distributed, although (1) source code must be made available when the package is distributed, and (2) a copy of the license and copyright notice must be included with the package
Comes with a LIMITATION of liability, and with NO warranty

The full legal text is included on this repository here.

General Git(Hub) Workflow

This repository uses Git hooks to support automated generation of R documentation, automated semantic software versioning, and automated export of our data sets for other software (MS Excel, SPSS, SAS, Stata, Apache Parquet, Apache Feather).

Pre-commit checks in `git`

All updates to the repository should be done locally using git commit (or RStudio) and not the GitHub website, since local commands allow the use of our git prehooks, allowing automated semantic versioning and R documentation updates.

When using git commit, a script will be run to increase the version number, update the date and R documentation. Note: This only works on Unix systems, such as macOS and Linux.

To set this up, run this command once when working locally in the repository:

git config --local core.hooksPath ".github/prehooks"

Now, when using git commit:

git commit -am "test commit"
# Running prehook...
# >>  Updating R documentation...
# >>  done.
# >>  
# >>  Updating semantic versioning and date...
# >>  - latest tag is 'v1.8.1', with 26 previous commits
# >>  - AMR pkg version set to 1.8.1.9027
# >>  - updated DESCRIPTION
# >>  - updated NEWS.md
# >>  
# [main 300b93e] (v1.8.1.9027) test commit
#  3 files changed, 3 insertions(+), 4 deletions(-)

To circumvent using the checks, you can use the argument —-no-verify (or -n for short) with git commit, or add the text "no-check" or "no-verify" to the commit message. This is useful for releasing new versions, since otherwise the version number in DESCRIPTION and NEWS.md would become overwritten.

# add checks:
git commit -am  "small website fix"
# skip checks:
git commit -am  "small website fix (no-checks)"
git commit -am  "small website fix (no-verify)"
git commit -amn  "small website fix"
git commit --no-verify -am "small website fix"

In RStudio, where the git commit command runs in the background, it is the most convenient way to add "no-checks" to the commit message.

GitHub Actions: website generation

The website (https://msberends.github.io/AMR) will be generated automatically if changes are pushed to the main branch. This is done using GitHub Actions, and the workflow file can be found here: .github/workflows/website.yaml. The website generation will be done in the latest Ubuntu LTS version and the current release version of R.

The website will be stored in the gh-pages branch.

Since a GitHub Action uses git pull to retrieve the repo contents, timestamps of files will not be preserved. This is a problem, since the ‘Data Set for Download’ vignette (https://msberends.github.io/AMR/articles/datasets.html) relies on timestamps to let users know when a data set was last updated. For this reason, the following code was added to the GitHub Action workflow file:

https://github.com/msberends/AMR/blob/d2edcf51adcb1b2e5dbba811dfc76549d10ffbf6/.github/workflows/website.yaml#L47-L54

GitHub Actions: all workflow files

This repository contains five GitHub Actions workflow files, each for a different purpose:

File	Runs when	Runs on	Purpose
`check.yaml`	Everyday at 1 AM; After every push to any branch	Ubuntu 22.04 (R 3.0 to R-devel); Latest Windows (R 3.6 to R-devel); Latest macOS (R 3.6 to R-devel)	Run `R CMD check`, including all unit tests
`check-pr.yaml`	In every pull request, including updates (not if author is repo member/owner)	Ubuntu 22.04 (R-release and R-devel); Latest Windows (R-release and R-devel); Latest macOS (R-release and R-devel)	Run `R CMD check`, including all unit tests
`codecovr.yaml`	After every push to any branch; In every pull request, including updates	Latest Ubuntu (R-release)	Check code coverage and upload to http://codecov.io/gh/msberends/AMR
`lintr.yaml`	After every push to any branch; In every pull request, including updates	Latest Ubuntu (R-release)	Check coding style according to Tidyverse convention
`website.yaml`	After every push to the 'main' branch	Latest Ubuntu (R-release)	Create website from scratch, with all examples

Updating the AMR Package

Add or update a language

Please read the separate Wiki page Add or Update a Language for Translation.

This process is also covered when committing a change, since data-raw/_pre_commit_hook.R contains the full workflow to update language files.

Update EUCAST/CLSI Guidelines

After updating these guidelines, be sure to add the new version numbers to R/aa_globals.R.

Clinical breakpoints

The clinical breakpoints from EUCAST and CLSI are stored in the data set clinical_breakpoints. To update this data set to include the latest guidelines, follow the instructions in data-raw/reproduction_of_clinical_breakpoints.R. There is no need to update the documentation manually, all values in the documentation that refer to the clinical_breakpoints data set are parametrised (such as the names of included guidelines). Running devtools::document() will do fine, though this is also part of the pre-commit hook.

This script will incorporate the last 10 years of the CLSI and the EUCAST guidelines.

Be sure to do some checks with the original e.g. EUCAST files to check if everything works as expected! For example, run scripts like this:

test_mics <- as.mic(c(0.256, 0.5, 1, 2, 4, 8, 16, 32, 64))

as.sir(test_mics, mo = "Escherichia coli", ab = "ciprofloxacin", guideline = "EUCAST")
as.sir(test_mics, mo = "Escherichia coli", ab = "ciprofloxacin", guideline = "CLSI")

as.sir(test_mics, mo = "Pseudomonas aeruginosa", ab = "ciprofloxacin", guideline = "EUCAST")
as.sir(test_mics, mo = "Pseudomonas aeruginosa", ab = "ciprofloxacin", guideline = "CLSI")

as.sir(test_mics, mo = "Streptococcus pneumoniae", ab = "amoxicillin", guideline = "EUCAST")
as.sir(test_mics, mo = "Streptococcus pneumoniae", ab = "amoxicillin", guideline = "CLSI")

EUCAST Inferred resistance / susceptibility

These rules are inside the Clinical Breakpoints tables from EUCAST and only available via their MS Excel and PDF files, we've found no other source in a machine-readable format such as TXT or CSV. The rules (in the Notes sections of each page/sheet) must be added manually to data-raw/eucast_rules.tsv, although most rules can be copied from an earlier version from that file.

EUCAST Expert rules

Expert rules from EUCAST are only available via their MS Excel and PDF files, we've found no other source in a machine-readable format such as TXT or CSV. The rules must be added manually to data-raw/eucast_rules.tsv, although most rules can be copied from an earlier version from that file.

Be sure to update the version numbers in R/data.R and R/aa_globals.R afterwards.

EUCAST Dosage guidelines

EUCAST Dosage guidelines are stored in the data set dosage. Up to 2022, EUCAST only distributes PDF files with their dosing guidelines. Adobe Acrobat is required to transform them to an Excel file. Follow the instructions in data-raw/reproduction_of_dosage.R to automatically update the data set.

Be sure to update the version numbers in R/data.R and R/aa_globals.R afterwards.

Update the microbial taxonomy

The microbial taxonomy is stored in the data set microorganisms. Updating this data set is almost 100% automated and can be done following the instructions in data-raw/reproduction_of_microorganisms.R. Note that it is required to download the full GBIF data set, which requires at least 10 GB of RAM to read into R.

Downloading data from LPSN requires an account. This is free and easy, and can be done here (or alternatively, visit https://lpsn.dsmz.de/downloads and click on Register at the bottom of the form).

There are a lot of unit tests in place to check its integrity after updating, but running a few manual checks never hurts:

as.mo("E. coli")
as.mo("eco")
as.mo("KLEPNE")

Update the antimicrobial agents

The package contains two data sets for antimicrobial agents: antibiotics and antivirals.

Antibiotics

The antiviral agents are stored in the data set antibiotics.R . To update this data set, follow the instructions in data-raw/reproduction_of_antibiotics.R. This script is not fully automated and requires some manual work. The parts to update DDDs and ATC codes are fully automated, though.

Antivirals

The antiviral agents are stored in the data set antivirals. To update this data set, follow the instructions in data-raw/reproduction_of_antivirals.R. This script is fully automated.

Other

Reproducibility scripts

The data-raw folder contains all scripts and git history required for any other maintenance task such as updating data or finding out about package development history.

S3 extensions

The AMR package supports extensive S3 support using self-defined data types, also with support for other packages. Read about the S3 object system of R in the free Advanced R book by Hadley Wickham.

In short, S3 allows to add new data types (called a class) to a package, to extend on e.g. character and Date. To add a new class labnumber as an extension of double, the basis works like this:

x <- c(20220001, 20220002, 20220003)
class(x) <- c("labnumber", "double")

# now print the object:
print(x)
#> [1] 20220001 20220002 20220003
#> attr(,"class")
#> [1] "labnumber" "double"

Now we add an S3 extension for print() to the package:

#' @export
print.labnumber <- function(x, ...) {
  x <- as.character(x)
  print(paste0("LAB-", substr(x, 1, 4), "-", substr(x, 5, 8)),
        quote = FALSE)
}

Which results in:

print(x)
#> [1] LAB-2022-0001 LAB-2022-0002 LAB-2022-0003

User visible classes

The AMR package contains 6 new classes (data types) using S3 extensions that users can create themselves with an as.xxx() function:

Class	Created with	Extension of	Full object class	Purpose	Defined in file
`ab`	`as.ab()`	`character`	`c("ab", "character")`	Printing of antibiotic and antimycotic codes, ensuring integrity of antimicrobial codes	`R/ab.R`
`av`	`as.av()`	`character`	`c("av", "character")`	Printing of antivirals, ensuring integrity of antiviral codes	`R/av.R`
`disk`	`as.disk()`	`integer`	`c("disk", "integer")`	Cleaning of disk diffusion values, and printing, assigning, extracting them	`R/disk.R`
`mic`	`as.mic()`	`factor`	`c("mic", "ordered", "factor")`	Cleaning of MIC values, using mathematical operators with them (over 80 extensions, such as `>`, `mean`, `log2`), and printing, assigning, extracting them	`R/mic.R`
`mo`	`as.mo()`	`character`	`c("mo", "character")`	Cleaning of microbial codes and names, and printing, assigning, extracting them	`R/mo.R`
`sir`	`as.sir()`	`factor`	`c("sir", "ordered", "factor")`	Interpreting and cleaning to SIR values, and printing, assigning, extracting them	`R/sir.R`

Non-user visible classes

Additionally, the AMR package contains 5 classes that are used internally and do not have an as.xxx() function:

Class	Created with	Extension of	Full object class	Purpose	Defined in file
`ab_selector`	antibiotic selectors, such as `carbapenems()`	`character`	`c("ab_selector", "character")`	Selecting/Filtering of antibiotic columns in data	`R/ab_selectors.R`
`ab_selector_any_all`	N/A	`logical`	`c("ab_selector_any_all", "logical")`	Using `==`, `!=`, `any()` and `all()` on antibiotic selectors	`R/ab_selectors.R`
`bug_drug_combinations`	`bug_drug_combinations()`	`data.frame`	At least `c("bug_drug_combinations", "data.frame")` but might inherit other classes, such as `tbl_df` of tibbles	Printing and formatting the result of `bug_drug_combinations()`	`R/bug_drug_combinations.R`
`custom_eucast_rules`	`custom_eucast_rules()`	`list`	`c("custom_eucast_rules", "list")`	Concatenating and printing custom EUCAST rules	`R/custom_eucast_rules.R`
`custom_mdro_guideline`	`custom_mdro_guideline()`	`list`	`c("custom_mdro_guideline", "list")`	Concatenating and printing custom MDRO rules	`R/mdro.R`

Support for other packages

The AMR package also extends foreign packages, by providing S3 classes for functions of those packages. Usually, these functions have to be imported but since the AMR package is designed to independent of any other package, the S3 extensions are loaded after the AMR package is loaded, as defined in R/zzz.R. The most important benefit is that even if those foreign do not exist anymore, the AMR package will work the exact same way without CRAN complaining about incompatible support. This greatly improves durability of our package.

Currently extended packages are cleaner, ggplot2, pillar, skimr, and vctrs. These are for that reason also in the Enhances field of the DESCRIPTION file.

Foreign package	Foreign package function	Additional (input) class	Defined for class	Defined in
`pillar`	`pillar_shaft()`		`ab`	`R/ab.R`
`pillar`	`pillar_shaft()`		`av`	`R/av.R`
`pillar`	`pillar_shaft()`		`mo`	`R/mo.R`
`pillar`	`pillar_shaft()`		`sir`	`R/sir.R`
`pillar`	`pillar_shaft()`		`mic`	`R/mic.R`
`pillar`	`pillar_shaft()`		`disk`	`R/disk.R`
`pillar`	`type_sum()`		`ab`	`R/ab.R`
`pillar`	`type_sum()`		`av`	`R/av.R`
`pillar`	`type_sum()`		`mo`	`R/mo.R`
`pillar`	`type_sum()`		`sir`	`R/sir.R`
`pillar`	`type_sum()`		`mic`	`R/mic.R`
`pillar`	`type_sum()`		`disk`	`R/disk.R`
`cleaner`	`freq()`		`mo`	`R/mo.R`
`cleaner`	`freq()`		`sir`	`R/sir.R`
`skimr`	`get_skimmers()`		`mo`	`R/mo.R`
`skimr`	`get_skimmers()`		`sir`	`R/sir.R`
`skimr`	`get_skimmers()`		`mic`	`R/mic.R`
`skimr`	`get_skimmers()`		`disk`	`R/disk.R`
`ggplot2`	`autoplot()`		`sir`	`R/sir.R`
`ggplot2`	`autoplot()`		`mic`	`R/mic.R`
`ggplot2`	`autoplot()`		`disk`	`R/disk.R`
`ggplot2`	`autoplot()`		`resistance_predict`	`R/resistance_predict.R`
`ggplot2`	`fortify()`		`sir`	`R/sir.R`
`ggplot2`	`fortify()`		`mic`	`R/mic.R`
`ggplot2`	`fortify()`		`disk`	`R/disk.R`
`vctrs`	`vec_ptype2()`	`character`	`ab_selector`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`ab_selector`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`ab_selector`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`logical`	`ab_selector_any_all`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`ab_selector_any_all`	`logical`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`logical`	`ab_selector_any_all`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`character`	`ab`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`ab`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`ab`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`ab`	`character`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`character`	`av`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`av`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`av`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`av`	`character`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`character`	`mo`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`mo`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`mo`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`mo`	`character`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`integer`	`disk`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`disk`	`integer`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`integer`	`disk`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`disk`	`integer`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`double`	`disk`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`disk`	`double`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`disk`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`disk`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`mic`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`double`	`mic`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`mic`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`mic`	`double`	`R/vctrs.R`
`vctrs`	`vec_math()`		`mic`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`character`	`sir`	`R/vctrs.R`
`vctrs`	`vec_ptype2()`	`sir`	`character`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`character`	`sir`	`R/vctrs.R`
`vctrs`	`vec_cast()`	`sir`	`character`	`R/vctrs.R`

Badge for sharing anywhere:

AMR (for R). Developed at the University of Groningen in collaboration with non-profit organisations
Certe Medical Diagnostics and Advice Foundation and University Medical Center Groningen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Developer Guideline

`AMR` package Developer Guideline

Introduction

Copyright

General Git(Hub) Workflow

Pre-commit checks in `git`

GitHub Actions: website generation

GitHub Actions: all workflow files

Updating the AMR Package

Add or update a language

Update EUCAST/CLSI Guidelines

Clinical breakpoints

EUCAST Inferred resistance / susceptibility

EUCAST Expert rules

EUCAST Dosage guidelines

Update the microbial taxonomy

Update the antimicrobial agents

Antibiotics

Antivirals

Other

Reproducibility scripts

S3 extensions

User visible classes

Non-user visible classes

Support for other packages

Clone this wiki locally

Developer Guideline

AMR package Developer Guideline

Introduction

Copyright

General Git(Hub) Workflow

Pre-commit checks in git

GitHub Actions: website generation

GitHub Actions: all workflow files

Updating the AMR Package

Add or update a language

Update EUCAST/CLSI Guidelines

Clinical breakpoints

EUCAST Inferred resistance / susceptibility

EUCAST Expert rules

EUCAST Dosage guidelines

Update the microbial taxonomy

Update the antimicrobial agents

Antibiotics

Antivirals

Other

Reproducibility scripts

S3 extensions

User visible classes

Non-user visible classes

Support for other packages

Clone this wiki locally

`AMR` package Developer Guideline

Pre-commit checks in `git`