Good Statistical Monitoring {gsm} R package

The {gsm} package provides a standardized Risk Based Quality Monitoring (RBQM) framework for clinical trials that pairs a flexible data pipeline with robust reports like the one shown below.

This README provides a high-level overview of {gsm}; see the package website for additional details.

Background

The {gsm} package performs risk assessments primarily focused on detecting differences in quality at the site-level. "High quality" is defined as the absence of errors that matter. We interpret this as focusing on detecting potential issues related to critical data or process across the major risk categories of safety, efficacy, disposition, treatment, and general quality, where each category consists of one or more risk assessment(s). Each risk assessment will analyze the data to flag sites with potential issues and provide a visualization to help the user understand the issue. Some relevant references are provided below.

Centralized Statistical Monitoring: 1, 2, 3
EMA/FDA Guidance on Risk Based Management: 1, 2, 3, 4
Risk Based Quality Management: 1, 2, 3
Related tools: 1, 2

Process Overview

The {gsm} package establishes a data pipeline for RBM using R. The package provides a framework that allows users to assess and visualize site-level risk in clinical trial data. The package currently provides assessments for the following domains:

Adverse Event Frequency
Serious Adverse Event Frequency
Protocol Deviation Frequency
Important Protocol Deviation Frequency
Lab Abnormality Frequency
Subject Discontinuation Frequency
Treatment Discontinuation
Query Rate
Query Age
Data Entry Lag
Data Change Rate
Screen Failure

All {gsm} assessments use a standardized 6 step data pipeline:

Map (Optional) - Converts raw data to input data.
Transform - Converts input data to transformed data.
Analyze - Converts transformed data to analyzed data.
Threshold - Uses analyzed data to create one or more numeric thresholds.
Flag - Uses analyzed data and numeric thresholds to create flagged data.
Summarize - Selects key columns from flagged data to create summary data.

To learn more about {gsm}'s data pipeline, visit the Data Pipeline Vignette.

Reporting

Detailed RMarkdown/HTML reporting is built into {gsm}, and provides a detailed overview of all risk assessments for a given trial. For example, an AE risk assessment looks like this:

Full reports for a sample trial run with {clindata} are provided below:

Quality Control

Since {gsm} is designed for use in a GCP framework, we have conducted extensive quality control as part of our development process. In particular, we do the following:

Qualification Workflow - All assessments have been Qualified as described in the Qualification Workflow Vignette. A Qualification Report Vignette is generated and attached to each release.
Unit Tests - Unit tests are written for all core functions.
Contributor Guidelines - Detailed contributor guidelines including step-by-step processes for code development and releases are provided as a vignette.
Data Model - Vignettes providing detailed descriptions of the data model.
Code Examples - The Cookbook Vignette provides a series of simple examples, and all functions include examples as part of Roxygen documentation.
Code Review - Code review is conducted using GitHub Pull Requests (PRs), and a log of all PRs is included in the Qualification Report Vignette.
Function Documentation - Detailed documentation for each function is maintained with Roxygen.
Package Checks - Standard package checks are run using GitHub Actions and must be passing before PRs are merged.
Data Specifications - Machine-readable data specifications are maintained for all KRIs. Specifications are automatically added to relevant function documentation.
Continuous Integration - Continuous integration is provided via GitHub Actions.
Regression Testing - Extensive QC and testing is done before each release.
Code Formatting - Code is formatted with {styler} before each release.

Additional detail, including links to functional documentation and vignettes, is available in the package website.

Name		Name	Last commit message	Last commit date
Latest commit History 4,012 Commits
.github		.github
R		R
data-raw		data-raw
data		data
inst		inst
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.Rinstignore		.Rinstignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.md		README.md
_pkgdown.yml		_pkgdown.yml
codecov.yml		codecov.yml
gsm.Rproj		gsm.Rproj

License

Gilead-BioStats/gsm

Folders and files

Latest commit

History

Repository files navigation

Good Statistical Monitoring {gsm} R package

Background

Process Overview

Reporting

Quality Control

About

Resources

License

Stars

Watchers

Forks

Languages