Skip to content

A template R workflow for general data analysis

Notifications You must be signed in to change notification settings

OxfordIHTM/ihtm-targets-template

Repository files navigation

A template R workflow for general data analysis

This repository is a template for a docker-containerised, {targets}-based, {renv}-enabled R workflow for general data analysis.

About the Project

Repository Structure

The project repository is structured as follows:

sc-policy-review
    |-- .github/
    |-- data/
    |-- data-raw/
    |-- outputs/
    |-- R/
    |-- reports
    |-- renv
    |-- renv.lock
    |-- .Rprofile
    |-- packages.R
    |-- _targets.R
  • .github contains project testing and automated deployment of outputs workflows via continuous integration and continuous deployment (CI/CD) using Github Actions.

  • data/ contains intermediate and final data outputs produced by the workflow.

  • data-raw/ contains raw datasets, usually either downloaded from source or added manually, that are used in the project. This directory is empty given that the raw datasets used in this project are restricted and are only distributed to eligible members of the project. This directory is kept here to maintain reproducibility of project directory structure and ensure that the workflow runs as expected. Those who are collaborating on this project and who have permissions to use the raw datasets should include their copies of the raw dataset into this directory in their local versions of this repository.

  • outputs/ contains compiled reports and figures produced by the workflow.

  • R/ contains functions developed/created specifically for use in this workflow.

  • reports/ contains literate code for R Markdown reports rendered in the workflow.

  • renv/ contains renv package specific files and directories used by the package for maintaining R package dependencies within the project. The directory renv/library, is a library that contains all packages currently used by the project. This directory, and all files and sub-directories within it, are all generated and managed by the renv package. Users should not change/edit these manually.

  • renv.lock file is the renv lockfile which records enough metadata about every package used in this project that it can be re-installed on a new machine. This file is generated by the renv package and should not be changed/edited manually.

  • .Rprofile file is a project R profile generated when initiating renv for the first time. This file is run automatically every time R is run within this project, and renv uses it to configure the R session to use the renv project library.

  • packages.R file lists out all R package dependencies required by the workflow.

  • _targets.R file defines the steps in the workflow’s data ingest, data processing, data analysis, and reporting pipeline.

Reproducibility

R package dependencies

This project was built using R 4.4.0. This project uses the renv framework to record R package dependencies and versions. Packages and versions used are recorded in renv.lock and code used to manage dependencies is in renv/ and other files in the root project directory. On starting an R session in the working directory, run renv::restore() to install R package dependencies.