This repository contains R code for generating clinical genetic reports from the output of a Whole Genome Sequencing (WGS) pipeline. The main functionality transforms raw pipeline data into structured reports (PDF/HTML) that provide variant interpretation and annotations for clinical analysis.
- Automated report generation: Generates PDF reports based on patient-specific data.
- Customisable input parameters: Allows parameterised execution for single-case analysis.
- Data visualisation: Includes PCA plots and tabular summaries of variants.
- Standards compliant: Adheres to ACMG guidelines for variant interpretation.
We export 4 CSV datasets from the paths specified below:
-
Core Dataset
- Path:
PanelAppData_genes_combined_core - Contains fields:
idGeneconfidence_levelmode_of_inheritancenamedisease_groupdisease_sub_groupstatus
- Path:
-
Minimal Dataset
- Path:
PanelAppData_genes_combined_minimal - Contains fields:
panel IDGene symbol
- Path:
-
Metadata Dataset
- Path:
PanelAppData_genes_combined_meta_names.csv - Contains fields:
panel idpanel name
- Path:
-
Metadata Counts
- Path:
PanelAppData_genes_combined_meta_variable_counts.csv - Description: Contains metadata counts for each variable
- Contains:
- Path:
# A tibble: 52 × 2
Column_Name Unique_Counts
<chr> <int>
1 id 447
2 entity_type 1
3 Gene 6280
4 confidence_level 4
5 penetrance 4
6 mode_of_pathogenicity 12
7 publications 13102
8 evidence 3653
9 phenotypes 21636
10 mode_of_inheritance 15
# ℹ 42 more rows
Currently, the private data from WGS in set in .gitinore for
output (some pipeline output for testing) and reports (the reports generated by Dante).
Lastly, the main analysis data is source from outside the repository itself. This is currently hardcoded for testing:
data_path / file_list (path to Guru output .Rds).
.
├── README.md
├── Rmd
├── images
├── output
└── reports
..
└── guru_data_path_to_files
- R (>= 4.0)
- LaTeX (e.g., TeX Live, MikTeX) with
xelatex
rmarkdownkableExtradplyrdigeststringrggplot2
Install packages using:
install.packages(c("rmarkdown", "kableExtra", "dplyr", "digest", "stringr", "ggplot2"))Ensure the following input files are available:
- WGS pipeline output files from ACMGuru/GuRu (i.e. .Rds).
- Phenotypic and clinical metadata.
- Supporting annotation files (e.g., population reference data).
-
Prepare input data: Place the required data files in the appropriate directories.
-
Set parameters: Update parameters in the
report.Rmdfile or pass them dynamically via the R script. -
Run Report Generation:
source("report_runner.R")<!-- or bash --> Rscript report_runner.RThis script processes the input data and generates reports for each case.
- Individual PDF reports, named by sample ID and rank (e.g.,
report_priority_1_sample_ABC123.pdf).
For issues, please create a GitHub issue or contact the repository maintainer.
