Skip to content

DylanLawless/dante

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dante for genetic report generation

This repository contains R code for generating clinical genetic reports from the output of a Whole Genome Sequencing (WGS) pipeline. The main functionality transforms raw pipeline data into structured reports (PDF/HTML) that provide variant interpretation and annotations for clinical analysis.

Dante logo

Overview

Key Features

  • Automated report generation: Generates PDF reports based on patient-specific data.
  • Customisable input parameters: Allows parameterised execution for single-case analysis.
  • Data visualisation: Includes PCA plots and tabular summaries of variants.
  • Standards compliant: Adheres to ACMG guidelines for variant interpretation.

Exported Datasets

We export 4 CSV datasets from the paths specified below:

  1. Core Dataset

    • Path: PanelAppData_genes_combined_core
    • Contains fields:
      • id
      • Gene
      • confidence_level
      • mode_of_inheritance
      • name
      • disease_group
      • disease_sub_group
      • status
  2. Minimal Dataset

    • Path: PanelAppData_genes_combined_minimal
    • Contains fields:
      • panel ID
      • Gene symbol
  3. Metadata Dataset

    • Path: PanelAppData_genes_combined_meta_names.csv
    • Contains fields:
      • panel id
      • panel name
  4. Metadata Counts

    • Path: PanelAppData_genes_combined_meta_variable_counts.csv
    • Description: Contains metadata counts for each variable
      • Contains:
# A tibble: 52 × 2
Column_Name           Unique_Counts
<chr>                         <int>
1 id                              447
2 entity_type                       1
3 Gene                           6280
4 confidence_level                  4
5 penetrance                        4
6 mode_of_pathogenicity            12
7 publications                  13102
8 evidence                       3653
9 phenotypes                    21636
10 mode_of_inheritance              15
# ℹ 42 more rows

Tree

Currently, the private data from WGS in set in .gitinore for output (some pipeline output for testing) and reports (the reports generated by Dante). Lastly, the main analysis data is source from outside the repository itself. This is currently hardcoded for testing: data_path / file_list (path to Guru output .Rds).

.
├── README.md
├── Rmd
├── images
├── output
└── reports

..
└── guru_data_path_to_files

Requirements

Software

  • R (>= 4.0)
  • LaTeX (e.g., TeX Live, MikTeX) with xelatex

R Packages

  • rmarkdown
  • kableExtra
  • dplyr
  • digest
  • stringr
  • ggplot2

Install packages using:

install.packages(c("rmarkdown", "kableExtra", "dplyr", "digest", "stringr", "ggplot2"))

Usage

Input Data

Ensure the following input files are available:

  • WGS pipeline output files from ACMGuru/GuRu (i.e. .Rds).
  • Phenotypic and clinical metadata.
  • Supporting annotation files (e.g., population reference data).

Execution Steps

  1. Prepare input data: Place the required data files in the appropriate directories.

  2. Set parameters: Update parameters in the report.Rmd file or pass them dynamically via the R script.

  3. Run Report Generation:

    source("report_runner.R")
    <!-- or bash -->
    Rscript report_runner.R
    

    This script processes the input data and generates reports for each case.

Output

  • Individual PDF reports, named by sample ID and rank (e.g., report_priority_1_sample_ABC123.pdf).

References

Support

For issues, please create a GitHub issue or contact the repository maintainer.

About

Genetic reporting that depends on many labours

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages