From dfcfc32c26884f827c2f5d5c1e83760c63d2c255 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 13 Mar 2025 14:00:40 +1100 Subject: [PATCH 01/36] add document for https://trello.com/c/NEVSbxPK/1656-sash-planning-documentation --- docs/README.md | 750 +++++++++++++++++- docs/images/sash_overview_qc.png | Bin 0 -> 44637 bytes .../sash_workflow_overview_diagram_Vqc.pptx | Bin 0 -> 56740 bytes .../~$sash_workflow_overview_diagram_Vqc.pptx | Bin 0 -> 165 bytes docs/output.md | 111 ++- 5 files changed, 829 insertions(+), 32 deletions(-) create mode 100644 docs/images/sash_overview_qc.png create mode 100644 docs/images/sash_workflow_overview_diagram_Vqc.pptx create mode 100644 docs/images/~$sash_workflow_overview_diagram_Vqc.pptx diff --git a/docs/README.md b/docs/README.md index 5ec0f466..b9a29687 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,8 +1,746 @@ -# umccr/sash: Documentation +# Sash Workflow Overview -The umccr/sash documentation is split into the following pages: +![Summary](images/sash_overview_qc.png) -- [Usage](usage.md) - - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. -- [Output](output.md) - - An overview of the different results produced by the pipeline and how to interpret them. +The **sash Workflow** comprises three primary pipelines: + +* **Somatic Small Variants (SNV somatic)** +* **Somatic Structural Variants (SV somatic)** +* **Germline Variants (SNV germline)** + +These pipelines utilise **Bolt**, a Python package designed for modular processing, and leverage outputs from the [**DRAGEN**](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) **Variant Caller** alongside and the Hartwig Medical Foundation **WiGiTS** toolkit (via [Oncoanalyser]() [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. + +# [**HMFtools WiGiTs**](https://github.com/hartwigmedical/hmftools/tree/master) + +**HMFtools WiGiTS** is a comprehensive open-source suite of genome and transcriptome analysis tools for cancer research and diagnostics​. Developed by the Hartwig Medical Foundation (HMF), WiGiTS comprises various components for SNV calling, structural variant analysis, copy number analysis, and clinical reporting. UMCCR’s Sash workflow relies on specific WiGiTS components (run via the HMF **OncoAnalyser** Nextflow pipeline​)t o enhance variant calling and interpretation: + +- [**SAGE (Somatic Alterations in Genome)**](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md)**:** SNV/MNV/Indel caller​. SAGE performs tiered variant calling with increased sensitivity in regions of high prior likelihood. Notably, it targets a curated panel of \~10,000 known cancer hotspot mutations (from sources like the Cancer Genome Interpreter, CIViC, OncoKB) at the highest sensitivity tier​. This allows recovery of low-allele-fraction variants in clinically relevant hotspots that the standard caller (like DRAGEN) might miss. SAGE outputs VCF files of additional somatic variants, with filters indicating confidence levels (e.g. hotspot, panel, high/low confidence). The Sash pipeline uses SAGE’s **somatic VCF output** to “rescue” missed tumor variants in important genes. + +- [**PURPLE**](https://github.com/hartwigmedical/hmftools/tree/master/purple)**:** A tool for copy number analysis, tumor purity and ploidy estimation, and identification of driver events​. PURPLE integrates read depth ratios (from COBALT) and B-allele frequencies (from AMBER) to calculate allele-specific copy number across the genome. It infers tumor **purity** (proportion of tumor cells in sample) and **ploidy** (average chromosome copy number), and uses these to distinguish somatic vs. germline variants and to highlight key genomic events. In Sash, PURPLE’s output (copy number segments, purity/ploidy info, etc.) is parsed to inform filtering (e.g. flagging variants in loss-of-heterozygosity regions) and to provide metrics like tumor mutation burden (TMB) and microsatellite instability (MSI)​ for reporting. + +# Workflows + +## **Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants** + +#### General + +In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser** that is relaing on **Somatic Alterations in Genome[(SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)>)** outputs. It’s structured into four steps: **Rescue**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarising the results. + +#### Summary + +1. **Rescue** variants using SAGE to recover low-frequency alterations in clinically important hotspots. +2. **Annotate** variants with clinical and functional information using PCGR. +3. **Filter** variants based on quality and frequency criteria (e.g., allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.).Filter variants based on allele frequency (AF), supporting reads (AD), and population frequency (gnomAD AF), removing low-confidence and common variants. +4. **Report** final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, multiqc) format. + + +### Variant Calling integrations + +The **variant calling integrations** step use variants fromemploys the **Somatic Alterations in Genome (SAGE)** variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on **targets known cancer hotspots (from sources like CGI, CIViC, OncoKB)Targeted Hotspot. Analysis**, prioritising predefined genomic regions of high clinical or biological relevance. This enables the integration callingrecovery of biologically significant variants in a VCF that may have been missed otherwise. + +[https://github.com/hartwigmedical/hmftools/tree/master/sage](https://github.com/hartwigmedical/hmftools/tree/master/sage) +[https://github.com/hartwigmedical/hmftools/tree/master/sage\#6-soft-filters](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters) + +* Employ the **SAGE tool** for targeted hotspot analysis to recover: + * Low-allele-frequency variants in hotspots genomic regions of clinical significance. +* Hotspots are derived from: + * Cancer Genome Interpreter (CGI) + * CIViC \- Clinical interpretations of variants in cancer. + * OncoKB \- Precision Oncology Knowledge Base. +* Outputs a VCF containing rescued variants. + +##### Inputs: + +* From DRAGEN: somatic small variant callerVCF + * ${meta.tumor\_id}.main.dragen.vcf.gz +* From oncoanalyser: SAGE VCF + * ${meta.tumor\_id}.main.sage.filtered.vcf.gz + + Filter on chr 1..22 and chr X,Y,M + + +##### Output: + +* Rescue: VCF + * ${meta.tumor\_id}.rescued.vcf.g + +#### Details + +**Steps are:** + +1. **Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered:** + * **Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions.** + * **Hotspot regions are derived from databases such as:** + * **Cancer Genome Interpreter (CGI)** + * **CIViC (Clinical Interpretations of Variants in Cancer)** + * **OncoKB (Precision Oncology Knowledge Base)** + * **This ensures that only high-confidence variants in clinically relevant regions are considered.** +2. **Separate SAGE calls into existing and novel variants** + * **Compare the input VCF and the filtered SAGE VCF to identify overlapping and unique variants.** +3. **Annotate existing somatic variant calls also present in the SAGE calls in the input VCF** + * **Annotate variants that are re-called by SAGE:** + * **For each variant in the input VCF, check if it exists in the SAGE existing calls.** + * **For variants re-called by SAGE:** + * **If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`:** + * **Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot.** + * **If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`:** + * **Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is rescued by SAGE.** + * **Update `FILTER=PASS` to include the variant in the final analysis.** + * **If `SAGE FILTER` is not `PASS`:** + * **Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants.** + * **Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix** +4. **Combine annotated input VCF with novel SAGE calls** + * **Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF::** + * **Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions:** + * **For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`.** + * **Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data.** + + **Summary Finalize the rescued of VCF file integration** + + * **The final VCF file includes:** + * **Original variants from the input VCF, annotated with SAGE information where applicable.** + * **Novel variants identified by SAGE in hotspot regions.** + * **Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process.** + * **The rescued VCF provides a comprehensive set of variants for downstream analysis, prioritizing clinically significant mutations.** + +### Annotation + +The **Annotation** consists of three processes:step employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots),UMCCR panel of normals and theand the **Personal Cancer Genome Reporter** (PCGR) tool to enrich variants with detailed functional and with clinical information using **ACMG** guidelines. PCGR classifies variants into **tiers** based on their clinical and biological significance and incorporates **mutational signature** analysis to provide insights into underlying mutational processes. To manage memory usage effectively, the input VCF file is divided into chunks, each containing up to 500,000 variants. Each chunk is processed independently through PCGR, and after annotation, the chunks are merged to produce an annotated VCF and TSV file. + +#### These annotations are used to decide which variants are retained or filtered in the next step + +Summary: +Use **PCGR** to enrich the VCF with: + +* Functional impact information (e.g., consequences, mutation hotspots). +* Clinical relevance (e.g., tier classifications, mutational signatures). +* Process VCF files in chunks ≤500,000 variants each. +* Merge annotated chunks into a unified VCF. + +##### Inputs: + +* Small variant vcfRescue VCF + * ${meta.tumor\_id}.main.sage.filtered.vcf.gz + +##### Output: + +* Annotated VCF + * ${meta.tumor\_id}.annotations.vcf.g + +Details: + +**Steps are:** + +1. **Set FILTER to "PASS" for unfiltered variants** + * Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. +2. **Annotate the VCF against reference sources** + * Use **vcfanno** to add annotations to the VCF file: + * **gnomAD** + * **Hartwig Hotspots** + * **ENCODE Blacklist** + * **Genome in a Bottle High-Confidence Regions**: Mark high-confidence regions from the Genome in a Bottle benchmark. + * **Low and High GC Regions**: Mark regions with \30% or \65% GC content, compiled by GA4GH. + * **Bad Promoter Regions**: Annotate regions with poor coverage, compiled by GA4GH. +3. **Annotate with UMCCR panel of normals counts** + * Use **vcfanno** and **bcftools** to annotate the VCF with counts from the **UMCCR panel of normals**, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. +4. **Standardize the VCF fields** + * Add new `INFO` fields for use with **PCGR**: +* `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. +* `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. +* Add the `AD` FORMAT field: +* `AD`: Allelic depths for the reference and alternate alleles. +5. **Prepare VCF for PCGR annotation** + * Exclude unnecessary data from the VCF header keeping on INFO AF/DP . + * Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. + * Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. + +6. **Run PCGR to annotate VCF against external sources** + * Use **PCGR** (Personal Cancer Genome Reporter) to annotate the VCF with clinical, functional, and biological information. + * Classify variants by tiers based on annotations and functional impact according to **ACMG** guidelines. + * Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. + * External sources used during this step include **VEP**, **ClinVar**, **COSMIC**, **TCGA**, **ICGC**, **Open Targets Platform**, **CancerMine**, **DoCM**, **CBMDB**, **DisGeNET**, **Cancer Hotspots**, **dbNSFP**, **UniProt/SwissProt**, **Pfam**, **DGIdb**, and **ChEMBL**. +7. **Transfer PCGR annotations to the full set of variants** + * merge the PCGR annotations back into the original VCF file. + * Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. + * Preserve the `FILTER` statuses and other annotations from the original VCF. +8. **Filter variants to remove putative germline variants and artefactsartifacts while keeping known hotspots/actionable variants** + * **Keep variants**: + * Called by **SAGE** in known hotspots (CGI, CIViC, OncoKB) regardless of other evidence. + * With PCGR TIER 1 and 2 classifications, indicating strong or potential clinical significance according to ACMG guidelines. + * All driver mutations from; + * **IntOGen** + * mutation hotspots + * ClinVar pathogenic or uncertain significance + * COSMIC count ≥10 + * TCGA pancancer count ≥5 + * ICGC PCAWG count ≥3. + * **Apply filters to other variants**: + * Remove variants with `AF 10%`. + * Remove common variants in gnomAD (<`population AF ≥ 1%`), adding them to the germline set. + * Remove variants present in ≥5 samples of the Panel of Normals. + * Remove indels in "bad promoter" regions (as defined by GA4GH). + * Remove variants overlapping the ENCODE blacklist. + * Remove variants with variant depth `VD 4`. + * Remove variants with `VD < 6` and overlapping a low complexity region. + * Remove **VarDict** strand-biased variants unless supported by other callers. +9. **Report passing variants using PCGR, classified by the ACMG tier system** +10. Generate the final report of variants classified according to clinical significance using **PCGR**, ready for downstream analysis. + + #### + +#### + +### Filter + +The **Filter** step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. + +Inputs: + +* Annotated VCF + * ${meta.tumor\_id}.annotations.vcf.gz + +#### Output: + +* Filter VCF + * ${meta.tumor\_id}\*filters\_set.vcf.gz + +**Filters:** + +**1\. Technical Quality Filters** + +#### **1.1 Allele Frequency (20% are also excluded.** +* **This step reduces contamination from sequencing artefacts or undetected germline variants.** + +### **3\. Rescue and Clinical Significance Filters** + +**These variants are retained even if they fail technical filters.** + +### **3.1 Hotspot Rescue** + +* **Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters.** + + #### **3.2 Reference Database Hit Count Rescue** + +* **Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering:** + * **COSMIC count ≥10** + * **TCGA pan-cancer count ≥5** + * **ICGC PCAWG count ≥5** + + #### **3.3 ClinVar Pathogenicity Rescue** + +* **Variants classified in ClinVar as:** + * **Likely Pathogenic** + * **Pathogenic** + * **Uncertain Significance (VUS) with strong clinical evidence** + +* Allele Frequency (AF) Filter: + * Excludes variants with a tumor allele frequency below a threshold of 0.1. +* Allele Depth (AD) Filter: + * Removes variants with fewer than 4 supporting reads in the tumor sample. +* Degraded Mappability AD Filter: + * Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. + * Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor\_ad \ 6 +* Non-GIAB AD Filter: + * Removes variants not confirmed by the Genome in a Bottle (= 0.01 +* Panel of Normals (PON) Germline Filter: + * Filters out variants with an allele frequency in the PON below 0.20. + * Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON\_COUNT \>= 5 +* FIlter rescue variant: + +Variants meeting these criteria are flagged as `CLINICAL_POTENTIAL_RESCUE` are **NOT filtered out** + +* **Reference Database Hit Counts**: + * Variants with a **COSMIC count** of ≥10. + * Variants with a **TCGA pan-cancer count** of ≥5. + * Variants with an **ICGC PCAWG count** of ≥5. +* **ClinVar Significance**: + * Variants with ClinVar classifications matching the following categories are rescued: + * `conflicting_interpretations_of_pathogenicity` + * `likely_pathogenic` + * `pathogenic` + * `uncertain_significance` +* **Mutation Hotspots**: + * Variants identified as hotspots in: + * `HMF_HOTSPOT` + * `PCGR_MUTATION_HOTSPOT` +* **PCGR Tiers**: + * Variants classified as: + * `TIER_1` + * `TIER_2` + +### Repor + +The **Report** step utilises the **Personal Cancer Genome Reporter (PCGR)** + +Inputs: + +* Purple purity +* Filter VCF +* Dragen VCF + +#### Output: + +* PCGRCancer repor + * ${meta.tumor\_id}.pcgr\_acmg.grch38.html + +1. **Generate BCFtools Statistics on the Input VCF:** + The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. +2. **Calculate Allele Frequency Distributions:** + The `allele_frequencies` function uses external tools (bcftools, bedtools) to: + * Filter and normalize variants according to high-confidence regions. + * Extract allele frequency data from tumor samples. + * Produce both a global allele frequency summary and a subset of allele frequencies restricted to key cancer genes. +3. **Compare Variant Counts From Two Variant Sets (DRAGEN vs. BOLT)** + * The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. +4. **Count Variants by Processing Stage** +5. **Parse Purity and Ploidy Information (Purple Data)** +6. **Run PCGR Annotation** + +## Somatic structural variants + +The **Somatic Structural Variants (SVs) pipeline** identifies and annotates **large-scale genomic alterations**, including **deletions, duplications, inversions, insertions, and translocations** in tumor samples. This step integrates outputs from **DRAGEN Variant Caller**, **GRIDSS2**, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants.The analysis of somatic structural variants (SVs) involves processing, annotating, and prioritizing variants to identify those with clinical and biological significance. This process uses outputs from tools like PURPLE and GRIDSS and involves several key steps: + +### **Summary:** + +1. **GRIPDSS filtering:** + * GRIPDSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known\_fusions +2. PURPLE + * Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. +3. Annotation + * Combines SV calls from GRIPSS with CNV data from PURPLE + * Annotate variant using [SnpEff](https://github.com/pcingola/SnpEff) +4. Prioritisation + * Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) data>) + * Prioritise variants based on clinical relevance and support metric +5. Repor + * Cancer repor + * Multiqc +6. **Assign SV Types:** + * Classify SVs as duplications or deletions based on copy number thresholds. + * Split variants into separate files for structural variants (SVs) and copy number variants (CNVs). +7. **Annotate and Prioritize Variants:** + * Use **SnpEff** to annotate variants with gene-level and functional impact information. + * Prioritize variants based on clinical relevance and support metrics. + * Generate TSV (tab-separated values) files summarizing the prioritized SVs and CNVs. +8. **Generate Summary Reports:** +9. Create TSV (tab-separated values) files summarizing the prioritized SVs and CNVs for downstream analysis and reporting. + + + + + ### **Input Files** + + ### **Primary SV VCFs:** + + * GRIDSS2 + * ${meta.tumor\_id}.gridss.vcf.gz + +### Details + +### **Detailed Steps:** + +1. **GRIPSS filtering:** + * Evaluate split-read and paired-end support; discard variants with low support. + * Apply panel-of-normals filtering to remove artefacts observed in normal samples. + * Retain variants overlapping known oncogenic fusion hotspots (using UMCCR-curated lists). + * Exclude variants in repetitive regions based on Repeat Masker annotations. +2. **Purple:** + * **Merge SV calls with CNV segmentation data.** + * **Estimate tumor purity and ploidy.** + * **Adjust SV breakpoints based on copy number transitions.** + * **Classify SVs as somatic or germline.** +3. **Annotation** + * **Compile SV and CNV information into a unified VCF file.** + * **Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE\_baf, PURPLE\_copyNumber).** + * **Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions).** + * **Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences.** +4. **Prioritization** + * **Run the prioritization module (forked from the AstraZeneca simple\_sv\_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes.** + * **Classify Variants:** + * **Structural Variants (SVs):** Variants labeled with the source `sv_gridss`. + * **Copy Number Variants (CNVs):** Variants labeled with the source `cnv_purple`. + * **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** +* **exon loss** + * **on cancer gene list (1)** + * **other (2)** +* **gene fusion** + * **paired (hits two genes)** + * **on list of known pairs (1) (curated by [HMF]()** + * **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** + * **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** + * **other:** + * **one or two genes on cancer gene list (2)** + * **neither gene on cancer gene list (3)** + * **unpaired (hits one gene)** + * **on cancer gene list (2)** + * **others (3)** +* **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** + * **on cancer gene list genes (2)** +* **LoF or HIGH impact in a tumor suppressor** + * **on cancer gene list (2)** + * **other TS gene (3)** +* **other (4)** + * **Filter Low-Quality Calls:** + * **Keep variants with sufficient read support (e.g., split reads ().** + * **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** + * **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** + * **The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields:** + * **SIMPLE\_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier.** + * **SV\_TOP\_TIER: A numeric field indicating the highest priority tier for the variant.** + * **The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated.** +1. **Report** + * **Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data.** + * **MultiQC Report: Aggregates quality control metrics from GRIDSS2, PURPLE, LINX, and the annotation/prioritization steps, providing an overall assessment of data quality.** + +2. **Obtain Input Structural Variants:** + * **Source Data:** + * Obtain the structural variant VCF file generated by **PURPLE**, which integrates data from **GRIDSS** (for SV detection), **PURPLE** (for copy number analysis). + * The input includes both structural variants and copy number changes detected in the tumor sample. +3. **\\Assign Structural Variant Types:** + * **Classify Variants:** + * **Structural Variants (SVs): Variants labeled with the source sv\_gridss.** + * **Copy Number Variants (CNVs): Variants labeled with the source cnv\_purple.** + * **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** +* **exon loss** + * **on cancer gene list (1)** + * **other (2)** +* **gene fusion** + * **paired (hits two genes)** + * **on list of known pairs (1) (curated by [HMF]()** + * **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** + * **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** + * **other:** + * **one or two genes on cancer gene list (2)** + * **neither gene on cancer gene list (3)** + * **unpaired (hits one gene)** + * **on cancer gene list (2)** + * **others (3)** +* **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** + * **on cancer gene list genes (2)** +* **LoF or HIGH impact in a tumor suppressor** + * **on cancer gene list (2)** + * **other TS gene (3)** +* **other (4)** +* + * **Filter Low-Quality Calls:** + **Apply Quality Filters:** + * **Keep variants with sufficient read support (e.g., split reads ().** + * **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** + * **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** +1. **Generate Summary Reports** + + + +## Germline small variants + +Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. + +1. **Prepare** + 1. **Selection of Passing Variants:** + 1. Raw germline variant calls (e.g. from DRAGEN or an ensemble caller) are filtered to retain only those variants marked as PASS (or with no filter flag) + 2. **Selection of Gene Panel Variants:** + 1. The filtered variants are then further restricted to regions defined by a gene panel transcript regions file. +2. Report: CPSR + +The CPSR (Cancer Predisposition Sequencing Report) includes the following: + +**Settings**: + +* Sample metadata +* Report configuration +* Virtual gene panel + +**Summary of Findings**: + +* Variant statistics + +**Variant Classification**: + +ClinVarc and Non-ClinVar + +* Class 5 \- Pathogenic variants +* Class 4 \- Likely Pathogenic variants +* Class 3 \- Variants of Uncertain Significance (VUS) +* Class 2 \- Likely Benign variants +* Class 1 \- Benign variants +* Biomarkers + +PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): + +* **Tier 1 (High):** Highest priority variants with strong clinical relevance. +* **Tier 2 (Moderate):** Variants with potential clinical significance. +* **Tier 3 (Low):** Variants with uncertain significance. +* **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. + +# Common Reports + +### [Cancer report](https://umccr.github.io/gpgr/) + +UMCCR cancer report containing: + +**Tumor Mutation Burden (TMB):** + +* **Data Source:** filtered somatic VCF +* **Tool:** PURPLE + +#### **Mutational Signatures:** + +* **Data Source:** filtered SNV/CNV VCF +* **Tool:** MutationalPatterns R package (via PCGR) + +#### **Contamination Score:** + +* **Data Source:** – +* **Note:** No dedicated contamination metric is currently generated + +#### **Purity & Ploidy:** + +* **Data Source:** COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) +* **Tool:** PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) + +#### **HRD Score:** + +* **Data Source:** HRD analysis output file (${meta.tumor\_id}.hrdscore.tsv) +* **Tool:** DRAGEN + +#### **MSI (Microsatellite Instability):** + +* **Data Source:** Indels in microsatellite regions from SNV/CNV +* **Tool:** PURPLE + +#### **Structural Variant Metrics:** + +* **Data Source:** GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation +* **Tools:** GRIDSS/GRIPSS and PURPLE + +#### **Copy Number Metrics (Segments, Deleted Genes, etc.):** + +* **Data Source:** PURPLE CNV outputs (segmentation files, gene-level CNV TSV) +* **Tool:** PURPLE + +The LINX report includes the following: +* **Tables of Variants**: + * Breakends + * Links + * Driver Catalog +* **Plots**: + * Cluster-Level Plots + +### MultiQC + +**General Stats**: Overview of QC metrics aggregated from all tools, providing high-level sample quality information. + +**DRAGEN**: Mapping metrics (mapped reads, paired reads, duplicated alignments, secondary alignments), WGS coverage (average depth, cumulative coverage, per-contig coverage), fragment length distributions, trimming metrics, and time metrics for pipeline steps. + +**PURPLE**: Sample QC status (PASS/FAIL), ploidy, tumor purity, polyclonality percentage, tumor mutational burden (TMB), microsatellite instability (MSI) status, and variant metrics for somatic and germline SNPs/indels. + +**BcfTools Stats**: Variant substitution types, SNP and indel counts, quality scores, variant depth, and allele frequency metrics for both somatic and germline variants. + +**DRAGEN-FastQC**: Per-base sequence quality, per-sequence quality scores, GC content (per-sequence and per-position), HRD score, sequence length distributions, adapter contamination, and sequence duplication levels. + +### PCGR + +**Personal Cancer Genome Reporter (PCGR)** tool to generate a comprehensive, interactive HTML report that consolidates filtered and annotated variant data, providing detailed insights into the somatic variants identified. + +**Key Metrics:** + +* **Variant Classification and Tier Distribution:** PCGR categorizes variants into tiers based on their clinical and biological significance. The report details the proportion of variants across different tiers, indicating their potential clinical relevance. +* **Mutational Signatures:** The report includes analysis of mutational signatures, offering insights into the mutational processes active in the tumor. +* **Copy Number Alterations (CNAs):** Visual representations of CNAs are provided, highlighting significant gains and losses across the genome. Genome-wide plots display regions of copy number gains and losses. +* **Tumor Mutational Burden (TMB):** Calculations of TMB are included, which can have implications for immunotherapy eligibility. The report presents the TMB value, representing the number of mutations per megabase. +* **Microsatellite Instability (MSI) Status:** Assessment of MSI status is performed, relevant for certain cancer types and treatment decisions. +* **Clinical Trials Information:** Information on relevant clinical trials is incorporated, offering potential therapeutic options based on the identified variants. + +**Note:** The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered ou + +### CPSR Repor + +The CPSR (Cancer Predisposition Sequencing Report) includes the following: + +**Settings**: + +* Sample metadata +* Report configuration +* Virtual gene panel + +**Summary of Findings**: + +* Variant statistics + +**Variant Classification**: + +ClinVarc and Non-ClinVar + +* Class 5 \- Pathogenic variants +* Class 4 \- Likely Pathogenic variants +* Class 3 \- Variants of Uncertain Significance (VUS) +* Class 2 \- Likely Benign variants +* Class 1 \- Benign variants +* Biomarkers + +PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): + +* **Tier 1 (High):** Highest priority variants with strong clinical relevance. +* **Tier 2 (Moderate):** Variants with potential clinical significance. +* **Tier 3 (Low):** Variants with uncertain significance. +* **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. + +# Reference data + +### [UMCCR Genes panels](https://github.com/umccr/gene_panels) + +### Genome annotations + +WiGiTS (hmftools) + +**Annotation Databases**: + +* **gnomAD**: Provides population allele frequencies to help distinguish common variants from rare ones. +* **ClinVar**: Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. +* **COSMIC**: Contains data on somatic mutations found in cancer, facilitating the identification of cancer-related variants. +* **Gene Panels**: Focuses analysis on specific sets of genes relevant to particular conditions or research interests. + +**Structural Variant Data**: + +* **SnpEff Databases**: Used for predicting the effects of variants on genes and proteins. +* **Panel of Normals (PON)**: Helps filter out technical artifacts by comparing against a set of normal samples. +* **RepeatMasker**: Identifies repetitive genomic regions to prevent false-positive variant calls. + +**Databases/datasets PCGR Reference Data:** + +***Version: v20220203*** + +* [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) +* [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 () +* [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) +* [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) +* [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) +* [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021\_11 () +* [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021\_04) +* [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () +* [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) +* [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) +* [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) +* [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) +* [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 () +* [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) +* [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) +* [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) +* [GWAS\_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) +* [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) + +### + +### + +# Sash Module Outputs: + +**Somatic SNVs** + +* File: `smlv_somatic/filter/{tid}.pass.vcf.gz` +* Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. + +**Somatic SVs** + +* File: `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz` +* Description: Contains somatic structural variants (SVs) with prioritization applied. + +**Somatic CNVs** + +* File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz` +* Description: Contains somatic copy number variations (CNVs) data. + +**Somatic Gene CNVs** + +* File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz` +* Description: Contains gene-level somatic copy number variations (CNVs) data. + +**Germline SNVs** + +* File: `dragen_germline_output/{nid}.hard-filtered.vcf.gz` +* Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. + +**Purple Purity, Ploidy, MS Status** + +* File: `purple/{tid}.purple.purity.tsv` +* Description: Contains estimated tumor purity, ploidy, and microsatellite status. + +**PCGR JSON with TMB** + +* File: `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz` +* Description: Contains PCGR annotations, including tumor mutational burden (TMB). + +**DRAGEN HRD Score** + +* File: `dragen_somatic_output/{tid}.hrdscore.tsv` +* Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. + +FAQ + +\>Do we use PCGR for the rescue of sage? + +In Somatic SV, we used sage to make variant calling then we did annotation of the variant using PCGR, then we filtered the variant. If variants have high-tier ranks, they are not filtered out whatsoever + +\>how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? + +In the current version of Sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that are not considered that don’t have clinical impact, in hotspot region, until we meet the threshold. We that wil impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be hable to get correct TMB and MSI from purple + +\>how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? +Filter out as we Filter on chr 1..22 and chr X,Y,M + +\> inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? +Circos plots come Purple + +\>we dropped the CACAO coverage reports; can we discuss how to utilize DRAGEN or WiGITS coverage information instead? + +\>what TMB score is displayed in the cancer reporter? +The TMB display is the on calculated by pcgr + +\>what filtered VCF is the source for the mutational signatures? +We use the filtred VCF for mutational signatures + +\>Where is the contamination score coming from currently? +I don’t think there is contamination at the moment in sash + +\>Do the GRIPSS step do something more than what's happening in oncoanalyser ? + no different settings are applied to GRIPSS other than reference files + +\>Does the data from Somatic Small Variantsworkflow are use for the SV ? +iirc data from the somatic small variant workflow is not used in the sv workflow \ No newline at end of file diff --git a/docs/images/sash_overview_qc.png b/docs/images/sash_overview_qc.png new file mode 100644 index 0000000000000000000000000000000000000000..d4fa1a0b0e7e6fc61190339705537c4f868fd256 GIT binary patch literal 44637 zcmdSB1y>wT`u+{W0D}*%!6i5(KyY{W;OfO7~8JS-Lr+{-1vKNwgd82EqAVPIroiU03h754SN zuK~&ow}t`#`x+hK`0~RAe1N`@l{Hw}r!p`tdvBDGdx>cTcYbucw<3YImjh%%7~ zymet)n{I2-`OGW-uEUdwHUT%KUoMmS_bpQ`4VFrkw^K(MoVHckd_RSrAI`J&f&$3N zArR0%2T!cPlWZovQ9LjhRzw&IeL0X5z#4O#k^K9Lm%|VyG^}q6DhK~Rx1xqe|KA4csYb@#Iyh3gCHPMDPaQiI58&N|0$S_P0lsNq8tsmT8=77y|HpSttM2C$KkhHEkWPr*Q=p#D$Y|Nq`7}C&Uq%jo=KixL(`|y?s@8Yx=EvuIrlleTDyT(CXxCRHeEJ{b>h3jSL`O>m>r!ulhC@gNL)r* zrEGq=RHmsMp?8CFTK}o*i3s>1apPt$=VO5Cqc_^@Koq`^|HIABSXzU{#FBvudwDyWJhRea%nvT`wyH3#nb(LC7B>Gmp-kZDF=^+dtQydz zn$h{!3AJhjBCIV$H>x?Tw#iduP2V*8!$tJQgfMr3d4K?}knc@=%;jJV@$z369cuH@ z{GT3fxt%wu_?N zdHya%Rwl(e^YlSo(#WvQ{SVp42UwwXnq`Ec=DB@=;rE|J|-GTL6~{^)I#-Ds6=Gh5hdi_2b- zTpFcBB$%DYW!G-ytNqbZo!f>3(oI{5O7Wy~AHvgDGlf7CnpIE)4&^_iCJi&dtwG?E z*YIqqDe6$3Sh(QBNuSU4-;CV8&gZt{&U@g!$DA_hGwU=iHCiXJtA?_NEp_<3>$<~4 zVTJO`wdB{#Ck4rI&9wHl*kmrNhcsjdV zG-sGtAdE}Es|v-+{ac{nH@J%BGxEvGgj+uBSiSF$cvdNT#qSCI3%>eJ?R+p-MjyyT zds_YTj<2t04Mnq{Z?QukR^$x%W!->ALdZ9aj=@vLF$fs!B+uHOmGf1b8%RsWj1F*O z1HkJah#~e^ZFO^ktMu}_zcN?SYp_tzmoAWuiPv%(JJZe+g-_+^$NKWe>}?E8aPr_5 zE|Xp4%KC%l&S=W{?nKm=;G)yxuIEnAQ%b#or^kDRIl`%If#Ca%GzO?OIiJXIPq~cz zVzqukfp`Q|1lyDu=I~^5043GFTsn>ARk$F~Wh7kUM>!nVer{8jH;G1glzstT3xfld zO1}7#e|`1r_^*%8uWm=TN0JhR<syPH+n-yn-HQyce!4}ZmJZ+&?=QE<@}}W3mist82n9KrIaJs7miV)wmw*@sPA#S z7FtFbEiCOqe1K_3l1cw<5D1R2xe{W1(&+4-#fnT+(DyymmM)5H&+y!qPzR1PN37hc zi+20?E1O>F(iHyP<-Yk4ZQlk~;Q5L0LC=r48#WqbDv9n`7l^mV9!0-OV4Z;zg=@4B zFi#h!m8YGeB49<9R*SwAUtT1)GZ2Q;0fS$ToHWzuYhYhFuV()`AT=cB2AF%)`8COt z$BWGAS~-?5shaaPft9lc1oZV;dVBCv|HqRaL|P=BCYvG?oEp=!xy`D_9TJ`oAy~DB zBWbKTGAgL(=HEZ@j%+6sG&{LTIs_kS?K51V`=zodVD?o>B&F4xkCi7|PG!$)_j-0` zlBg;ekDK^AQ^-XTjs$ubdbEDzOzL!MEovR%g)aO+Z{ENU*2b1Gl#2U3Ub>)9QVhy7 zx%wLr0yo0n3ZqWo%2o`cmu^{4OOa&RViN3d#7uzBay47I?6Nm0;S=N{??!lwniUFv zmS*VRGYO-QdgjaG%D&{rWi<`sgPz6IFB}9$QYWG@L}Kq4QNc;cF4us$4};1nBOS3y zRh?Gem8llJVbwzt;&1GxQWl}N5SM&QQtR+|clkCCu`DGrkQ7IlR!P6h@9M&O&g#27 z*rFQWRu%MzGjc*c`>vN6fnzL{IT(w{Vs4*VzKZ7|l+2H^?q~z=l2WylfZgO|jh@kZ zytFRr4F1l3Hee0=Eh`rc_f|NPq(J0a{3^)Zh$C>f+wv{?C-F$lQe_I%&#&XXg40a~ zA}JQlu7tmj{2mw_o-G(~*Cp{ARs3XX(iriID4Fqf`lc4qCvoJl7PdQ_juI^QGw(M7 zALJyE3By=+ZC@vA%zufCA`|K?F!SUQ=oo1c-TinVAvnrWeWmiH!4wq+0(bLx{Y%Pi zGF2nSS2bv)&UDDLpHW!L-a$azt1hv~v!rxD^0s@jnQD#}O+si3+QGvfX@ql%s^}iZ z6a~Z?MIzdE6a}I>CxbEYxf_hOKO?#X1ND3{4eL@^Daaby$X70Ek?pKP>h#>J#|(kqSD1$<$h zBwTg`CNQbI%qMJ9zdsSsc9yCXsMg}XJ{42(UPqnrSGDm3!cJri(ex0~=%4Yk>)VDbU5?tfb&Ml(IJG;AnTrfMZwpQOeHvEWx=T`+`xbbb8D?N95PR3M425tHvUv|yx>S&&vRVHP{s z9R;V>u3U?9vOErVQq!(|Lg=K!fkG?e&s6=Kkv* z#pbzoN~x7tG@q@(n0?v>1oaMKk@q0@S=x3g7a}C1!s+)|O4OWBCw&-S)YRJ0Ls1h> zNchd)fWc9gKPF@}*jc`2y<)4VZ|b%0f4jl>mUCvVN*AB!9jwl1XY$|-izlmh5=V6S z>DJKO^p*&9)6W~Qh9(8RwsA(}T#jpIf659@;{!*t_1gV9e6EWdpa)`X6M0g#$m_{) z@?oimvm{B*W2J*+u=wOlYJAz~M>3K`gIIUYK3FRRaC!NxF2cQenL}+o1G{hwxDIeE zuntjITZ}=3=OsD*sw2N~*Vu7cv}(zTl7O&=9KI(T4nP zO}0DW5`pIEN0S^1X>BOif5I3 zPp=rG@;FLA&LaOzN;>JEl}4+@{l3w8Ek2wGk8GSjHRw$oz zb+Zv{&Y%%P%=)yPMCb0fn$F=hTCcHIcCJXmgTtwF(Mb4Kzsvh}p~;rrbZEK7`tg4M zZSw1D%p^V<;LrK;1LZ)Z!1K-0a(h^XRLEPaRNOiW?IMIEu`HFMN54D`Qz=pr)Q;R#8ZLPVevD-abK(I?!Y~(i*D4UdMu=Dx6x!Aj*=-U&x07c| zQ=Vc6@8aOU8ONwG?2-gWyuvyLo%0bUbeA=&TArT5SS(f+%bJe;MR{FYsj6&r7Ao@Y z)u?0!7!`(Sn82vI;(ZM%*O9ci>mf}1evtL-q??>2Y6{FDM+34ht%W03ene$1i6&gw zrvasekm<~XXxDIEJb0Q?wM1qE*w$>OJkj<#)o565MF@7SQ4gpITUlxUjl*ph*pen0@`;f9H_Cp&^@Jj2^)^|!y>`PJ#h;!zoTaB1BS{WbpF=S4pMLHq znTDh1#wR@bFsGlc6d+>UxdD;ltqYj0Y^Lxtr`zT(CnN5JQV#!<*RV&3r6TFatsff5 zUP5AFzXoI0I^1SM$ij|(vz^S~#`UbuM9379NO@T72na zHe%%`{PmIya6HCoXMtyL8n`dm8Pqp12*(mrkj3pA^e2GNc=rMXaj5A9VqQ~}b^<*~ zANf?uT@fwt>);9vM2+c|@Cmceyg_MXb()5{*;5$g%?8UUO?KA+|LsAT ze4_Iij*l^xnCB&UrvI$^5qN|Gv=Z!Q%Otf$TW<4l9N1{(yfgZ+6HeXSpVLP0&`iE9 zxqKFK;~m`AK=LVw~c}|bGqyGIz34Zz;EK;UK8-=FI(pHm5E0X)Qyz<8F|`1EvQIiVW2$6 zrGL6^X}Avl`**34lpYckgVYy}Lw{*p{SeAkqwT!)1J^UQ3;ifqo8qx#wUe@A^mmpa zo{^nZ{7^pEuD(%%~CD&-1qB67lz;a@Jr0_!dnFjRJ8TO(#Bb4Ck=IMhOPOJ9r1AhQlJp454M^@2gB7$oq)KAB+rg3)t5j`OZX_XvdYc zP%fB85whS@%r$1?1cR`hWP5u`VPIJXV%avxxShH`#TGlcOZj|V<}8m&t(DUS%k{Jk zg3T{fkjlryA*Ac1&?%!*k$JZMX!|5*IsP%S6(kEc_thgVuh8Wrs_KA-ew@J6H=;&u z^FBTNgP$kmM2Kvt(YA7=X~wA0NRZ-Qhwx|zTl2v**}i9~(V#B5H3YVxu4Xyt_I7Do z)g|bq>d-^_&j@RH=y3$}FgR`%irw}4YLRfBD07XObu?&gcvmcg?IHUZp1f=9)8Y38 zrmmo*M7Q=f5Q(t4K|tCJJf*fsRe{5wG^Zs)^hA!rvgE0N;61h=vGbIy;)IRqaPR3S z8_HE&qms~dqFF8r@UF+!h}JH;mJfA~E9?QA$GkL8yEBjR|fvnQ-R2s z(zHn#LvJXGlGzu}(+z@%>!DF`=2dtZw-P%njP2HRj!qEA6l(0SNWQYl{^>K z3e|nbdsg|TYabql+FQCFMXCF{?%q2Xc8JUd8oG_>mi`uU{FUFizqGY-?Bn0!d3Eva z6wCGAT{#n#(DLbxph=;0Bm$lPJkH?ZA8~`LJG)^X5tr#A`JqvY>IZswQlF8oVQ5bzU&|97KzN`lrRCOd>{uo?D8r=Vom#B{@O5~vVnk!m>O@BdXt|R+j7}4$cc<1p0# zEoHcvcVIk%VlTIYw$zf`^yf;4_8$yXvYV;!b15Lr0k z34tBhBO9VLKULHDwTNE%6uVnoO6gV&*d+#n&N7X%_dtwbF{ooP!?LH<0KI(FSC| z=Rs25&;(NZfKW}+{nVLDb6nkiNI{gA*Yi&`GX1l<2VfdVk8!bt?BJ~gslRL23bx~R zi*8i9{M9H^-o4%`_MNT1Y(x_3@_a(|5^E{v)MW9cV}4&_-gFlOtVz+ipfa)0gQgY# z(}`yNA3g5WN*TOE~Lr%;lC)Pz2AaWm8ZX-{7goo>nTE66hP4vXDC(^GM||? z&Ou{ybfG^>cG5Xd6Mr*sTJgCok^QnP;55NUqP*2t3NHE`^W>sHs8R5TCF+J#TH%5V z;hhC;2L(yaVjgN?uLe%{jAyK z`Z9Lt$^z&VOYFOl8*UCIlGq~I1q3dGa))BL`QO)Zt(A)E#kwdU}C6sI`r><+j_if zzclhER^hOIcRWTiEE#)`#p~>bo$do+}rg`7>wC>|u~U9xKH4?qu~wrM~98yAS+ZKE|y~KfAqP z!VGA22_q+Lvfg zucq4LlHnaj`ah^KE8>{cnca`&fqz6G2V*o75l88uzhR3CE`n2~YKqWZ22%;-+vv9; z|JF@e<4VA5h{wc>{}b_bPg#OuucV86uyS⁣ETGmJ9I$vii?3e;YkFdD49jFicN+ zV!EYL_R-I8{_9iErV+s8QC3G87&8-1vjTY=nqjwUM~!|dH9XSn=q#zaX*dD4EYY=D zlQORQQzV-r$z&fJJT*AIxbSdx%QTTqJ07$3i@A`Od_(!Nj@#K;9C1MCiVzaP2CDTG z#n|9ycumq`q#Vqv>4VS*)gP0Hwh!PF*$*W^=0O}JJCfZHmA30VfBUX^Zl`-fi%E}; z4I}s6hA30mFzi>j(4cy8c9~DR}G62v4msi zAf0*4NpsYJ)XX@ty-Z;q;QaC{pA^?0x7ZIQ6>pE}`AmOm6h!TCBc=)1_hW=##|=Cw z(V2V}PZDVU(|^XHj9EMJVR_c(<1>FpQ}ac4Eb~OUMr|03U_qro{j0dV%@I?n2D5jU zosLxa#LmP@mjy#LYrLn-xKo^J*zSkBF%g)QgG%YjLs}@FjYYdD-qn9%-Ps$x^8QGp z+!;JejFsSzUd;hTc3hHijpK7u2xmeESu4$eI zx{k}W#!K|Ip}-#J{$@uvip+6#*J{+twR_0}FQoi6hF`V;0Xne&O`QA-+r+_~`sLzk z-xJ$s9KQAZXAjePi>`FJuxKyuGKJ{(W+-3W4KinTNr7>OXjj!z1gqfGdm2tH337D; zbRTd@40WIJko>$J8}=THR4-6meooj~$+TJUrcZ{xL_T9axtK)F0gV0{SD zU*rNnP1IG>DD!wSg ziZHl=V2hRtK0Ss~r_o|}wl6kI~lol7LuJ#mEi zB@&|j!;&D66w?A^uWdERNq(<=w|g!uXXYFxt>h!&mK!zf)*d1AV1>ygm4n7g)qq;A zqP#Gi`M;1-OsG%H4b}Y>>RoX~wa&ZI7tG{_1a6mXcNw0Qj3}gNt@9Uer<|wQ%n|T= zW4+BbuVrhx0$`^QfIktFE^NXPd+AvE;$L{4>vS6wsMKImI_h zSivCSbPH|(FeXvUB=R?6fUk-8HbhJt&!LHVjkTX4deWZjSu>oC*P&4D<8%iW&s zRBAVXzQq+;2L+H(0Q{TGEj6A2!X5HbQYOGn0o^AA3ETf{gbqEA_7gm;NVnKaQ&AWd zX^D`}VM1p8xg+tIy>HLV?}Vf0ME%H-VS@rhh3&Y_s3D-h$d_V%s~dBI+|1dbT^}L z8H=~vjOI#JK0aRJH^U*IN@w%)z|v+m*)IMA0;h7i04@mB79fKLRXR;?%twFgF$hc6 zV>Vkz-| zyd(3wNMNVg&~0`wmTz_0Q_Li!SEY^^$#_f0IT-s&x2S*Db@B^O$RU8=T0WSYUM0~f zp_2)IA$Su@{Eo%<=DD=m|KW6~Kq{{HFq-f!tb(|Saut1G z5Ysujer2Z<0bX99+qA^v-C+#_nr%z$9IQO2X@7(Wh&vdc)u8kSz^l4UcyLitaish0 zU_2JRQdPxl{ua?tWZp$U;5Aa4V>UL$!~MI`T;t7a(ho7^2}VGBBjdZ?4J(nyq6e>9 zqKM{FfQ7A`kiwwxW~BOt^wRbI>d#U05o=!Y1H# z{XL%X7KvImxfEz`2-|BO0JrydYxv&oq=Cn401bfLf}B(Y#A*+)NL_$lB$my{*8}Il z=@FPzT=qZawi8bAmL(@LIdMZ9J%-~cfDyuC*vW6b-VANkYxhiW0)lkhP~vwE;xnd9 z#sHzd!vnjf!z7iRJ*ZS@GMl!C0KFOpEZ$4eCk%yXfk`%P8`lCaZ!QlPd0D*?HJ#Z+ zJ!&0Rd2xRgD-ehNaO`?4Rx1x+K@5yqyR*rK1#1v{96|3wE%kDbY7{a zyD0;WUZ|qinPMzg*byNqD$d!x?p;Go) zVEwUEfE5dnnI8FCt@^4QHshf0#6|$s!gM@|SVf{mu@M>Y7K2aL6A8WT@H#JMKucj3pUie&Jr#TW;hoyW_Q(OwW<46r?FqK!YI}RR`LKE6Tv2 zV7OHjkq?&e`de7mTtrmyN zsQJ4;^44fEpHoDqB8X-8=5(vD&}p6}OYVhO!+<_8X@&)oLrWpt-A=hr%CF6a<9kTZ z7xdkhz_wwzx#TIrGLUo;i>d6YhSxH1n)n1xrP57@Q;XVFowBhhz5+m+8dQIW)bRzOFy>_Al zM&Ws&rA`L(a6B$216N><<(DwdYovn+)kHeo=M4#-9SaB>2_jsWUw9uvMZAFN=>+Hn;!w5p}~B{ z2&_7iKu#1}K83Lec{~*I^`h~7#s+7CEZwZ#<2XC=JEClzdbh|xB8?pIB|Q%2=;c{B zfZ&K;nhMHO8($)yq1pgXOXETeFyeD*TXtzpC+8lua?_EaH4xFaiO=t91l3HLSve}C zw2G|YBfU0(<*HnIbT1c;%q(!@_xFPEnUek}d?hckz(3sZsuaP#m9+(dcsFbqoTWB0dDJeBJoesX78NXlVJ7DWUKDcEl8YklF;3(L zdEktAe87CB3kiv+2!e9GnN0IilyTIk+sV4frPo(@6g;C-S$!ZEWP9;SLNW!$k?`(Y zWDFB^Fl!Imk81rkHf{}6DPX!wu|lN6pl124JzegksFjcxO34+OA{8XOBrZ%4Tf2Nb zb5I))0LOr9|9yl@KO>X~Ry3u%UREB+dP%WGy9C09O(7#rCy|~F%uVOuyr~qYiUrSY zC`449+lb$$Atq$@r3s$3_XP{FrFK3=B5~;R$@Zz} zZV;&wX>D)On&-w?!y2*a6sZF+usqR0A6^1D9w19B-IS6&NZ5!R5MV@tox}1pPaQjN zcN?YzP1EA0L3DX`nPJURELPjPiv0GyH!W4K@^p7bUzJz@OTDjbU48wxnk2#+>1=-mMiT+{ z`lh_VfF{nY&K4E$O1=VKgDP2ns}C61|JBv?JzbrThm$C%^h0#l z(uaI96?T&8`4(5ASEdH&iMYPuRzBz{J1-P@KlqpV&a`}*Shp3T4du*w8P|SyT!zYx zF_TE|KNI8TjoiN`#*koqmuh#%*0Bi_r+KYG@@_d+^gS8r%~ngeVEMt{KLyE6%^ou2~COgIZd7c!~PQKHb4ru>-Q^}3c(T~55P1k z$Mmv+xvt@0W}+g;imm;5KjSXay$6LNg-^MCf>J}iAprsVPjDd)6;Re+VIaQ!8JM3; z0lA?BVdm&Vo*K*afatrikv(sFAdgc}G%kJv1;R$!8=JZ@F@|3g=jf05HjI`0n3Je? zf${ljB8uveN*wJ9%>2HT>g3>Pxj8Pl{VnKE(+0#%IRwT~c>N)TxKlW68ruExePBGQ zIhdngb_WDxFtMCnyZFH~#nPKvq%wbqKrp24{lvnTyMI<78xr$}KXMa(1_da+jA$kI zC=Bc-PcuidVnyk5Q*dlfm+RJK$D-q!osci3L_Td5*JBlaUns_Ashh=ZY8NKgw3zwZUP+MWDXSo9}CswIc-%!?_u@5?!A{LLOkWZRtYEi!n` zmF#$I>b8lPZ6|X0-}^-_n1CqLYIIeK^voY!FSc`bS+df|4TA15j+KZ6DdfVK?l1#? z?LIu+>?g%m@!^!J-*jho{6c@OzLT0q-eob8-(ub9Qp|B&FqMw%N2>YNd4Ip|_`yup zGUklfBS`1!!GFHccmYego!kg6@SVtKd;ZId0<8VDvE(w9zV@dr{Mrw4x8H z0FF#PmIP1toElG^Z1jGjos22tzy><7K2XD^U10% z*5@pK>-tUj=j*&Hb$V4A#q^tQL&Wpl;q$#S_{)mLHf&eRk<22@Ua3;uCdKIA?5+~_ zM9pC`{VBo_(t4{U+#Uj*2L(Os4ESMmkt*lcWah?kH2r}nE#^aa^HKBip$Yt z*603EC@nY-rI3Wy#&pxA)Vt@jAI>r?Q%XYLqOT~A%z8kQv+LsB4t72_C-jCJDwll7 z|NZ?-z;6F)uJi_#Sp@#jr}o4FfAEuO-}kS}B`srnR4z3XexoTjwSHsn8krho^B=d< z4TmHn3HH~5eTbVK8MMFk3mrcu{e5hl&s@BzU&S}Px-VB+Tw~gXlSDgM`x~_;c27w) zl1g7uoU1Z{Fic^@y~~{O&iUr1IOmpaJjB^rl)bBDP3`jvo2tI^+LLyS99tC|ierB{3|8y5qN@BSh*IauAX>6Bl!xqcmYR)xCj2KKjHygeYOsr+4Vk__tRj)H` z(ruUD7ko#4e;xYA zdwP?BZR=zRCTXW9&;B%fqMyoZrLQ5Uel0~$iYp%p(EZ5K_Io3a{kymG{i3E8>y=oZ~NY~ff>cY+uCwUwEPjZC7h*L=C-eshF5!*6GAb#Z@HpF z6;YcQqQ#ki`z!xka*mmk{6wN%!hsO&N~t`O!i*@I%;#}TFngmzkT7T%RqU|j=X7)= z@%f6BXl5trD#oUAt@HYOQVD*|DpMxkeWBNwu&MCSf^xQy&o0Zvp~ISo=hMM7PTW0F zm}s0B1og>p4A{HNrLp``W?A_ZM=H?jacm$6t6!;A2khm62r~MjOs!=1YI7j!w|l}% zzH`zrAh#6CB<SHRpE!7MzL*QW$M|eE_0x4f)BLi0 zg4h49|25ADLWTEtTDv6NW+VAnOeAkX0&z|RoLw1Eot5O|u zI0tIfT4w7FiyyaBv5OU$G`GjJ+hfY+bN? zb8|UFUvU3yHJ2BH;ds|^@9Dg`Gkhkp*lEa7xqPzUQ1T1);8bRdD@*8asqJ@ITP3#) zNtfZoP}y<)H))Q`Nq$$GJo=q~EJT?r$1C3VD8;n6>X*yU32c4Li8vii`_sBFB}i~% z)8BvDhVHB0-kqyxkDkNdyfw|9%;fijvqGmFGPj1_Dr9Okg~usmvt}AWE9m>}B$Hw` z0SgK~lIrgGIS6ccy6*qBeRJM=NmBda$43 zpC3(tT>v8ovyx4}B{7;xDps%6mA=`xeQlH2{$pEZQDH|O<@Y+txc-&6?!eE@?)oV1 zn91DUjhy%_0{*u)ry1u{23>1joZMLc`6GxKV=a{dLmS>ML8lL&B8jGpiBIiV| zgK7`HeR#A9KDex#;PPa+E(uO_Qexc55r#>YTm+8l!HdY+tOn%&N9?xr=^K$9yY_R+H5@b~D zg18QPMMoumQqfsWzk$n6NXnAvI`7wwOL|PJf3~mYsuFJVGy9;)B~i*G{9HyuyC}nh zup7S$1VpOir8=q)+rz-tUfl%Mj2KXIq`bY4*94dwhf^5e15OV26jRN`D9`H75qbb?MvrRjuN=*M&P6=euYIIeVrQ)>cBjdsH> zU5;%A4Z}Y{ZzPYUadA3PL?=~TG2wFS^v8jLqET@a)FP7GtqO2cFzYL4<>5)4Zj2O zv3+8rd?vfRAImilfB3at}N?H^#!%$ zbot_L{VJ)TLnKq7&whvaQE(oaf_9v;8$H@Oqy^+WkW8$>U0!M##Fq<_z^XCKsGB@JSl>}5fCcNLjZYpt<|kUCW$taxn>RR3%~Xgs6Rv2`8(WwP$~PdUg|MF)O@)WtR@8Jd=zTjm-iYJp0eNnU4{^{bl zm8-ay&r7(|>=VXUbiwD}6Da$y9AOV67+&$ZT7{#(S})(=d1E%buJEQ^`#l6TJOM{u zF#IF2GCC@QR$NCp$7ggTn-_RUQ zLS>zSwCo7Lz`B|fG!*_2ROmn0|R!sjw;9d8}*!t zw{ac%6k%vG$k)|gz=zr@ML)2c6xI?qI)wdo`FH*5$J-Q*)ljbXR$_y>H4`7veAMdI}#-PnsU7$n@$uHvrOm3fG5@cn+~Tzd1oE+%kX@! zD~asl?<#Q;I?K2yV5+*^9>I3=2Gn_$1)DCHz1R;dc8LK6jH4OpGVQ=Sdx{VSJT!om ze-M`(F5Mu_EgB+?^|GVxgLF$;3FQGEDpGDIBLKWuOk@sFXldXDA@`GpYz@Wzs(8;z z;)`hjq?aBoH}D7m9ws$zjGsb)k%L4EK0@qd+;g)Zsq*d=PBInH&mV5K;)eN8en$({ z(a5E8%jcUx!TixM!FWAuwZ?rX+TrjVlbueYeU7cJ`vmi-Zs`4QqS=Ei8lM5V!sg3V zI{Vhy&X{M}DVtlKayw8>Sl2RDD8v(df;~#NY($kVb}BMy7Kw~p)j_`^fL)JV1m_5nkr^{btnqWUuVx+dAz^g;*c4V1I8Xacus3TmgoR5)hW85pFU)8OkF=->TYslUuBvM%s}R@SD8 zkAO&0jkDR(o<2*Kao(^unwBVr|FXZvg=HT_m%KWpWS11mo5VH0khiG|0Z><2%}PN% zA%|7`0LPV7JwLzpuea)oWzomq4HV0`r_~zpR&ti~=?hR%G$Xk6i9kFdgb^I;LPoTd zlF_A0m^3r9i9#h~bX4%2H{~4#G)HYNjz7y#)S^1>K@|nd3Nz2MrC~DBr22qsv=#5cqaEoW@>sxlNW+T<)Pq7>S=ONCn}Kro1zWY=R~xX^5jS&iXdXStJ8eo zD{j5o7Td;st->u6S@>`VRS=E*w;ec#e7!J)%DYv|e@6Hha&r z60|S)zgnH6;--_Dg*zG0Y|O?|Rw)GpBeErUSFx;#k@Z94OtDNDbDyX7U2f)?&y)j1 zG)lDLY{kaJQe^9bguAKM+e6@TY(MaX1QY}#ITYyZl%s2G?DzZPbkAxm849k$Byg(x zK?UwV?^?;OC|gDNk|aTOVVhODGzFx3oGAsGl+y+$TmpCfBsV+K&Q{O}-|vl4WLHSb z;(>i(VfP5p4RZx+t%bk#G;_eU#Z)(K?>OcP%KvgxD4KRrx@QKmFZ{P#gKYQ8) zvBxnsF% zYnTCfWmmvJ$sb_e?*wjD{b0Y^I&ZtZT>9E$*A;22^B(HZO#5sqXlIrn6%_FNqC`Gl zWg*hPfXU1ng#*vC?Gb?ee%f|{p^fYP!Th7BCm2!Jj`NqeK45(YCYp|hSIyGFK2(5r z4F$udZv1&y*{W@s&Dzy)JQL3hNg>iB0fTVH(Fv#nu@evz{plN|S-Sg2f~J z+vpDfIls~5o|wxS)~PjuHPqcL*zm1z0h~QzVIeB3Z}0t20M{k5)fZm317LC34wlcv zHSIB_7>}pu16UQ12e9_d*|dw4;Uu@~`N;=pOaQb!4d5FBwF@3p1nl6Pk!}LcoBcnz z?3Usw@;cRaCazK$G|GWcw+rxi156O==z1`R6o`mQRy&pPb`J;vH`UgoR{(z>9-}4` zzy{e(2LVrPj97uk`nERQ(BtkW$x{o3FD*t#jV%tWz7sVCa=Vk8S0+WmBHcUynsY*L z?w*4010u)Re2J(*!1)fW_?7GcFK_d^=VGqWZ@U7Pms>%)`U?;|GFT8#QwNY8dI{fJ zoOrpW)&cva`r0~##VXzF&yJlGhtaVaUQ8Rl|OXFV-05;IWgw?4E%rF5|6b zuNjkmt2$=@MUQAUubT~?+p!Dr^mLnt<3Z@2bRsoUG^T#-$7lb!kFGD2>~OvUNOIu6 zi%K3Y`W*|H!zs|6tuA`)$zPp{hf}0%P~|(YW*~&c(WqiKZ!y60#s$11K;J%*O&XPZ zv6a$ApdllhkO7ZXq&qucBIdHY0l?&ZO>85kQI2}6t2Hqa=pIv7I916)>susXG@H@# zH|+2p*p|Nev(o%PnTZ}yG`H^`4J?^nn3KUs3(QQOzKKKk#PbIJS&8&kaSzB(FNgD+y@ zGIY17OSPNM6=5SeSkxgh>p(f~?hUKkt}8Cx^3VWM#_u$u<<{^MpHfU1<4 zN+XAN!Swl^qhiJz8vQv7)#pf=5JPr88 z0G``RL$y5=2J`JlBa-I){c)%~%Kszmt-`8oyKqrr5|fh73DVsHf^>H`(v36-f*{@9 zf;7^gbcl3^h@^B$i6C84>p{PNuf31C7QI|dGxvg9I?v-mG!D6!08w3!r0 zQLoQ_y#*7`ZFAVH8<$xEE0m491Yn&K6it9ZWU-qUPu%|i+##Pwyh=lA2jrc(I-6Iv zr)kUvgZavXfC)48@+~NM8Gdr(vN-$$iZ0tg&;~LOY8hc^)({ue>m;ytx`$F(^>MA} zc5~1Om`e${TBc;b%(;r!Mb&Q;@lI}$e~vDxFjqB~r$P0Chj-?LwQlUic<wp0$=j8AEPaxG@mE#3*?We^trW zFk{zj&?;M(oSZ8<@n?|F;Zc}k{78ds4$bSbeix`5fRjV7OiF0vwni4a0)+2JBYB%d zOgKqa>E zL9Jm1Z43SuRYx`4+Z{T|@rXi5E%9*p4-@>t`-`>PaPg*t_P3njT+z2?5H{lWmpk9! zm)k(jNu0=emS1d%)ivSS834b_M6bqJc=zWQ2aky1j<~4^pzsutU=@p7!Vf~GMFwvRfmx-1i5m2i&0hme7SX8Mw@)#H#_&Rl;>|3{*99aR9LsYyf` z`2tg;&X<>r-D^-Ui<*1g0tbWGB8p6A+~iJAbz)$85MoXAT$I^LwLf#Q#lwC!8)TAQ z?SN&4@)emrm{kqSKFEuZ<&7R@JVIHPU5t zEqifmHn%7J==ajA8pc{5e<30NbBDEhZ_zU{SJMFz)zh;5kNM{&vdS-JEoO)JuTAbN z~mvMJczzWqWCKP4y z*>mi3xyl_EWEmvR6JBm)cG_6ef$Oox0BNE=6gTN&V)Ij}9;c+x+*DOn-}K&ObR@Q< zW$b7TYn0u$hN!rJ!Xj?x@<>j0HaGv5wl7pEY*!8!Q2BC>&EDXX?EJ_lgvWsLm@&IX zbfB8L>3$(H=f>TeZ@xYJji_a*x?Y`yEQjKG#fO_ls~+@0K1b;F&EC(tP4_hLJ3RvS zRQE{97e&tD6ofxlK6_lO38Q4bS)O0l%#G@;cRwUrp~8FpstuEGqV*Mn?NN;Aqy6*U zn&v z@j?mtKCmS!eN6NoCSQG&xX(#%oBPUE)aEDi!qA4DEfuy%3M4mw+IaU2xhIgh5EO@6 zJLPx*udnx6>=(Ezcr;U)aY6*1#|LG(rFamNsCQB4 zZtz;KQ3dW7V81?@uQ1ub(+YVBd4kl4FULJzO-&#Q*)AAU9)TGt7O!Zge)DdoV1YE)$|C>EdY+e~$`!M{jW)soQ-gU}@1C zLvk_ZKEB{;eZAY_diIAyG$x*a!E-3e%C>Kwph8%;L4N29M(VE4?d>*GB8{|zM%n2= z1M^M2X|N<841aK=leI(vL=;SJSbLn8bV{@s@T%%2ZocoHJ6@De{4&8BrQV!w!a_2a zihd{h967x9Qg(gh`-gGwyKy{1@wMByz$BQMbQs_wwNQ zwvL88{~phZJPYfzsYcOOZ9tecH>VS*F0QsT8sALFUP7!4mR9FebtJN?B+n1!jSSK; zBAXP%vBtv0+5c&8L;YIc--~7QDYMA{^&s(EGgzlFic`!GdYLMS;pF3qq-SYDxW1?@ zb~BjKR#bN%(dp#sWo2X`h8X=QWI4_U8K^pGANA*kIy6H`grsC70!AbI_Zs z+tj{5es*Cua67XItI$?8zK)W`JJ)*bdkFu|FqWbRgW>7MA7BI1Whk!fsd6>1*`R>v zb81xOtP{WQ>`&+~PvVp|9nTmm=$)feKej%(TQaloXs?aD@SD?HI-?zp&j(-ITua!M z*wVhUwcV?}C*>Q4I{(NZd)$Rh{0U-@3|}szhiSiv42{`z8DX}HU05yKFVOPX{Z87{ zrTL+YtN22VcwDhSD$)V}t9fT(Pko=CEx--G7_vgYv`W)TW&Ltw?YvZEC&m`$brBBs7@8Z8f18mZ*~TJhV*hTUR9rE#eFwy)%zxO3ck? zsYha^{9kdFm)z%586#8E@102SAi``d0gIjZo89%dr5 z@y#KQs%Jj_3(}cQVU0<&FI<31%htYVU2g zh=p%w3Y{o8wz&dc(WM{1ZSvjaqH)|_gemkXDnrSp8Kt?A(G;|zE2N5Pr+S8$7?7`E zZGl8x_qgsKgCI}v8(+I7)d$^=z!f?<)moyDsi%b*Bs)ByFTCyarlY!~#p-X6ET$EaiYcmer zGZYOm8IlWDllDcgV)>`Jp=s{jJh^=_oA8zi0#peWqs5w28T5o?Z+3wC?;z_XEQpR) zRZAgOOxD~&V0de2|3jl5<-%*!evM4J1PDlFim7%lp-uG0f{|t!mK||70f{|FcIK#z z9QCjwWbD~RK;pD8=}wTm-?1yHkMi@N#9~kT9!NEC7|hA8)H-)c!xk`R-CQnr?Jqzm zc({Fk`0IRonyGCrv($l~u%aR@;D9ng)T|XDa-C*FtaVLL%R3O)jK3AGR(0g4x;Fv`f3zJF9yyxOtZ_9xpYmqlg3c~N+--!#Ui{ z-o&I-?Z|fnF(kp1Z<#XRe3+=Ko~hkQTDI&Du`>U1W>3+M!p5h_>vL4MaY zF&GpQPOxtX>9K?xC_w$x)QyH5`b4=JE6XlOeqs~_M$siYPOb+^&2Z!}3T_53Rzd+n zAuOX{0-N$cY42@_OQGQ5)A?)`hEFk@#82B>>V)v&e4jmF&IwXgRJ*H6guN^b)LX`a zW=T7){JPIn5~beTa-edGn0?=|l3^;KdA2mUhz$7_k0GG-NK5%SHpMslsDm{oA?(2( ztnLVY#8rP~5oLvdR4zdNnfVk$zwNVMshD(yg@7txu!M!CxsKThs#dRL@{CfdwcNQP zd$c+0Eb6;HVo5_sC@S!!f!&2in2*{7dQMVUl^-V(P3QgDcau5ox-3e^Fq}0eYCONx zQhoIR%OJR8%NYsC9V@egqx=`oG`HDB`?*Li;ZPSQc#I>~+)6H64O30M!7}nE&)1n? zGQIucw0&e{jl@hx*gb)8h{Uc@od`pOS(sc>)cdBdP_sh>ZZO6ITf%u3-KZ|Pa_lsu zCOj#+o*bduhgM7R23Bi$1#+$uKN3!q?olD_8#;=P>C#hiNnLI^eAa`eoJ_zV-f-x9 zhz)Rw;OwBEJV*0(g^EvQW8sl8S~-fL5V>MC?&1rnSuaYJ)JMsX?T~<2t{5%ED-ut| z7)yp1{?|L~Ujlb+Pxb~yztHSQGU*q78`rcE9n9*=>3ybr5Z85Fjs4O8?&lo!>{+7W zM@ml9>t(jLRgRzVg~*z3qqz6#7e)#@m;2C8nRGhw=sxQebHdmWcJ}w=_Ksr1ddYBv zk*)T==EE%`97HLs1{MbL25Wl0eYGD1i!*8mSf<yz2vEi5^;XQYImU0wFSo{+!qjf6Q!S!gIcbcFh6Pvq!XiE$*QLlSSU}@J2-N29T%nL{BIZ zILl;DQNzfiaO(+T*gMBRX$f^y(`=Y;e`x*IUCT*Nf;-GNDTv8M5fLn*{S$$ry}5TZCvlRC#EE3 zTK{B8KMR8Gcl!t6j_@H!Rovf8#prB>E%UUsWQy|k!<8MSL>(n!p2iGm5$7ZH&Q^=?4Vg0p z;G=)GP9ic38&WF7n_Y3qmPT<9%b%Q$jKq7dsB4B>J+7CWvOD9+vX}q(3hn&}%rkm= zE@CQWN`d%B;q~e8Li5OPJ5i}9#syIyv-2&cRK}iuuG9?8{*QUCrgP(M1suhCDW5KbB&2SECxV%8Y%$pn^Twc*FMhZbq{{UoT~bt~Z_L)m zb1IbI37GYAzBcilUS6j;8GK_bR75Sg2=f<>a7|grlYh6s30<&dyWw4lk%oF# zEiFGG9JIO5_kYxoHA#IqwY4sFUGd|r(>ETE-N~?DYd-){YKHL$>9lfu>)dNWZD-dP zh9~$uKNF8L@NlhW%bSdI6e@@*{wpexmtmwDen1W;j4ES_%88Gl_q<}v#O0;p5sGaQ z*F85E;`enthdxGZVb;0vbeieT@gkA5D)mqkaRNo7XZ`UAUlsYW=P#`V6Z!LvPL$_i z1?39Cq>9_D)(!$GEPJ%pR}gCZeN4@rxiAUI4Ns-T>yM$cQ4XJKPoB9n%Qp>5!oSF8 ze?;J9>>!0@-9Zr610yzxSMOg!rB761EMr?>o{C_NQ?Re}YiJUK5dx4S*;dcun-EWc zmw<-JxtJ2dNucwyON-k2=gDLAROE5p&<&A7%_BaoCPH5;b$aRU604(fM)# zpUbU>Pqr$z*=yev!o^WckfpJ9K!bP<#N_$3ZQm%Pj zALv?0C^12O(co3;ES^T0@z3MEPvkEr%o@p$*C}4VeE&nMHSxIKZqhs@_8P9T>rx8q z3z`rbQ5wmtp~aP%l2Uqs>F?AN{Wx?Nt9FaGq*4-PNeD8?fe`xWn>}GPx zR)@DuQkVYovx*buH}5LZ2yh4pDuUi8=50FoiJa;wjQ9HW(loA}0SncrkqxJWcqlT3 zoUr!Ucw_#^?dVa$cQ=^uF-(?~Ez5S)YcAxHOkEt<37X7)G0O0HZDscdV1E@M04h63}9Q`6|40Ux`#Les;lJyq%K(`@PP9?MXj7|=g{p@(REt-aMrgP zn7n<7xg=5Q0dZ*zbVc6QIee_ZjG=gCjqn=}dcx%h3pG188-Hy9Fg{=X$wE^RR#V&` z-}*W9>Pz@`%_BQdHGkU-OwZW6rZ+$d<+`mi@Jj4SeKYGMM&kc_6hG4L>jt&QE%c z;;<$#3z5TEj_L5V3S55>^1DlD-zL#l`K#$1`JvLAJiN#^ssY(=3BzbOkw4W-fPoZ> z=$z?!i#+zM+3J8xbt+I`t-#~6u)5n2!kiD^uqL(cE;jmwEuv)QQV!(2sV`?j=xx|o z-r-UMKo}bBCXsaz=a37~X-RxDmEXiK>zx6L01!k5*{h#T>$mP4+d1v=eXBFojtVB^ zPt6hb!J-hFO&{^^0el7L73k)}_|rjF`G&PVe)b;ohiUngIT4kc!IKHu-zf+ZW34#n z-8#C$bE4}j7OT|QP10YBntK1}$23v^Y!+=|!5C27-SHw{t*G`bq2LPpqrp>7NU=te z9bZ?P{&TYASFa~>-ptpOWmS5N@C&ZDqz|)W3|p*m?bRYOcAeb)^_gtN3jfs7>NRG4 z2V0A}6EsCBwH2rlotTF8P+24X3^JDES+xE#<1Yl!PqT3xg8~4w@k$jkz>`y=rc(zf z z8;+Jy5R4dR+2|r31Du30^nunuiH(3jjBKFUf5JNsgf&YY{vFOHH+|Q0eY)S-G;SW8 zKWSu-89TG&^&Bd16T^Qmn|RRa?RRsST|Yq^Jyg#>C)&nRF@Bzh4*vx~97qDLJW-3W+V64yHpDIi7D+O*ty~+UNq0`0o zDQ?u->R^}~eb?JXbbOZ%r|zd*HR9S=8<9`?hN+2EF|%RI&9k^S^m?JKRz zPx7ZuhLY`^q>!{R?vkRT8(b_+#)uFaC66W^9k3Q8@Ze8@rO)Lio(a%yKG>X10l^VY zRmu{8k>+aY<_WXReyot9VDP)4DjV#7B>M{89}7h%s#cRvW6kiS>B>;WGB;(Y$dX@u z<_>_6(%3zAep5W^+0ra-J3Tz-Z`s{0kiGP@um^x!l_~&Zb-uxV5DvAvK{=ToGj}3D zLJRi>=xO6u08(-|;!UV}@TO|%P3`S$>^td6nW&xeEV6frW-jCG5l>qZXB!>KxcGOs z-c|t04)ZEa$8&Zga<9vcExZzm&>VHL#t?;I#t5vRnUayj0R|Jn@rUMCM|Zj+NXo9L_Gi@@+! zC@_djG#TfK0d4!Q#(Mvv*L0K5pZw@#wE)mjGxFLsz@yfgdJPc00eq_& ztr~lJqI?d<-av`aG||hvVU~Jv3hK-vB%9$CD`mG}dQiOW>kFG01ob2-RN+>~mPWgNXU`FjFZ+^8wqbZ>JF81~BQX z;}H&RUtkICkTnmm5G?kmU+igR#;ai2YxWxes3J^8(ktXutsHJ+jh)U}L8x5Y8U-W> z0eajAHBgY)ASxvsr8*!FPcsMuOYtGBcD)*NR$A#O{5728N3Eu2;zLKeSb$cA&rwi~ z?|*k?190Z`f%y0hIGgvZ%kzQ^N-=JH!;|1+iTK}TZN307l<%n!_&E(=}oj00INPbL`7X8Y)`f01oUmxRzv^%-ZCPGlq`KylA^ zTxhK90&Fpxvi^%5K%0uIQ#TCw;gHgMi9Yt0W2IcXLge;TlNg_vbQVX8jP)=j833of zkE%&c-kcFg*K|Io9B*JOEmzBz0L0OEPo1%T1#hd(E0R&E5-9D1T5_Kivl5~&vLQmw zNTd`@;<+4#EW)EMS#988bpq_HB}L<8<&krxXsy;;5Rag(_{*|~;t1RDHrD$I(e@vp z4U4d~E<6_gZ{z*#IuJXtolRh(8%mmLT&72iRn^G~T15a(2$c)aeW#E+3;;%2#M~2*B}oqipecdQ@-_ z>7L~@Vuxu-7bZoNmXrrRqk|DUV0IV+cBdG7H-ql!|^5ZgMNR{Rvt0PMd-7#mEDgo8?$vULut{AT#hP$=rf zL6)sXN}c)?rppsl2>=JDvOVXMLa=2SOl8ufpJDS&qGl6I{W0EpH~}MgqOpKE&`oHV12s>P6u81J;&$hNc|fz3?|~}B z!w-*Ly%nkb`K@+902{o&E5|Rxju4;McZu!-Z}@b7M_plkm_Ml7n4@O}$+hseXi0kq zhUAi<01nE8I`IJ-cfWHa1tF=P%)HU^4Pq;l3guRBY1Sr%4b!^_;xrAqw(83%;-Okg z!$HSwUR4LyVPfl8|JwEvu72~`PsjzQKNp%fet)@AiSn-5+rrsMAIR+OGz<9f;xlfz z9&4=Oz5U4f8=S{BoJ_JyaiI1tcFp>eRfIP(hT9>l!HHM?=QN z-ubU8(j4tCswM#WEWUGC)27L=U58uySftPJPv`Q3&aTuzi-)9H9<+j-tb4_)+PVJ3JQ#3T*QJ26vgwOTRHn5_N|oy0aSC)WnvK_2&vbdR&&{RH;X8Hf4%P(`4A~(u^d-Bn3_h z1QK{(OUxYJj>lN}x4S{H0d78W=TeCJ^?Gf4WlK3cf5X&nuXljIY@Zqh#eX;#t1VXM z-)%;4w7EqOX%abyuDRfd4Y7k=G%#K)(y&Z-OW}6er}1>sFE)sQbnvMm>uc=`^gMRs z)(AxXDPbVK-uT@9?MonyQb0`xn;PZpFuad(j9HCl8Ca7s#9aJ_?$_szS+y=Q&s4Jb zX*-DwBXQ|C*SnC|;)j9BHNUHQDu#_Kt)x|dq`4P2)=yfnrM5!Qj_$O%J)+p^hLMHc_W5SB^i`3^RJRP1CZnUHq^kUJ z$Qqn6;3xh~H6elbl$|zLrbvelVSf-+B*;nE8-Zazi%zAf2LrA+l1T^&rzxBOH20YY z+KN?(>q^`!^`x~vDS~o*t`Jw1u~(Ldk4Ey%Z{k)5Qy2Q8gBytZahtU%41lRes0~rv z;~(y6ir%k6O}aEi+PhbEUAECILeLE?8N6Q6VR92(f5MFGYn&(ClPmBm;C~afC0ish zEBO}mT0zi>A^;Ifhb$$~VEV0ueRJsaeFRVZyC!+1-yIc$I0*1ODp-MpTM$S(Ohm=W zj(=5k)PAY^2!KAAH!?1|P-`LvJ>my6Pck`?Rq2Q?$qbKTLjUX-fnzpx+s(I!CBDE* zC|F%h0}b|xR8h>gV7E__#mf97a36IKDG6h};gv4_)7@{}9jJ=+CA#=L%%8J6kZ}8Q z4OpwPBaP%i%EPiFZ~Wgk?34|7^z+DDNav-9&Va5uHmA#ototf5wh_q9UIwD8AUNmy z>Lc?ZsJU%@i3~;2Ar}g|@4P!*(K^Wb@{Zq4J6;Q_7n>KPUX2@14joPbQH(lo;{0b8 zPpZrDfs|CaGbVn7c@D=i4Uqx%=o0nmlJ;Yk9GN#o3?W#8fD5hq-qVy)zLI+N(;5!I z4K45~&Ujv|KPBWsrf6%flsG&*@(%|M`iM-BMPJJ z<7nw3?%z{L(YRP~eP1~=2dPA1jl!`}Aw8ZQgPQ+d667Jo@OUd#p6n-;zh{~;hwIk{ zr%_wjyB0rJWLpv|qJOC7gyn~}PW=;RgPG4C7H~V07nk}gb&6^*s8u3Z>tBBi@LyhJ zfg1WxuCqVC{?1h}1OjRIHW{$8?|b;Pk@pK91K|7+8PiWA88_9ND!|I3pM zcdFu->|KMH&$C8|%|D8yvZTb2kG_BW?olFgG*qhF42LlCe@{P&>fn7@`4V_az8*dJ zc!(6$$$VwH9U0PtA}B5QkOn@yq7?+WQtxE!csgGxM05H_tB_%bpNiF)1AU^47oiE0 z(ch8!`>J9OZIzJ!d)|jA{{OG7(jh(qNad^bPJxFLsNGl}sprX{_285Wv+V!x?r$K- zSTNhf2*5RCH>i)0HEMZTlTLX^y_ur0dRA61_wVkVy4YUld-|X+1z`IN`(7S`^!ve- z1hnI^8c7>X=c1tncIT$&dvj1++6Y26QdN+7@3Fk}3BUr~qu|Q%nf{&%U|QZpi(MYB z0W=VpH-2IF7QbIGG)+_hY5HEL1`rNKwF*m;IIWI zoUoq#1cku=KE7Y1*i!80%tBS|CmzRgzbyoTFf9-x7@Z0w52Q`Vgi@gw`wOyPPJd2b z^#Rz77Iyu6aU_>BhvQV%lp7ZPo%IOGB|DGGt)O({wZ{-e3?aywhzVQ5J!aI_oMa0|$Kj=amlUsctXY!KfF*7+USkDyFra#&{?yB0J zz}J@<{AeD;&X8a_Nyu+Sl}_Pe)>H_imY;ktRrTF;hf2CiYe(9Sl8!_pj9Pt5E^I3S zpI}JH>fG$=4~+bR3mHx^$j;Bsy+*5oJ{{S$eO7tG0G1YHCCP|8OPjKrN_}lU{{X z&Z({j5OK7+taSqsF^}B@Ev_O}**V+nm*1(w^VD3m;Xh8xllC)ksVNa2{w0J!5XSk^ zSoOP~Gwk)~wR*XNG7)#QM!=oCy5QnFS?jZzdKPrqtz+F^|Ni1g$`s6a5Vn_V6rp7U zSG8l1Xpf5mGPfp>pBnr2z1Ia9J?M{e`}^|&`}4*3lvY9eCpPW!l%G-{2vX<&`1yAWsQ>w2u7$VGN6A)! z?EOK}t7{Be#c-Q-hX4{QXnF!Mf}|FJ^#!CV(pR>V#hCNRsV+a)qd8KjGHCxlYfulDwPNHjh7R7Jz%wg z^4c88S@}0p2^S-iywB&p_WMtODs6v*EwHNsvQ)F>NNIGt`FbW<$l$iENV>36H=F6M z_#1H=E?#%q*uuntyF?8Uzq`b49MKB};eX>i=zsu<2Be^L$cJ2y0t?<6%k>8weLc!= zoQp;lNE^#O2W@L%WSoAc6O`5qlR5EKcvyA&Q;Md5=1}=bt6ZxLsAw43`Tr`43>qEm zGCA+$_w5X6zw%#Bwm0wCu4cFXO7!{t)2qoi8Z#@MR$2@qMY#uAKWfl>3NenhitNq) zpj+is^eSA~87>tAa+9B$9LQ>9%G`F2)BT;n@Hldz0oq8-TPi-0q}n}}EtAIP z#t4|CJe^}@pAAG`D3ojWSr_BS&9-vvrlLj!phk!#;WqCSMB}<`%=aWFG-(Ip{qfPD)yyG1$?7!u$ z5Qg3D?7aJTaW~u@-S@-Y1i2Znbbds`mE&f|cZn#jMRa8rut$Qs2cTMV2fn6@o*qdm zGD$tg*&P@>c>1!+kX<5mz`DUeBXab*!1&)U(sBo?+drj0KGYz~#E+XQWmNt97!OZU z38^o{)yXPX`tK(t)V1caTcA!SpreE==|d$Ysr?H^mNpw}AU+bqCn z{jW#o@N0psBv|_oozLLi&la3JKqYq%)J!T}5NHcsXgigsu>XBA^Po=qsy3Y}W8rff zurm~bbv#+D?EdOkXCUI3@C8f&Xxi4BA9l6ti}yhw2jhC)=m5Z#_OqGiLyv`DpIXEV zr-}OC0oS^=<6)NHr0dT<2*X_8Tu*ub8!=+)1Yw_ZP`isE1W9y3t$gF}7A>j&mWg;v=FXg>yYm1g^^F9Fg9Wa)*> zUnsbET4Ow<3FBgpH0}OdLoPTwGanW}g-HbX1Mpzf)|fpQ>h*%nFm@{T>9QBF)3fLi zV-f(2ZxC%LrZQWx=Lg*YPb}8TJjVq0^En%nS4$pS8VE2jWBy;?CloH429I9J)Uoen z#N>O~Y&A#NQJd?{rRzlHr`g}N$J{nwVJhQ_oDQbDjC$3ppwK#`4}1YZDX!A8l>yB0 zgI8*|M1R+31RFwS{jgjykQtn3*yM%QbK2SGE%x8p(+mn^s8Kv%yfFwrYf_3la0SjdBNUMGQ$(SU|m-xn6cPP=D+(HHZsCR z{ipM0dwWab7ed;7Bx^aSs@i+ERgMgMnpLk9*Z2o`CTupFew`QZki>typ$~2#97=t% zKw@KAjr$5*$%$0q-+~GW?`+$D7hXc0Yw(pnXFYCmTOs`k7P)BRuP>ELEU6X7rJb3I zY2lyFU33RV>NQDcx?8(mqrDiIl4=_1@LR#(FrI=07ucM z6mlSh_8B05<-+>Bk(Elzg>TrWx^+4zEuiECNM5W!sDV#aiuPP~#rI@1p^T(~W@9~( zg>k?n(W2trzpEy_h5INC`Zb1gaIBZF&cjs67*un5kG_K)AofQr(D@6WO#v@s5+Sc@ z*HFLh>{kyfKhf8@*|~%D|9+haF)!4O!9+xK6jB)w(;QhnH2mBh1%&}wV>zmoX3nPH zvaVNsbUvgxD!m$P8vN!zok}_WZA_68aMOYLpn+i4G~Rj*XpERJ`5*D9pPe!M*q6-x>0Pjd5i1MA14M9rOSqhjpQsTlHteuv-f*63HC`}SP0 zV^9dV^*$917C067UCB)XtBr~lQ-d&Xj*+GKBA|vh7angMc15@A)w&FI6HBQ90ylw) zpZI{Mmk8IaWXn{O%|W~Hb+7fxR7Ee%AqdrTUoCsz*j-{df*_xx{c=70v7a8`(F5hv zH5#KVWc_VtUJ(n?{&Lf5=7G68s(@}~Ok&*G=h_)C?ousJs90c<0H&D)8@E)8KUDu7 zSl?b>{K`#`oXj&%>r~`uJMAZP)~eJy0-L!v=tT&!4Ta7UAnnY=dPK&zx9C2$SuYum zbz`>;{D7@d3gRNgRv|fhtwMMu%&gHscB!B}e!r2d3UurEmtxe+Ln%CY18q2t-C7iX zlA`Kk6PzFIm{0QKA4h0$oMvu&@>yWx zz4?t`w7KP0_v*%q!vKB=w4)|M9*J9nzcKe<1M!ca%5v*_51C3bvm;QOLQVA@rpu=# z9|=ogrS*Z@h3D?n&(*HrEC7_^jhKU_4SZl;f=et=L$eaP6FZ;kO5z1hLS{?HfdVQ` zYqm5nY#a>sBazEIP@w>h*M7iSy2cX&#(xM}BbKSF&)9MET1(-Y7{XX8ri{240V-+qGfNQrxT6{!ipL`+88HItoL-0gtd#nSbR>dFe zKYj}91sPqnXGoLBG(Cd6A|>0%I>0hb&Do;&@)VGz#;aXeja!Lna66oUxcEWnia78L z6e&15ZrlLWxC0A(tDu|Ycr|4g_Y;jK!3kwg)swP8UIsEjSA7|66;5+lUjDp|Z$8Xf zwb1dBTRjdnLh{I;x)`GCQ5`6*)AG(*a zFS(7dIcWHtj?3z=nBS(ai2D&fY{2IeWO1OYsRszRfIRUWQ7j)H0;%j*%NZbU z1Mju`=3*&gG)Fk|8Hj2AfKhJ;x;T(*T6kB2zy;K5O5EH^9RU>9%J9&xH}l25EAS$I zIHEf4&Z83%^%>RY~~s?ZUnm}mjG0t~nQ3jkm5 z)eJafD@2INp!(Ln9G^&Kd0}Gkx^v7GHidZe(ASR>m zyEG$ovNZ>1a!R*UB8Q2{aQpeZeY;QCT;537n*ZHNZYh4Haot8g+6Axy()&`ZC>e$U zdJ5B$iwW<}H?YW%J`1}u6YLSx-FbFY>bUy(7;=B)zrDixwd3p0$YlN6gf+wC4Il38 zmxS>#$v-1(xR8X4gsifYd~5lSnWiZ%xU+pV4^vuCs@zBIMy|fgzqS-2%EEKYuHUxc zUMk6{WisO79*7g9AD%pBIx88>Vz+NOup8I1pD)NuDvdQ9>+!n|hoN%k_H`EWbJ6P^ ztwG$gJfg4(Lebl=4AI;ED&0d|;hNBG-;L%cvENAav;FH{$K|{vOfDw&HmuXldLkwK zc#JMt3FkdjGl6DFu8dQ!{q_fXRBB z*5y<$gZH>40VN4IoCN%>8h`c-PeK!Ti+VhXODvVVRrO2s6>|UFeCWxMQLg)XmZJ3{ zHLJz??)G{38eu@g96j%+=cWHi=F|-T3n7e#2a(U6^KCN!{mscIk}Y1Lx|u#M5ZmqL+Wg+eb??|` zIWD~7OO8AXB7OGix5LK?DX%AlbbxE%rR#f7YWUarRDSa$quzqrz;r(^cYuB|_x!4Bz<*@sU3BI+Am29O+Z9nrY z6S}dn2rxf(Z+KpF4=GrCD%ax(hKJ-#b$nMVd zqqkqu)pyOAPg}HEzT4%x#n)4sGl{g5G{5Duk_NZzW(K{IzG{QSqQ13)Em=w6uFWn* z*HmLz%gQnfO_-oar^BvmlJTvu(9pZO*Ay{)&i<;AASe62> zV{%bTNL7!3>ykFb(p1^llpQy@`dGkdr)KDC;?+W_-2&ag=}PCT?>`j;l@HO^egtJea`PZD9(=4BN~h2Jji4X4JJ`); zDK2$tc;EKndnTEatxhFyLgVPgtw9I)Z@7n28=>W(d{=3nCwbU29Mi9Dh_3XG2Wr** z%DJp8cC-XEb{GfQ_B1eQsAF4!{elbwQ!GJvhm>Wzm0UQ)(T)||k1q)$VF?Q7_90bF>w+sTs;rWoo_%`!aua$v*xxMV>R8j>xRpNn`#Y24Jm*}(B3ys5FkXcWD&k zqDIPQfV{!Y3>bHa@pNO}*R41iFMugK{27Rg>Y*#=EYC{z)Pn7Jq~BlFBs2WV@=|@mQ?qc2RPp;6^M=^sP{jK(bBgVGbLezy3t z{qw}1=RNr0Y}KIL0YVQ`BAjwgQsnY}O;ur!C6Dhl%!^=G03*>Lmq?v1JlkVwU{i;m znWa|c3RU}Hz+0ydmCh=X#BeTUB37w{vljK#ugMuKXJqEp!He;2-B@pkWS(|CqVxad zUsd>;Y5Gg%ZkiIWNe1<*^eU*Px4JEsw59ob%Z5(eAMR9vQhLTrZv5v6`8pS8Pm<#n z{`e{fn8QRN^exE#08nJt_{Yd&T06^WwmNVQlKISzLfmz7=w<&3r9XXf&dc~MBqeJW zK14&eA-bQ#vf=?9GoZ3L<|WDCe7>zl`&zZdxEwC+V~j1Ci>p0ZmPUszp7Gv|Nkg~s z^t>;wAN!}i6vDtA>~v7~Kzt;0xZku=oBYm%ozzM`RC&frnU8!29ETLra|t20J)V)u zDk%l5q{tna*6(qLAgL^?rDdi!+T&d z#gXvjMH=Vqbk`%gg;f5&%dP`T!Q!L5J?m~2CM}ec4V$OTHObJ;;HRHlP_2;;KYy_| zY`t{MN+vb)^K9c&|HPte09j9g^T(fMuD+=uk2-rT6k?9G(E#@ZX8fu%JuvZ+CJM7i z22e?5OXTn%Kf1#h|Gtl&-YOq3)2(3*;>9Z;Gg)}+q*$`2%dP|D#S)E(>-f8%_eA4? z&;*Q7b}za=FM@bR>SP!3+~d{g4DwJ*u zB)kcKS!Fw3aA{xdly*qZYt{{!a>d7)^k#*DkZ>I{IMb;m(nINv(aRM#nsONJK;E>? zG?VWI+UKuV=**xZf;6Yh>$)zb3420yz?e?N$t8Jw*!i0A`g5!mQd38SYw7^95{EJG za~is0Sqg&E$7iAIt!I(tfoK}Q!c=dPk2sDMwl(vM5#dNSAp84`^^tf{-bL$kr~b}v zw8zQTOTDj5_fq*dB3vUM)5i%B*Eq$FB8EPg6FLO~(dPW7TqQZ``_MxXUE(I5t8+C0 zYV0=%J1C&kCZx#FT4ib=o4h&}cqZ$~Z8j~&F;E{6^J!Qq-n^^gZ7DiL(yT7!8cE4A zM*7>0x4D;CCKrdP+fe1#^w%-O<)Rs^_U5)czbU?GD;$OY)UIm~Ki+n(jE8ZJ9hN9a zY7t!iiL;}7Cace1(4+q$LQD$v@Y>w(D?(LvY|$QP!?O^!LHBenXD{fb!aM5u3CnS& zM9`F6w^|WD8Uag;B$4ZGjeNMJ#d-jl9>N-J-P(!}b{DvWU`0+IN`QBQK(C5m3vE|Y z3WeeHQ=4a2C-M8QhbnkFTGUXQoiMFGJkA8&Db=0zINzbV-MdaCtQn%2R}Pq5NpiPF zkuL>_x}P)uoXOBa@MSwqQ_xs)MJ79M(lUopnz*$U;r_T_RQbGp&qi>O7xY>&G>90P z^9f|LEtrmgkZ^yV8qeR$q_e)KhPRF+OE`Znn>Apn^xPp7 zx4mk#<{&Hl8BBmqg@a#!z1t)$vDo9PJ=)@+QP;qTgAYH|&@SwQqAk@eub8JFTv5^? z8asP*Bn{^noHL-CJH60L6?Z=nrZqk2}Tg*f!K9t|pqYfVF8d%~| ziq_MmLWsQrBolsA!HLVW9?}uiL~+C4$8|{5yGVIsyiobCV1ogFU-yo&j}_EXxP0uV zYNTlL7%rtgzbB0c1r<)nseuD$J?+2B&qEOkHxU1#pDIc0v%sHcMR3o8uXEGgmt zsl&*i%B>Be3a9>uhW`B({I~#4dAa{?Dn&^H0izqObYF#ehzqhBx zmW8x0{GXz(JRZuvdou=O%Zz=Cu`ANp$5OVjMV4$6C406kQOHtb-?t|cWzQ~zP$@gf zzE{>rwnU|(p8B16-sgRPeflfS%~R>pJHI<7XG_0*sAog`jDq|1a@WYu6x? z_};sq8tzV4yaATLF&{3s{$}c=HYSdS1iVKn6kRYUvC_P3m|6W7x>d{vfDPmY!o%<> zDfpFhkijt9k5Dpy2HyN6$g7|+j%p1nf4B!( zAzix}90zxRNF?*5Fs%!e|JJn}&;DP)ew;p7{R?JH1L1N52#uS+_GtUSoDWlrQmgXA z4>uqUU+ZDFz7T%!-qjwa_>{1ZgZ^0-6Sx=_2+rawAXM5kt^+>{DySJKy#xF(HFT^< zhtGgh!nFPNlo_xC{B-@DnAE_tVH_kJD;Ov`hLjeQ9&5H;+nRQXScKqF4D&dlCYAep z(SPLBYB5uRa{&^qcd^^#Z+7u?+OB zjuV*#12?~Fz`BO!#6B=b=Ioiwe0X121_dnsP`>&LGfnR&8)C-^7;acA1zj$09$!gZ zy?c3cLt;?r;#9oGZMt-#KQkC}($>0}R&aLcX&{j)APtL)|n0#@_F4Nt(XrWYL3 za}Gt|4q&7_kB=x6J&(~hMSx(@XXUrrkLiuo6oRs=x-h#w=MMjQZHKZC@aqR|&vtmA ztl@i2pwJty?yP$gQG;lR%|#z)@Ibn`?^Sn;VW~N89j%TK7s^u(46Y)Qn-x}a=itbk#K{R0SGj^ z0>%MctbMQj7LBJU@b3I)=salOY7&PE3t{LjJOW8Erg$$)yT8XOAQ;SH^m|fx0-a1_ zb+Fth+`g>LPi}W2sL(6ay-AHW%GbBK=Ic&2IhMm5xme7ai?uv^14DN+eoAO#nxA;b z!4qYb48yI2O#HrGnN_;#9%uCN?XD%Z}XkPQD&UTCHLAE7`f?HRJs zT|W3zn{HkZ^7oq-^;ymPmTTM4H*H6bU0PTygC1(sf`DX6rZ_qGZt~T$q#QS=i6C-d ziG;VBZDHPwMmZm+49#Mds?m(33HrLZ_~K^iH=Se_QaFx5Wl;>D0=Hul zm8oOjO42=3F*iYisqk~2dOPqX82@716cJD_xn~!)Kh?6S#lH-r(fr+kTm zXtRRtXwbaD5_SdTV6r#F&jjZ)8V>L;{-stBo7$b&TGdJe2t61`0Qyl+%Vpv;T2OV@ zZrgj|k2bSEBD_o4PmtZ~WH@R6r#)M9P|krKW&qh8^nw#vyBwL8MP)cH4nXiX% zJw-k6>YOYnCV$Bq8ub=m70{#~ShRk30_tGdx9^6|A7Y~$CI+o|_n}=Ij}wipp8xv+ zb^-L%D-;^Fn#)asnXkVvX`Psx-;GWg8zW>sHw-^#QoO<@VY*Zbh;aMStBfypZj;{d zlc9h>gN<1c_#PBLCEgH4EFix&?3^WhsMpk39E97!Z7cx*wl#k(c3ibNqamel3BhLmvcxoYL&S}XT>TCBDwP8%Jax~85F~>f=`If zKLp>j)(&Cffe2X-GSr}t)hUdEhcqi>r!^Ms4KcnA_~gG=))(4T>#cviTCXIZ1rc!$ zFGUaV_ch(pz9;0da!j{-*DxuPspt_|DaZ4k0$X$dS>u{j6l&x?2NM&YYBh=eyra9q zNxB7D8O4=})WMF?Ilszc8YzT)C7>3x3vzpmYAx7-mfvham=zD_?bG8Aj*>VIc}nK6 z5A<);)bt}4!67RuT>kCqNTdsfA>Tp&(g67Ym8U?|r7VH|&<|uO|oT7$)BE z-v=KcMZ%cs4=}AU6exBB-0JDDLv&>gq&oVkEZijuCIV7OPs68zkW%BpLBb2Sv%I|0 zJg<0p_+BjkjdfT@^%G=F{?J`bt*K%YMqb6FDrE6V%SyJ>y7dvW^~Wr7|2ef=0&p(B z$krEqX$0p5U&_LBsO1>S6*8~2mfZ>Z|Vkuh4R|D?jd z*h`-cUd-~|*n{ui;Dzo2fvS6V$ZUrE6v1Dt8TmoHtyB00M5BUv5BKf1O=68w@q}!iv z-Bg?MD){rbwJx-vD8o<+G4{(Hzo?gKe%lJ=ZZ$XX+wDigce|CP` z4I}0zp5%fcqA`1>G-tNqGV|{x^EPMMOE)$iKm*aog|?sZ;(1V}ZvlD<{ITmG&BQGm zWvZYX^CNf>%50!DT($&l7HoamTukKh^chY{yIczS*pxI=%R|6g!qJ`UDKf&~&)&ib z0sQOyAXfMmCTjLYLzCtLNE6U)(@Qy=`RPgNNa*TlGj!ZI@=JV}sq_A2BUiLba8tKa`&B1mbC0#dfBdo#H%28KH<3U?B z%{#O8k?lb+VN%3~=??1U+X{2>tS|asl5JjupVGV%{!6B3ILeNN=cPr+J6f<$<|R&D z0foDQ!=e=BoCev65n-{{Ev^wNt5p)dKV?lg)-|F;^?T&z??~=o*y>o01ytAzJ4+em z2H-hfe0;?=uM7Z6oA0s~eNibce=q1PRzCbYYwA(4C>ZO_`lAM%B7@JnTf}#50ja=} z(Or=7aCWW%1&}S0FW+dS7bzqamcx61vJwogJ%G%gU;O#o2SD%OBclb=BID`SG(IKo zd!lP^O|c4Z-zl0EYEc(;Wj~E%iX6^`X;_KwFbrMb<}Z>vDSW!zERux25W~kt9g?YQ z7DkMO-LXGGNsJ&l!*ovQ;uGgN?$ip)Lz0g;NMPN;79UL@SFjHvasv5;h)2q7wgRP#r&vZ+!&?MCO8FfXw>gLS)zCH z6+=1PYCErtOlV^M~sb$Bh5_9*ItYq>*}d6Az21TnUXHb zRzEAagWLEG6Ij$boPx+Pe9R5~tzmleer!~I;* z*Y_k{p$)7Fh&}xbp2ev8Y~)xHJaGAqc9YPm6sz(C z&X8*z-5sv|QRAXhHc_>Xy)Z>1a|j7SpB|MHDW2#zct1PwnQ1{)y^*$>a_P;Gm(7J} z55*>jXh2`;ubkJJ$`Nil653ua?WTUOUG4~)gKYoEUtI(g7XdvX3RMenG68I<#)*EB5FGQv9bCTU7V2FdJdv@4cHp`y|?5WiQW(^D$X z7$de;f6j?WYZrit)5AL&Epd1&g;(>)1^se=0)LVJT{vb_U$!*uX==w zq~=7Yd_?O55Gf`2Bs8FmTlQ$#bsW?ls~wnQL|5oTJX$ z3J!5h!nEB6^%>{Ge-1*~zwxt3_8S>RUy*r$Cy#yYz-^aZyPjk_IAUoou3cKiB?ui1 z{k}N%0^3=h*#nYJ>?h`@5~iG{J?))$h9u-CZ;3ESGA^C;~0kx5-oq7W2xdqRl)AK_R|3;Pjf9~r{S+aI^ZT>?dz7qPytN_3*AdAvS(8;00q*d53p?X_lqcKo}_P=UY^eBHp3A|uY( zOUJIJ#e|zjO!PMHFZZncIfXhSr|&Ye5zm=k8!9eaHVl3L%f1ZN$gNc=`@7bWUyk~< z!r=Gri&Tb+)u{!QIY!eNXxySiGZE`P=v3!eF^q~F{-ZR&5FzkPLdI*U2BJ829w!TF zo@PPPpchM8|2wK@X=zFngG#0tne$p-*<*X9=yQMf!j39uj8#Maxx@3{sl@H1K~T8J z_3{~8`s{|<+cUF#LwUIcPfC1TOCh8|UtV=d^r{MR}ByD!fxW?t%PsBgUt99ePdK6LHv3rir`c}%!5&LOX9{>vw^ z+=t{wc+tg^|6i;`L8<&Zfs|Vy9jB3m!5>a^+lSn4^ZNOpIu2U_fI8ix**5aOe*Hl~ zxHbJ-sw^@3_K4@pGum8I4sTJ$F z&&Y@=AbvFLv7%b|yMl?lC#*s9zOm+hmmax4OUc3REB4*56I|J2O;xpc7P3suP8pi_ z<_8$z+o!UhTF^|C8sd+fZ*)tywxs{KMolOqkt+kdqybb4NJ6{PuNxXYzonhE+E}TF zZY=!zxi1xj?28rCG(xfU-A*+*A!g4yvtTGYHBk^%L1~@WvMFmhCSSj9l!JbRFFJ9q zM8C3TG&m-0Q#3bT&tIM= z+LTfj{ALqA`5$T)$f{jmF;3>HY*CgeT#(618?h7%n!8kFGJ9S?sW@n`1Yf@t->Y?x zhc{@x_jcbOrF~PrYff`K=mg+6;nY7y$*u6bbEzKPkR>2D~7sn24X(d-1Sx-Fc8!AwpjsDhC<; zz3gGj+se9qx#cZt;tt(hNBz7_*ax&u6XQP1-TWOY>vzYPCyd&93jiuCdhxY0-uv_G zgL=12sZW1Ph}+Xfhc22)o}wWK?INut1O2O?Iz7)88i!>>!CfcPI>u8q;G1^4NweWl0r9XJ;Z-#bp!~ zB_{Z*nw!qg66b~0y^UWplxP3vTrTJ725RO#;GT1PB0O*FaFSc-Sus81QI*Mg0V#U# z&=z*Rn~tb;4x{C$pnaCdnQ0m)HC68dX9K`3RlsCZFhD@=R2iW{^f{w0kK>lTn1fWB zU`*nakLe9*u=pgPRne-nM_1x{bup6)G(EHm2&K)an)*{Brx}_lVhgLeF$PBhN_E(% z`|EH7iW}Jz^snS$ampG^D;!7^Zc8(&MlkMZ;XcY&y8u5|lCvK3SQ!dS4Zzh1q^Jzk zM7L10yIX2Ga;l%_My}KHwHI@s17q5+O=``_vixw@A+=jdE*sR_;v&HgeMfjgJhO#Y zSnE^8w6COx$-39%jM_dP>A0Lt`W zgITt>{nlfP)~hav$^-h`Ge+-yH8cfBb6!_z(mrhaX46S&(}R#X95H6A#_ouUe8oi! zTxIVmf;KP-6Qe&ZQ+lc${iJ%zYkk>7BkeVm?lgfyr$5Ak=?&tjVlJ5xrreg0O0-8i z9!m&qOa`VQWPkJ`?BBF7-LtGl}5RFY!fc8OTC0ris>2ZWpods+BX8 zxg=YU`Nrr^&HlRFOYs-i-B~)QZ65=ocGaxK?i-%9CH%1`zS4~|WD<)AjBlTUxj(j2_KznSVj$$!4YzKyDZNs`U^kp8XA)yzy` zDZc7}Y>vB28SuR2X~6iHYk_@mshAAi3F&pJNQOQnZ8M8M;JvBF}S~W`BLBCq7Cz+5_@<`SvO{iGH>NIzh>R} zt41XuuI4IT2}u~&H`iZfNAR^BI;ptVX4vOkKBsiDKZx#Wk3OgL1uk~lOa}iW=<)$n zHqus}DC3~=3D(UGa8Lm$UyZ^2RMT$|>)B!YT%S^>d2PJJz`01n)c^OdgSMiM`gf5e z+V?TZM@`=yXOONgsoPx*Z|(iI3@p{x&B6_qNG%$d9q^F-fo5)${?Acu=0XL>?85xg zmJ}g0p0ej0-Mvf^9jqK3PPz5OF6QGIBHztDwqc9Q;?||Mhdu(e<{#B1g^kL;Z3rlI zM%_^AT&GHse@VN^$fjLAK6m&+TTG6AFpj*;yOh& z)Ls>LO5tJzb7spO3$z=cQ(oC5`pz?~Mx)|h9gQsdTD{Mm5Gv!}>`H6~s$#P;FpqaP z1{c~w#?-t-A#cCY&5YASjQylb)Kl4PKy@D>3o65HajquMe|q9KDE&L<#sN1wh0bg79 z{BX1C2k!&-m0X5Ju9kiI%#{Nw&h5tc-$Itf%e|glyvne2A@j#cQc_RA)I}?IOg{lW zNEY)vhl7L^xkZa|j94<8#8-AwP37PG&ApBH3pK!m`co5e?V}apjw5ekeD#*aSEa&L z^TH3^Bl4Ho*g5m@XZpHmhdamAcO%YdvNmO+j}xLg_VsURx?I^JiH!P;x+1t4Ff+U2 zx{HVkD_J45JA9_lS>k@8##M~{g3FOL_dItllRHL(__fYNHfg=RPw`_2>rDKCtGDYL zW67WjOWK zA%49kJb6UF6^Dd%3B+CC9RJgZnC6DeruHivDz8(itBdHk#m?c_-GFS@R(HV1HTBA= z-0h|mT;_c?$s)L>>dv|`sq(@R(c9Zj`=H~DtfjfS0ILdnexBMi`AH4=zJom56DBTm z#_fnt?%+=7oOOI}L`uZz0N@0f1WwQCreEv3(8N?oUO*6Q9@@xikV)ayqSN=+6F<_K z^VK{AE4k0l;j1|1rthXas~&g9kTy=pu<3_iSu>&O8f~sl4H!tTL+vLKaaFrI*gQVI zcxx*7Ge#cQxEJWW zPNTwc9TmW79%U%yf&SW!%xv1HK%R|mm&vy5C(vm#X_;%1G|B|~XmYQN7En6uj=H1b z`_@=k9}i7ycg_b3<+Ug~X%<#9QGbwr(1;?Cwa>B34>@$O>2x@A*|0~R8OOdY#|nt8 z^AwAsxNTKtyZSTed1#cUl^&m+U3y>_FjsFGpO@g;&Z^F;#vr+KmValo0(tdqlP9N~ zlIF^&+T2k3x3{(h?Wt1q;&gi(*YcfOi1M00w(F`8GIBhg^?hwNjTT;ITjJ92OAK<=oiV-ZS= z2kEPy@WaCf3A{_gNja4X^+bm1GJ98hgswX_N!5zU=*V`Hv$Tols_UJ|PnclY3R=;a z*l;m>d0yUoNmX#ks5Afm`(rZh#`EM$hMv%+3)@QDCOru9$EJq63c|rjv?6$MgNhrP&pKGXF}V&+lT4c5MpMk7T>PmUc8w| zW92##_i6Kp$d)a8H8D?Lvaa$^m107*^`NCqhws&oO*4-qJ7=A^gy-v~QH$OsnR2k- z#__pRA;>WFlBP6!Iv<&m%7yXjYnZ=z^Pk2Jp)?C}IWhHd+Fl3y4jIIn*{n~T34D>yvS&3#PA7w` zqZwxg7lp2wBHtxY*eO+7)g;t16b0=by19iZc|X;XtU3p&rrtn<jowYC9(S6_7dQ_+P!KVRdPw1Y1nSc8({MDP#WQ#p+F{pgL$ z<%B9&d^7bWd;eB;7(oR@ctj+gduF~=P9JN7xn|9tLaIDMRt3nq4N{6MtRb~1ej+zJ z0s8r-kxfQw;ViW5Woxd{(F&XwVv6nTXNgT7?3FaV%tG~0TP6PRz<=0*MT7_4Hxx)i z2zN6wtgJVIn7h3wybni{M4@R!JlDFKVTgKc_wa(Unmfy1NfKX>{+C6N#O3~R1L)k_ z`*^nV-V@>aWjvq1{yiq!c_?>x4K)mphwrTy!BLkZp>gBuOYwYke%QP2c*1p2o=UT0 zakeqhk-L>q$u5*Gps;~+qI*i|G1I~onVP~Dq~10r$LEf=Z#OnUwU?ZP`H6Yf0RK-Rg#!8M4JD*qp_|^;F{%G3ih^;W{mV z2+~Bv?=_qT(hd4RKpa+#D6zN}Kqu0xB==-6qfy})pTUh4h3yvMg8coKr!EHL4q)^#*T2iwb`H#q&r&bO4%sP3iS;gow9V;um*k4E+ zJ;T~r!>@cI{zUH7lTtpOb*j*HPKxBfoFmMMUS@ZIZ_>^4X=+$+#|ZYODixJ^%eE1b z{Hl*ewPM~dE>DAmsOG;nvUHLCZZ4d}_~MHSIfgQ%VhaK)Qe z@20_8&8CUSyP+-neW<$>YwI$4{i91N1d$O(Wb~0lu(;;$o!ICjjbL#CM4Akd5$2;| z;QMGo=bh@-*X{Os4d#6I1?QAU_YG+W)Up*n>Yq-@E1xn28v%kkqgi;psAF^|ievI^ z6bJ^P}0rX$0(6u+VbfBaC$NB$K#Q$Jk_z%Bc8NVS3%zzMl1G+;n>BY7| zNjPt5BYc`Q`VL^o*nzc$EnYm|-IceiuTSIOH95(Lw?nLM^K6ZBlUNU{(^p?E5WCW+ zwnOD=jfdKT&sf054kil)m#Dv>rBj(>DoiWnR?83}c&Faii+xi#>5d=L1{LRd=19PC zXfM?jS-Mk8)>eLjmFCTun?^shOfIhJ0GiJu{Enx|X}~;6?j+tAS|&x-Y>YN%a31z6 zH5czU2OhFAK|m&&MwV)5NS4Q)=?F$rjm{Hh*^jj8n@irTlgfhEESFgJNoR4GO$oaC zuy7kn%$Xvaqn8Z9I+L5w@neC0WR9r)JyQn!zw9%s!2TA?e`}`y_8a`aS6f?0ItNQL zBVz};f9pQ~d|>#;F#FGrR2j!$F-Q*+^dk5ZFsH-SmX~NhO>?xnTm1YE99czN9A-(a zZT9W8RYYtn6cX+3{=N--1+V*1$XjPCWt6uBOHQ04SlHM9arAl%5bi$9Rp z(X-LBe^`inht5q`NL**l<>6PsI zGTAXnx+bZc;fs|igU)g1U&cgUWkeec7XYAF9uVNye~!ujiN>Xux&zio^6`i8qYwXW z*$F|7Jx3cw^=4ef2x50P=A82+m9QY;i9Z0iuVow8d(X_620%T8EiivlVLYLDtz0Rk z(vi#aLsrkv$VtW7Y0Q{`8%ItThrUk%S;5$)rMEZ7$LsN7i!KubS;3JxHy57lNyAu1 zLJ#Rg!q4m8&iBYpjV@iu3DM`H<(vEG@u5LeyZ6`m(*31@J&Vq_l{Zh?o(JA>M8zG{ zkCvyWQ#blgm{?bO_kLozs_MbSn#9A~kRdPM&+Xp5CDMsWPBm$RvRCYo5vR^x5*;0F zP0H0=HJ(G`#gI{0NExr!MZ9;fjk7Z+4_!~nN)Ih*!oB(PNKC}V!bMBO`IYXsYX`b^ z_Q4k?#?-K?tLClY!AinRMNUe^@zUwS{{7UL&(UO;6o&Y)Do82#KC)VT8-3UXcB|Zv zHt*EpMZ-{Gdv?z9OHvuv&e8e3C4=rXU*)$mYmd3Lq=(I&1-zeovB9ry44EE8ey5{QLt#RSvo@UFoCLj`D}ax%im#pE6Dz(_^F z@U!%Sf>Id#HL?Bl$2x3=0SezKQHSPl%+rxIR-WyUA@+MvxGNNH+S-kp1 z1waWsW3*_z1+Nv29jp4p!+XS#cP`8gWd!p3ZCqw;u zcdv)CNL^jJz_Q%V-R|TtZR3=Pk6uCkgIM32fE*U&0QA+=Yd!5N zc)YvXnm(z#pL66nt(P#%QuhWiN)L~0%FwWvx& z^%hprRfm;^GSD8o5`e#~tD=gAddfq76l%Ad9W7d~1fOZ7U{}Al zYppaaf5M>`XZT{EHb25Fq1Nv%xx+L}^5>>W2o!>R+Ea;>;OKTCM{$!rl&Z?f47DtT z+N|hl5&Ll>u-NPx>!>0fsw%gT53pdYdtFS-zS_`rdpOdjOcD~XCMq==R<0MI;ODU0 zwtYUrD_H3`BP<#}XDHDlBA!3(>()W`RQWQaQ_zl5pSG&|J|2}JHu~A~TlBFDU?wJiPA{^lqvaxy(?gfl2@mF&tn$LBunvWe;In~*wA4Xs}r}jfqYg>nK!u2 ziUshl6GASBEj01^zso$y!a*dV17AQWQXo?1lL;_LV}I*SL!})WI+T8#6XjJ_CfLPe zR!NEBixMUe!}e{0nTnU@->YkuUi&Nu7_5~UNQ!PE|z9v1o*AXj8w150t00wXX-)4 zY^R#H6jK{w41yQ9ooW7^Sl(! z5J-NYOK|FUZu0f;S8k0RU$GZRF(?BR_FOgbIW{qn3~5dNOmqpF{BI}%3nyqBC{d@~ zs-*p$Rt@vTxsI_X&$@NKd$L{YPxVJf983)MlTvBndVgVQCO{jIyGtxb`^5Y_r*pM zzWq<$9o;_S32?{rOp`^xuR@x7I+bPF%j^_5B9if@1MCvf97;NINKk~L$7Q)PIEW{w z16bAl+(f^cu3e)g{UV$JjGtp#eGDW_}zGeQ|)M$0k#V(&h*0={Tm?)aDc*p z(!{}(&IE<4kXigY44t9!_QA}QL2dbqXgT9N0#d=3cQxe~_k~WqVQmNKs9T7 z!pN16y-S7Dn=3$P`yqc9HQi}-s#aMj=(rIo(zcJfBYN?~h<6GI=Gjwd*bVmdxp3)e z&)Xhndr={MoqvML42Y~&rW0D)QALw^#T*CfCZPJ)(X>(BMb5S>P=CzalA6VIeFuEuFE ze!#lECNlj+&{4>&KN;B03Q=% z4xX?3gCaZonl7BR62V18UU!Tja*SXt-UMzO88s1qJD5L{LB-;_b!9}F>w%?%#)Ldv zjqQ(8_?Xgw6(RF1rwz+!z5uFMLjCENXS;is2+Gq+_{Y(UJi2TNxmI#P&Yz005fPGn zxLI!Z81^fwk>CWRNNIl8T7RRm*)=1=*wHN1Njr=o#q_kCO9RGYX(8N^x#UBqhY9I& zauId)Urp@B`Al}k2qgjRU&n)Ir3*Bl-_SZrX0g?>KSx8)uB;g=3A1p$cWyDY*A{ER zsLUFeNZB}21aF0d3J4(ihcG;k)Qj@J%WRBD*-J2Uxr{At2f+2ea3C*Qdb&J59smXX zGrXfa>Pu^^aKt%16{Vt{%TILB%0e?N|70~)*AK)PuPA!u8*CjaFj z(quWd#{#iNv9H>!C1A(^B@M5wKN6^(AZ?Yh@G%2d5rZ}|5#e6q;JI|kGJ3TAe2=XO zWHtREW!6CYtj;v|l(aMggq{a=CPteu5qLakViO8QXiAgBdPz};H$#?>41hV+|0lmR z*2VSYF2`!)yQf-!JR|9=wrft8@Le#k`FGHLH!7c+nBj98Vwx-Q*yuciAx0fc z7t+p20#h^?aRm#x9DU_c@G45#0VC~v{vTE5_Usl8SIz*@FV+Y=(uj()BTM^T!KeA? zW<}m^5L)=vk0;DpBP|F`8h2sleqZO9M=)5VV{mbGHW8n-KCz z{ET37DZYIA>z0faTlK96*{Za5EgIS%77Cv%m743W(T%M84Z2coo~B<>@W`NhmW}Th z?)Unh!DRO2gC%4I7k8H3gM1Vv*_2SP9bfRT*GK1%SVCFUT zbEX|$`~R{aLmMs~>BmKxN4b=%ZiQ5&?Y8<^=oBqd#keqx4YggYed23&=tyAzvI>O7 zxe=fosFt&qlmb|bJyrtF z#3~p@M~GAyV=62Mm-Hp3rj}6{Vlg1-hRTi+X1?Sqw1S^&)E^G0hcy_qX-|f8i*Wi|5U4A0Li;w5Ga@^TD_sN$F_d!5Q^N|h zWtTr%J2g2cCO;Y~F_?H0C+fci)%uEIYp16-TqGA91CBI6cDVPg<~$7_q|dnD>AO0= zyMBRL!Nn(o4U_dE}=qdp{hZjhs<`# zlog(RXy#s9@RFF0bveZOQWJN_^2l>wAq7fNr|b$KTwr2WpJLrF3$Sfwuf&&hSeIPVOO4c~ z=?&8Ih52jNU)pl&l!~_>AflK9BD8`D)Ubwd39f*NcOZg;yg=>H)T@Gnuwdqfp`q}F zT8_lpAQ5D2UQ}1A_2!p`PyME4=t^^G+Nij`c%&*$E9wW*mvl9dWI#p-7j`wdWmHqy z#6w!5alFaLgo23KRiiQsvgWFUhGpp)a3r-dJ0%A&QG!RuqwglMZMII|3bWhZ*jC5! z8fDHsPEM??l=tGYkgW6KFT!n=vgMCV$uGn~-@)5wbC``cH-)t?sEos*%V3h8%54Kz zx)FUEIA*`8;E5Kcm^hV3E@44X%-`=%6&Jpzu!o!P7)hSvfm-U=x>OkV0vXaiKE_a9 zU3%+w%!!xWv$JQ9?Y zM@!{l6xK=MwV{#LW3xam=l|*&f(*ca6$SD0WPhUhNdI`wjIi`4D&67OQ!w*??j&Qd zDET!YGKn`K|4gnt<6QA>NYVq#e->Dw%mdXvZ3N=?{N*lMtp~3}4%`0yA(h4hW4ELl z`as^~?Ao4aK)miIV6@AhIm)prD0Bdisl8M6v~v9+z__P^rTiISkPnP3=@#1s- zl79xf2wz+ei;gg)Kn~X4{VqANoZCk zkkAhyRG5%0aPwPGu-ho%sLbM0mkfw#?rrjS;?+L*e5A9;NbV>>3N+X8a7 zjQQ>>@lE}rmH#%&SOb6k#zq4<{(2t*W%!O6W+GXq*GMo&CYfsFzX}m=6g16Iprtkq zM~?J}2<*%y(!msW<4&Zyz(0$(@OJPwoQOF$Z>}iZby~*5(qKkHMQ6Qm%Wxuj;cW-cG zE#D^d)o3C2%yM*~cAQ2b(|8abN=(7P3`WS9t;GD1i`>Ywz`VG2hhC%uZ;0n__ zok=AXse->!aO#CW?LoB_j7xanAM>a-p3<{3=yynB>lQwr%l^zV3N+yf%B~*o){ljxHy^m^`HRoZO~Sce1#=*5 zjGXaJFM(&V;^Ok^*9nfh(Ug*9Aj(?F^KC(B1)OUp!7Jd+Rm@n&fN=4Rbn%T%$}vPa z-altsHYK=w-*k;YIgCaJ5iZp61l3Dg zd5xm@H7k0mg!OLVIys+wX>Gm=D^L!8&I{2=nv#0pGLRsZ!haJ5M_aQ)!xf zMekybu=8Ur5_g2qhx2MEhDQS*a*-quXzKdy<9X!zaa=L|MD?RB;Owf}D$l3_jY2D| zOSnWDCRrf?$g59)hh74Bm{)~>g{~*GZF@Jk)w!oCJZFt4rEj^UNY^y|2@;9@H6)3d z{8JB+?ZA}6))xF|;#cSsg=!j;f&!`s(ItX$Ti@8rB zl-=riunEkq@w+1&g0HjUkF=C2iVpoJgBY=U}Q^Jxi<2L;t|cMb_|jJjBQ@EFJy8ePPnMt)_}BiLK0ANQiJE@oQbit{C$_ZGmvU!&kY2NTg=(N zMn7j;ZV#sdleH7jQxAG#9K{{zFt?&34>$=;Mq}5{8DA;wy5B1S2}w_pIc@gy9had; z>wMVvE{xFMSu4(}Xf{_>RsiC{DI$mN@XQZ?>brA;+{5oOgcVrJ6J&)1r{tXGYaeM^ z0|QvVT{lg6VkNaV2ihAmPP3^PQwR1af5AX~w1_I9>6dmOeXW9>5+234*9oOdUxg0C zvje=`p{ASivD3%-*tE(=X#pY*@px1r z7(ruM)5CCjD|AT0gSDtpC8{AKPwIp>2L(u^J~Gn<+zNh-4XYSX8d$6Q+g`btrnn!I zew!CY>9zLYMq}=-ZuAvR5ejVc73;2#>cc*;HE>y|OO-Mq1wGZj2j5WlRujoX6KAe;xqwOIP|zSU{X~17!U2 zmLKL3_Z09aOU0#TQ4k>!Z5UCOUI`J51e1>MNX0DL^>qn|*{(vlNn3r>yFWw_T0ebq zh9k!IM)fFx+xg8oK3r2qj5&04cKy4vl`0%vevdlD@ckIpzaH4%pL|OJTm+t?68d(N z3Yy7Ckdi= zQDRaK1xwc1)Zbikfy zI-Xn_l!qpsJ}sn_sZN1`uqdW5ktQBVAuk!j`i-DuvFc>Pn_(V`hGnJGdPy*>M6&h@ z&Lju{Hj0d1u07j=joBEkvaM*?BxMP%v*wQnSP>GkKQSuC+}hq6hsZ`&3Yc*PK))eY zO}1e(6-^Z-G5r&|xRc3nWPeo9XnXxS4Gr9nUK6~^M~it=k5f;{1}m3$6P>H{Q1Q8t z#n>5L58gOZKo)Qqz&mA;JniscCm}DCSZQM#h^SvJj7~^yL6cY-eRGk&mCtX>5&Zlr z=_aXysJ3S|{&@uj&p`G3@F(x&Q@yM!J~_4OhclO2x{q|#8PW2=1~M2y;$>5b#-eEY zzBWe53SoGTyu$pvEx>QstYLw4S#-J2Y8KGwWXWZOPIUyal>i}V<=kX=xg;LwkLJQo zuuGqT?6p6D&UfVC>ok3#?3Jz?+xi!X+V_fQDT1W zspB$xFi>OjR4Oi^^mlSXtFCs>;xLRw6d~=q*U#3UtYNV0z>nS`R-9O0>Od^*6ty2n z-m7!;&k#mbVHIFs8fG_Ydnj|(8Dk9o`BmIf1?c(oJG>L3LpblxEiN`b zGF`@UH&)~m9o+5N)9^_G2_$q60MfpPUPtor%UmRg%+|l-*%yM*ZmZLi;yG>eWvAZv^ah6ZZWm z#V7w5Kuvh$Igv?#i=ewwno@ds#o)ZWlnTHN6>f0|yWgbMakYn)<+qzDAFi95ajA1y zNGUc%-S<{j68#n%+DgGr4nyM0pQO@8>$50 zF||vpXLoP_;KB3bP~V!ZJFNoR(Aa1BB*G3_)mBB}t*e}}{H=3v@93`6XW#xh^7@|R zqjcNnP(>OcXVp&IpS$hU@cZrjwe9C`HvPXEP!8@B{P_MR(oaDE0Pp}H0RNFp|FWvWZZ_g{etJ0br1^&PsD>YU0EPnIrYs3;ix5x=f5;66jEz3G~lQ! z+grLa@7Y47LI~j}HT0LOvS%m2?#FYo#XLJjIQb1VC^L+|Y`IACwywt-a!U@!C^cfV zO8M&9Z0c_^FDZn;W+;db?Fcp}*@Mvez(>>l7dZbFgNW@a1!aE?P>HYr04V=%Uig2b z?=nT(x_}2V-dP1S#@E9z-z!rBpj9pP_4s3XBDz8xONclv(Bw{5#(w~pu~gqKer-+8N=H=&Y=BO4v)BM$u{2Pn1wqr&eW*OA(hlje|D z_P(QWGT0)+%LF)bGcux-Ze7}6_4B_GL zX*>#;<#FHYZRHXuWVT!?XFrj0-{xjR%GH|XZwjp`099?CZANq>lGlqI9hGB3IYkBt zZgVJEODGFHoE|1Y?mku6KvbKtjkmL(rqWPR0RIb{FCgYbY)niFOPaCZP>O#{KDLZW zpqLZe1uVp4Uw_UOCmlaC)fb!>7G+}O=Cu(4j2A5Y2-(rImalIPptQ&UFo=i04vlRN z-WTQhEn3`=Vo){rT zz@PGnO#g)DqTaob;(TgGip15^2&u$oE<36_L1U8eVu8Nxx+3=FRfOLkmiZ1lEea7n ze}Vg7p}jO0f7ki9hV3u#ssA0?|1{)C>ATrDIsPZd(!ag>Kfa~NnJ%b)dIa!W?OUAw zE7GcfP-Z8@zCMvh!05wuunpqF{L_t$IsbqIudi>d1|R7HR|NyK&R`7{{Lu`(tQ(fi z>X9W>5R*sLkzI*_4D{zK`#CMsL9J+4+c*(DYyb5zbQbRG}&|Yh2(30F(twssWv0Hd8LEY5HLIitl2LO*5Of zo-b7Xj*O~{x#>EwrGNMwGSoP#&spqBcaWQCnOw325(~0dhNHe5TO*PMJF_msY-_Xh>_oa(-%}gT z>jA&&WQ@3Rtdf-!#Zh&0Q{Sy!HLf*i)md29tI&QBI#;8SKYq9$Z&BE_kXb^iqr__>_J%(XHuh(xqe9 z&T|n}F9)$8xsYxxweV=%u!M6_afIf;#?`L%)MTEAFnrVx(p|k+Q#3qS#C&f@zY}$Q zvC=X*wK7K2P~`?}VWXI|guXfDIGQ=F2BD0HGctyM63Xh;M$cDF0hvW?Odf=5;5;gq zN{?3*=niSO5GwaAQ8oNYeiKxDIAf5EddzlOIniko;!KirWW{r4kc>tjwl-K8es7pL z$mNzP~?i^ z){)H+U$$<|Ge(ipMZ3W4)LFyTA-FTn=nTo9m7 zeQ3Ojs%rkk(b#Lae0YJH-Ti>rfO)8(qb&H+^6Ppr@jQR-&cYkI?j0H|1Rgq}LVF~t zxXsW9zCe+WvjP`YIn5q*E;nVbHH$mCoYBW*bQwW}B1&})U&Q8A489VaA7xiOOcYVt zF8L`fd>8Zb${s3zp=_Gsyt0Q*j}zLpjRWt6SXOC(wHOhaT?n!mH#{H^zh5G-K@=SB+y{X$NCh7C6a7^v+kWSlT5VCEz;Pi@QV>LB}m|BqB@JFe8?GQKF z^l92Xg;-$_v;`3~%l6%Ps{c|MFi&lzRvw}WjZO*|^&=&kZAyuE*s^H0G3v0SD^!26 z_8_RLpLGv@YG(RGADpaROOJlgdwW_nUf7acZV%E^>qvu5y?R=ibMP#w?97?8F?o1; zL#}`0^blOfV6(lpEblsx^0dKbAHYk{E=e}mwPP}U8c8Jp3=KbvJc4yk zc-$>tXrz3INk&4YlQix!&f1@6{1B21K)uqtDW2=k*!P!s{;sZvpCG`ms&il-$2hy4 zDa0qD2%fB^rA&TkDutr&_sByTJC;u;d4L&Whd|7u6&N!@%BPoHKp!FYn{iMmq;_s0 z6w?!{sM}OwgZ$Z9%a3C)I=ikbG8r7ujy(G5{=4r5XxwrFFRjM7_?uo zt5Bs_0f3P5a%S=h*4PZ&yY-!~elJQzg|dd-^5$VHCkY&BG}@*DEDSsEntz<75`;G8 z_574io+GqD%rr9jsk{^;`4!5*ST~zTH~||5To^ zHjdqBtv^6Ht{_@M%+K73eP5_UfcUlI2Vo~P3YTtk(rIVRg|KQarcQ6)S2a7`J|7r7tkUD_9nx%C4Lebh7I^iY>ZqPS|pM!~hU-302c}}c9R1rv6Rm+y8nX@o>%xNNC zeY|?}e*MdtYZZLGJl9`Ir}5Ww|KHIO(?97*Z6kVv6~T*c)(2tHM08&|MWoY@_K${M z0`>VJFiMyX^jsCv)io=5?2b7fneIFR$=bG20;iR89UZF36wJ&dN1>?q`(=A|W!7a# zy%j2}_x^ga<$W`GMn%xqaTC9v6)dGic%?a3TdmEZ9v|1wT}a8+n%pn6HR%pz)IrnW z21|XL#PexvlTo(%>0mM|GA!3->km#v+YMa~k(@8Mx{Gmw>I)7h71rm&ot*uh>J?+! z1=gMAdU?ZG9hEeo&w=90U=9rz$;Rpw;6R4$7J{-2xOyC;?}kaJ4aVRvxZ+wbxVDpu z^4hq1A8$J}}5!+J^kdHC$;y1r#x5Iil@%EX;MEp6kE3D`8* zwdw;sSBK6^Q@0>Xd&2{Y*+XPo-mK)%SseINF#P!2J zRNaMw^87prBRee^3(@-XQNK4GNZ6YKJ+WS3Di#w8oq8xpUSIqxA;IE@qKDxP^-a;< zQHsGb+B5G#Bke;s&U%x3e9IGYw7MZb*&pOA%%hLMx5#(D5H;+Q(AT_}dGbEfv_PfS zFseWuFKPh2>>5&NlYf_WwEvKGEPrL48;;wCW~9-%MN1lUkz*8T$6X!bSIFO9zxsT*AdIy){ik;@B=}U_(C{hFIPGjxQcCJ-o{)bD+7j&*M(1a@`c_t=}rDxAo)pZDv%f zy{K-5=TQgv&z=NwmyD60%y)=HykZ3Ovg(uc(FGr>HK>tbg(NRN!jrBDj4|HQgQ*>~ za5AWODTx+K>OT5#t@RYUwJSmgO2a{A=+iZWt;lwds!m9fuY@h1kK~)VicjtjO5ty_a5_b1@ z0r$)!k+qzCBg$f;24(syXz-wFz&Q4f#9V=qPRn_CJ24LfSGaabi0SD6- zT=U4--8y*4bztrLv9NE=YfElgLIx`YWLyV`I&oj{PP`ZJZ>}|%-RW0b#&))cPTA7U z41zVaCws&>=1P_)*0ofhTv<{5fO0Tot`COiPpHBQMIRv)gp5fR7oHSD>_kG05j+>n%iET_ zz(BDr9oGoIH{Y00)ifsWB%wJqtIAW6AZhqh!|x=6BZbosZw~~p(?fIIuqwP^PJlNN z#iP*1AlYS%36k5$pG*HJcnF<|A7eLehEhU!{6g2#bS~EQ)jZ`+zD3>Ua~9ttV!xAO zZd2?IDpuI2V|~$*gD{PEkrE3*h`%C297QiFl$`7Pfr8&kto6=Z_zmm{39Agq(vjd` z4?hAE!Wty16DTYWp<$1j7Yht*hV&j-2pK$V;WG)7J`XmV<=*c3dAlNu$M^Y#BZKCU z_Vb-yVJ({AywCn9jxt{xq-~IWTqWm433=j{+V<7 z_OFFt=xD}J<6j}zgZ$q!FY`Z`_fl=cdh>xuwfAGfs|Sf@AAo>SKGInH%`x*eAl$tRC&HWZ7w z!-0ruv?}kGbE5>RFLK!~g<}&97e$ZJ7onQ5ish{53GHW>3T5R`5migfxpMh16sPLO zuYvaS5OvL-h-RH|fr`40nv6nP_p2(-3pm+tM;t1$7I)Bg@bxDH^%R*5Hut^PN{9Js zWsN8=md6nV2que`8snbJn8K|)YY*bAa%RlptZKHd)WJ)`jwa-r;pYXn^qZ_2BSTRf z$P>?ZTha;2&1G^@TO8A>tn&mZ|`E5_%m#Mtrin-Q}>)f@h1IeX+hPyiF zPK-s%IZ>65YqDv1B%@4#Y7^gANI(fugI>g9jBwX*+t;Q{Pps}euP*63Dk$w+v zxK{?vak6#3mPPAn7I7WO_0dzmPTCvBt8I}t9bX`2vF(-c5xQ1^fs zz!1WL03?urp}t@SQv-_e`A$>x~ToQr8%c}KQM+ig&Lj5n4 z34Bq4MSG37`Tz%0wgJJla1P*s;6dJ+pU%2_ zH-jMfBVyHJI6Ul1g?QNrb?_>klM7)S}^zxz=zt{-DJ zkE3-qqqw|WNR4)spM_8!$=)p;7H9(fuE*uDWyqsY+8CF2yP2;q>XL09zHKcy=2x#N zntpff9g(|}7-;42b>e(N|EpEhQMg)B6fgh)MZEu>@>u>!dFofz8>}c_e0n$du084z zx=6p6DIV2f#543H@c83S)hfD4$E@q~`E@NkGI_^nDJYte$(-g3A!?{= zCB46n^GBnKEY~Y3U&vmQI2er5l+0AJZ)ABhMa88enic3~*Es1KNT$S$1? zot2!QUe1K&ak2ewjU)rF{XQg?jbfzT_1FF7v}t6ee9mR%wJu(!addB`n6o$%g?sDm zwu!D@4iM~@aAs?GS}Z$JWGz(#%ZHb7w>?xzv54qA(7KufV3u<3QeHh4F)jWcF?Mdw z$hSVvmKLJvE;|BQY>`l|{M5Il1&5L1knYsRy!d>|4!2<)X*y^0kuW1K%#AAwBQh&E z`G|fmc$^z5hls>=582dRI=8z18?(&qjR)r`s0u3+RefH`iseFwM42Jfa1nhrY>mGp z{N7;{Xrg~m6s{OOZIBg_ATLHMK~W5XEt31)coiVjjr&|#D&z?c8ZXTBVu12(71J10 zm_o;6yW0KHsN9eTkEQ6@^mJusF=~m{rbtb01Xa*Mj3yd#hZrjZ+hGD*Z7|W+dhSB& zdG<-iWl#C62-6!(tg6YC&ZZ;bnla_QP=${(5{!J>&={;<*rv;S`MP(e;jmy%IasP; z+7+!O%MzxG(H5Ft{^1#`RNg2JcSQPggmX0{fuLIhoDHn#pi|TOQPSG{ZNx2e)Get5 z0nn{@oQPmJ8c!G^#4m>YHb80`lF;w0LjWP%ceg|}+dvbv@bR7qXfV0AHDyK8FPCov z)tQc8pfq5t`kiT?F-=wP$AssQTx3h<7(Ez6WE0ywWqce3Ta$&Qcb|tl`LR@KIMzm{1; z{YNWKi(1qcpHRl2rIr5Kb+g@td1vN{1%(7A)&81hGE_L?6m)eS&OKX}pSx z&0hQVd#&V;f~gD4pa$I+L{g<@qIqW@uM!i0LvP&ECIKWb&Tss`NjDFRy{gDpYmLky zhx1-ymOx}H+b}>7-6R~KHivm{Z|S%|h9^OHE^5;`t^D44`YG?I>^P_WmZDkTqlN*R5X`H^@Ndx# z-_>3fj*RVvYy-AF=5g9;GMji0`_R+uE_&8zI7(T+D0*&pvEDGW2Dc+)L)vh!Zc4Fo zTWzAS7n%^#KSe-M9~1x#Qlhszd|mZ=U#GfyAGf$#f_lkmvXQp^TJjp9&O9>f(NGk| zeI`VE+k}-rWW*Ym9!7A%G4B^0@E*{K5daL)Rv9Jd(q+Ckoil@CLjjXySkHfOz(`T$ z*v6TqC|2xORR{pU)wMK0(Nt0%r(8bz!?$uGzMGaA4xkR$IVcusNGvssh=^MvwaNqR?@>Bs|i#B0A0?jJ-0KVWYRb!?H?%Kt^) zTSmvxB>L)4z+%IdIl?E;JBinNy z*{%X%$?!Q!xV;{b8&bu-?W6IGJ`~-KFYFC9hXy6BF6KLMtL$!CPR74B7$=ap(1VIt z#21n)E;KvYW4xGktKVODRl2ZGZsG~rGTDj6cZ+DF+LH{Cdn_qIK7Lj{>qjr}Q)n_h z0`z}h=#q>WA}xI_*SlAwyV{YZ-eujDS!`v$6e<=a54Zc1{d;Y`hvw)Y(k!)@ zp?xjpYWY{M2H3l`y1%9BJEN(y{s&0Dj7q4@5$8?bQJMs9U9X??nXD0Q zqHb8i_krS2$W&6QUYc-+RWY6Lb7)hZiowkyP{4uvyoS0`r_v4^im>l8MmA*rxgYhw zDKIW-NW2L5<=k6T#%DJ#E2E6BRDdwPimxojD^LM%IgfBplhCqC*W+%Q85~)G%~{Vq zVuPHl#UW`y<$5GGvu8{a*LIn>h57fxwumr#(pG9wbnog-8qEE>&rhV&QsRz5o`6~t zp4Tqw3=MRggtg`>jh-fLZBWn~p}Xx*G&~pe_C{-d8lqg1a{%moYFg@+%ob_lHdE4Xh4z!b z&-V^fQ2!>2oTdM*AZ;F<$1E=4AJ840aG0_x7Or6XahB>gI}xA$kR|AFSh$j2si#i_kl9->*PB%PlkB0k4GizlO*P^iwO~<-kqb>K0wyZ zevsUTGELRS(P^ zMHL$jCwm1N*k&`;mloS8-}mE*=B(BO|IB_to}WX^mCJ+cbR1QmZOb0oP3L}KKLMs( zgzlw#D%-DdN51rIcpsu{#&l#g1hnmq9qvV}S5>;xoCr@raoCc&520^m-=A|YuMZxc zaY}ya;iP=&ik&wPw5aCg^l~CVqm0?=z5vF0+qMnjs3S$MiI^R-ccx@mT5`YB4xPJE z$egw9CONH$9WwG%HSwwUj-T7H2R{(M4W?Onw?IUZ+j-b0%eNw&cIKpKlSSJTNHc-S|6KuqLU@Ih# zp~WSI#e{w16LH5V3XLZ&9S+AN5A8a#!mqX*L160QP}Sx3={Mh8P?R<(p0t=@+WCnK z_Q}3^x)j9s-aLMe`$|D^3s5PJn?oF-Mja!bAzDr?^LjQm3)gL>rWDka*!jELfi(&% z@12_`05Hiuz!v0+!dC4;@DNdi3t>C2D-Uz(f{T6SX9{WcljQ-q?e0)^MOFvfG2{==xQu zpmMm5^~JXaLKWzkfIxlGvDt%4(o)!|D6YW2Mp6wc({O z?Raz7L)A2;Fm==waA1;hHY?q5vF^)s4S9bC+3rD-qehImVp7W4#ANmqi&_2#UL^M4kK|fI9@!1C7QA|CG8-Pv;XW-WV$tm%NXY&`&9p9y^Xkbgt zxVUQ*$X(Gn8idkT*0r~O$?|I68_vYm&Q{h?oKND;S_`$gIoMm&)#3W89hXVTb(hEG zmwyr+zdO=?nVx@KFLjtF-KyqHM*C>1+GIT2XYToVVr37_E@dDwl`=C~LPuus`kZ9< zbeUdGRJk}pQTa_^Hk|aB#dRsU%T#!f5il0Hs?9yU5G*+nUJA(F zArx71CDq}^xA7C?6+8ZT2x8t7D&>=H(=n|#yWJfzos`(>UlYS8HA?>;<-c@~OqGm> zHYjW`!{oqT^wpeB+1h8Sevr9CZIM-OwYA*zt-PM`>vywEL5Z5r$;u{AKb9lgig0pL z<&hv#Q-@)Q!@tqpxZxvbKj{ua_+{KFOc;Ecbj@gH6j{tzI3AiTv7k<@IF)bG>X(;x zw>rqL@@Nqwvw2j4a4m1I>#y2pSyXALBY-qmr*GWX2voPCCM_!BAzhH-Vdnb!Mq5&c z$MXyuE#F%pdC0-{T@B21<__ykhD5oY{wh_T(!>|g{DODkGUNjnd%e;v_FiJo08fi* zPLn5O8?a6ICZWD}{58KCA#T6pn(^3U2G_#6+tE)16_G-oK5uj~fin(gE#A8sJHc%J z^>_QcG7)A)Aw)UR_ARNzKszI=xPQC+lOoek@p108RL8v~ayG@mMOyiD)iB~;)OV^3IRHYIPzyRk z7J$HeU=Tv|F939=w>pAf>_s22%8E!8jM(WxAK1#4$OUkA6Cn1bV{E7K?X{U-e1_Wd zi1Std%J005y7G%x3S8WRiV3;_O*k0#h|l1d`v#$`**VqTU0%@JKyKtcNaT|h)wWVw zg?Af6tg%*;`e~4+6}COE(V9?TGrmTu!QNL8E9|pWMST14mb+J8@MCMVJH3en14<>W zs9DdmWATzMpKruD zE?f|ISFd&%E(xFbtfc1kz3{vZ$2Pxyv0~wf>Vqb-08N{ws5%pjAznK{CnGWYA-O~b z-Lw)$Y1DX9@>eAfS*{UpJ2esqrHnYi{czr^?fU6Z`>&Y|T#AmRE7g}E56dm}M0d`w zgi+FaqFFKA2xLZ2%eB|GHG7>%#@?rh{Lu1oJsOl4=rd_(epKs zso|iQ%*-JEUTnLk91*qHfPT}umVrgfh31H zb7s4@>F`1qT&vAR#BY4kyfSYR^)i{|)k918`*qTL>DUAF`HkmCtO&R*W2%_%LX$+% zqA!f*8%#;!lE2o3olE0-@pD{}d0gLGUibd_M6S|W+a(Z)`Q-l#%;)(tWc$zDLOsKP zu(UtiLhE`7B0tas!F|gfwoI$Pa|@Z-*d=&ns8nvqu&3^}b+UI2`);w)%jOr{7qMi# z%XUqRs`i~%j6}V?%ep$kQ5G9?$eoiDWiqo*xZJC|xL@9y4$mcYO%6w7>QHg@=HDb7 z%9w)8($I0U8^iH%G<3}JaiId^0KD<)X zohC8YO=t#!PjlL~tzM2_ty&{5LOF||usxBI zR1iki8AGJ^zhUH!bMPq8#@y;i<5afpD1TrN^_wb4E5I{|$Yhi9=P`zd4Up~hgVdbK zIj}d%%`&R3z5bBF?Yn2tDJMNo>(mG}HCKuP{ow?eXz)pqw(b=hVjgY_NwjJRB^p|c z+yLLb)B0s8LVG+@ZWml&a!y4>Kk_a*NKWMUxqCwAWkV}S9(gsG?P}Y z1#`oVl6qH%iH+3>nN`Q@>}3`mK=MHekZ%cm7aBP|BRxj_(Svz*zMm>mev*EoSm>km zbAf!x97Z=$B8@4XHGq6rDXAJyOhE+Zh9DRPsU170QAS?8cNkV2n(KOZ49iO4pl(@| zh|wUpCFm46h`2cohSG)lJ+ggh_+>(sD#d!sO;g@v_ocB?lx9b;j7-#X zbgfFlSC8siyNv@f1QgPW6^ItKAe_;1Y$D6^=$6Zar|o^L4+tr|}r&(4n@Z|%TmccLEr>zWCBL^p893^2kT z{6oL5%tJg|DC5AYE*?oY=%von*GGNjDn4h-bJLKS5I4H?b`I(O*LsQUk|xx%G?Z-= zP`13#uOz~qhw}Y9-ALFjrMn?9mz6)7_bc4=K6ETqs;4*E1)THQYyDi^RVqgeiC$_# zKo3Nz?@I#s>*C-K9^DnUny|D}6eqNig6Em%PzRaMe{ zek1Za5W>|Ypp>{IE_kay@sqafDxOO#>n4jPTUM+6YO98B%BJ03!vW~g?f_C7N&hE$IP)J2 zSNlgYQ5AW3E6z1!DQblwg zKDNX0SdVK5HMN!G>rSim7Msi;t%GUp45J4gRb5&p!$1U|F3M3^{n^rK+VOt#`mm|v zRM$z%c+|NLjbO$87CCtbJis{fv)6>hN_GQ*<<@H1J_p9tXmynTT#Q~@6}(kimBL)@ z%+TFy>fO18$^pIkt;_^23-)$J9_ZD>Y3(VMbLxK`<^ zl7i|vHD{u(`?zH~IJJJUTrhNHwK_FwU28TXV_0)7esAya!gYWbFneN7B@(;6+b46p zb$%YUGF<6yZYce20eF}u;A>$67af$`Zax|aH0v8{dFL6k+bh5k#QBZPLgt^B2u|JC zm~FrmQFHDSSI*#?>{#^pRF~>x(t^}nIhN7n-G*xFt`$T2qaLze)2Kit@6{?e4sSrv z&x&dc88H){BFi*yDXN2eATFtYGKB@;=WdaNHF$J+^e7^-sa^jRqGPE^EP3ni+>`qL8PIXHRp{eaIRl-E2>fjByah0LK3Bb8Ev(OhbWBL5`rt(hiR` z)Mpatt~bJMWoDBL)+zw>SLvXg7Mne8Y_cIK!#zU-QamWaUfC0rouWn@4>=}uiMFbS z_CQIFL~S%t@XNKcEOOqsXGd8HXL_b=LtBvUxhw zKXI zMD1uLv@?uk$eXlsOn)N}q+@5ewBD2LYHO8s$&UdCorO16ToZoHE?b^tG67hv7#JH8gqe&m5%(sK&*d@X!%DH zjwUJ1iGdNp5iYtNYX0eO?YL$glGdahn+1e_F-2^;=4!XkZpHU_?t8Wr`gQV};^gc8{x)R)3>*Igo*qaUUW4 zhkAS6%Im+4B_XdI9c>A6~}`CAs^uaD6PcyUs{jh_!ns8-g`c zqa9Q_Rjfadg(H^E_4A4ejY^SxvEk?%tyF=X!smv0TtDV;`TAX7^}+*Iwlw&n*=oW1vg z$OUElg^Y$sgndp}xQaS2S07IkVY>NZKEYbTdX09E4^L;i3>5d6@#;yV@S(aHv(%Oi z`g{(PDZ~FEMnxODSJGj=XjVqGIMS8izDXvl7n7lwqTg#1?$_ME_<+$eKD1Q){$(Kl z^j-ZTuc}?SjlUOqo>l}d;V}HyjP>W?!uE=m&ZA#3kQCj`lPXZ#3?67vjW^^~>CFGd%e>N!Vw@$umh>z&V!ovH5hOBeVp#~d2-J==5g}Gyt^omt!2szGJd)D zECL-6=r=<9$DZS|1XRL|gI8z+ERxhk)OBz4%=|s7TCkdV+ zC+_pPQ#7+W5*Avz4IM)fs)E1L2y8-K&=1~~ocu3*N8_;w+F2|*_wPUmNBRh-7L&Nw zNW7Qy7pw!n5uMHObQ?Cb&TT8Dbh4%R#7p_)?;OKbx`hQV0OP zj-@5~#)?zr@<#=mNw_4fWJ;8}1Z;N+Q1hlbXdXgCYS^&LZEA;JW*?uH$^z`W`$bBu z*j5b(q;_3GvXd&f6pDxK72{DqhM+{Xh9lBA#`Rv;xMq7Sn|QZM>T5``NT%Qi|LD- zw^jZw?~nUSG$*Xj<3y2;3wKttvgizNzfx)sJ2h7%5Iwokk~Xfecz5oH;4 z%+qN^7cElK-aN?#CH(Ut+5dg?<{#e8pRDu$I&q6w>+Bt4KzEQQ>|ZGY{uZ}rW9ww> zsPwm_`+qP04dYeMZ3=*_Bi+(VzejB-umel<+MT-aM3b*02O#)Y<2nBJQB=HhDg|Gp zh$`h~Bfd+!B&d)zl#!BqwcIycem-1!eC^A%jDn&JDLSk%&(@#QYd<*+wl%!HPzn>8 z!$#ldu=boZxE+SP?Hh>U&xpXd#K40OJ?sk~JdmZSsjbLesfk83?a;%2=Il5t@t139 zu#n*IYuC2q!G2Nwif1!YR!y`pYi1!Z&dz-fz=qDJbCFjINMkT$N?&?a zOqM;tpb(@`58X=4sV7FECE%jZ$Qj39^u^D&_EnK6x^1bz`eywh%Al<@;&a4xQgPCT zlL(MMIILVNCJ9|P~(K6k# z)8b<(B#gJf)og{8P=$!M9L2C_if@4Cm@hJlI-@v3h4cQJ~+Y1RB5$!!p8_xZkaP!HwfEJ1%T~0j{$L zVSdq>)E$&2`0egjbwH12j!nFs?_Bxvs7s=)1}n6x`J%|J(qsSIQg~)=G$=8_%5&}N zBv`SIb%we|KkN0ack)_Lp`~{t-{FvZ@#_}pl@e7C^;IwEJ`N1>y<=o|3YPw{A=mIb z1OJC~kGC$-P#|LQ_9m53#XL!npM52ZJnwNC!F=VXMrmK@H&D$_wAFgE-IRn@-NB1? zCWDk|1Rmku%Z`O;%K+V)AH%_oMI_XhP>+FgR0Cfqrx}sHZ-}Y1#kXxMx>#U85i6UU$z@z{>~e$rB(axE-y3(@+;S-T5x}<{ z2pHd@y!NzuZ!A`1;VJ+1ayj2>ep+ghvd1lBMT8M8g?t|xg1Rb$Vl~g)|2@3`#)myy z5loDS{Glu-Sk{;!|Mf@FSwb4;E(OTd&Rv?M$F0m&#o}UpQTo0~O?PFb2D>@#lPqQt zb%R8hI3WZzjW<$KmXvsMu*3mozFDA*PB>(SEn3tOKO4=nm~N33-|4~k#u^;=B}f16 ztoO!G4TavbgI-_DG@sxwh=}UB^%#h`<``LE9dJExSrz#RFR*<7X%x3)wKLBw@c-Wh z427ZnCphr8v?2cm4gR(QM{JD+7z=#xgV1J=S&VlINg}Flvf^-TVF8k!)q3VgrWHK} zxvj3S)QDg(BEg;1^Y7gAi%)VV{y~PMaFsR!Bv0t12$)}DF3Yjx>7)JAQFDXvSkGDy zi_f8CWlvCJSnA5q4dC*{eK>;OLhB4OSryNJmyXL?Xh0@p^0$CYw3>4kt+C~3G5`9a zPW|!|(z)IE6^{4L`}bq=tGie{6cxN2iylNPYmWN#Gw+jmeJf}W-@Vz#TtGZA5k6dU z(1dm;_A>`tM3bxTKOG4F-(M&GP}2X!t`q;*3k?5ue*8m7`#;Nd@UNci;D6Z-f2=wG z!e_((F7w47D#`!zz3?YY@L#MK%=2_fFo0K$A*lb71w}^R(aG51KOw{4s{RfP{l8=^ z`k0X+w!Cv)nd@RGc*7DBhnE9|a?0Zg1=M$mETqH~CB0w5lZD{*%}8>hy5sK-cCG9A zy%4=`zAre7!3`@S!kmnj$Z~;hOio2GkWWfg8ribLrcU?ddU-8%88CS_@2%K4Nr=zL z3Y@+@_NSI>;ulCwgnr7HA4HrDkz4(#pn6#=nCz&-?3oF&PtV3f19D%``+Cd1qgyMZ z?TSGt(r|Xc$db>(GNZQAU{*EJ3QMW+ zC8=ilR?7$2iT~Ylyfm?LjRIVH(wy_Z5I+9Bd;g&DKXmMw=7-HLC+d5bz^0)52cH>3 z_&k$gW^a5&2gI!PbO92vRDERxeJScWr55^qAp08Z^`WP z>P5%!K;Fm6&C>zPlvIgJ4j7;D#A2GNvB9M z-Ws~>I|$heSO`nt39YJ>QlU&-#5Up_*EQcFdbTEa9XX_fx#~?=eMENB*EiLVD_+Dc zXHVOgP7sw2B(xw<-)`g$48qVf!IDULo#nbeEri!&=*{gVCUs23bL&HhmdnOnh*LDg ze^EGVF4BjT)dN=zgkNkB?&~G5_2y?~fa>7m;9goI$(r^JA{nl;I%F1wZbTW}xK`R5 zsVH$rtRAQndUQ&@3WbTU=cPYRwR>OyE$OpuDKJ+t(G*;)cMB9fUB#@%biZ{A$U`RN z#8xX0W+1eV#E;k%EQ%8zO0VK)qai~A6Y`o z9gEKEX6~~$0(Xf-*b{+v(eOcjz8u7eT9uxdA8H z=`bEGSABaF|7bUIL|^Jr}A?PsAWV-r+7Va&PqJnBaXbtGbRm|&OC-e4BrwmkC ze9h7k2gL;oMULPd*uu!0c-Y2^%xRMhjCIZ9JFM6g=8{s7xDGkw$?;ZBKjZZ%NF;?4 z`&Ci3B=9M|@3c_g+~E56Ftu`)hx@vKa-2j9=1OKzHT??r9w$frEa5sG8Y;zihD_AV zwikb3NC`Xbqn&;$eH`BS?#Fm?93D$F-Qdxh7fCQZ@L5|S3h^jDtLhd(yRG=tM|ko0 zIJ_?!tkp(t@guluaY*RrZ@(KvJB#CHB)&G3!a@u~AH^ousa!$lR>`82Y~!HwvNGyY zb{?jH15Ny<+iG~pClnio9D!~J_ZnKI!ArQYedA0QxNjbDqqxWEtGLT*(2w6gs!oa$ zZ7yWhvde0DqmNZiCr2z-u59%PwOR^~JCjr6sq@&5gvFlulRsEtDHnV9CK1wx(LU}c zqw13yi7IZ|W~v8TU6I0#XEVV1RTtq8DH0u_%W}bo0?;juRn99tKmXi;RUNc;8kEWM zTXa@bW$DSMgKrYXy+MkIr#WuuP*vr^@E4gRzmkg-co;_4B*8rq(G ze-8eHM)FsKb;7;QRd-+i*K@&1U?8Wb)-X$r+q7J4m*_ChB&tIfRLdbx@%?#}(4_r^ zk8A7al+q9kt!WcSNiex@PGKabt{-+@<+Yv#Kr-CTioxsEOvoeXA$oQP6a+1F@#ERD zieZXvg&fFA6O?GyKFlSsRA-1b_DCb}^1yZC3MdAgFC+N16_a|GxVl(9C>OZ$PWU*| z_$cIdmYjDEo%?Y4-1GncG@bpEYUf{U(E5+l+5g=H<`1#z|M_0{6DRa9)(aBw%WI85 zdR;id-$&Sg3|;>|!v2Rhf2J;fWgEcb`jh8H-HHg#{Spf%4othE-)|Q??iOU8k%FP0 zqUoDt4I28eZ9{ef({ZfSiOHhEw+1uh4l~n`K9a+$@v~4SosFb^O?@4#NEhqN@dgHI zwqz`)g@(<|Xw6dVU!XJiC}Q@DokI<8H?N1;I?4p2;8mMBntC?hu{2%f7qj3Lobih% z51K{paLSp3YG`&TP_IT!WhZB#u5?g`T-TRq6a=k5XA`kz%M^jAlByE7&SbK2 zP@FG{+7spchWrV7c{av$zn)xW{;U=IVO_ucdQR53{JMi;IYhr@#Xy^H!TG?9zZrOo zn~x9eAS@qz7%8ryUl2KQaC#DM9I=uI@1LN~UQ@+@3)rhm)7t3-?YhkXW8-%@-%Jh# zgSbl^l!Zx`A&idQlx?5?6&IqAR&Xr)SfX^T+1^`GUohRkGZ|SH;@2pTqpu$cc+$Q0Vvq)%GJtG2;(gF1~ zw2D29YdsCPI2>{gg_p7G-vlM(282JqeJy0ba~~5f6v%x-*b$c{6JG$6*18<>Fe%Jq zQz#Y4HWAUoW)T6syz1eHBy!L{U3)d51UJVW)Va#P`p%z%q!eGC@QP0L&U~Z`uhqHO zW=3(&Y+dLlW62!atFs0NY3}*hW{n6Lt%XiOh*3u|WSfG(=M|%@svQo}6Ium^eE5AA zBARHV1G*u-j4L3Bpv#ouDE$NH#1U zB=e}kuFI`GxHOi>Is@GZ`?GFgaF7|KPpb>%JeFaf!`%&2V#o0Gsb-@TZ$E5|!UMTT z8&PktSa`!!A&S6}*h~U@xHgv+eP_NwGmnu5V(wBU13EFU=&{P?d@cx0Z4gCj(5;Yq z>n;XrvgW5sE{HOkj0sTQ@{$BEk0h2!>cwN(5qg`}|J&l{6)KFz|Ssl^1*OYsp; zJ%kutT{|_YUrw%;G_cd@@mVP4YV&FlnK)W|E6yBRh*@_Ab-o_CB(G zPs4eiWS{4anuwZpk`q4n_74%md)@Zj8D=PWdAWeSkg*iUHU6`*>CNanf(hxV{zus*X8kByFaIW2ZScF6DZbn`St#$J1(Z-ct!v~5>!#NZC{ z{rAiH5>Wwi09Lf-_Rq|Ek9=;6s61kLsJD79Xkz5l-Jr}!L+4BTAL)2#OkQ@TikLX2 z9O>b8kx9=t1Eot)<8-PeFIZE*Qn<^;;8r?`g6wL82Y+(iDz_!M=U%F?_HO2krj7Fr z`+!W;HG|RU`mz6bQaqH(WO63XrSDVr3&M;XWWCWAv(^)S27D+H(|s|0MM09P6sC z+|?WVil;EWJQ2_NK}NXS+|d1PV^!iw@S($%2V(2XS*QM@gUY}eO=5G^o27)7$kFrn zdr0QlCXu@h0aeS*2}D;Vxv+Og+@upKFLr9&Y`uUOknO-2aM4{fo2zU8{s=h$jxxKz z45mP7cG=|{3~^XEB=rVh@5P+{QbHC$1Kc)(DIkHJVeuOm13|tOL_E)tcGePDea_&w zL(PM6%t1TB-%{x_)6V%Iap?OGW)+~5^N)}K>Qy0!tyutozaBtZHNeQ$P~O4T&XLK< z%-Pz&M&I0u(ZbHy^keB`3xF&sCLsm@0q#c+cmsTF10qG;%uN6Q8JR!65@-Mf2r2;R zRsp6?2Q~u11pu!mf6}N+g5do}SrLT#-`W7<@}aT_c9{3Vs+zinrlFCsiK&^n zg`<sTvM?hdua7buaczi-)QgTY_kF>n}g2JNWlG3u;y84F3rskH`-oE~U!J*-i z(b>8AUki&%%PZSEyLn}b3UH+!of6)sCs23=HpP`7l@fv z-D{g%!b}75aOBH)9gS3S`#fs;i!%WgdG##+=KaURt8l4xB)NzIRYjQ{?(>76(EAJO z7keyP@uRTCnY8uxm$ceFqGO=cV84#4v-m_`!Y$08Z$ zwUhF%fc1=(TexV3X@8j#pLd4Ci)ahm5x+NYb<#|1f^KW#Gr)0QVP#f-ZC?m#I!C%v zeGp3TPu5|mYyRnfO#dTqfdvKrh607mBW^(N+qe^=mI^q3pv>l4v+RZPh|Pgni1D{mRyAmbONKb#6OGZ~LapI( z({^vmiSdBOluGm0(`40&y2ZRQW+J(4ej;q16#H`cBfD$v(Rbs?W0q<#6Ja#w6UBxh zjsbws0PFB@i;;PA-3n}iNHNvdOzCrH`1Lyf&OlX}z+A3n!b|1GH{*Nv<-xZKm-C^i z?6yJkMScwZ+Hjr*I1Tz)?R4pDc45YS``e$u6}}P2Zk_T^RKQ-u;?B_3LWrW$uoX9~ zyg-+B`6Tvt@t!{b$V6*Tt%@Vp)xwcHto^ItA^Sa4V|raoQQ1hU<}-)(O=Ee?8JhJwIdgmHhB)}SF*(6~|rx5W!$g+K%FCtn&pEA#Dmpt9&jMHb;vXE|MYwh3A?!M6$ zsZhpLvlC-U-u*tIL`3k)s?4>rwJ&VwaQK6--OyKehwLSrOHx-U^}S2_A&ssld$%hX zna#Tj68Mv|#=CG!BW1@swtRISIaZ8`{z70bjl9;wo?f7$zQoHbCIABPs&Vjjjp>-A ze$MDS{rx)Sl-K+LKzz)9@h}zdJ&PQUx~Z)lL9;d{=|J=?##mD2pyB3a%ZoCba1{BN zeoKGcy}DI-%-(X`wkBQ*i(2pK!Hx8>u7h*E1sFq5jesGb4UpoqiWlg;R(t?z<@!r6 z_}CXWC`04}`cCsun_OMd#Ba=+&V7og%w1&p&R` zEDoCg?mFe!DXzHIOO??%LLSVm9E7lsNxtw9=>zww_XY6tD#d>=h2Xai zyDmE1NY#cQRA#wEsdy1)gCOb90s7$9IHVZ%^+6}*+fN@lw(oIut@qDw1$^E{@6?apuEndu)ovUc)+U~6 z&5fe9Stk?hrPfP4MqFV)-m(eU*)3Hez}T(&UgNR$9j7Bjq;syCPp*V*>bPU zO=QheH7pu9Eew^=ZWR*xWrbTdPDjiB>{_ZAFE#R&_l8n)EFJ23MrV!7ssq%|iKXR& z=2piTv4K`BUk#y>bu>UoUI*+R!MhqR`^6i&(0Q&7ve=_QwJpTL_C-5_$NE|$RUhK{ zES+wKrz{V~TbhClZ5bqzbe%nzx~o^D27IykoT&mYoOpo?CwW>{8fc)8_8%=4K~0+| z%7UlZjnE|G2btl+0`{G4mz4%q;!p1Fb(Xb`pW{z-lu5TEv>HqED8p#r8^bHg;wnaS z(IH3QW?4)JH)|gvytJ!`(1s{e5DO6oGwrRKu+Y{t%hlE8idn}{1>uHkhj?t0MW|1j zDIAd2+2x_Yu9|vL-N3dtL)y`(LpeiCF^cqH$@Gy|a8yWxlsVZNM#lDbGqD)@Y| zAoAPpLoY3)$jdvl(!RX0VlsoxvwJlVeUqr4_)v9JOkMjuc#XHyYRkW3Dt$|uJ^QTArqSfVJc;(#s*d=yOE8pDsl|yZ*P{~a-mZzf!!HhT+QeD1 zl;8N0V(&)3LVTsIZ4>rBUHRT1TKrnKo-!nt7hMtsCU;Nq? z)>K#i`T_WUyI(okD3z>U^ON!nVUuRkae!aBc>%Jz z_%xZ2z*}Ic#}!R$gL1(3lejW#ZJ3Y#{so|96x9KG#V0kdQN3YoWURP!@=EdWK4ayX z2>Y>jbx>+jZu!JGQ4stVOb&!zowsUT~a3K{+ z)U$Lfv&Ogw&ssskVO6Ek-#);PO;^rx^9F3*_jJYf4X}`NL`P&U_vDL~zx?anxBIC{ zz*!4%C#9wQ5xyq|ey-yh1?3bUab>N}R+Bp#q8TL~-lVth;u4)`*5L94gF%tPc z;~piCU#mG4;@F-z*w;2ohd#Fpp6b%0O4IPj_=xBY6ZR{|$c)50pRYZq+qyVoB|~7s zr3hD|&7VI2V3YJ*_!_T8{5wN{UBLYv+{-2EY2!U-3L#R{oC0HojgFhcEp_m=5iQn; zVB6kPr+yI6VWBrN|DK-h^BvJMPG$IniofgrtBkQ&zN+IZQtIr>fTr_C9cNOm`;{dm zk3{p=Xty7RQi^3tZj)%^P-L)<^9JqrhrOMjIP1Er%eKebSU1MM)e7$Q01UXdu4m%>fTd!Pr?+x zOXG#okeUuo$5rM<#v;lFU+yLOCP!~ce|Fmw8B|O>-|hi6H~!*6qH@{KcisMCvC2;V z?(3M~s}p8R&3K)+T*YrwxZlK$ois9s!DBhEAmxpi1T2?NNjKkxQ@ZYYH$MPE$qgrT zDu=F22*S%l$%h{^Z(`Re1D*_q($;8$S&$1%6;+hXPYRk6S%~92Z`rU2@CQ0Ed04oA?8NW$8EEVS~ojhS0=oXCq z0dQ7gana(yH3KjWBme*y$~xa|Qr&0plMlcnUHb=MSLLYHPAluIL{V$$#*@Yim+<+Z zZ@aVDlKloQBE+dVP>xTLTFpgcdC!@lL)V;JQ7tX_wErDhrLANBo#?b2y$HJ?S)|P* z5r*^Y0hoP!*?H2Wi@z4Yq{H(amEr@CBU}9e@Nb)X=McC?jTTEN{{U=N-F^U~6!(hc z>hkwg2tls+T$|GT&Ego=pP?^I(bmFjkl`VcUe70+cj3hcY~6f?r=1WKdEW?otgB!P z@cQSvE>cM3L*f$4f_Vk8zhxNIegOW9PQIcGmCm0fDhz`UHV%L;&YjlAZR1Z ze9XgJ(FlB}TzaT0koABQ*%apwz~Jfalk%XG;;WnsbGNhoE8_7N;>X4V&p5E$&S_8!Vf^;=AiEF#@Gh{qyYFs-R(WgHkEMn)wlysv3DRu`?poUa|vg0y=>Z18-D=SUnXDP(R^`yr#zWn zA%S%Hm>J+apewTLup{LiOyUDDbUMc9Wt+dWmQElBcgowdx3ghM7nu1zeZbu)rsR2P zr{WS2&!Wk{dienmkg@#$fIoZyUJZ&@c6~b@pcu6`_&q+u{pTahv$=H*xQbDG=K4ZVK9Y<>1EEqvefeFnAkeWdCFs*CO^ZTrx< zePg`7`;b)%{H6y@Z%e+H0Uv+?RV6Qp>!87!x1e0JV%1hHdq?81+)#4kQTQK)((!^@ zrkf4!#zQe)pN(z@O5oaf&Kq$^n(LZ_^$)zqMUL<<_!!amu6Ro3`ZwsN+r>9Xt|lXV zratGbh^M4c7WBc9#7U;^>bUnneJ-6V4^z^a8B3{5ur=L#e!@2RQh2t1=8oTL(e_*w zQur9KW7_WvcU)O+se_z`XJo86#*v><)4gpYQOz>VZ)S_NewyNPxy>`f!JC>q5|bSC z0dPkB89PE-R@*WN!i_xwzLUG?^?%ws>$s@4?%@xDAkxwT(jwB`i~^F<-5}lFF@Us$ z0)o;KB2v=b9nvM8(lOFCyhpwFhw|KepZov&96kp)Gqb<5d#|(bc2GTM$Qa z((-0^Dh!vec4OeQ1stCRgJ*(?KojD=4PAm7(l0@sqzaHGWvBzpzY?-b0d%G)^Y^cy zTpA;}cTTLLri#VGiC@Hug-mcMD1VV56z`zAmpb;=^b&*(V_&P1P6~XNnI!u4)*vmT z;Xo;Z+mgIrtMVtpV1D!?-x<`$dpu6B-FZKvZ^6og^X|zENo+h+yxVV)K zE_6&e*^m5|dGSv3WE{{Zb}vC45koNSBuk5cP$7&jLA#_oDeik1q4g#C<-4u>7ZbE2 zly%b;lWp4rH@B7T?It6>R`FS&&~ciS`cjyH?xZ95_N)4OT3&))CdXZZvY|K8;MoF^2N<1uFQmB>r?8C~xziXvpIKnz z3qKgDx)>K)qN$7J3Tb#bZ8r8%hfw;U#-slZTj>%iTZBKnF8b0~fvfCdd*nIc+?gz* z8E3r1q?%BFD_@9z0}B;Pv(*&aTBP-GQO`!a!s%y~IN3GIP44364?~$4o4HgpbafVG zC1nYpodlxAkI$eMI-_yTr>cI>;tygz>6W(Oe9&$QV#En^zn2=3I+B8O?sS|t=a!oa z4SGH|1St`Dk9el!fW*8t19t2Y<%&p;KN%9fNEPJNuuz1Uu4B$jeyOn`i1UBL%%h4G zV96v?bI^{{7YxC!k%tN{N+qkMez0obV17HMHr&m@ z5ti%Ohp~H4(6!2f#K$ia;x?;kadCL`O~9CRBCf^aqqg)#xK&&z7Ob$|zjJFxCuvSM zVulPFR{p9}-^SulS?Cu1S$P!SB0;Ms3F$ec4``7En2pXQXjcj*SPD5Jhgs>Mlk&7m zrpctkULEmEH>it}GUaGQkkB$D&F)c;9D*+gFWjMF4&C!Pt>ppRREj7kcC;?~#2s6u zV)V8c@X-?`=MoFGG&-?^76MW&yewriw@QW>AH@jH))i@1B~^PXhJ~dlYB;K9DTKzZ z)a#<$jo4h&A5INb2c~xatXAN*j2d%Nsh7b=_eYFw^rLlo>&e)=FDc2bkJ`HxH!t+z zR&^4FfEU!su|DZn`kdWEF%m!VZRlmuZrD)QU^;&u4XcvA1f}Z~pXNRV7r&eOJm}QT zz@-=@R|&-k<+!y#c9HDGYD3#QnC0C4#30JWfExi7)lN2QzD9M?<3dq7PHu=nr>$wk z$iu+8O(}NmP1;mzl$>}LnL+Zw!0_lL2nxkN+c;lOfBa+`Ulxf#(6BGXlMgDZyQ}J7 zLW;a*MnA^;l(o@=8kXQj*ZolZk&}lKR#G#)I6>bdxJ4WJZ75grv-nY1CRG~A^4s${WVU;RpF(9F{eb@X`H24H3P6Z3E#fLZT9oQ_Ca*R-I{rDd_v{&64FG7k!11dt@reM^Wd|+24RwFRD)rq zw}K^xr-SO}0RKM!!`HJVv)wS{72cErvwA#5d85-$mF(ziL(QJV>V*O*w63S~{1! z1SyZ5pPz&0QUpC!YXv!=%YC8e{1Y)6o(!2RHwG5WS~GILWyP`U=xb+q^&_ulaclx+ zD?w&;SD$c7$wK^7_7I}3xGLuY)c2;dwiEr0{rcj9l^cWQtX^h7$l^YCDZ#jJFeC6=dVg{0l{f)>C0PN-CNC}c3!djCsGAC=oE^O( zqrP7s2Sv91_`2Th7}ChUTsj-nxv0Cv=S9)VURtFdYp0(_Jx7f9GOcm>EvqI{r1$&8 z;{0kTMCUj>Q-iTOy%Wdz)%dzyX(DwGvvp;dwE4FkL^i^X+hQ^fDI%Q*x$H%yqn-x$ z5^~Msc&%SsWz6V4J=}BSHl_9z=F?01m_G&a$#(lJSrpD|CsCfYGCr`RwEAQu=l(*u zglBbPUuZ4|!;|e&K54u$i7-)uW-}MOesZEZ=~KH#gQLomAYzsFz=it$2?LM$BrY|H zJM3ZF)8wX2_*C1PaFq$p$PM$VgSoENjWf6F2V(m~pls%!%uS)PB)#>A{F5EI{^=jt z>>COQ1Z*GU7N#ES&Tmdwm1v}dQv4Xn*Oe%8C8SQ$tn+0}*Iz}9(3yV-KHk4)>PBIS7x`c$O(L9IvA3~FKE z;jAcBx#cv|KbyGhNTZ=lxf`u9F&&prg7SH^u0TM#F1Er)l^P?HjpcXy%EfmH+#QdK zRdtfxW5C7M^{X@vCyxQSH)n+@B6iGqOvCZT2boPWPiriQr@JWK>LmyRXz_ApR@ao) zuTA@vUr6ezqG`h}2Gw5+T2~Tm>9U^N6?@UNoaXy6uGEwd-B(Msh})yOwX50l*b?=8 zd%Vw@PgT&)H%BPywUl~E$=R73ss8+^g%no2C}!<2u<#;%bz^8@(`;mD)_JNzZ9;ad zNGR+QWM50MF_+cKXKHB@E;wF(_-R_yN@VDRdhxg7yyX6ML^bUGFl@DC(p^YMjR(YO z{t|>@b_trm$C`X=jA%NKf?XbqZb(U$B6$=p+e})ToHbv;&*FlLtZz&cP#>)FNpfdN;skZ2%*I3MBV2mFpw8&`+n|#m} z2&ItVnt=bbyFY;E?9WkCpVyCvzp!t*uHkB$XKJgLyZL-pEjR`sil^Kb2vPB70`I;B7Hw4ItqL_bdj1CN;fW;DnD9OqI-fwbK#h92@h!YAx6<_~f)wqER<-k5>Rs?45y$5`MuJimAWsDECX z5a=n7nggDr?g@fAPz+$6&C&`?`e+^-i+4D<%2mDaL@}_GlFT4-c4T(zvoOquYxZaa zSQ4oeZJ#OhujU{r^%RdE@sJgh`;J{^Wk#sf8IQ~$jN z#naCm$)65|c;nMsF5} zqo;<=<}^(wdm@!;ReCpr9yx@W@VU>a)E|C*2UIN4&e{4HpsPH9O$~!r;eopL(zpbD zd!oAoPS!mpuJ37IABL<ECuuv}XG z6XHZ7>Gds|oW0?}41KDYw4|~ll@#BewHcJqy{9FyE6TObhAPWLpii9*U)mgWxS&}Q zca2(DONWn&XL6+3$5;ueU2ak$6z+>C-)pL!6dgT)%lP7(I!iiDL!Rda-l%mIYiE#z zt*^+2nU2{N2JDjLot!kn2*XVx{S%pJxlBf-VV0+L@B!S$%yxbuH%EkA&s+6I)Eu*MXv& z29f4REUc}HSr0?y2XyDXz*~!ZHSd$U>ld2W{>L~gw_gPI2634SA{OqG<4+Vie{E<;c=PAAr7}Sxhse8deXuc~Vhp@nMhgAP1e` z*RHB4t+%hR*q* z()m`F+lCf7u7*LfBCofcv-xrx^^1FSkO})d@9uv_TEK6@%5A^Ma+-fL+N_mhuOom}b-bu!P?pkPui;Kb$66ws?jz4T!0MEzd(i<&P;Jr&hhQH? zu+<+LIiJ_>XFx7TE_|WV%hQ3rPEPtX(YE{yhS$1Qq zoa%)`0ExsV4~I%8BYtP8pDun|6f#x!nu^|o<<4hT6;z2gH*1K+4a2nw1h13<_O$18eK0-*hzAVKx>fDRwv^&ZmKGJCvptlCr) zUvJZLx}2h4U0(rZ+QJ9Y5v80hgO97gC4g-E=f&advFe1$`8kj}0sxTi7L@dz90Otz z$tCDExG?ANTmy1+0Vu|xmay?HypNL7El>|t|Gx5WnY2C~); zX^Q-vH$HO4R0A&mG}{9pq|cARP$pnIzeD%jzK(6z2ZH7KWVb$$L&{@p=)g<-vAu!= zCRXs0CtEw&OT>proqYv_1w$#n5nWe;>rG~Zyom-1Y7~)&MNauyXWH~?EZl?}!WIob zz>{;gi5bjhx}T`radfL3J>;+HW6M;b71HjNQr~52z$l+_l@*R!V8Uvqc}47^I?)*P z85It-lZi6X@ZQ`$C65l#E~J4W8UuGj#a@~&yBtrW?nAo4F5Dg?GF7s_Q-0Ucd_=k4 z0$;*sB|_FI*K&EDO$S)GR_RU_+-AdQyAs?rU>I)S7v)G1n$9Q>%S03>e$G zQ^x(OZksLHL)B|{NUll{dFaI#Epdp)o-td>3|H>k+$~vPK%*BdX@)!KmJeqXDTEx> z<7rJ3__um1sWXt8+2aQgT4G4<>mof)lcd^P*P58|Wev*nsU(aPbmE4uN)oOTPC!l*n_y4&6>y4OpJQt-AZS zOMo#9^z3hFq`N`KIWS+9lw)=X>~R6g+1+}hqvY8`7*X|E1ooMb?#XV&Iq=EF7>KFV zB}fn-90NJmc7gnsEFFe<3BrZ+KLiZWh><`yVTTx~ZXE~xNfDJjfmQ5PRtdG5o9nIh- zChrb1i;97n&)wxO!gb3RNGFf2ceZ&jw(x-R{V}pm`WBNv>Q#NJJVP5!RxZgaG`Km5 zmiS=0Qy^#O_ul9bKMUNpeH@|oeiUJ-=C(5WMFqmr#gqy`*b6M~Zy9xaGxK$9jdM`D z7A5PhGX^JVCqc*EodzbjsdJuwPnU%PJac!A^;kSr#X<}2`3R{Gue!oVdmc@V6F&2j z$YvBoIw$kMPmnqeaG?4oM)T#oI|m_TNxmuWjFVY{u$_BJwY+?DXHKOsQpzGv>5CAS z!t9yCUgwiYq8sgBx8y2_9uW`_e6%`aJLhx?*IK8ipi1QpAGj^UDz@Xg%IkjnNJU$8 zVt#DS6?+lYBie#FWtM4A(0w8w@X$`40>@GDybF+YuUX&lcKfw$xQ^mBx$TKu`W); zN<69TMJAP$0)I)E&rf1X$x0HPc)C4{m+l-M-D_X;T=tAN@&dk~A)vL=Z}MkO9vxTj z!rJHK!sd3rzWtzWjyHyfHN9Q^?L|&WQQITtcFakClE*wd+NzCA$EPEgpsApfQZHff z=Eg2;du~RT<6v`H#M@|O^9UVFf?QZ~TgXqMfx9y=4`Z(2`8oHHVED=-liQ>#(=RLA zXZrFyFT6=}q3C(snfGQXcg8~0Bn)TeT4Wq@Mx{+JHg-I=-nUP3Mxd}zSL43Hb#6oV5gU-Xl=Ow1lzyDc#Uqlv1+Ctw!1A*lYc+ zCfXtM5uz9Z&xMI<_$n92NB?x$%i>Yf76b6U`8PkpIDP zJ_iqFxXU%DVCmuPM!t#6uy?eLi*vqZURZWJj^-V`dKA|YjK`OOX&(;4@~v|cr| zQyCiJco^X>9FY>HMn2^_gyD_O$lL01_L&;I*&3JnKozLPz3TpaO(pHP8gsBW+iPCi zdBk)1c^4=Bw{0VrPGy>9IA%@0$9iltJ_7C)HpGw~V1H8IL-r!RVh>s@quF7Z(lkRJ z*2UU;>X2)f&3!Y0n{Z>3hr88L=YE~EtjunH<{fr3mu$5z+%^*7Y&`;JO8$PN^noWh zAo;5k?sDw!y5$2owdv!q5`7S0E2dtT?*DEXeq@t0oE9oa!UOHzBr$> zt?Fw^nwaU>>8C;e)}|=pod!bE2Lfy})uG=~LV3MQA&80W5R6HS*1OURONJu1v~W*` z3cPFSP@mFvp>x=DFV}aqTk8rr(TU)kho9x|;0bBy`gNGS`2xW30L_VlH80Wa76_Tw3n5&xo zdV8mW&gi{OqV0Q7MTN9O!qUn_;iwV~tlZn)?at1_@|^5EvX5+Pimjt`eX$0VgByQyfVsyVxo`66JCDCog2=j9hUDR=}N5F!Zh zb^*jT(R`g;NC@~8!3O*#;{dPN+ByOcXBb%-{rvsosnku4&yhqNIQ4*9ujU}j#&(h) zpXRABG3{Jh&9Ja)m`KQ#4DE1s<`Q}>$7AmjL0>JO@;F3ZR%1l@Z zY3@m`;XN!QGBfb--Z@!0N>0&DoN`^HxIc%=5@9yzQe%-1FUOh%C(~5taOtSA!Rv$a z2H|Bt!a)+pLN5`QE@TU_hLHk?l582Y9DLM)#i9^WJzW^IHY?DE!B(Iaxl|uie!rh`nMrU$<{(S zNU;!m$=3MnlYNp?(op6>H!5FZoJQnaCZ2vKIuHsDzWpU?O6hXhzVgwh{Ze8SlYmq{ zXti`fe*v7x=Hh+rgNOTOpv1mqNj65K+%fZNZE$<+BlV%Q>M`iu`*j)hg$4Cv_fDnZ z3m&T%SJtI=4#gxwmhx{i>LS(r5hy=XR2f&oGiDKrtrgHor|Q& zHJ7mH_E{fzV$miF7nJ+@>f1+5;M2vuqXwH9+EpZ`MBh$xuzhM!`KTMgv@Kb7f!(iyo<&>;9s7eJ{c1=s z>i#qbRlgq*#ZUyso2U4ka|t;?i1X3Gc?e~o9g#vf{)`w@+T#4Ri1-}K0OdOqG$uK$ z+XPK)lT+y=OtLE3VIG!>JfSZi?*)s;HDgkXgAT$Nv5JCgpBL(n3G-k) za2FwsJmDFhZWm_2T<~o=Wb3pT@-t;REvqShNB}O-ac_=6FysZye1Q&6}BzdrF-;Qx$hH7#4rFGHv^~P7~a% zA%V+<*Z$1fe-rt(+0B&Z1DOT&W~#tk$-U`JLe^-)=!5~h0>e6X1zKbCp$Unp#lpfm z$viPE_9n@LN&%g*p558mF9r%<=fN)@_c2gnpUswirlo2_nDpTZARrze%)?=ss>hX| zlz5SsYmeeEWIZi6OY8nDkw5*$sm;TbIF&su?^4Cra?Tcs52vh+*wW}53cU5%0&|vb zbrvsu%&+`t0gH5!9`UaGXq^A?ZI1a@-5lexn_+c`6IzF+40g+(`)**{Q;Z^z*lM5J zRvtaiCLShXP|{-0d}aA{h_K@2<7vdO(DK4y8HzVuX8Z!e9S$SpNVLmEo`cOQ(^bbB ztwHP^J5Z&h%pJ~55YD7sxpG*yg0x}!R_`H2iClrlyE@BhOu3>y;WYNQ5^4J617A=D zpZn{4$^5jcuIBjqR1)u$;V6_p5u`M>kZ0T`9+-f-_UME*de-%5P(a)0sl%5#BR7H8 z1ABg}mk$CGB+_sKU-9!h2QhsatId-_bML}AW9OZYo40r^;%pBu(Q}VmHIY}ckNCZp z2|1Xn(>PHwKr8d1cvHqs&lBr4NR);`TPwl02hZ2U-oibj$Am{{XC3eMCX)!!lC_Rk zeVtg4xcl+dkc|;jdh3nPPg*<5=u(U%b7gfgb9o)9B~nBhdEUIjK zt%~&Kb()7oYm(>dohOTQsay8~sKWKUGh&L}5XZLsWXwq7*S0_Qc-OsA|B$%f)V^ID z;9O+fikuc9?Qz4+Ez2`xgcC}XTQpUhu<4#)sOt2z<~T$ytt_c0^*w>+2W66bacWpG z@9Bh!*I!m_F6wcak02M)(mUDgOj_}JJfMWocaPw9ii%Ts#M`^Kupbekjk-neTE-b# zOk}`C5vSj5XbgDzmUD4QmlR)OlRM!NiB3T3$jA435wrJ`g}706>v0&VI5|AXbGz*m zZrI89B+j!1;i^-X$8>RcsKlgP5D;6{0XI!#-rkNf;X(es` zS#&7RN%Og=72vJBMxxuHUHI}<22I+yg%NsU6G5~S4CR(U-4R`|>z?oOL}HuMN3is8 z=_eSj|BCA6Z6S>(gWqU$hDeXoCrmd>Eu7whGm#$oNh4nSwZem77oU5MWiqT?wmUamh{S}BOfN;yO-lA> z^*J`&oiW08?1E9b;~n}>J5vZ>n762luJcdS5Nf?K>ayzZkve#JS6vxL46hCilPJ1G zAgKS4`lU7WIC3kn@j(suLE^xBQW~``^`2DJyVyvQW1Sjw@dA2j#tE}6fXwipZnXZ3 zH1YGf{{Kl6|C1*ECr$iMn)shI@jq$ef6~O?NfYB2;gPw3G*Jee*ncYK`p+zl6dm0x zjU2uQFGB5+K=)%Ncr;%d;$jN(Vi1<6!f7I-b6JOb7(>4H$Ta=V;|KSx8}-$#17rCqt4h=i&0~+D*HpeMBq0mYh{GYtja(o7$kIw0vbvXp&BWog z+}YdgltxD?vo9Xfdp|1rYxh{`W#GPNh)SA;N|0nasZ=0c=MK$ny#a72DE zD*D*e`6g3iqlaQkAvO|^UHPDBuf9#c^MWA#mf$u61oh^7mixB~T^Gy|S3J*RJq1uD zJ5*jQ;(%l*R3cD9_#P}Wntx?59~0CeQ*O*4*rM^s<*c!FhSZZ-?KiaF*|f(&49<~0 zKMSK>ioRSucf%ved)RoyXfVkrYuE2~Is6I<4&E(Kv2ya}SnhrR+HBW9rS%9=+fnwV&NfvkG7!{cA zhAeveK(n3zzBm4CR%bIKm;Yj-l6&jE-HUo1a9Od|h_VFwTn#JlyMZ@~H#Fj(u2HW8IEx~A|)c#=de(Gl@Wo6?dvl7!}<0$zcw{%E$SQb`Fd}$#c=W^Ip;mw z9=>B27^xOpmrkZbwh`T?>wkxDCl~6o>z$Pg&iNuXoC_bxstr817TUbVS$MuJYH8Jj z*7mq8kRS8%c}~3Sdn}{o@zHU{k~h8SPl%qC^VTU5p1-wMG_HM8gXe4^nAmlviw}|K zL3EnGBkeQY=tCK{8tku6fZ$gB(^GcEQc2sxxkJFz;dRmam z_@d|(IHMIvgX#l#QNNy$R?<{AT{ujOEt9OG8I_MFjNY`iHVA0J5T(NFgQjw4TlV(p zWL}o8F#&_q)05u#{&-;`jYeb&`?5;U($6eZ_Q1Q;1Qee$};NZ*5w;pMR;fzH2W5(oEBo? zrDo2VKEX9|#2m3?_U=4^(e`A8&d&n%#`MwxMD`gRWa39+2N&6EniPbg_L!sz%pW)h zDq@g!8khE&hGKZe*S}J?rczf~cx!Y+iR47|iM@$oiS$L#4NoDFuEDG~Ml2hB>TSg_ z#Vb4SYZFdieB$NU#W0BK9SS=lls#2#D}Gc$G7D2!_{;d(2N;v`-@RGq0Oy}pXL}=O zGY2ypYYA&(o9}%y>}5CLwHim{C;;|+U}m(rkcg<8ybqy$g2nyY`sb!YZlZ_Z$Z-cVDVKZ_IeP>xwR~rbf-jncgQx+DV1BR%5-cdpT^4)gB(`vl@Ug2p0%#^>?y^n3!%;?c~3)9?q~kE z1Jbm7pK9JKYvT}h)A>&#UJ5$Lftjr9fr|Ri@f~8&2f+r$HWA6)V!D^LEG7|yTub}2Pg@`5bLq2&pFCrJNkH>BeX6ov;fAN=-%d~QE zXWKDibxUTcgk)3m`Zx*Ys7HklBKSUJ$2SK{IiGy{%J{gKt@Z8jZR1WXEqmG7dP<@k zG&QD%cXC3N>I!Z)?+L2Y{NgL+QxKgmBf4J^?1uwrl$_FCj|r&E2~YG3vf5ebD~(1hOYihhx>*g~;WdhUv*8xQ;n`VE`uCpC>9 zirLUSw!ht}!nrdTHm535XFhvhFX5N)A3aP<2AV31q}8Js@x|MeswHJ;pxRo4Y_(HJ!ksIRk9IY4HxQYd*nA+_60Y36Gj$1}KpbfI=_9cQL)8M=j#N=e zG0-9fN}S8L_}?&xHXrs9mSFaZJJ;mjs_o3_G0?j2sBaS~lO7p3yXn!ADa#FtoOphJk0Zy4f3&40p`7Jda;J)i;V3!{*J+CJmhxfA>1rqcEiOGIT zs0-NGzVFBV@IC*h`S8cZmErI@OS{j}SiBaUJNNL~#&7AVr|>{f<)=`e)pa z$Z^+k@v2PHW8c+G)a$s9;{S~M;f41)?qy%Bkq>~|L%xn{c>2${AF)}kz8(XWS3R%hz$&<0)U^IJN#6$MPc{&voAQXs1`a zRlWbh`!R~I^ZtGScN+Z#@8`7j%V4_-HF%wTJz(EePWu3ll;5X@AKN&Je;K~-I_!GX zuPfM?`G3OxJ_guz?%x+5YVmK1e-%ONpUB?>FkKx-mj8y#^6w_y-_bCue}(?hv;K0e z`TP2~Y4dN;SJ9LHsXl;3=a&P;_TP}NA`)FEUyti^g~qb`h5V|eU@FTrU3vNo=V zw7DYN+W!Ul$FTT?c|D_lU%3tLzu^9M8NTvw|L@6d9{+}X4f)Cs`*rel zZ{}B(JLvV3{L414!_Ud$KPva{{p4QZ+MoZe*WBxQy1w~(1u-lAnWyVFc3mf2-@(2j zl+^r1seYNMekt2ucE-QYTY2q&6v3P#^{fFa$D`GNdx(F_bW50!al1XNDYxM21R+Vju*G ID^L*v06NkZ4FCWD literal 0 HcmV?d00001 diff --git a/docs/output.md b/docs/output.md index 734dd68c..27a5dacd 100644 --- a/docs/output.md +++ b/docs/output.md @@ -1,57 +1,112 @@ -# umccr/sash: Output +# Sash Workflow Overview -## Introduction +## Summary -This document describes the output produced by the pipeline. Most of the plots are taken from the MultiQC report, which summarises results at the end of the pipeline. +The **Sash Workflow** comprises three primary pipelines: **Somatic Small Variants**, **Somatic Structural Variants**, and **Germline Variants**. These pipelines utilize **Bolt**, a Python package designed for modular processing, and leverage outputs from the **DRAGEN Variant Caller** alongside **HMFtools in Oncoanalyser**. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation, and HTML reports for research and curation. -The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level results directory. +## Workflows - +### Somatic Small Variants -## Pipeline overview +#### General -The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes data using the following steps: +In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser (SAGE, Purple)** outputs. It’s structured into four steps: **Rescue**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarizing the results. -- [FastQC](#fastqc) - Raw read QC -- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline -- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution +#### Summary -### FastQC +1. **Rescue** variants using SAGE to recover low-frequency alterations in clinically important hotspots. +2. **Annotate** variants with clinical and functional information using PCGR. +3. **Filter** variants based on allele frequency (AF), supporting reads (AD), and population frequency (gnomAD AF), removing low-confidence and common variants. +4. **Report** final annotated variants in a comprehensive HTML format. + +### Details + +#### BOLT_SMLV_SOMATIC_RESCUE + +
+Output files + +- `output/` + - `output/${meta.tumor_id}.rescued.vcf.gz`: Rescued somatic VCF file. + - `output/${meta.tumor_id}.rescued.vcf.gz.tbi`: Index file for the rescued VCF. + +
+ +The `BOLT_SMLV_SOMATIC_RESCUE` process rescues somatic variants using the BOLT tool. The output includes the rescued VCF file and its index. + +#### BOLT_SMLV_SOMATIC_ANNOTATE
Output files -- `fastqc/` - - `*_fastqc.html`: FastQC report containing quality metrics. - - `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images. +- `output/` + - `output/${meta.tumor_id}.annotations.vcf.gz`: Annotated somatic VCF file. + - `output/${meta.tumor_id}.annotations.vcf.gz.tbi`: Index file for the annotated VCF.
-[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/). +The `BOLT_SMLV_SOMATIC_ANNOTATE` process annotates somatic variants using the BOLT tool. The output includes the annotated VCF file and its index. -![MultiQC - FastQC sequence counts plot](images/mqc_fastqc_counts.png) +#### BOLT_SMLV_SOMATIC_FILTER + +
+Output files -![MultiQC - FastQC mean quality scores plot](images/mqc_fastqc_quality.png) +- `output/` + - `output/${meta.tumor_id}*pass.vcf.gz`: Filtered somatic VCF file. + - `output/${meta.tumor_id}*pass.vcf.gz.tbi`: Index file for the filtered VCF. + - `output/${meta.tumor_id}*filters_set.vcf.gz`: VCF file with filters set. -![MultiQC - FastQC adapter content plot](images/mqc_fastqc_adapter.png) +
-> **NB:** The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They may contain adapter sequence and potentially regions with low quality. +The `BOLT_SMLV_SOMATIC_FILTER` process filters somatic variants using the BOLT tool. The output includes the filtered VCF file, its index, and the VCF file with filters set. -### MultiQC +#### PAVE_SOMATIC
Output files -- `multiqc/` - - `multiqc_report.html`: a standalone HTML file that can be viewed in your web browser. - - `multiqc_data/`: directory containing parsed statistics from the different tools used in the pipeline. - - `multiqc_plots/`: directory containing static images from the report in various formats. +- `output/` + - `output/${meta.tumor_id}.pave.vcf.gz`: PAVE somatic VCF file. + - `output/${meta.tumor_id}.pave.vcf.gz.tbi`: Index file for the PAVE VCF.
-[MultiQC](http://multiqc.info) is a visualization tool that generates a single HTML report summarising all samples in your project. Most of the pipeline QC results are visualised in the report and further statistics are available in the report data directory. +The `PAVE_SOMATIC` process processes somatic variants using the PAVE tool. The output includes the PAVE VCF file and its index. + +### Sash Module Outputs + +**1. Somatic SNVs:** -Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQC. The pipeline has special steps which also allow the software versions to be reported in the MultiQC output for future traceability. For more information about how to use MultiQC reports, see . +- `smlv_somatic/filter/{tid}.pass.vcf.gz`: Contains somatic single nucleotide variants (SNVs) with filtering applied. + +**2. Somatic SVs:** + +- `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz`: Contains somatic structural variants (SVs) with prioritization applied. + +**3. Somatic CNVs:** + +- `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz`: Contains somatic copy number variations (CNVs) data. + +**4. Somatic Gene CNVs:** + +- `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz`: Contains gene-level somatic copy number variations (CNVs) data. + +**5. Germline SNVs:** + +- `dragen_germline_output/{nid}.hard-filtered.vcf.gz`: Contains germline single nucleotide variants (SNVs) with hard filtering applied. + +**6. Purple Purity, Ploidy, MS Status:** + +- `purple/{tid}.purple.purity.tsv`: Contains estimated tumor purity, ploidy, and microsatellite status. + +**7. PCGR JSON with TMB:** + +- `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz`: Contains PCGR annotations, including tumor mutational burden (TMB). + +**8. DRAGEN HRD Score:** + +- `dragen_somatic_output/{tid}.hrdscore.tsv`: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. ### Pipeline information @@ -66,3 +121,7 @@ Results generated by MultiQC collate pipeline QC from supported tools e.g. FastQ [Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. + +## Conclusion + +This document provides an overview of the output files generated by the pipeline. For more detailed information about each step and the tools used, please refer to the respective documentation and help pages. \ No newline at end of file From 800ffb8797ca27af03c4da1683ce25b42570a259 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Mon, 17 Mar 2025 15:54:13 +1100 Subject: [PATCH 02/36] add general doc workflow --- docs/README.md | 64 +++++++++--------- .../sash_workflow_overview_diagram_Vqc.pptx | Bin 56740 -> 56532 bytes .../Slide1.png | Bin 0 -> 42808 bytes .../~$sash_workflow_overview_diagram_Vqc.pptx | Bin 165 -> 0 bytes 4 files changed, 32 insertions(+), 32 deletions(-) create mode 100644 docs/images/sash_workflow_overview_diagram_qc/Slide1.png delete mode 100644 docs/images/~$sash_workflow_overview_diagram_Vqc.pptx diff --git a/docs/README.md b/docs/README.md index b9a29687..9e8cac48 100644 --- a/docs/README.md +++ b/docs/README.md @@ -2,21 +2,23 @@ ![Summary](images/sash_overview_qc.png) -The **sash Workflow** comprises three primary pipelines: +The **sash Workflow** is a genomic analysis framework comprising three primary pipelines: -* **Somatic Small Variants (SNV somatic)** -* **Somatic Structural Variants (SV somatic)** -* **Germline Variants (SNV germline)** +- **Somatic Small Variants (SNV somatic):** Detects single nucleotide variants (SNVs) and indels in tumor samples, emphasizing clinical relevance. +- **Somatic Structural Variants (SV somatic):** Identifies large-scale genomic alterations (deletions, duplications, etc.) and integrates copy number data. +- **Germline Variants (SNV germline):** Focuses on inherited variants linked to cancer predisposition. These pipelines utilise **Bolt**, a Python package designed for modular processing, and leverage outputs from the [**DRAGEN**](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) **Variant Caller** alongside and the Hartwig Medical Foundation **WiGiTS** toolkit (via [Oncoanalyser]() [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. # [**HMFtools WiGiTs**](https://github.com/hartwigmedical/hmftools/tree/master) -**HMFtools WiGiTS** is a comprehensive open-source suite of genome and transcriptome analysis tools for cancer research and diagnostics​. Developed by the Hartwig Medical Foundation (HMF), WiGiTS comprises various components for SNV calling, structural variant analysis, copy number analysis, and clinical reporting. UMCCR’s Sash workflow relies on specific WiGiTS components (run via the HMF **OncoAnalyser** Nextflow pipeline​)t o enhance variant calling and interpretation: +**HMFtools WiGiTS** is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: -- [**SAGE (Somatic Alterations in Genome)**](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md)**:** SNV/MNV/Indel caller​. SAGE performs tiered variant calling with increased sensitivity in regions of high prior likelihood. Notably, it targets a curated panel of \~10,000 known cancer hotspot mutations (from sources like the Cancer Genome Interpreter, CIViC, OncoKB) at the highest sensitivity tier​. This allows recovery of low-allele-fraction variants in clinically relevant hotspots that the standard caller (like DRAGEN) might miss. SAGE outputs VCF files of additional somatic variants, with filters indicating confidence levels (e.g. hotspot, panel, high/low confidence). The Sash pipeline uses SAGE’s **somatic VCF output** to “rescue” missed tumor variants in important genes. +- **[SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md):** + A tiered SNV/indel caller targeting ~10,000 cancer hotspots (e.g., OncoKB, CIViC) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). -- [**PURPLE**](https://github.com/hartwigmedical/hmftools/tree/master/purple)**:** A tool for copy number analysis, tumor purity and ploidy estimation, and identification of driver events​. PURPLE integrates read depth ratios (from COBALT) and B-allele frequencies (from AMBER) to calculate allele-specific copy number across the genome. It infers tumor **purity** (proportion of tumor cells in sample) and **ploidy** (average chromosome copy number), and uses these to distinguish somatic vs. germline variants and to highlight key genomic events. In Sash, PURPLE’s output (copy number segments, purity/ploidy info, etc.) is parsed to inform filtering (e.g. flagging variants in loss-of-heterozygosity regions) and to provide metrics like tumor mutation burden (TMB) and microsatellite instability (MSI)​ for reporting. +- **[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple):** + Estimates tumor **purity** (tumor cell fraction) and **ploidy** (average copy number), integrates copy number data, and calculates **TMB** (tumor mutation burden) and **MSI** (microsatellite instability). # Workflows @@ -24,7 +26,7 @@ These pipelines utilise **Bolt**, a Python package designed for modular processi #### General -In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser** that is relaing on **Somatic Alterations in Genome[(SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)>)** outputs. It’s structured into four steps: **Rescue**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarising the results. +In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser** that is relaing on **Somatic Alterations in Genome[(SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)>)** outputs. It’s structured into four steps: **Integrations**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarising the results. #### Summary @@ -33,13 +35,12 @@ In the **Somatic Small Variants** workflow, variant detection is performed using 3. **Filter** variants based on quality and frequency criteria (e.g., allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.).Filter variants based on allele frequency (AF), supporting reads (AD), and population frequency (gnomAD AF), removing low-confidence and common variants. 4. **Report** final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, multiqc) format. - ### Variant Calling integrations The **variant calling integrations** step use variants fromemploys the **Somatic Alterations in Genome (SAGE)** variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on **targets known cancer hotspots (from sources like CGI, CIViC, OncoKB)Targeted Hotspot. Analysis**, prioritising predefined genomic regions of high clinical or biological relevance. This enables the integration callingrecovery of biologically significant variants in a VCF that may have been missed otherwise. -[https://github.com/hartwigmedical/hmftools/tree/master/sage](https://github.com/hartwigmedical/hmftools/tree/master/sage) -[https://github.com/hartwigmedical/hmftools/tree/master/sage\#6-soft-filters](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters) +[sage](https://github.com/hartwigmedical/hmftools/tree/master/sage) +[#6-soft-filters](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters) * Employ the **SAGE tool** for targeted hotspot analysis to recover: * Low-allele-frequency variants in hotspots genomic regions of clinical significance. @@ -58,7 +59,6 @@ The **variant calling integrations** step use variants fromemploys the **Somatic Filter on chr 1..22 and chr X,Y,M - ##### Output: * Rescue: VCF @@ -712,35 +712,35 @@ WiGiTS (hmftools) * File: `dragen_somatic_output/{tid}.hrdscore.tsv` * Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. -FAQ +# FAQ -\>Do we use PCGR for the rescue of sage? +### Q: Do we use PCGR for the rescue of sage? -In Somatic SV, we used sage to make variant calling then we did annotation of the variant using PCGR, then we filtered the variant. If variants have high-tier ranks, they are not filtered out whatsoever +**A:** In Somatic SV, we used sage to make variant calling then we did annotation of the variant using PCGR, then we filtered the variant. If variants have high-tier ranks, they are not filtered out whatsoever -\>how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? +### Q: how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? -In the current version of Sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that are not considered that don’t have clinical impact, in hotspot region, until we meet the threshold. We that wil impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be hable to get correct TMB and MSI from purple +**A:** In the current version of Sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that are not considered that don’t have clinical impact, in hotspot region, until we meet the threshold. We that wil impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be hable to get correct TMB and MSI from purple -\>how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? -Filter out as we Filter on chr 1..22 and chr X,Y,M +### Q: how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? +**A:** Filter out as we Filter on chr 1..22 and chr X,Y,M -\> inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? -Circos plots come Purple +### Q: inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? +**A:** Circos plots come Purple -\>we dropped the CACAO coverage reports; can we discuss how to utilize DRAGEN or WiGITS coverage information instead? +### Q: we dropped the CACAO coverage reports; can we discuss how to utilize DRAGEN or WiGiTS coverage information instead? -\>what TMB score is displayed in the cancer reporter? -The TMB display is the on calculated by pcgr +### Q: what TMB score is displayed in the cancer reporter? +**A:** The TMB display is the on calculated by pcgr -\>what filtered VCF is the source for the mutational signatures? -We use the filtred VCF for mutational signatures +### Q: what filtered VCF is the source for the mutational signatures? +**A:** We use the filtred VCF for mutational signatures -\>Where is the contamination score coming from currently? -I don’t think there is contamination at the moment in sash +### Q: Where is the contamination score coming from currently? +**A:** I don’t think there is contamination at the moment in sash -\>Do the GRIPSS step do something more than what's happening in oncoanalyser ? - no different settings are applied to GRIPSS other than reference files +### Q: Do the GRIPSS step do something more than what's happening in oncoanalyser ? +**A:** no different settings are applied to GRIPSS other than reference files -\>Does the data from Somatic Small Variantsworkflow are use for the SV ? -iirc data from the somatic small variant workflow is not used in the sv workflow \ No newline at end of file +### Q: Does the data from Somatic Small Variantsworkflow are use for the SV ? +**A:** iirc data from the somatic small variant workflow is not used in the sv workflow \ No newline at end of file diff --git a/docs/images/sash_workflow_overview_diagram_Vqc.pptx b/docs/images/sash_workflow_overview_diagram_Vqc.pptx index 961862f19a462ba0d3201ffb9b55ddffc982941b..09a04b2a8f203fc396cbae791dd3c90406feecd3 100644 GIT binary patch delta 15815 zcmaKT1yo$kvhLvSt^+}X1a}K0ID`Pf-2(&&A-HS;0fGl6K#)LihY;M|-Ge(M_@Dy} z%$q;=-E;0)Pgsk!s&`jcch&A)Usd;1UNhQEBO3Kfbu@Hx5GDv41OhRGgpYS!=TShQ z6@n^i7F1wKXNjLU{3LaQLAuG#nyn^(BC=bN$bgq0O*;68Sme`W;VE`Dw{hbaMOKO8 zE}O-!(&o+xKyG2q(dWjr#FlVOJy%Fn<&FpCTe`x&YTxlm(8ZY1ruJ_I&Rk|2ywk&X z?*oKeObkzPifhMzkxvF~Ebr-ZhH1rOX!6=@seJ&*n{etM0ih4&9MQgLcxzTp527a0ZJero`O;gi=*r+T6}J4od7Ac>5aTdCE@WIS zF;BA^&}qe))=_EQ?_sB|{Jp`qksuo=L`|Pow$Vy9omfB@CyMGq4Bsyo4hfwWZdv-u zwG%5e=M5c4-x3XJ{J8dwxP`W!glUxfrNuRZytAM-$LHnSXgmz86VGnHLK3Jh&j^0X zRt{kk4MXj9yWlsafeU7Zf_M_az^KoEcoYi8wMhiODG$6SVV?dDW|=``4APbEKR~Wu zlsf&oTAo{Kk-J_wtO&UjWzvZl#bQvVhs$*BAHFR1PmbxxUT({4*JnLwy~Cfa$_uvE zbYUcP$fbg8SW;9=psS$c3@M4k<1&IkMb%N*kLZA_L#|qe?(jO%u^c*i?ick1?sXSY z^mgV&AN0TC=R2-lty?yZrLT2196CQp(-o7BXYykw80NA~CQEjq?+tO5lgezfCPuk> zDW~p-i*7M#vg30G&8O`vsk1q+-lT`Qy_aFCd5P+D17cgKK6O)fD&2#DrGErU{}`oz zR4)$D$-6laDrS=>X_ICfKGM&BT1NN<`*BU(*w7nr^o4`StPsYlpS1C)mjX%OP-(~$ z^Et|n(XBSK`zoQEhY8fSYPltqcn7X0eQuChe)7O;Df{V#-{aO#eiwW`R)?7=cmHDf z9QZn>)zWNPX7(-YU~BW1IZwUm$X9;5{ciaRJB%4vT)=eOEK>OOCFVei2nntArc<5`W}r@8>NH?$PWzn9ilTkj<< zlOMO%GjpGu{q~cTdsU#33?D{&4@$gF;!`y~k%q)zQwq6%5lVz&W=k3TP7E0^J(9EX zS^{PU_V#+~5-5|8E#?eA`LvTijZj=PHnP_>aOAIH!m?@5L7aA2Pg^U5B%e%}>s)h*Jd;G`|D5@{%MHI5CtB@H-iKCGD7%N z&2PBIHq^&#;IZU=v!OvZ=~y#Sr?M4d+npWRL8!v`IRI}k|0ckH@%g2vN-*&doejmH zsFOmgb=dARvhnhZWk-_32h3}NM__m2hd8NjP|QTZw?*r({GJ)xO3&+THKT+KjFv$_ z9ixZ$IysI_FFuV9tA)r&snpqlwNBWZ#eDNtC(uEwo;0)8;^E)2_GNa^b3+Op%|IMte&OXAg|f2D>w_(Gc&&Pj~$-=CK+}OF9v^5!JxLak(q;>WTWz?`5*ZyS4es| z?-eEnuU`hCKuDgDSIa&oK>O*xh*Hh=WLpfP%{x=QVvmgu0rLu0E4pE#K=1=+kPUnY zwJQe%Jcj|1aan?_`?EuE>}RSG&X_3GVD}9i$jUEV2(CXIqRz99a|caG$87nz+#Ik_`Md`BtWRGHT^DVhc}~_ zqP%BPp+5GJiSM~Gog!7nq-4Pr4R_oY6JwTQ0Ia>1sI#D|rClyv~juRZ?o z-^`0ntO_epB{RnhXn(Dg0CS^w7~khVF!;Pcfb^zYjNEy|XajdHn-r}a{fmYs9p~!m z4RnS+n}GR-)N33%ZSg|rGfIL3i4=VE(RD?WYG9dnC=DcWs5kHY3wJzh@(D<*+I zsby!Y2gp?2?q*}!HuLD;pQliWEAc0` z{(V)plpy|Ggrfsj$5BT7XM)ifAO@Vmo1k+lWb&(F_tolP)^lwOsjN6ZADyL?4;j97 zjpM1+-~(tzz(czK@r~1HeP6WmL4sZ3QBxyigrt|gE`LWL6BSK-&~vy_>kGq@7I`wE z*~F?9;mGwZch@Blt@(Y3X*TChs3A4ih_&w7qwW~N@C=^tn^-u7>Uk375JGt%TKukiRo7leKrbMIwC`!$v+3S!^AvK@AwC^!S= zFR-@dj@}Q!XD5CIJfHQHa`s>1#$l^#Bq2{SUYLFpoGf7!_uSUyjRWNDn1;EGe07d& z#Ij!FN{xH(rcK1y*fd>g3aFbW`8sE~YGO7ilp)wamO)z2$kWIfaf zS|4`)tbWLH54@E1SaMS^*~!0@@P76%5n+#fvjOJ_5MIC56BcDXLrMFgAdntaw2@i!%g7%jfE_QNpqTl0HZ)_35V(I^UeA*_?{}Sp#r1ePS%?}6wxHv z!ldQzwf6GRuCWbkz&7sbPhVvzYbG3_M74o;Ed8S~do1?HZ>8z~v3=I zF{tI~wwo$y2xEQne6sgZUI{t>K~~|x=1ELDT^kcjXw!2%LRczYUFFbvB1?0spvL#O zMd|k20kE{nR+Bw%RU500C-ERC6(I}0*-p?!TXNb9;E@1O4t&lE9xmvb{rdBF- zH>rVqN_7+Iu59X)gcnyDuAsvfyF2Ls4d_Wvtane-d=xJaT%yA0x7WO%U7qI8yF14H z-QNJ~(W9D#V#ilx_$6lEkNYtj87!_;VWB0?^>uTr{1KX`W2JfflJ>b6ExwsoGA6GY zW?a8}Jd#VHxqGIRs5Jbjm__2IG-0m&wOdrq+Dlk$jw3WW&_=bcbt?0T!1>$Tu}v8ssg@?XHKY z7tyfBw)~tIMHT+o*C!;qMJGuN!pd8D=zr{43d@8!QR-g>J&x8hI?AqzrIqmf8e^ct zHrM^Wd(S&gEGbvXCX9)zQV;8JX_I2sli}U9MaUTfq#vOdwQCRw@pq<77WF1|(JTSz zTt2?}4$4l2^-aj6tjEWSw0}Z?2oGC;vfeA7fM+73$KPPn;UL^0ArcbtP8?VltK;>v5e`QO<7NcK_8&}2RPWyJRJU10H+7&bmh_UkFd z*l*8Vc0vOqihB3{T=NI4wM5SEu!?a3%INQi@=bHnyu8ViX5&+=Ve(i0K(+Dy?FBYU zN8HgD{Leuz$(-&%Xp7$=>&wRj!<$4T#jnl=@0=5r2vyj8tW{X=P^P193!+YL?;doA zABHVP0?r-RkF?~cU3kl0WS5GxZhPa;C*tgS%CQg$YKP$xZq#FipbYNTMSDPji0)g0 z%nvpNCc`d;OS~?BV%>w{f6#kMVN;^^=!lqZd5dW+%OBKaTQw)4ct*2ELQ zHK#-pWb#{kN1C)wtc?oNUVYxkeX4!p9z&5;NtBW4VdeA~Qnk2H&T3LLr)Q_N#!{^QX<(R)5nKAa>)cfCQ;P_cL%*#NN^oCO zrTKxGL{-|IlLXUkrTh;y{s$uTG4{(Ro1%y(j;#COe&ef<$-(%J=%qH_j* zY$K5EL2kY;C5)R6i5h%qdnCu>s}j_m_5+pQEvdr=9tIxtcqM)Ev3z=>8aT9#b!1(u z5umW*yd@{vAV_}B!rCYx(?O<~ING5A7tZK8i zu72jQN5EvTA@K9KbF;wS6RS3TOS)||X|WMmi?qge)Yled(vlNSi!SMdtVJwx&WZj> z&%a;>gtvD78Yl%a$RKBS@0<^A`%e)d_=h84MQ&s+XbL>!a_9uet;b`yNc9`f+?MY} z3a5}APZw^DGgN*5bGwraHB zU0Fa(-uFX~8!KS`^0W~V@3%_KZYB#qR}`rUNW~eF^tW&IIq&&O8x_GIJvz$eaGo^P zdCSDJeJL@syd!jpu~T9P&|Vc<)U9_<`x&7HaQnM`VE)U>0cm$}MZ8Uc*}k_DxJGb^ z$57?^4MxmQB?+c={EOY@*Gkg$4y7Lpi5x~*MqH0~>l#|>kXu=kV8-iMk2PG16jHGH z{d-_xhHJOY4vX}*&Wf3-Jn1O7tJLw^2UHPt!Hs$f^s1NHPf_9}I@jsx{eYzac?3uS zC@8%#2F^6WE>Ns{+_cIaM+9h4r0^Ktj^?j@+>RD{y7kNHw$#zMzx$K0OI;sOqweg& zitqjLtou<`7L7F_;j$leIUWzD0>X;8n7{%<_lVqJ1ae1hObAf6t{%Kx?uI=yurdzV zAfE-ewH{Op@7kFwZFXQ(4(*Mnl|Xo80DdaT5?oQdD zcKGBfXsZ|j`k32iaY2m$0fRp6UGXqAX*&c+J|3PD4d+g?+CtYkS|2sR6%2A_T)Vn7 z?Tg;>3D8i0IVjI~+7mx}x4FuR0C|(&wftP#)}UNm#?K^lG@-lSkZgQ6y>b_?@o(|} zr}X$_1gPij^K-2Y0&DRt3?Ma4%CaOkQB{l@L|Bj$sy}yZwpDjpQ7Y6MyC3b;>oLt_ z>Yn8n(6Dwww1`~e%3a*B1UCc*zDm|#iO*9yrz1C1Zw8MfrDc+@=rD_1+imAKK-tBbK0PoL2n{b`P^>KvJYW->|PV-mu?`WTXi82E~e zpTU^1Y%NrLg}ponmZCeuG+DtyMZ_O{ICjpwfSZ(zED~ zOzxHK-D+Z)H)mYurtnDjll#YXdOdMeEKh)mJHX>BstvIq#ygIQY!! zWKtf;jx2x&ARC@oZ3kWbu{c_a_bZb4g*eViP6dXv{CaqtSn|16y(5XmaI%U`!n5G+ zcU9>0jF;nPu)>WqSheQ#DgvY>sP5p)#GE4#91oJp13$|2K@aim5Et1l)AV0T&CbZH_A6;Zk^gr7|_#8Ual5c z2cZmdOw|L`!fQ9@5fAr;uvgM{b}ehGPCux{ui8c4KXDLpR&bdWA`qgo zH!-iefdDP0Y#KyXlndWWJ&(g1zAGx{-E6Yp@E>`_O?K{LTF*hV4z-YZb!AZBWS(zR z=9tLio#rxP@UkXymm?Jq(`-126+oUgfQ`>N_w*XK#>JdV52lOnFr~W78nsJ)TW#(+ z413NkTA1%Gt};O@kC-<~iH;bQyBBQ}lDSGTrqx+Q=iXo#zEj10dMVEECzB33e#E?4 zF1_#P79spAn^f4RvB8HvsqaPqM|zLW$~3qcbbRYYd*M;-eECIV`Roeq7_g|~UAKw` zk=l0u(<|4N#qc2fx(=nnv()6KB3bP7%96Tl-_k(L%|Y1SgaZwW$OTgAU)&ap!zT{g zS^Jc}K^7%CNW=g}E)LokbvI?d0seXW`jl^19k0oZN20uo5~( zg9M?JWlh3RlWzKWPi9-dwC;J#3T)qbr*wFDMN7MF_L9ShPlPVAqQF**S9&m`*~PRD z#1X7|JMVJp@SyZNCoe$Dp>+Ui0dK+sL@q`P5uZ1f~o$QtHZQBwTeNk|! zKsQ^UEpd8Vh;fbfrD4QXfb=+HdiT5^M`#O6$#O-m$b9(TR6*v@>iQ3Whij;u5=x(G zj-%ZSiV2;{0{6x<@UBb`-sfFS(clZj(gJMn(Z2-`UL_J@3VJcZX)k$oALMm`c@+fc z%4zFhS14fY6?cQ8myD#Zushzjp%o&@d08gdv`Li6tnl+$4G&DkmWgM1oIcj6%{I-{ zx90WyHk-^K|0VHucwSMo|Pk8=~eeJXC1Zpv%pseeRKmwPJWtJ8)?Y z9vc}2Z-Jw&zXDy#g=^*}Uo8K)xE3 znsS<-c(|&|1Q&6hf687F-A12zz}*g;(gT=mo9So zZAg_?{z<}Fl2_k?+DgFi4kQBw7-tZ4=F@;8K%d~^ramTJ6Mw8N(Zd-ia(S?xg*A7w z)CC6q2*iv6mM-OzC?3k5-7Mf+&|FGY5;=)VB0v$f5+|-U$>Oo)QN7fJ=}+ARHzFlY zI{KZU?V+UmU|hJoXQ{KXePvNUkiTer7^ZI*Zt~6>L;v+v_}Duk$RBfHR*@rfW-W;r zrcY2fb|GD<3r(&sT@GyTStMOsOEPpRNRe0e0hnnpRIP*+fCA3ZhU!!Z9|jK@-;zam@KD+@a^Ldp>RJ;1#;K)M!c z5fVU#6~;3+c7|H9>wGS>8Y(x^xs8>qqeAc-iz5C$v~p=5sEcOkDvKnemTzm}3ZXQ$ zqyCHl1)Sae6qndy06ty-5%r9W(;{FxCj+RH@$7LnxfI_?uJRclYZ!?>4eApr>kl8Y zJkqu+F66?_d~(*zWgks1sn_xR7`v?~a-*E{e8yE8c||Z1oG9KwQ;{JyU!of`lc;Hd zOm=XF^$Spej0}1@$ED+&Z(iR6j7oe|V1hmuzm6EOkJ2`tkG~@kI1gvFSHOJdf39z$ zs?&2{agufGddb+D3xC{v>J}*?hkjsCFPWK=zp|=Wt=IN07XQsaFKdb;N>w*L;}%7q zkCGX;IaIcxP0UPPYw|)ylsn;(nL4h@l*+TFXjSz(U;_1XcjyG8rmr{xR4%Q_Bcqg_ z=;gh*cqQ`&lN%40f?FqAO|*S{*K2J74~dxG!1evXP#eFaQzqr7`;u)0C|`NbUao6) z5f-X`%D1#BO9SPt(HB%CBk?cUU-hxKXAOp+)K1fRjJ;Ddf*OPe;lbiN+Snl7i{*XU z#eV=vy`3{95%oO*Rdm*}a0#~GWd`BY65Ez@NoPFjp$73th#$BRH0SGn**jg%shT(M zb}kDJpJ=(po+sO&d3t~F7B>_=`d*g)G^lvV6w$i1XS8?gS-p)%ZoD<4 zf?iFAE5udJIJ>TAeRv&F9S*=Ms7N}h+U~S)^BAKhlaVXoO{9bSyM&>hF#O!Z9Uvqj z`6dEaqLYbnzE=`BlBsagKDe!j_qzVvSJex?WoQxn8a?o7zuwA3>MMOH(w%cz;40tg z|A|Z6DNnlZ9qoCpx-0T+Mm~=@>$RAPxK;2|YBV%7kt;XeP7|2mUH*f^M#5&a*J(|fI1u*N5soA|2BAAJCl6i{(ksP0<#ov4jpsqFI2CXU|$Jw28REEkb|2) zR>w)~s3-#YmUe^cm4$ET)Fob&IcX}Wz2FrYb>k&MOd}Bz_!mTW&iig!++gwDzqiLgIO7+Ro*tc-QI*EKuWmNW#wi`%=HeIB~L5|IneGc%g4I)qbje4wQf$p z_gC#(%1gX*do95AA_5c(QZ4}oxZH04fFDLp13hW;R|>CQt+mJB+X8*-7BGIP%Im zUzNK%)c!}nKjS%nP;!qUTQ^=%p>slr@bKB&t))2)eV9$*R8ZVDH1G;B^% z+14`~QKv^492JMzVJ~_HTA=@PBxL& z6WJ>G2Fw$j>=GAZ?ab6hK<@9!V>m`BIS+l~HHH-1HdsbE4Ux@8=J zDo5BG#=WBiQ*-0->LETCPmyY2-<8@Jp7d~E&>lmn!+%TwUX%pj_rNn`4oU@gmUyX? znBF=!c6>0clS^IHz*=^#uPordal&@A$ZKPuE7HZCP-1QuK0SZk)_qn1-povwP2F%1 zy=^PeKbChBlr8l7!z##)rKI}=_s4o@XQk$hCxLWr@{=3{DEt3>jtbJ zC_eFyJijlIjVpwi0IM-ta@9fScWXD9{REc3wZKY3g_Id7&jY%V(Gu9etj!a7utwQd z0`rDu$)8qs%MtmwGWZIYtG{rrtW$I^FkcQQlKP3BNS%r`v{X;f{k`8bAn7WLebYNO zJHd`iUNB6WlYu{wds3DJCiB4Ma|K{)1>UDPyA=!vIMn$7g-Lm;IiHlYXTnS+q?$9a zo@M-`S9H>Ae61)tTYDy_1Eu5Y*nBHH%^;yfzH&e7U0ScV>)%%eO;La{L>t0#d(t$I zeyXr3rpV818>YUB-4+(N{+=j%Vrpx%Tbm8{nwA#JDfpIq&?Gwh^`}zWlX7%!Jd|q% z#p~*1X`2i$V6b>|Z2{6vOPTu3qEE8t?P_v8k+U8C6tk^OGf7wPTk?&nVE44T{XEgmuIJcOI-O!99X#4UWJ>3x)e#`39R#Q%8XoJbSibhD zMP^gxnf}!KgFqz)rNy~NtWC~dvE^yjKEwmm{9kMmCZJ|B>96jm=epI@V%KXM!B{l}bX_U79m|S|S`Vv^79c5ktYC2aBmxW)np|I*OT4bbmlb5WIUYo82D@3j^Md3;U~mi#RGpfcP%2 z(+sB&px-ZA5ui?_BQcI~1THv`FJ&FsTW%!q+b{2JKWNG|s_%$-9Efy1XORq(wK@xO z|Ip`WpuCa3M(U$4k=U(pr<%d!)jZG9Q0h*3W&`5GtBYBmoP%$LPp3eJ&eXA5jy% z;%^8u;V-0ml{+^AM*j|6f#`yzOi^fLx4bMscpP4*VV@y>0PHh?vL%|nmdrE zSN9fHW{+SgMxKQ2xR-*|w&26Y8?1+w(k@RR)SP$r{cyQN&F~kp?3Z0TDOgoQ1B5~` zaa&P$kHE1A5anhhiyWcLSTCB~V%~f3<&owFzuEj6fB>QYDUI%{443<`Z5$EHC!8)J zLZG{mobBk0jj@xWnMxurj`4MA9Y3u!FW?z1{M~SNfMn|oDdqx;eBm&Ye&1WQr7iLD z;k8ddP5I8sd0n(4Ti9UwaCMqWejNYJS+_tvMvnu7--ozD5sdcuc1IJCfyTu914SMj z!;dOufIy5W^U)G@f57CWH2A;lAb5Lh1FcKr*j@|nAA5)(GlQ@ z!{Q-3JgVoYd%^u}f_WaUxYy}hX55Xq-?ssD@ii^^ZPw*W<+-Vui2U<9fBcSYL zeq_=BT=PYmUZP53nAhvEofV!qls_3lBx_SO&U>cuIobjU&<4Zl$vr9(eoq+zxd--m zm~wy=0`xRp4vrO%07aHZaqimu{7=6B2b-0x4f0&;KjsFX#=*QFK8OM^A}|6(WBNF+3_>Q%iHD?T|bBO7_W5||f|f1i*CjK{X$-)fy!AXT%7 zya2+@pR_=kC$}5o5Fm{#Eyy9G1{G+K^+jys5I(0;zXW;+mhtvmr_r#mE1S} zMa=KpjA#{&VV>3Q+u&xdDv$#QuBIO&5J_k!SX5N zH8+;k@qPkS>Lr{l@~7Tbo_I_Y23StPa6&1M!My@j$W}p^ho{J1$`(e{_Azlq?(3q| z!}gYCEEzF(b6_(*H0bf^1u*@)ssEEOf#l7RvF*7-3m3)Dl4WD7xQ%NxSElzTx<0z= zh0$=$Sh?r(?->+!7ug&&Jj@?KgAHA4`fz>=`oIqkAF9($V9v;dPUsXk%*5TTpR;qj z^GAR_OJqf6vP{de`aJCQ9pur;TzSp8M`x2mjy^Vv8Vjy-Q_{Qc1A@wGlQ212MJSrB z6NzF~hxUibJKQZdp5u!Zqz7)teo}K#8m$^o$48U6s2>?6#Fdy=q4bL0t73kG3jRgz z&P#6aVB2ZFSeJOmV6(08km3jR8&Q=%Rw9K1i-E@OuJ=+|h}r)}yMMs0bG~TR=vg3f zc)$@lBOxF`k%|Ce@1{Sj-CrU=a}`Y@Tht4g$y8L>&ytnV6us9@;~d+MkYT39y_4Jq z{WP$IjD*{7XDjEd-`@virFKl3UBg+@-0pzuc_h{#_p^2p` zLkf`CnScNt&i2nYYi$HR*~F}e1obgx0AdoRTlK9D$X$$+?T19!Km_QrbVn$rE#Gf- ziq43!0#7RY1^v&qy_iY!QRGV5!~XEyUn5tV`#`LTH)U~$%uveBEv^{cSSs(WyKnCM z|9_0iNIvAnkl~L9zDpc(-F$FxPrdkZp!i!sVjeUXFht&4;+Ons}bK#}X1O zsD&1csrkwWAmOfc!QbnF8P?|WsDf6AJFL!50!m9`I@->)zi^2V<(8Gi212#(<^dyE>KQoi!Q`vKHtuoy0pD?^oT54m% zyL3N|y%j)r6wORN<6azQ);EcTlQ0c1721qB=gb6A!JZk^Q#R#Sn<-H9E5C{D&c1h+ zY$SZ7?XDTPmk)t@kr71|nmg7~1*Y6&C4m>_fb($#h|vvNdQ18z4IC{kv6))V+CWQ~ z>*oKH<9@?%X>XBX7SDwPG5sh zz;EO&D#dB{L-63u{3OtobuCAB4tDHnAGA1$xB;g&mC>4)Z@LltBy)?_OCIO@2Iwo@ z?=I|m$B66O;S(1ztI{0%&F8?G>O$7~etdn&4znAR+VH7t1~wD}_>n(y5DfD4(1J-I z-IH1)RhnxWZ{f?Q@5PxIvEj}gDFmAgJGzT%>IQ)ew*+qFXGy_6T;v6hR-;?fyUqIH?A0;5&gAFsNlTDf`Fv-n za({h7*TL!j!!J!q1zMvoVftUVRfb9oIiEGS&xh6o%%H68Tb#Xo55B{40&WGzO4_o8 z>jjq#_kY|M$3&9~)}eo8mcQ{siS~QX}Q!* z*c|T{IhIpjUH=UMdR9uLlcuxRp1fAPIeD-~ow=nMsyPGa4s?F2CVj-SG+fN$R6<%j<*zD9=3I^~viQN{p8!+raUjH(H$K@|6>J_u6KEv>vN1%%!|4RvY>i zQ}3qZeq;EJXcsOGvthrLsuZ5e@C5^jjpQ$GzZ0Tlihc z)(nf?G%@M6%|xHAFrs94e8f-P<}>??Tc|>flGBIAs-HfxQpZ?l-k=FQ?nDTD>W{k(>EU&Ij_)`u^m; z(ySLmuC4g5e*p5~Cq0l8#fcU*)4u)vx7-yW(vw}-eeSPxjA`*Uto!&BL|!|wGklWkelzo`VX*qlTJ#L5{Q$U7ysJzNbV( zN+{HweM0e`P`>V?E%w%kUPgQu{?!Q5^dq_PGw;x5ulptFnBQ*6LV(ExZw2d}hAhZ8 zpU1W_)U-O*b{JKRn_JdbVo-u9;{pqUr}ojeHHO};^|qa5X5`20<06mQnB$rq?|R6z zM{ez0T>b{Iz-tW~cXRi4J@9eOsSmS@L;0E>zyD|vh1REbR5HyFP4yl|lA5+eBCZYIlQIoUvc`lL>uQvH zSnaI#WrfIi-@f7M%RHR5nIgg6(~`g9T}QwQ&UPGdIp^zuRp#^MkEo5neK( z+Rua*vP4P0%dLK(I@r!?Qx4aTY}n#r7@Q;!``vBbCWdg9K2z*I-G1iLtUZ(}%(gj% z8T}nldlYFKBj8e+a-+1%5pN9I6mBOoly~+;x1iy*ZObSM`^KDVBebM+)!GTmuRUP zWFjW?-eV#~+2_u-hy`ONDOAQ?fUrOWI@9JLU~LGbg|X=5CCfE`||s=U5kR-Y`Q3SE!GxPzy9n9t(h|*SyrO=-7bE;5NwGpKK53* z+%NxM+xwrb4pckTnqyZV_0FR}hp zqsCHLH*3>M?hJ3j6ssC)&E2JkiELCKt7}!(+PTN!WkE5UM2M$C=@+Hx&!jI+sTA}c z-w&;E2(XO_>G2m+0!-Lu|YrK4pV~+f4PV-)2YMVw;nVK}dTs#TLzpIH8zu8>&t1CD!lzW_YF-nbdim4$@d#d0) zY45qmY>~F-Q@~_zykmCtyO*1}BhizHzWCjnwX^lZv18uLUrxX1T1Vc^9=Bg=9pD14 z4%Gc(QY#ERERw?qlooxbEz zIXx=*)_QsRFSR`#B+v{s>4zDyZ|)>fP>Xn)sy)XuMq(f@2<n8bM}pB+p}ZcIz4 z;Acgm;dny%evraQ?TzrV;<%L-xa_dAsRGM$IlBV!IcRi^a;2#4)lmgHaq{jX`ACc( zU(Dlcr1iILXM-<=e8+>gc4(Ng2-CV#~9O{R9Th(!F8#*xE*^~ z7Y)H5o@+NQvgAN8kM z>GE+<^oy_M&ch6&Q&J%&DvU7+IhL|M3I;hp7XM7;QgDAfThx19 z4ty{>ASRKX!PM;tW4EsPBs1mE*-1h9lqBE-yR-emn4qo%;3q_wXrm(f(>KK)z5UUd z=AGyy>QC~FP(M4}5H0JU#-n-s3BQ>Z$Thlne4eSZGkly14B`18ve#zcg}Jk;b@kH6 zC4sEHdoZ5t;+w>4`fu$1_87Xdj&?4@Rl;K?O%p>>HJ%+f#}E?2+GjkjzY96B*DJf~ z^cnk5R<))4m4Tzz;GwXT^#=r*z$aPEfc>|3k^O=hjqR4Oz3h#uh`HmsZ&+kX?!Q(a zhwgUK3=*bLWQ+MG_{I!>96LOkbibHHS>{{S168+-5aRT4c%*&^(AqmF&vV>df} zXqt85)d(t))T)+C7Sr&0d{S(XzCbP7s^eIseO1Lx)+#!o)FY4uC9t zMIg{XFV#^{c|rfTQdmgjgfWNAVi^*I*AAL|ihe}S0(Uj>xfC?E=?jDPv1t;Wu@+s;#H-y=Z6|%k{j5^En zmy}$jg^W(oqAu}5tY#S@Skru{V4=UHK9Z~u`%C5{Nlux+$f60v*K9pw=lGzt1Vi3#O7pS);e;>a!#e|5=5<{*miShrv)H2Jzos4t*sIhi` z*Hb-33yGVfMTOWy4(ypB!*j%t?+di3=}ySrCy>}*T&UU3e;M%uaUq$D|McQRQs)U# z3*Y>G%s7V$*;zoY+~sfc{T%Jz@inf0n;!GDkn3NxsK_H+UiT-Eu_a=NHc~o`JH%x{ z6V=i8FVo7e{~F`w_qVCegNrmH{ohGE{r^6ab;bK@*(Kyc0|Fq|bJCECMOxJ4K*(_* z%U?BDmuOL=Lm{_I^z69*TU}<*ze~+PAgX^G0ZGk8L1>nqqqIWIm*i1;Am5i{Ay&(@ zDD{wlWmX8&GA(LcKSX+&js7oyA`1v_P(dJ)|B&E30BKofq5JFfkqlY?2g5z&cA1s_ jucJct(P4l<#Qz~6WdstiLQBX6;sUK8-_XYRKfV78I!&d@ftVLt)Q378Cye}yO%|Eft>IR#3JhO8>(JwytjDT?Lhynmc5#w zHF{%i)i2cd&2Gl}T`utecJZc>jrFH%MdatRPxLoqmsX%hnV-4?eqgdxhs6$eBwmge zykGyl^j2A;#6*r|C0#(tR0?UpaxEQA&~cS<{H!htt<6TJOW zt_zE8UgvhJk5Q^r@dPJU@uG#spl1*x_4CWUZybW8;X(P9y%<>n36I`CV)U(Bp6JF5 z^7g9Mdt`gGr(x)h-h;4~%}vyXuoOzaT&1dPusOyS^oGfplBc`+8om29l(eg{7yMi^ zNeK9z=RdB(yy++!?Zd0K)&(982)_E=Yq4sLb6WC+^u@A!H38dk2bX^S_?)@l}n{n(~iT_nTeU=8TIpsco~;F;J-G_L9EBdJNAZZin25eTQv@7=uqH<-JcrMBS*5 zVV=%G)=S3nEVq$oSqwQ>BOkGJAGM_`7=&5t0& z>aJqNe#JCnuUMTQ&)d`7*1yI%QSUz4QQ$0llT;+8T@8LeEHe; zGFZCbQt&iBOJY6L<~5@o0bN=1wo-}48F9e3x~dO38Zm=x(HfitMOt)v=t|+lM#gqF zwF$EKRcXgxk)IpJ$-is-GN`tWjX5yEq38F3_gq6}&fnu&0h>RcJ*si-Hkw8abRAtP_vBrVsFi<&;RM ziBnj{*tM({qrgfR15f>hsbg=}jk&4`aOF^2W1 zP;bXtH^1B)A)(Bt0(z8Xa`E791VK4Mm+)w5#GS+Ka_Q(KGNw+M+Lgs(pkJazc&=Ex@yW3@v; z138PzGu8%zL@q%CMZY}?p0ZzWj>I}?m^TIpOti!YMw(s^n8_+rH@G4>XsNvGc-T7_Br%L(m;I;vd2jSBre!6)9 zc?U*_#l5Ox7Ct0Et&0~s@>HQ&=Cin}_?)!;GB<_;Iuq(9QstBjS6WcrWBsglCmCn=-e#eSma`^FA6e(g7}Qz45v?d(pFVbMTPEeekCJAw3eza@ z6rrc?r1fzZq1@lY-&Mx!R_mpkah_tnTd)v{I%C7-SLw5!dwv^SYaeDep(KMZZ1sv9 z%NuQ6sSUsR63o?#^w(HI6vy$a2-zQzA!UJUkW)?;$Zwmes(r34VaSQgRMmG|4AiQa zU`ixNnG-+ckLwiTiVaB!VukEEuOY4vf8URr!Gut7LLr3#Q&s(L>rbojI2Vwr%AC-U zs`5e(pFkF3Ffpt3(E45YKp+>ss*@N~6rdrS^tP`{>dE+``MK+Nn#?y$G5DuRW`bGxq&tjC9@zg)|=xM2Lh#c>fV{FeDoNVzKCkFe^D^wd{ysb>}0C0K3Tf3Twg~SF;Ek~mVz=Iu29;vxdf0G zJ>QPxXiDE??R;=UHG(Q+cc&!YI$qn8IJxVLmHv(7`*^gXQjA=3n8kQ8lGdUGb0?d z#qLS}+TSk@3s1Pl2i%9azM-ut(yRjfd>Hik+`kAEIHl=6;o0OgGk_ORI);iIeT||` zrA8H|e6XlKHbrkJTst&+8zt9aG}TGGp$J&u=QV&f&qUN6!GSG>(v2oAES?zfhMx_esaXQxMIltlhp??bk8~-?WvFJ|x%K0HHeI=7ALz zW00rOzc=)0ussapfP+kDwzSyIY(ZrFv&Xn(%WNN`hQqgG<425J4ikJPKNS*otKE|}GA4iOEZwW3~;HE-2 zb?0#cH#i4<=<>1&%zD$%r!p|d zU>R~wn5L_ozmVbH_P8wHvhuq`HTL z@M)Ls8np(_%Vc6j`GM|5)Tr$a)**{Fp3#w`ZJ?Q77SRbf2*JX-YyguvQRpO*iQ)W^ z4+j4`Cf?=`D+;K`HcZdvJFQOfrT2AMZ~O8VelxdpF?n{6#1`|ue7)j{wVT~2X}oOP-ss9w zI1fJEBhOBy1>dL%$WQ&CT_7HH$Z1wO+!RW*A^GP30zyqa%B>EfQZKq!bM??fgl; zA}V;fPrxssWrXcMqwDYwbksYi z9uC62&3y>8Q|LU3?T@>ttQ;V6v7v9l1qw*!^aR;O#07HV?0fUCr zd9Az9%|*v8w{%d(hPqT?`CWo+Fh&PZaEN*ssQCefJPT+s98{^^U3en#Vs?ckO5=Uk zVGd!PpC6IJg?-&|=nqzV56WQ!x$OzVS2-IdC(BphUO4DOV_BCuro;V@*p!bxzYZ3B zlk1EkPs``A_PgbFcC8qRRo1xwEN$z1zE8aLO?c3~Jvh~v2o~_j-LEq$h$-8d@TMw~ z5ikTC$&B<$n6rw+Yz8hEE3pE|vDJqd+$_F-K=yaLHaK5-|Lgb9fkKB1X&+M#9cAMH z>zX=#2fxvvRt2r4PS%Xp0j5Y1od}$5Ql^=k*ibB}g%qg#EH$h*_1Pb@7v{uR1Eni3 z(KCw_3l)Bl*1Tls+ac~<8%C2OKi_F3eJl?JIQ;8mgPM5o_OAUJtM;eNb;)(ZAI`t@3K+vAAM+$atYK`oA$|pmGG+dPD=xM{`t`N4m01VB zUlh8`iCPOsyKldCM+QeU@DmPx4p6c~^kG9lk^Dq6)iI@Biv z*s+XBbU(D~(sRyhUJ3|#Idx}OvL*!UGcCYs3LVJI*rvFB3{OjU2qlvXi(Wc7?~{BA zcjl9~LQgR>M52fGxZk3-)tq)K)U*+cp6z74h(k7Yl#AY$ZHXM}YK+ypNs@5#zUUuy z+8J)Bi?DLzQv7W06UftEwxHYtkW2Ugtl=l#;t1t4$f@LwA@N}M-s}(bF?#+QJ-`Xm;${nBrfF#e?68(FPwU z`0RRDD{npqo0ifN^|7Sl=Hv8cdN|jS5iJ=N85pP+@DCBb#Okl?6Lm{@&2~@^Fni)H zztBKOIjid=^ha4=jcO)h`ydo$NAkl0lhG1?fkKxdLPtoTM&ro{n)JYWu-7_{vEKRX z`^=~hJ~+JON#N(JPO|NXHC=f9D5sM&=dv!Z>KI$%gH`byXpHCXw9KPgIZ}V4_fa-R zW3(HT^iGc$plt8%>}PEGmZ;r&XP!Ye?oclcoRPr~{W zfAr0-qiz7A_YWKotM4wdzLI^A5ccDmE{oWhmND#c)w1tK!pXbL=>Az1tHg?(%NW6Q z*}fZWj2C-)MO4_8GgQF%X7%V=AIZ`-uACaZij}8ryMwCsa!h}II-vyk3Dm(sWktXv zdCjQU7^Ep-!)Nll7Q3vvy3Fn}kKVai;*Z7sgtaE)>uMQ-EC<89^g-#xxyLWoZOiDf zwel!wBe21#zPD>(E{O78rI>Tl zSdmKg@aJ5hJ2N$PjfRbtz-nX*pjFGMed@`v*AKQ8f8d}yjDzf}8So?V=hW%~Xj^n$ zRS^^pdUv^9GE(QafjV0T6?jZ3w04={5WK{qu+vgbF{mhKIl@^5*hjp2z&iCaXk`V5 zDKyVQq~|*Ph>TZQdfdbmboeWyLoK&YKtBfwJuwzSzvR>!)C>&_6%>w~Y3*KTEZozQ z-*ztc8hvP++8DQ))MMKMxCgg#5~?;&T&o#xz6|dEqJvckO*yO`LdkMS$EFV4Jx*o@?>Q2CC-b#scLpM5AY zg*a!j^^$N9$_S@0wITEec%u)r3A(<*dN?IKti2XWrNOJ4)Z{L9HS_npWQ$z4eoi*< zvT5tC)iOeKH$Ma)yk+5be?@ywQ4;f8H_UGvs%j&btLFv9`!ey+W8||^CG`1wz?l=8 zsA9c*y#H6gN?Gfdw*LsxFghcq*VLQl>)p;)F`=sVvR~^%P5djvCY7{2TP)ndXq?|H z2rlt4NUpC)v`WxUHg6%ABn-(qfBs1ELX>X5QuoXpYI9&DzAEX}Z7yk!UEI3M?tqMf zeQQs&`}EByTFx?0GG+-@B)*?AtqOMj?9OhwV?<8?k|Z{F$VI4UXOeY(mBY-t3*<^# z!R_Y>k!J_IPDV*)kKeUSMk!4cY`hIK`%ym$ouO%1^GG|*A2O`MWK!F%I#=PKS1C0I z9J;%HJUBA*eJQ)}@khDy)Se(ND-{=_-Ym2kp5js#-aAbrMJ562BcvEb8X4qJeJVAP zQM8}Hkl%U;|EO%*V!K#FctErpHQ0gBAfnsD0!X${AW zO&AEvSuQbNL4!nt^o=42^pvFvSZ6T5M?HXpZaJFapiSMqMt9?^qn}#Fa~DDE!IU)j zJ6-Fo1x_z6us%k4*N3Z1XtC-08qRO|aJ2yD4kB9mX_1HBFnrynmMIwRVG;2U^1PJS zO}5ETh2(coJd%DLe;)A-GX~kV1i=WI;h=A7WpGef(^Mbw#h!Ds`}V&Y^Ic4h><;R1#qh#jzzGansuo!%W|`44Rgw zoB6&otN1mG9|8MPN{n^>oQk-ys8?O$255|o-*=h^BorDWA6kSJrpvf8mzc)(>-gm! z3<}vgC3ySS;&CJng)Imd(5gWPYN_L^U=48cK2d5F0JuD0&g>V66K{;Z=Ro#pCWrsb?n z7Z0eo)7+3IQPfbABS%|1k0q*ktO!7?GZ;czhc%VKgk`dXA6DJjY~Y~fhmi*u5kLWq z2Jt}AAaGE^IKnIMg)-oQc7Hw6ouK+6p}x9JKvP!j9kA2B81-`3+N$J~%M z)<+#p|J3!MZ^)wEdT?%l=5q(oDGlslm0a~f-<>wz1BLmpRp1`k3Akl#LoiGnq~dnd zx_)7^ym^yV2nRq-H=u;O_{Z^dK8cl!dpMq>{zy6Ep8m}~~IzvinX zy^>RUc`&6EggIkNQM{n|SL?^yg&v3ddNp!V3M_ z$j8u8$(#j+)HIg7E-d;)r7xRi0UhX)g_A|m+GgWJsU-l!&2H=dj{J>O{?YbP0Cl5d z(|u`F{_Xn>yKVq$zof{?3_p#^+D2Af&#`oD^zw;>L=BlR1l?OFr)%Dz(@fvHzI+Bf-)w@&* zuyb3F*PsS$X@>RamQ#F?z>Rum z&`POFY-nQa#LL{8>*m2>S~A^a#GAIiu#3dn<(}a$RccEB!K4C_&SLdQyqGZYp(=7lM{oYt!_KrU35PqV~a;fXiM41sMUi2UXUl_ zEU+#%3+!f?-*mu1l;HmkF3GD)2)zkA+^|0^Q@FQ3VibZaytD8ABwXY_emB0GL)aN- za8Q5zB#bE2*(oCC6&^4jbw+!d9e9Zs(^g$tch>Xgae;lBv30qAvG-*B$%(F~$71|W zqqq|mhmc*(I|e%tVJ_;sG5vSJ&T!DDtW-G2#2OA7i}2d0GCGgqPnSxF5OG3}>Ftd( zawq^gz%uH}kF|o5Z#Jlcz_=op(Om?cb{$^@SwXr#YKhXYZ z*% z_xfRP(yo%r&4HS3;-9A7QQX9_fi&OZzfEToLw#;b)&h#ZokzW2n*^#~|3G`t@j~Z4 zUIBX#%L&Klrrk}-JbsfBGIG)e><{o)7U!GY$Wz0=@QUgaL^$)PHDC1+kA47%nl;a* zHkH%6^=5oO=0!s*?3sqXUZd3a_q`>G#l70I@^ZvXZUN!~yu4w>!J~L*G*bSJPS4(k zQm{bjLLl1K?!=)ANQOZ%vg(codsJ0u*(ee}- z#Xn;QxH4BSnWwM8@QKI3^H{H;^`f4-h!ZAltUC{OKTGQV;~IHRcRUoxLN!!zqlMKh zWx`2Pxm%2{b_Jt)l3O}iYPGe}xG}RSR68~{TieK6zd$P{Ww*^7i!%Ob({k!tj3GkI zhZppC22?aV(kg%YaKkcfgFJ>a7;2&B8Mp=LXdKLV22^!zjFL1CP(*__4Cx8VEgRlG z&|ujp+`Vfb;WKSNG&JIYzK6gXRpFpqv#R@I6>!z}rQZ`iLtMhzQ5p^Bcrk()8+4CZ zA^dLaBNGL_L$X#$epVu=*w`NGN$bt}n?aA-s;L^2TxPwU(>6g?uDv=b`(JXFdXfMQ zg#tROtgG><893AnN!te6UaM-VeC zJ%D5Axq`G$kPbm+7pDUGs5H{1o969t>GEj>Y&1TIeLTrcSamA7vhnyi$r}lut&HpT zLsOHjy9+O@Cq-v;dOh6}H5KFre&h*u*Q^o z=5x{vuA?~!qw)L!kfe}H{RGGMd%OoEvBz-WDwlO8`@Vx`3#tSM>CQr-Q1Du|RIq-F zl;HXHXbep>!RoZk8nBB(Z8*w+~ z4iNh$&K-EvWH1l&x;DhBI$oTj2~hN7%By*0zaF?Meo-v&W+n`#5|W?1L1LVgm<6i(7B>>EE<_}2&7qaBR$H{pTr+v@ z@7^X393g(poePHKD;Ol5KXvWiJsnx@+osO$-iL#1mw_?0-3x>^6{r98c2le6EV8BK z_RYwlc0=SdrNP0c?E(XYrd`fQqjV-uv>BnmFz!-<<5f#g=L&X3PkhR(@2rvc$De9Pbiu9!KyaWO#~vJnhnV#`UVi`Vo}JD=4fD@h8zBID z?Bj&tXDQbP@?&#;s7F-@YxjL=824^--6V^_H>cD~CX6%V;g`&1?5CEFbQo4i6YRvsB=)0Ul2AG{Gpq!& z5qP!xdyM8@ zJXMoiH~ED!dGD?VNQ*W_^KCme?+Ax}w>fUJ(>fmB+--WdhOvy@Pjkr_21VfolwX4f zJE9f|7@dUjq8-#ey)#kk*$V=CjV|H})0`)CHE~7@99()Uij=I(D~fM3QaF#~Hq><3 zHd=^!VAzj^!9jKo1^#vFbwE>fWj&c@o|e(8;Bl(%r#7DMq;i|HB!_Pebbz&tTw+xXMOZehlAu=9OG;H_S0&PgtBx3 zC6^8=M^JK&BivTn7`+EQu8`k~-KVX>q!{3!5FpbFycihu2r2_ki_BqQt;NH@s0Bnx zG|A~I!cSgv1qYS>NoQ`s7KMXm2Vj?gSWHF13{UeNctbhhCjGXg1vqSxf2;^W*hj{r z4+o0~BWrfcm3w!N-QIQFzh$XIZyDHI;nkbzGQ{(l zwB6rmjqMeo>x>a;H7RekG*o2LCoR=fNAjNy>NE=5>OF5j^}U}07Sj-e=+^evAS&cO zcxfP5!nt|8aza=lB-{{R08+QrQ9Yg3cqwP)|6EAa#q9P|lW1b-iQ?i-pv{-fDag{4 z!ia9aQ5bH&I(g6mxqL~8fTGrPRynp9ktF+;#EA2Ub>*3=eg0*6H%RA}} zD%FEy)}qU)i9AcDuJt`v3o|W9)b-*a%>=ME=c-V#QW>jI$_pK-qBDTC9VQLc#Vo;t+=3B+hW8wAH z4_9&OI%Eos@o6p=X){mi?SBxGV)q+*qIrr_Tg?6xJ|;r)uf1*3!T5?xB~xb$`MGuX z4Yy&8L?Lg-I43nB1kkVKp1-I;d+uF?VMr+fmi4(pC#(2z?XI^CE z&6)lLFsRS9 zGjW5KsU^?w-N)CrSL-<9hH0vd-NJk|-`FJx`~$}^zv6$Hs`g}z02NteR6Kjv1Na7# z9Sc}JIpRNT{Jjk&Na4Y0jiKkq#+UN)+;&&6dN*YyayV^Idsg)Alj-fqWP70%n|W8Q_nJXF`K{>R;djWRdha)b;qau94U8X zmEd^8$KRt}@H|Fy+;PpI)t&^)&- zC|mq$q1+<3<1P25_)T+TiY_7Bn`REY)C$(Ld!GL$?pnPd!wxE=RB~mUAlF{C39qqi zqkDn>R-sL>KzF1ueR&C%xgo*Cdnc&GyYQUd92ivO>7a#A{lA6W=NyE^DhfW#TfGF@z{T+D4ZC6vL3gOX*Jnq1JNJ6D=q&l35 zHD&1|h&8335NoZCA!{nXm%~9RPu5zS(4fnS0oXt}F>Ig$o)uan7d8>mZ`dtlu$QF3 z)d2XjzkcuN-P3*Gs#pCHL%%^bkCUL7cg2Lsp7+-?chH`w(cQzQ>PMFR{Dhj3)dolr z&rQ^W;P3dlD!bV;k{LVoahS^aP`YfwG4W z@I$@9s@eaAL1L$9?4?jIiaVV(_Pv+6R0~9mKcv)W38UYpf7IjRisWjxvn;ypX|-Ee zn*Q5hyFyvXs|>Haaz|3g5~BiZ;--V|Yu&2PZ}bm13L+;wGi)?_0>o*RH;Q-FK2uho zRTOd5oXQRkuBgjIgm_&S8y6m3u{9U@17ER`;@=v6!LG1=BlOwXs;tBl4kGX|j|1CH zLdDNjOI@5#teuMdt)tXme?Imt6ff?utQt0ZN--Kt`R6zK23aRTao=Ns&-#~{5uJe6 z9=pQ3HGJPBYzs8Hq^3Hnx}Q66kOM9Ct^9^<$vV3PMX{=T+ABvKcgq+B{hQt;20%28 z{?nqtdcfel&rOx1dpwP+VCHchKUjEe*o>jn<`s!MzH+*c>W>oN;>F4;^GONJo0Bd$ z$TSz`bBb`)wQU~iBQzY^ooR#vcCNC2LY*<(qMfk5e?J4sTsoK#aa5I+ffje7Uc7 z&i($Uzw@G<-Z}1%I_7smLV5ysK^}4IE|mgVQjsnEa8O5}P=OtHb|{Tgz^v1EXKSAW z5Z(KJx-dM3inevVTPs+QBlwp8FUNGvyjBg2G*I{4WOV6s0`CX&0dF8>)MI0=u94+> z;;0t+xQ^+gSHv^LBT+%U0dBH^nlN*+-lV5YL;HFbY}*5GTo~yfRNQNz=c48MZ zT1YFU{zp8!dG%sA@X?Xym<6FUab!QN1HkRC>zgyQw3886mLWnQs%;%%<41)207ko4 zZ%=RaJDbzm+`8|#vn`w2>d$$O$q)^xvLCj=w~gRxgvR?{;Cl5|f58q4MYN7b5THFP zs~)|(MJP(o;2=zJMbR}>{;@6c1h(1TR*i7sXJcg#f?=Za*FgB{<+=yLmDP5;9pnb( zI2M3~B0ARxbSC^e!MW)No@Q`e`RXu&hAQ+He9nXT9H8GE>e(uA_7)%rmOX2W6jTnH zoxBQBcZmVgE#B(b%4c&c=~T03jQVkI!)0dFIb#O&1uw&Ohye=aqZbtPVSi(Y*mP&gl#%r zT`Q+B_d(zHiJvzU&4#p;PIwT`M1frr>(%FlWy6UkJS+y8@FuB zIL^zwZYr-*EBd;P)MS}FyeCx0kc-yAlPrS58BghtIr_^SC5E$taMlU8{5pM`FfRAs zbf(=EA{DHln=%1`u8|r+axT}PuX>hhA6pbd5zAzO2;roZ+JuAXf%8Fd^mD*=jd zS_6F?O$h%KxX1htW;J3jBi8o?4((agZ4vC9UiR$^0QRyB>w&V(<|b?P8b;Oh5Kr{* z%KYxE9*U^BnGDz~!$DGH;A8-5>IeK=vs@TH97F<)Jx6$}(Gn0#4IKGvgdJuc~y z$9^ObxyWn1Tv=}w=vX`V=+1^aE3kvaw%9i))zYgagl ztj8<=OdO1x8J1p}qj(ddSja7f4y6ksOIN;)@M5}Xw7dxY zO^eeVrm^th)3c7YYS*d^f{eR{k*Bmu%Ha9*_0mk9?1D_W1(lOk(p=v-$cSgEdHib!D$KZy)|*__LRy zr>6QHJh%7LaL`iJT}_A#_~`HqcCxl&E_ijc{W{cU`sfCiK#^WX>Euhhx?((OGzLn9E;Kxwf;&a>*k$jp}#?#qo14Zjqjczrsq8U~D$HUX0 z;~#wheC84Qmj?8|(>7Fp!~3ixzvWM@FxmKT4!I#GDrJV_NJV{&Z(B`tabE8lW|=h5 zHBAt2itbof5PPcJW)~Lic(KfJ9W^Of^bjYmb{6%4C1olioz&r>U@pEN4r*Rc2CLoD z*~J{x0EFO=QUIZUs$q!Ot`eTj&))Lo(t+CUBsyEO#kc*mB_Q90qaW;+-PdPZv{yYZ z!K$6jm*x%UM}A&I-~9dLXV)GiG(bb0<(7RZRx=^R4%JDy$ z!nE@<8b-$&?+4?wF~#tbaV3lTii3lG__HjGcKL=9PEU+mT)r_@_G{F&``i&*h5xdD z`4XQu1c?z)R=@4BDRd;-AM>{}jb4wJ!;^e3L)EPL729%dla=b_p7D1^CT%@m-teiU zr1`84S^u8wUV9$aNzSP7tJ_jRQN}pGIsP6RD4HKcM91X9n4%9Q|Me!!Zb!Q1Z#g*Y!xZ2SgvS+|jRjm2MD2&}x746Jvf28P?< zAj=^2$F%y(^C~ste&_7Y75dmg{*imHVvjF(4t*Yv3XQlasFC6>| z4F*Yo-e)p}7UaH+5@YDO#T&9^G z*#vOVRkI|VJ9q`*p3%tdyL7!X<%jr_38C0m#8@&+{Tt zYc)V<;^%>2;@kcnpwJBA)anWw8Ci;Y;0(2ccn)LBy)uoC&8d|`V#=^`3GDrTwz>{0 z@$m>MymeD+tZ&kSC~_66U2x!c^Kn3ukVH?Opse@3)ntwbKugY-*Gp)YPjf(lIb1^fdCTzFnPNqBTt zJk?{-({*{LsG;MohRf@}k&%tlO4CQp#~N9^t?W$xw&o*-ylNW{K>iJvl=QZybRfiU zJ)2}tnACwYXVs=^7&%GBi7~H!>6gBTSwZ>ki|t7m^W4z4glCSmVi~}K@0^nXHfsQw zTW9yJu-}PDKhpzl#Ci)#7io#r`mz*`(0sZ_%rYwC%O(-sc52+P@?eCLi7*x(BaXq> zj-V6!oEQ)rB4PgZ_g8|?Lq5Les!D0$!KnKB{v|p@Tn`hH(YIXH(=&_oMAMqj!V(hN z{|rb{b?f{q%m8vto3ljEF2noWhOj_xXBXZ&tav+?)N4IpkVRpqXe#Hlxls zWNdlKqe?k5E>1gQB2R5Ed{Z$!Jl@Q9xA@Tk}3t&7^Vg(A+t+(#c}Lb z9;Z%0HI;sC*egx8@5t}7SX#o(3{ds&nx43L%IZ<#k)Gu2mgnA>`K`Iu9UfPb;ykkc zak>|Bl8>u#@Xdewp;?1kFon)DP&5+aNmt?fCemWoA}8YYA1*JowCR+I%@@s`S~BOJ zCit}Jyg!A;>yyz>JGTDtOeTxY9qcgod}D5O3N;!!r_y+BNgYZJ%iyd$KMa2LdT^rP ziw)o5s9|qaa@FqXkCycNkL6;5XLwdgBa^W=6zcc-y;aiH&sJet8)SdQd;nK?i<%^e z_?aR^B1Qxny1RSxc-r_ncsaPaD!SUbar?VC$9@`e?BXSUeG>tDKfW^4RY65HMBjmm zY5Ief&RUezpixdMGi3rOOGE$eJ-N+0+#pal-<-~Fn0!geDgT0zO;hhx*>Pg-Y?qX2 zZa2DR5Z^mp>JYXBq2o;%KwvT~X`yg?9{1Dhn5&vKdu+>6zH3h%xm#>on_S_y0t~PB z!B8u}?FV^NPFPxl<~LmnB^Re18NmL=AT^)O;`>KhLAC^N-acyANA#X| zt6y3zCiF6BplzhvjrR9m>UZPefyrA70$)E;+NZ>U}jdMP#xMK~x=hGq{ zMpxf6BGwZ(4Xn0}&@gMlborxRPW6l-ENa05vwrM#Y)XavJ*P;0Co)@D*_d?~=c2Lg zb9umx^`$4~fS%ClMC_XW>sH5A79(lBFfWZVXQeYkm6^i^Viv+@Op9TPU(ns`2lC!) zG^&3hDc@NVR+;-Ial4O%S;pnLp)Yc$7}m;%9+z9S)n05d$XZg2!;sP&pqy$H5ao+2 z(hT@)grXXMM8UVY6r*LJXxPex#oN*?K`l2Sax!>b`=$?oZko(S z7Fy8-{*nE8>T2iNVnNebD50L@(!+UHzWF$H{j?GjW-!0EOQIdLmbsH;Px?^Ua%j&Z zM3KK9jkUC$hI7nKe@_Jh!Qm(%4P_)`F2tVt-vPXkhfz%=4v6ZQ2(mdk!~&fGVl;FfjjJH@vP1CMNg+Yw&yle>A&EiE zkn+H%5Yu^T2<<#BG65IloRbL=(S->)o5qKLCn%B0xDmIAAg&W)$i@PHXVnDoA-^Y{ zBOeI=C&Yq~P4XaHOa2SmBB1`Of593A{H5?Oh&siC+^+I3sEvSt#=l?=0($BF3!WpO zkjdZBbpjjma+(Ke58^Scjx+%2m}W+VoMHYoEJ-tz$fLH9GFy6xARy0QH2<#r#O?3O zaWgNFvE2X8-px@$@aHIz2|OU2a|X!no{-2nRpj=-zYFm=|L)G;O?!hNNE1?jXNQCT zF5R7>gwQQeB27TV7oM|V{$~&ALH|U$fk0&c^c#`udk^`&@EWNGLb&)6sTnfX&HZ=J z+!iR2>slZm7pNf_)0mL)MM`Ak9?0<`Go+^nAL}0qtB{gKChEUc5`^HsLk58e{wYM{ zYWpDEON>;1pQMNi+5fA89K?Bv$?Shh|9@z#4*lQHg8#Xy|CWLX>QqlaApC!tBXTUi WA;n9SxEvr35G@V}Hh)da_LL} diff --git a/docs/images/sash_workflow_overview_diagram_qc/Slide1.png b/docs/images/sash_workflow_overview_diagram_qc/Slide1.png new file mode 100644 index 0000000000000000000000000000000000000000..7f1e437467a5291acdf58118eff93b49f343db06 GIT binary patch literal 42808 zcmeGE^Lrjq+xHD8X>8lJ8{2HqIB9I#ww=bd8ryc#*lKJx#ydUF>w50{{R^I7o_6at zbIdl!I%cg|+P|Og-U!8Sk_fQ4upl5H2+~qw${-+Mzd=AiQ=q|sD;iQt5x@b|Nm)_^ zxd$jS z%mM`bf9}x&j{p8}0RMn*|IhJD9_armF%Rs2?*{vw_vL@jK~w&HjbJW87dXJ!OKCa* zkMix`KhS6vkaG|aArNUXVHJ1K^K6LD)XiDW_yR%<7!98WI9{V(P}!|Q5HAa?zZ2jkE}qT zKQS>hG}wO*X%RwVY!Pra!a~T%;NYPDIRv602Z(Nc`9GJK2|!V}~AoZ`WF z%J(9v1P1M9Duo>P>P-^S(6VX9AMFb($Kxqsg%UV-6b%2{S=|u=K(M$z z-XHZKiodyouHtdpR%z4)hQ<=`Hd`$-GBuWE{EkG3nr76f-RTR0n#tkM= zpPmg4N(KhXXph}-+T>gy7LoDQ3KJb&r9!2=W!rb%>SE2ROgf2@=6gwSC6eIVZ(W=* ztp9BfIdP%pp$xFA!3eaGI3gmJ0)(b%X- zz8>3>MF=4T1))-STYfRFHmfW3rsCbI^W=_{(IubGA9i0}PvuZE6uh2)FC0jp-9Fs9 z7@z{DHizp*bGqI9lotp8IgzJ437qowl3SZ-87b=e@5xvZGZNSXa&xL3l7ZKAU}x!; z$MG==*?j~w=EKMD@9fGe?Jjgy?`TxtbUNKxT+dfd=87woi)H3ZJHX^>D5MdV4To_Ty=*%c~4|saktvExW&5xp+K*OtQoE;(DGP zGe^+RH=D<06o<=RKb7$ZA^}gM!9ZvX9%qh#50BH~n$>m*`$I|MN1|T{kvr&ukcj2!xh1A#%8@a zo7as|CYhQWC^3m@Mv}+}o$4FDe3sbf$9sp%nS|kPf5=zMh2JfRgnVV*rg-e;isi}_ zDK%``&DIeoP{#fc_UNn>vOnLR_Tp^D<-V~NYP<37UIp}E^GL#<^7`(M_fyDpAOx!X zr`Hq1qX;=W@rAdcKTXgFeBmyU&*U%~iEWaAtuq4JEF5}R8+eZAoggG+vgmZ$%}Vu} ztM1q7v>vEmWh^W=J2@c{@Vn*J!)BmR4RNgsCo{eFaHy`tf&J4;ma=H6vE0NIJ=M zGL7Y6B6a!s0UQxQq{sKY-Swhk{A@DU?*qr;i!yMN5Y8;E`s zJ{T|10|Y%+x)=hMgQ6&x0$PPNuo zRS>1ngN{TF`!V|OC32naH^{j0kZGF<-!$HrfBjewJi6T6j(#cd1GudT{u zw?M6r756xk{iD%&H_h8(yEzVY#t>Kx-EQwT)2WQTk+>uwgfHvjQJ77?OtSY-2^L`P z#ijA6!;s^S1w6A=E7kHZK%g3hq3xHdxpMD@u=L9MF-Jj87RtY8a@zfp_6We)R0AOw zTZ8rwqyB*m-p>sdZA92uk#;a3d=B+DYlWD{<945^=gkr5{G<4C5Bm*@rWH#_$|EsP z7cJY0e?PAWkVc_^zJ$Z$Tt*udBZOrd*=4^AmO9F!fddJTeLcmw&CENU$*GW%&SDhG zXf0J+$S}~JBh7*f7W=+@%C~9 za|g^bI}YHJbN1$Y;j^M(XkT8%9nSOc(!>=p%07r0*UhO^-#Z)@nMPFO6eFW6)yP zp<}i|{)6@oYbX{74s-tl3Wn;07>|TR7%C+$1o5z8Mu4nr7?^pP#W#EWh(1xFrntLJ zV&vio`F-DRB;$pViq1U4a5i;7$eGJm7DQw57u;I=oaUK4c`SCv!5BRYf|G35yu5vv zeK-m0>aJSp^4&QPpWN5#^a4${u5tE}t$=SPPNzMQEu1!`1~+1nYpkl?4wdw zN^4;(ZO7n#WtG9(my9FK*3TmcR-B?JZmFFq|C&&7P;*q3WVsTv62GVp4yNoh^u-Swe=#q%VP48sj8lWq@TfFhUT%Kx!?M zcp~Ouel55*Gn5!JP}=IF;K2Y#H->;Rxo1G9>n3SvAb-sL0UA1Qq5OBf2I zY{yRt>P-OV@TiXbPI4J+S2{9QgHd=N;jsIwN_C~6K#^6Zk;Ri5C6c4o+6|0pO+y_FyDMM26HG{H_>_(1AML7}vifM(~ zX$zkK%7(R1 z!YB4aa=V9A@mf5)uVFH*&U6BrJYaz`c3Yk4R8l?1h$Z$1Ull+XdC!+mzk(oRmX)~9 zryq?D&SWSu%y>RtPQhf*@`0*(EtNSVkAQe%demW5$|-lQkK$|$<;Q2IbnZeE8UMk~ z(*uEe?3tRl6=PxwgU=je1S*55g8mk1zxf-DRxqDb zJd)V2EcRyF0LMTST#($F!kkwqX^JemNJCjsMB_0RpAcS93R3O2aZRLF8f(REkw-g6 zWmwijS0iAIspZa7d>Y;}i8`{HzU zsX!>W)#-Zhv#C;zKCRM~K##}Ib@*vEQ>{j^i>uY`aY0a$kpF&|xXEb#MY;z`u+|LA z?Q^Xy=B+JhB$pqtc5?qOb z?}KN5xBbIpRfN%W7W9J~WtZLZLkiq>D{bZJupQDV$yd!K33!D!Ip9@S@mY7joYGx< z$Q>sGt9(NNY2SK3^{~GDBwnV09s3kA8lpr2@>#nIND@Vt(L4^~Ay>4~tP%B2X$%%O zDa2-MLLG4vz+O|me|I3#BUGvbcc$Gn1|gD%+~7p+2}?=%ia=aRgY=`qbb(jTxw z3K&=ri1pX>vFnmQqOloW_PpP}-3-$8V?>`sYL8OL9ILE-=Nl(6smtGNS2Fsh@i)6X zAcRzwCaup%p{=)&P)NJ?Nuej)q*wC7z?ji2cE_Z|>U8r^Qs!@$*CX!hA35B+zEFg- zY?>1}oGs1=q}!;0)|o0XcPiB=A3ytPLJ2WsE1u4k&VsUE&rOb4dP2#Ve8Q}}E~kTQ zcvIk+3~9)VNLkDNa)&a-ei_)V2*Sr`A@g{{Pd2N52;KW@9}ve%g^ zAME5AUHzAslXmMk$CvuK7oZSxp%}KLsDvT!71z-sN+ z42OwfF8Eobf9Fd{^w&3Bn%*r~Ls@ru&gmIZhZIgbUAPCG_QsMz1PVFkXlO>IJNMmM zL(|74dA+|dzE$3qoR%*JU8j#%o9_jSuNNK8f=+KMQ)lXxi&pmz16ot-E?FFhj~#Bd zR0`>*%S4_|tB;j4-6w)iCrKmWM?}6isz)4Crq%1s_Akk@M>DE+zn_vQuNc;vi7XCZ z7U`a&Fkh=Y2GXqpebSAFWAA)Q6|ihKTWsDlF7TEaGC89<%oGT?K4(r!Yy_V>P4q!?k)Xi^3<5Zk)O{`!>0cWqZl`omN>u1H)FBnQ{jynR4%OeoS5TLYPr zT4F^s7?l=x8P77*=w>+C2<3QG^=IdILJSCHvaVjSZzJhq9-zPNmuvL&SbCvnWKzIR z>%}5acaZ7S=!=m1aanM_Ku3aJ*`MWMa2b%11VW$?IkSY`x4WGESsy{0O(#!|g-!Q| z;06Pq$#(hrE9}j5)wS0yy(L>ZCa3<|^nEgD_^^NAAiET!xWjN8L=b_T$`=kiNC@*t zRD9u+o;zOYsjF6bC}R)q$m8TPvtYMZ49wm;dK}W2b5g6*pu#3k1tvL|Hh|rX#D@8| z##t(e778bi*RArOj7a@2G{bt;VJgY|w=p{w7dbFqmI_QvVXlYv8(DYG_xN#TR`N+i zw0rbKMeHu>3bd(Add-YPcWz@?$9Tsngs|9^qWd6P|XcgW`rP9cbIbtj`{*e38h_xX8RHoG9 z&_5_`r`w}eqP*C8vv;*s4^t=FgiU5TNlLLT>{IXlaL;qa$)wsnmnFSte&1EXfb`My zx|PjqmCES-cAP&~qEfTmDiO2;Gh>xoZ!*E!%8bWCarZ%{&6bhJWGrTsJ{pfeH4ujM zc|Wf(JCh`V)7u`eK4gX8zKM(!{tRgmq{>^#9*iVKBDkA2)H>#GD~v-E83EPfXpIn# z72O&9U;u9r-)9sZQu9F3Qg$4a`m)-G*RJim5T0^+R23}#(oY&g6$KIS)Dum@r>Vv@ zDLTm_utH|05$G|jun)Q5Sixt+l?kJWappg}@rPohT9{ux8l`h4O@(@9L3YC!9Q}ZA zS1#%>MYyu>Cxph=?lL*EGW#@QMG@wABa%Q+UB^R|pHRjHO(OatuH?C~ci!p!8j(&B zKK`YX`I+vBd0jjCg%4lAdr9;#^uc%3gFNF{!_8%ezKL()C%;Q=z42t}Szp6xHCan% z8+ufI)*Fu*W=kVlx4y85?||s|GQXi}gwYhYn952{KvJfR)yPsHYq_wXu3T#@X@6$L zZ;b65b%oba+LB|Bvopc2eKVn3EDD9k2~F-6jbCp&btvA5Z$$A}k!w-o>r_xzeus7^>qkO5+ zp(+}!7fsv=A1y35NuS8fOKQNz*f5rb5;MdyC$Zja^9-sSVi2wr&9a6d;DRAhP@TC& zkK}9`4x`=?p)#To;fB>y55)I04D9k`3VX{TuA2qo^fK_}$z~8^Gw~;80hu~0By#IgA^J$1P~`C z?}zovQtcnO94H0veVCN%8>cP;5Va&}hVUb*Ii-I?G3X_VCd=>ArxmTqo*Cts@Ase3{m!leK;0{w#%-V<7=ug_%eAb24H7O zJcosZ*d59kbq6_5YSbOs@an&Zoi6aqgAzcs$gz|&XB57Q3eZ2eLiQbr*Ylgl6-LLwU$X!TQA zR9=fc_qZit^j^rLL>&gM?vLx$em^TIqfUznjZW`$nh#9~dNulJ_1@L7uU>{tT>cFD zN|9b~l6!k3m5kHZvgKL1vrh5q^Df>JAXs?at1LP_Wwk6|Hw%69OS$^d5rZ0?TPAr% zi+zxK-fHw(oAX!CFXP83HfY3cZ^OrCZBFQnz_Nhr5x&9J$vCNZSLVxGZ6kP*!inBN zxPkhJ-3U(Ma?}K+1yjeCjK9hh5+Gs}hgwAl^hJ#oge;=mJV^Irkw`q^%kI?fyDYJV zF=iJ5FStU;IL{t|NE0B#ZcH}>r>4J0>=DsLDG1DjzJNkv%Ds}OG5CjI9uuRX_EVrS z6hNvO=|Z-T+2MN}oYnM5Feua-sodn7j>v-Bdk$fIu?U(a3{d8`igKTFhVtQ=1caF5@B(v21WZtu%+`=IT+V^lNT!rX73_qqfEh= zf<-f_C`|}w+a>7lD<_QTYg^pW!(RU~rz>nPuE^tLqjlfN`|@>AyD(#Vgr;Pc0yWf99k^H_uA=|JXX`DuuW8aq+i; zhVc91ABA;aJz-GB5~e4tI*1B5rhd4dY)#-W(Q)P!#@)}Va_IO#F_-5(zZ#(b^l9F+i? zbSiJ=HzRsQM5Dz#srC-PaWY5iQ2+7WjG$zn=zP0NCh^K7e1czU+U$ZzuD|vsLFPiE zP~K&QOK+yLVm!Y^5E$khBzGm|%xA@;5QU9Bj0u^;9Aa5HCV?Lolj^YQK?8*F`+D7| zdg^fZI{i+zTua9Z=Luxhl?ORg9l-&3PD`0kg2-=Gjkhvqx@58RKJ#hvQ+L-{ z)!#w|7K64i?bvQmS)=z1NR-K@V8Q`Dl9BYygaT4{a&Ml8h;>7XbHYB6eK}16x8Rm{ z$eChp|}n=iJRbi>z`A z_V#)$$x`F)EC{`R5!4rH;$u%Xl<{pe}rsmRLR+^*Doukxy=(AE&0@` zD<5@e4VAuJ(%#*#seWvlix`~|-|RfoQHOEeq7hAkqNLjHmYgbkc!dbQ4j+W2;A%t}uwYp6Vn|`HR>o^-8EtV`R3_~)R zNS`1yF%pP0+%pM3ftt~6^K{Sj*j(7CXMj1AE~#2|jur2R+XP^TUdDn?uIFHT134FW z_&jV3_i!Vg@b$_ASjhay$G24OO#b55Iv8u7;R)pVINeUipi1O1NMq2W~W|n9oI>^vR(0LuVd9jiuQ{9wnKs+&#x?F%E!%T@&PZEEHYoN{@G2+BWq4A^#j0I_R%17 zp=_^V1ZcBv2S*k8wVA}V*t5eTLQ6F*dpj@xcAE&U|EjDTXv5sYQDfbwU$N17X zW@$-hv-fb1p$gear|90YT*&h2HLn(cj)<=4c6Tig==$oDDO^SOK)iQ{u#yPi=YuO& zWQO=Z*|$boyWj0T`6%W(r`N0b)&x`vJ==NQ>|Sr1Vd-a`sn1e@`9WZ}&Jh5$y8a5T z+vSbHl6Y+Hb5RE^80h~4eE-N23%88~5YW4$f#RNS=TWx6L9M{Wc4~*k906gh7YP{| zA&iKR+&~GLnV7gk1zuljE*05+jQv;zFtqYV@&r>^*{{Ysy5Jp@o*3(Af=rJ14Piz`(yuP`w_2{D)?m*>iysxWap zzz@mdb4u3(VEv%Odw2WUj+|w%WMq|qS}~&q z0KtqzB>LDSW^gmWx@QRFV+rJ2x$JHLsL>>u(0%FSR3KqyMMokUJ`*yVYyVLsi+7* zm|_vXvdHi9G9>@{O1M8XKH_ddW}yEJP?xZVIO9Kq5r9m**6LuCOs({P{c_j~1R>yY#D4&!27pHfn>_$>JOFs${^Q9!RExao&RUZdK)}|TZPweI zjug4~0YnZcyF@?$ie-I7tg9CP6SksNTA-a+H1JVJ-$Bb>#ELq`oi1Nrj7JXMk9!15R>e) zg_7z7<%;=%H<%~FL6GIz08VLXrAQ6*hXB*8H#R_Q9);hTz+}*}UahC_=QZE_N?7M& z3Ub7JD&YOZ?9Y4L^TX*%m3_8|)${%s;DBs;^?6K+c|u3i+1p-MT>!#L4AKguc)eI} zk!E`wi6zJ!ED(>n-0o56e@f@HBV@wo2I?ftF9fzJak18ie_{bKap66~fZn-c z*)|MrN8=ZhDhdmrUjd`8^Gl$AE{T3(aFfTKDMnJ*m?HqD+A&JJ2O&a~(pgPZe9xAu zRb&715VisEBOwp=yX*lF8Av2i1RRJTELUqUr`52dZHQc2tT$f(kMZ|M>RMCzu6<5A z(CBmzu$c5^IK($MBSe+nfBCJJ=zSiivw8C*ecK!loJ+{TLA{`ZIT1!)uX=g_ecQ_% z+_>9oLgq6>gdJbt^?bt%W1F>H5I~)(Y~}#oqR$Ybizk;xb$J9j%Z4~p1tWtstokn< zC1tppY*G-A zhMDOu^;!C)@f?1y1>@&cv)uCFk8fFAEW_$9ZPu%-sDPeN6-Whh4hN0Xz7OnjR-T7V zD%Gl0D8V9S7AjO_vF2geL1}$%_ocJTQJ-|3&sSzi=bv2}Xe(sZ)0NaPbcs!<14cjI zUuRq6^{XGId9UJkexBCL+{O48xQOVCO9CDfbg+5`wJmVr-X3IT*_~c6Fl2XYVqFcg zenga%@)GSiYo*;x6yulsllHg$u_S3Sx}Dy9*1z}wP4;sBf|*O?_ojr=fXT%!9~GQ* zp|zl#`is@*VmSYeyEbztx}cNYcB^x23GbLJmD&QnDP}9vC<$Q<9lW0k?9Tv-KyJtV znjZ?Y_)aT;H2%RhX#K%RzYf2Wcz8H=w3 z<1M^$WO$<_kdNM5R1g{RbYa}Vand>t@8y! zL*TG3x4JU#Wa>?)DkzX_6!Q6o(E9+z&-_cI3mTw1w@f#t?-_SIoK|=rr3-r!pF(>1 zL})~MBr?=1Di%QYYhG@&MR0*ZC(mYT0_zk^xADITb(EJF`UwlH1R7roZ!)`Twl2-S z*|O1mmR?B7;n(N_7`VklH1l#|8Zdj(=(Hl2S|)NNa6}c}wBY1k(C9(xhShf92Y2y-<)i-@j(GEt7aWL{ z{lPK?pNCOE1r^~GM+ZytSpRmTGS1o)>&6Q$=zl`*3QMCUsibSEc<_&CCl!OdDjIfv z&}-BRrMMlgr@PwBIRvHmhrpG}q`+E^qK)SS)-Vx?>uIb0<7Vvq#*v-_l(&gA(Eh+$ zbBhZN_7UQ*2_~69Dvpi}Og|0RWlsQ6SCw<1XX2TJ3_+pK;I#V|=)Oh3~eX**iroQxtI6I411gU+<`fa^l8gM+5EFM7+-@y{TPh^>$s98M9a ze1uM#3i8A#O2GgBZ~KOVpk?9^y>W80i5mno2fdS7ck4&m%$LN}*VlVRYX5ZJ@NeOmDC)0@fCN*3_$IA-RC?^us8X){q_M(=c}yWdi%bf0sx=I5zzw~e z|ATsg!PP@Qfn=dBlB8sU4hy-EEWF!9E_me^KMx~|Xw=}hYR!fc%M6MP2%|x*;=R*f z{Nd5ovRcT{EznXsHTJK^#e2SyLg3)k4Vq^1e$TfD+AX#y{cuR|E-IOlxD;}j59q@L zRY>M=d?NEUqyj_H>s|4YA5kEgW2=9HbK^f<`zWSYv1d+X$IB;*BR{m$B;^)BJ`c84 zr@0Xgf%T*6$B%02gDHt#VD7wIHyoI8H+w*9%DaM?{aqm5h7RtH1L}bL_n?%l69#Mv zI?Y_U&Q|W<#t_&~qJtSK-=HUoNFrY&g5@mRkBV&YDtyyI`R7G||5~7caCGJGyxsF} zDf8oJ+EoLl$1P%FTtp8HaL`W?V7~bD)a|bZwrrikLf%~as>tAPaKNZ~JM(~P0k-qk zkidukxr*h@MVRZcqvr!>ZCh9xCk~Z+sg1L;nY`5&400v<Ph;PE;UaMtP0d^=r2=c^<6Y8!lViRu@Ut%#|032$3b1Oq^cC1wxODyS zu&!y25#OnWQsEklW400IIw@Chna5fAw9%OuZLoqJbNoxEVy;mxHl>`hUE_J497cA2 zSgCm3e-IVfMyk5~ar7oAA>q<~6t{h`(ITt2CTG3TebP7Z$@gu^MpLg|?SwLyold9t zxIZO|z9Q@8NM+Aee$;xt?c$>ciC(M8{iziN)8y^=Hc3E~uJQRa`=_2SH*+wsM1G$A z>_L21S`Jnf79zR+mcOAnUqYTuYf+$eg^r=|uUVLo?d1oAQz?;h(`dyjsymvOZ02`< zx4*AHTF2pjJ^l!Kn<&l{JLd%6?UYsyzp3etjq2ll((RgiN*iSgE*WIs=^9!D% z(5KaKsBE&l6n76R(G}L!62sFP2|a6VZlB6xz&<3O8Z1u-CEVdCo1RRT`)zQB2&QQ#rWp5xF?zrHiYntmweXOo9V6&f_9%0 zm_&1(j@?P|2C3_J-)sqOSK9!UpYZ9KZq^>Jmo}To5ol(ypIna2gJ{FDzsojXG1&_t z#6<c)I^B z`D#=e*K(Xt*Qrp<9og(#%qou8Ld0eI(n7=$fr;KEo$A{+b1?ZV(4)ZP`J&U3Z&Y8T zuvzAm8y1(_`#cCi7gQioey6DrfpqqLMkx&jCBX6d`>lFYxVv32bPBqAkkCxbI$TVq zrtjKYR$-}ihx;nOj;)|a{olY$mrW~{c+0k`U1nF^^v@3Oxi+x`R3Z+GpyIvr=Y~m- z$)(9h52wQzLWY-yqnWh+B+73u`(tQ_A53e$)83y(Gs2VhMx`-q$`e^6Y%U5}0(JUB z{ePX70BTWhn0_0j&XB)zGc6yilw^=apX#Tlw%B7}$J~NGF#7~`eR=33x zH*KFb+cyQGrJCluomoFdUE!kyR*QnP*&yHCx>~c2LvvpJCfT#$V%S4MUbj`C3nx;- z@9nt`;PV*tB7nchZ`41JR>$xnwJUMnt4d63V0n_-Dju&D$h90&Y?M)}_G1Px+F_X| z(w!{~j8Nk9Y0;b9vbkMW^LynRSL|z|4O97C%ogsoT#`o;cF7DkUbns8F8)ZZNs&t7 zQ%Drs=z6)HErvo;*y`>$9!$TI!JtWHZ0hR1Uu`y<&xXQ7By3ji#k%cCrClr35N2}v zQxa-)H$7EG%~~{-x8igEG|Tzk{eJzvl-qP7^sST2K_QEq+kJ2+w}&XOn?&_irO@wf zFHV>&vlFW-Mv;i3>h0=u>NB(n*Y$FfLh|rDP2_jrMz@!lP|HX9yjeZ5F;BF$c1<3Z>)B2N_hdB=k0gWy-LQ;XbB;IEDn;_4#V1)+bp6H5#bL z&`KfptdHCFXR#G$rSF^k+P-v;wNj5z?4%i^&a>xeS@YitzpYL}{(k>k`OHTs-_?f^ z2e#O3?ngZ$GBNgXV;()!K%sEx``p^y*MI?cKfWh!z=H_v{(;OQ3 zwhaXAKs1`A{_XHzjIPt;%ileNM97!+@p5;)D{4ys#D{=cvLX$tZLSD#u7*Wpkg_WG zLZNVZ?=3~nA7KKsi%e)ECy}5wB|Goso-9lf%q`)rL)j^AzwnhLSO;=-kTIVT{i{oEJ~FEB5dGT)Dq%I8{|Lu2SgK zX<@w{FS}gMmNwd)77u`kcmZk9TfqD>x9qz04 zR63t(P2ct>AMJ{HAZSRr&)KETd+3M1-AS1I`ohOf*Ba5nqKJQ&?^2B}5}L#wPYktx zztQGT7aN`?e!r7ITPdn~v$2oYLBTbjy9jw@6WQeVohXhK;npb8vD_d@gH)R3Jj1D< zE};(}tAXpuW6A{&X##%3yh9VtFWn8NO2I9p)Z?@U=d%fDlGYrjaOpIO zc(oaITI~U2I3xn@MyE%EZkHzzQk|{0Tmd_O{?Rl_alZkX!a|SuU@z#bP9L+^p^4I~ z4Ot?l>C1`bOFobG>%o(-nOr78f=&?8=NsSYs$9&;hTXT~gVGsa9b&hb$&>Z{sk*nn zw^d59^cz0B;;cpSq%&BORT>Thq57%ifDs&UI9mdSarTK>oI>_{Vf=_%eZ#3 z$F$C2A=+Ip_SR(CQc-|*nrP=uPngqQ%4nrpq4}IG<8D;up~P8)`jlHxCgfa8Tx{;U z(aGSKVSUmfYm?QY+aTGUpv~AqzflCc=TKn9klUWjzC`*#@0rZ+o{kDpv(IWG&^W>tjS+KLE;R*S; z#3RwKdOmY)RlmoJaf#)r9l}dtv?Y?`wms@s@^Q!GQL5C~AHPiD#S!Xx-9K&3m+A+e zkozhsX62QUw|3rMD}jlY%}b$>@p+x7avlJ|kn8CJOm?uR?iWaSp<4-Xy>@o@(A%20 z30y>|&F^!N5E%;=(X4w>y4k413M;>cM)rv!QKyv|Pp3mo_+)*OH(k@Y%05pwegQ-D?0YVE9=mWYTX$W*} zpg&4Ly@m2GLAN@^TXH2*2}>?JkIZ7LlnCS1@`gn-M%?Zrp(wu9jH79e&!-a}r=L~4 zUUd$!^jdIPB)U52FR8)_ zFQnG>O&(S(4|2NQaWJaIgP}_=H*B)ltV&6r3&efd7=Br+z?n!aESpBXtg=Acq-WasqBBPiAs>XLas$p8KyAq1m#%;V3lwS<2Vnrr+zia&8$EtK zvZ;(ay({BXfV>FE#mcFeJ4N!Dn8LwIt%~^~*qR!M1-n8JbAXfnO|WdS-Nh=3){zF9 z8vGhWHDdB+Z&#?I7$K#+2*J)Ugs2gf445zmtrX@@Ux6@sDw~%xwQ~{Btb3g`e(AIU z2AP8EYf7(tYdUwh&?NHxMh&Wf+ixsHf(#rA1Bg(jZ>I#*MHp9Pd0+g7=$E&jSq41> zR+&OSf8HJ7_(gbP9W-sDHmw8`F=*8a;&hhZta&I;CkG!6eErcy^hDSBUKPo4)o82o zi4BPbo2f98j=j!7mywLaZXS#y2s`6A!TZ)I>(yd$nA7v3e{oBq)&gzAh#QNHlJqYF zYX=C55;^o^r!KtNQ)Sk6=HM+vTS#I4>Ez0L-)(s+{^0-5pS%?F1Bp7UfaBdSU8Zb|FK|SX3p(!e^ zUCGUEXp&04C^(q(%<7o#B$lQPqJ}8#RW;P9ozD^UM#9}Z5W9%m5J7^|f&hIROPAP7mZSnI!M77)Z zG{ME(@9A?y}%-TQA4xp+*1b~Qv z6hkOH0Jy4hd+8 zC81Kt)V`1O>l08&&^s_Aewq@SR{ zIx7J%BAWIABrbUUJrzIa6Fpb(e%4v>ao4e*fAW`YvaSZH2H+ZAhM+;*SEGL}>Vh}~EKhIU4FFUSk1AE>d){m9!n;O(7)4c-~ z$S^`wi)ItHn^S+#V#bXvlg6m&H1Bp83~YWV zT)t%x!G4NruB-vnyB|9saTSgEHIxO7k;5*Z$)WB+yEtn9a;+e`vZ=`XNjOiXQzPd^ ziTVqyNT}nFTl7#9$D?!Iin7u&ZUsqiuB;Im8CYP3pa9#_|MX9)3p zhYkX8<*EkP^?O$FKTvJ?UZ-rcyjegs}f zg$&6ay2i8FHqGqA#cc#XY;fPWF28809V1$Pbnq(VG}o86Y_pj$-?v+g#F12PHf@ZX zQg~oM&(Gh&>Efx-rxV*+YaC=Y>Rr>z{i_!7P7rkCbmAp2a|)x7T~;V9Z5Jc#U?7Ac zbsI4@uwR8<7?J5v?&rgXOr&VBxY2A+T;_kd_xhR_rijA23l_i=VHgtSD?AiXdlC6M;(U=boA=IdJ*c-oD1rIp zi9PlKrwCR@qnT{U&PDA##Jw{H(?vMmUI7MMmF-iw?iou^RPl{pbeEUfn^b z)~VGR(0cf-Sp6zM8S53$IGgwFUH26+9f9&b!j;EqE*js%D|2lr#EKI!2YRct_H{I3 zL?QPlQ^w4~VQ?$4J|qA*3?M5EgwQr2=Pj|>txKi16FBsdjq#6t<2fu@1 zWL)>KwX63bLL2E4(*FZaKjL%0@B{ZYkhb{ME~$r*gD_`gQt$5oSH$#Xc`221XC$EZ z>4xxLLb^I7XLFX$z!8*bZ20+5q}GM~B?KP;5c7$3&+b4_*P!h>aZb zT6l(alE?IBx8p(Nl(wxg&pez}{o~bk?!!xi->0uj>fRp!vdDM>02IK}YU#jw{dAP; zry|`Qk!dSGKEb{1bKh-S_gUIEvM>e+sJ#DHs}9LSZFsa(KLA5=c;NClQ$xinWN|rM z>E>S3*Nqjqu>$UNHLGz3Qhw1n+Vai_AT`_Hu}Dn8i9x?N8~tJk9~kfPWE@KGRODXS4&a_-DcViZq2 z$5EcQedDT&JRCpufh-aiCcX#wH;O0x0562ZHL#y$wf`apWL9!*#iI=q@p~58@9H~1 z-a8#lV=(D%V5vnK+%|IkO&}uz7PmGa#}SAwmMT=E?vy^C_K=1qfxxj;A$O}(fylJg zuIHUYj_Y2(kh;qq3h-D4^@G?zWdlsKS8dWm5h?GscG8g|xWohyAqwdfc#M2hgdH zaR1aSiFXv--BoFD>}>9UB$Q`hN}pr_!UOS9AaggNV%lN?pLm*jhU@Z)Pk}1)Zp09U zOafoPn~Uuxo+vm_dN!Hn`@eiK z06qXIpi^lqUyzOg3J2_00}MN9M{%YyzV0Po1jC@=(s@!%X>LIah5O^%e*%5N(U)EZ zOIIsd4p<{Mo2;;`lj4Gd4GvBhD`8N{k2gD7D(&pHXXjsHv9E3jhr&?`* z{(HJcI6UBumaCMbTA)JXn;*ufD7hEOr6*3l{5|t~ST&0jRo~VvLO3pK@BW;cEkQ8~ z3%EXd9fc0d5$`<$N904e*);_8#Y$dR83}gkJZJ<2IYsKMkd5X4{l5TI%6?yNc08LR zLMNa{t^!%smBO@x{Q>I!i`itZq!Wg;*7u0>^OxHq?7kGwr7FJdrEwliB z{w3}0ue@O;rPUcm4H#EKBc2&?5|q|Tj{c#6)X8aiL;0nr4)a##hAQeHqD9S%8xP2*^GFi%6@!k(Y>7_|CU6KwYT!39Snu zL4ko`=5;+Eg&sR*_q;bpXAXiy5TQ2c6Ez6;e!CeNy-&M{1gK;KNoG29U=88MS6Pmm=nZ2Hif%b3kkZl!JDMJ;)vf$A1Z+^5X@|%maTQJ9q_e zf$%oeZV)aDRinBfB48aDxMZn$NB+A&=6@|k0D-^#MB2m*VI54W(BvZa2?KT5ZQ``r zL?W?LCs%$E+RLfdntyW6cZSi5lMzz3D#n>?flPMs`aE-CT_b;-Wv-JKVj1ma>v2tD zw~eP8k+_IiX5;gwg!+-~u*_xKEI}tmgbUo_LT)U5!^b#2;FuS6f+zhs3#l*;m&`&>mQLJU6 zu@inEwWhF0E2J@lo(%(;;_(>3OAV`32jW{GbQ<`f^TOhB&1+2Hf0$$v;P_3fQ8HAB zDW*OFhEP~cIQl0mkd9AXhjR=QxLPcW<-2Y@BdQLg>1u~7t;vV+lZKv2gNIm|IZ>@# z86qF*k1;E}Fem|6k9`rcD=FZ6`mzi8{H6X{B2IC=w`ZgnA&p$5_xf!4Or1KR%Gd!e z=U|Qn?6vudzy0%?cX1qpUTM%_?AqzE=E6&jNkb{mJzqIx>t;k((BsX0Yf$qn^%u>L z@~X$@(!UNp(J~2?CXH(tw4E2K;f?s;I=~rv{DllT={a_m$rS9{SGmiOT&hIS%|8cnm9C4DG>;K2s zTSry3zEPvF-N2?dozgAcNJzJIBMlOQbO}gzcOxZANSAbrbV^CLfV2WC+!xO|zwf*E zk2?lqoFike&06nz<9X&Y=X|US9L0CKI*mveQc-A@#m!BFtl4+7m4E1gT%RN&p&)1S zN~c{Wxq!w9)TxD#(C`QIM1@IV4Munx_;F+c?4J6$7C6-mGLNFDklx3+1@Vy{prT(f2`?pHg!2UcMN;efit#NG!$UH zI_@Le&CJ6z&vm~32PK|T)R^01{fE2ZrP787pFc-r$A>w_#-FR-7wawV#yV#Xjy9LK z1XenyJ(WK@id6;W@;a_IH-MV{Ql|H^*+=ddBfNVik3AE;@;8azx@}o>h)F+ zffSMe#4o|7T4wf^1^qNApNNir+AY&6H$_EUvY`m=#iV~&9eESX9q1X|ohk5%<0h5b zLs?|xPmuN}<(sswz{fSfg@6e|U7pH=>Uh&Gd?3gKhk@WAk8{JcbITLJ8$xX@DXujA z6P9YuAokhgncz)0|GDaK5hM{%?Qx{V9hitOynW)KwaUc!KJHtP(huJa{V=3(BmXbd z^a&w)L8he=AI-bVTu|$Y;GYYtTSCS{m#rOEG%~N8^2I)0Kf#n+dd2nuy_XZ6idgE= z`C9h_1ocG&-kRoj)IQrIB))khhKSwIuGj0*deqsSH~Lbv25o9XhJAl7d(H~P_j>>0 z?Iq!7PLCOCQ9poUPhxhOp8WQnavXmn=5veTd~v<8kA5rU{P_8yp3R!4{qoD@s8RBi z`5}{d0y=Xv9iA0-HFNW>*Wq=bUd&P`x#=hM(IHvhhTAA;SwgC z_5xmVN&y=O=ee7tclm!>XUX2?RB{G$$8D+xz?C8H77cdYV~B@nJD#!Z2~95npGf7E zrD0RK>-^ZwbjbQzm6$HO@vZC0;Jou@O69X3UT-Mfxw28AkTl*kHNbO_bnfCIOVSTK zshnc|g267T_*zl>oA#TN17hTy=h~~bn(sEMKfW+}rB{X?rOn#Ii*Sc{VR>2^Us`4gtl*40GSwhtK(M+ z=fjbCTqNHzTK-HD=L_CzbsB?L5(}({o0;qpUdB$O+NSHs1w?uyn6f_Ad|9%dGSRJs zP0hstlS^IB-qEc$Q$ZDDjDh$Y%ugR!$VI3WEr$2zP1|(%UK4v}5bLe-;1+C}!7aT* zo}fWUbSMU%%|G9hi!n`=b(6`_68m#84b)?sJ3)D$=Y@+~fJexCM;!||?~1mqtWzVS z-XLMjTE)nWv9wX>Q<6T{`cfm)cpBWK26{2x*WGKjZlp&1KF&$_3g?7lltPM@yALt; zx%Ie8sRpRMyF?`EzOh5qwuA^v^nOm2JQcQ|=gdHnUDn^lQ!WI)6Fh}=HE2PH`_u1i zmnjYhdd7;`Ce*ZJ#K@`5&DU3-rhA7bJiGRzwxh~D#{ws4xIk0};pRYW5h@*LPZ`a% zGkTMpw6`S=dAmC0KN$1}alaln&kI$7+uqk3fQ&u4q0Ow!J#N=x$Y%AwP zT?DUFFtgfnv9R=fXtrX8MTtf4NWfd6Pw#ZH)1dCiGR~RUtu|_@pO~=Y;72Ts{N)21 zI~jVX`th{x8J%bJ6jUljZiY%^#SfLp@SUi{^}g0N*AEEvLCT_n&GVA@E4=G6ryCs% z8?QO&2vf>$e2bh4 z*&q4vwv?sa4Dh61V+#mKlVA;+##z<|a^sGGvWQvCI=2^o9;d{dIgYZ3?FCbXvvY)5 zl5BPlBn~w@m0qQT)wlC%1AWFIC6J3)+&`dET==|jkNXA*PZDk8YMRrVSm<*aTlUKt zbk=r^3;}gL>ByJ8}DM|CD=S9 zZTBVu>n;P*QOvL+F+WC`f1MQ@Y(@(r7!pDf#>N>jLyp8N#hHYs0LJY*d{nF}Pv3Gc zp*j9!vOlc_F-&0#>POu4`AB#|Ow`mhlPRrEV}uwf)=qIWn$CrkOy=yJZgfc?usDxJ zl4yS&N7C4wP37FvH2k<+TO%Gl zo=UUa2t#bl_jt{o{s*4>#6OsKiDmlSrY?KnA_Xj-Z!tny#BS^NNH?U${0GJ3D$( zI_f!$0NvOzLU0fob6Me>=y=l3xFgUSo)MdoG#N^jNVTy>AKqM)zsNz1 zCeOwx>nf|ln@pivtrt#7$M#$^f=7mHo7RXr)G6}0&+YGP=u($RI7u`)VAk;=3~Ep? zCES@3H=cU(b7R2vayYN4-%!gCuojIxW(B~?olJGtpB3~%RRwjo5l1z} zznpscKZN(?gnLP}+T42Qr&#JT%Dv53p^ZIzOrS^U3B1~1lRp|8Cj95WNogo=T zkz@rkge|8^0#&^W)3z6}oZk^qZ~!0jnRcIsToZX8#a7`VhxH#O2VCa_{Sq(rXTlHh zPf86%iZBg5uke+6Vk8k-#|Mkvc^Y-j+MVYiboDj)3PKRnR`(%j7|-1r&#QN2`$tdE z9Ak?6m?f~gN#letGJ_QoxuYCP1fH2^?yHA$eyHt{C6|D7eS$xYJPYP|)yuV6%P!X= zjvn*|%_ky(B0-%qJ^#IU9fbLU6EPb~@0&18yQr3uQb6dV9wFEdvTGr`%Edy#p zqUx=VM#|+SB-jwIbvy({BUwxb1P%*^&7LDM@hMSDa=oa%%!+_W7M zE<(WzF?unb!J-^LBPlAjL z%?UCbys%-YiLO&oA?0V$?`F934U5)KSHIn`-OP`DQfB$y%?=6oX}*?VLw=c-7xPP= z!mAAohVW|V?M?ZV?Iy{Pbia7KnEoZo=LZFWvpnH;G$l^yEX}em7o?GH`%_%#WsWk6 ze&sFiH6}8X2OvSV$Q)WTxONou(=8%lzOudpej`j7*cl(WFvSrUB9JPl;GDW7x!(HI zFFH-h4>IOzHpR*PA`|iY<4IQE`)5dG#;!?63fBh>z@7%1BfLP*l~mLGV8J5Eh?qw> zg%GOkt&JfJk>2B8gyf5lOH-+^H(+yflb7Hn6r{-r>N-&I6m??R7+iUtYp4?!VXG$F zX7fTG9V3kO@JCJ%Jq+~#!Si9DTQPCf zC|H=yV4DWL?#bAva)3@taJ|(3iNqzNWKZb+0o@Sa$#)+CQ+rZDS!8d1FTd``%|y!VpR01>9N$NA@9CYjpx-k9bxVprmSK!ykvme3f5k&^lLzF%XJ#M{)$TsSwz2 z_|ld#Juc?bpdAv_^XT~?aLGmEv?q7C8u>dEJ>>axr@yBd-FFFO-@{ZEwPw@ipt!Lq zD<7NpzN?@5oH<;!Ms0D1`<(ryT+_pB*m?1Ds>rx#D! z^z)00RB(wuB3!spK_G6>RQ>3?uHFY3$t%kc-L*H?bU$SYW))k!8rOHraHrZu-b8Uo zq|`|_#ucn~zF9+QPsVn6tLJhV`hKPQN6*{t%ILEiSp>_jPZY(oq+bl`-D=!Fh>sT3 zxh+tTvaD`qc>QkLX(n>d=F1UsO!Lg6!ko;p5D|43;7)Xx?pIe5218T2f?9Wu8=zkAi829A+B#9Q=C zvVG~E`LJ}UHTL4x$0sW5ctWE0^xI`3S4VG>(*_;HPFG|_*DLqYs$j*qjW9eq)v@D7 zM^0Hh+4PCxBGm=f3h~xjM=%M}JIpC!8@SknF_rcEA`g5rmaTlCCdgcpw47|}p*5c{`K05^q2V29s{sTJO>3n*e(RWM!|=jTv8Z5C2yyRw8Msyg_6iX1iP@dnz0 zg^!~`IvECnoQP&JrQajelrLzXiL!DA;wL80^UI4B1go*TC`dnuHaBx&hR9`feP5S- zx*WM{W*qvgSvr5t(DAb&=?!C*M`=i2oZ1ODMEQ<<@4dihS`>J-__YhJl^j=5A=)}k zl!!!t1Z}vUaSS4s8>c{_poX%x);1dpY2=lp0Tj!`3#}*kuSp^`Efr~80@v9(?m&Ch z505x1hxx<6_A`F;U-l&cnDuMC>&f{0!P_7NuHTmFQJo^J>(iv5KwQCBTlv!cy4{;gv{jOP7v^c90Y}^zh0m45*X`pt*~QJ98u+R#dN9v>ZGwKiJ@{FVQ4_0RY>#KhWAE|%Hw>JYY7Gv&)y$|-TW|$p^j7l z`rsKn_@sfgm;SFWw#cuJ?hWcd|?X+TSt-i3Ow-C zdNktDxF38j9yb<-p^(xk`e+Al!^%H+h>eXd@=J{(+GurYsy&8`{xSk z@`tr$b6Cako-S#9zBNfJ{!JbT=;WA`V!uz;d+og#PKb6*T)N22`#`aHe*rj)MdFoV zYpg>B(duM8fR9|+Djn=d20>FY9rsYgdMu}d)uF4-M)3FVP=X)#;=WJt`TDzOceefx zU)|?7eR1Bo+fJ-KR>*>e@ z%yLIDqIE?uL4Fe;7RT0VA8;Xh{qBWsI%j23iS&9$TJzc=ABRnoChy#TNii;yG4}|50OD^EWT>*`PE~s3v?p24H0l4Bhc|td0 zTUGr02dl}#ek+?-;dh8H+{SLJ^FMEclp%7(A@^8zEIW>Oa8GtwO@}#`VH=?74tZ*h z$+gb!B6Xgxg?HD~eA%A#R?g%`;zN#N|IX4J5i|Cvz~zL$p>kUg<>f=7lS89PCoe+P zg1>J#00n*p{D`WITAx4NTkj3eu}~2+#H~2K@fs`BE{7M}b1zhxJUL@j$!YdHHU`xr znmV^{_@H2+ZJq0NLbj?gRw~EP>lTDviez;_9u2`__i3IzacLVq}mS`KgTW>v65PoBDg?`;F@Tklg zAH#jKulA5F428b+aWb7k4}ExkS+HbJiherTv3en=Kf5k^X%FQml_x&ej^X5H`E{s~ zd~@HUyVgmKH4MNzh9VwQ1#~3}dw$>Oj|Qr|RzMR{9tV|`RzTqRU~y&L1nEK94J2Sc zye1iI!{WD@Kv<7dasGA#>?J|XDk`DIW|GE0Ie`lC4-240WbxYiZndMmZ7^CP)1@sd|JVfRkW;8IMA*fmDgz)h+P4$Syf(F{#Vj6s+rW+TuO@ujZtj z9#Qq*u>NSbOO3C~<#K1wwvOk0ipA|e)M*npfnsq4us;2}N3C%~famA)pNIyr;BBxQ zHdcU0TFpQqu)+(2;>y%M1%`r`)6&$t_ZnF1eV zfw0Eu89h9d#JI+6&i1FFnD6ciHK?G9>8;SF2E0O_=MSvK;_;I-4x;5wwaCJaWQ@;bCjhB7GH6@$WH$d?X zsaMS8p0_kT6OFfW536daLY&rf~ApM>IqER$JaJ>FQZUafTCFCZwm37Cv`t!TIR1!KS_ zO$54CWS;g@9h<3|Lb>&4O)~A}GaL@WKk$K^!&#xx7urypkDEQTGJuz!eD3MUDb;11XIdx+m-HRgf|+Mg*O}!lf2-{)c?>V z9H2AQZw{}0Z!Va-TQf3*RC9P*<}b@w0EH>cP@4S>k2MyVTx5_8zKY8EBH$yod7qyU zv*}dm0gGL?LG`%94NFXwyKi4UkXff_2K35&v`iXHVV>^r`voKAw-~^vXSOc^@-?P+ z&ut#}!Rtof=6=OTwOGXC)50W>c@P@)cHgtdU|Z3GuF`3ipFY z!q0pJTgTgE?#7Jv%nCc6Vyz4I{9NtSF3Nc~kB=Gtf4!3c8@xN9%L8Vvz{oY!JwOeQ z^-q(-QVG_LIl}<11kCject}U#1j5&V=z0>Av)~dLn(86dJcR8qYPm!nre1a9V$>Ib z8Io6Hpys5G>>Ogc3*&kZop&DMRdtOZ!snT)8GhR;AgG%gHt}^L7Z}y@0=BV!6#HC- zzGh;vJF6U+N?&WO&@t-j!p1Qy{UM>p3pYZnSnK*JGbCLZ~?<_HG4Jg)|SL8i0U|~V2Ab`hUJ#@U% zE{r<>4#a&kFwKP?;$H`A_J0KG1{xk-C6BNCT(Z@1*8SsCZ){UrL-=>I6|zX?h+VY$ zw=D^u0Fm{2S0d-2`VjKaxx|Ma0qytllGCUiyOY>x1iW0JR1es87-w*USfC=O6{qsv z^Kgq%fBarhy-BTWLVqgCaFcCTLOaI5<|SMXh#9dI-$yvXGXcH2gc1SuX1>$kKl`D? zSEz<~qqj@@D2rGIGXnvoFH=v_{R7}blna4vv*e)onGL8=_yOxEE*c0S%B<@1XAHVA zSYWl#c7J+)hblCLA16ic3I5v_$NiueDm0`Gk;1r5RV*pM2vn?TTf+`?=vkRscrI1> zoCeMJd{n?juzLUEa3T3y^^ny-%-yvw1{psF9=81^lU?yD*bG!U`WqoY>hh}+5)W*f zGnDb^a&*HVLz@VK^33FWm@LL0&dRTLPJsh>tyzqMCRfzskeHwErcxD|LePl{w)tG? znc$41*?0K^md5VDo7{@?s~Cr45JLSjtpUZJ?}iUbuk7(S#!s?SOEQvBSqf4B+jzNeg4svP7& zi0sJF7i5Ugzsny-ed9h0R(@>4=&{-?x-p+wLyxwX1ILpHmYw&lpw@BgK%kZ=Y()RzL@wTq8R&t%q|V&eaC(7jL6$f-{N zL)qEmVYZvo2h%2&nz@Jx8T?of!r&H!WQFK1)Z5}E3)INYn*>WS39G@94LW?i8$_)E z$Ew)|E|KcQeCS6Xh3ey^RfEs0A)dBJ+jkqJ1>6+ok1d{dclHe+A)66?RfjOuU^A_Xv%xg{t1uZRycmZlgMB*_;=v3vrri~;N>!(wi!QUhN zKJ6ld!Z{|a;hU+=mC{+OAoLOdsnsX?Nl6ew#EMA7O=(nrO4Tz4d=>^5rpz2Kxjwzd zHkwjE;0YoxoN$@I=!Z*8BvuooU6CIen|)%nzE$rEf(YegNCo@`3F1^Ah!pcD0I3SA z#ln&8n$TGZl#I%;K5tg%fi43vq?{b0bUw!gj~az-cIN`D*>&A zLyRGsgtpAX`0y{#n}Ho=#o=`a7q4uEs9w+P2(u3;d7v_%hAzG6O+Y*-0qW#`9E0vR0`Q5h-h1C1 zGx5-zfA(4NHEJ~mADvENM8Oo*z}p}1HEC0%V`i<=qi`@4{&y$*@Vz{NrGW=Nm#^Vu z^vc6DFPU%i4#W69AT5O?*6cML`MlhY>RKnX*HHN9&rpd_ixWF;7`Yu!t^CTRa69p; z-IEyQ@wV74D{aS1jw(GtbN=UU!Dgrh6Y;y|X3{8DOFW4k0&4yNS2+8BPlIR(6Y;)g zF(?+#sCZC--AIwJ^*7hOpOJw9U3imD&>o8vvA>Dj{qH(r5lG|q3fE7Ltm(e~?~||N zyQXzHf@z|#!&jfk?cW1D|D=A?_wb$S6ACkxe{LM)O&Cz)Urx}VUMs2#YEEP3{m}nD zY$|O!`t}8zz>(O`wMo@*)wstV175?F)8LLbHSzNPUA7R)z`{s8^44E#r^C8SRjWgU z0*#%XdQhzYlOhF)o6^7^DQ>;m*O+ROjNC|9NgG@hW`=}!K9N{A`F8|Oacr`<+_zvj zkGDt25kyn!v>B-Blg3xG&swW_e*U>u!IuDRS-^ciX2MV|_+wlmW z8} zH1M4R;EFO?C{q;f`^5kD_iVXt@yE-vZ6#cVB)Zd2V*nqCSAFXB_q+$sB@X4x*Q$I* z@;p&NJM z^f=Um6)r`P^07n$VsR&k`(3uyY)PxIX20l~_{o?9s@%7!;MM=b<`9ve4oZQM69x?a zuQd)F0%`bSU;c)D`xlAErRftqcu=*W(2$H1z#oLR0a=m~@*b)Dl{P z0#I;HQj*qA%XRI^PN^)68TxORK=>Ii`0CZxDSsp;tG0 z34S2pCg}?;p<2_2dffTBY#qL}c;#`ys$W1z40Hp6oSV8$zi`bKAx#G@Ml~~2n6n!6 z0PJX?n+y&@j;)s41?7Ig6t$j1Rmlgrdy_8f$oZ1>a2KU{cxlccJAvjp!N$K)17bwi zZrXGFb}%|5TV?ie!o0GO75~VvH*B~`T6+( zffTTr2yg?|Wx!r+667=;fF*rTmyMXIG|J`%e9fC5OOD!sfSpm7%x~5ca%i0i+TvvE z>r@$!sw@IFZXp;B?FiK&l|`U3725fN?kUr|j+F5BUiJmfhCruq1g^nE9<3&>(!LlH zJV$KF7k$x0@ve(-z81Iro@-{{NCxo2Coq=B{J z#5a+6jDY53L`0Pc0)hL`Ni~l=3y(oLpR*VyALXmr^wP+Zh~Ui((F9RoFHv65$-g@> z+y;ivst0tq0xvzlQUcyqV?b{6M#MXbMh1<5Gv#XAh7~kHh*E(thdix5jNH~KQO=z3 z=5<^lmu4z*UT5ECLxbm}KxXKlXcVtaP(c}G;M4_e`Z)Spu30L2^wMjvM_X@={(V+) zS%gT$cj{OMkAU$KSi?BX8+Rr?)(}U{+1t}6Nf3c zU-fG_SQvlx;~by08HgdlqqUa!0iN2=Z?gwq3>|^s$`FX3fND9!Zo_c2O@FhD6Cn7Y z`-j8V=0xfUjm46uWG&z3YlcEKsbS$Z23B>G2qsX*)QTu4qK2XVVL9 zW(9!%J6IyJUEe%qBVfc^(KxM3+AvD|^CsRl=%|}yad-jS151Qwgk*$*m?uZq$V$@; zyg#4?02XSUN7DD%AFLt^If-I-dQ%JPQQ2#6{edu2q06mRh#0xP8hWWZ%(%W0E)Kd;_$6|@mQE`M7dnfJ z6UGTPE8wdCd&Nd>0DhcTXK<3qKX;M_t?a0FyLmM+YKitefKUXtu76z1soT%ezqsh&veR`hHw$wCsXERA;QW-sZqEFV~ z+4O11M2-K2-}?p_uo9Hc5kFS6x+W?xlXYB~ES1cJxl@e3q%C(65&fMRultTwRi9#% z8A(hBfX3%3w*q$dYnVlvb-HtmKJG`E{WZw#ycb3BVr8JLL-;H5wJSHa`|*!z{J6d3 zxd`xtc{C?ObEO)B+?}+l*p!FcvvFDi=|&#cV_inM|!Ona!s#QM&@Pr*L_!OYxTSk$CV28WJ6M3#VcF8An@HaG;rs4lhqX- zIE9<~4#C$wqnc?2z@6`{4-vopym*)tJ8%wQTShY$jBJ!xThH*7PfWIM+nU@UpRu4k!LD`ZUaiizWxQXH%K1eGmt@AFn*T4Dw?ta~ec( zD^9LdFA6J4i~=J>lVAfacA{H+7(3w$+A?()$`cj>s!Oze2ty5=(Hrj&v|q!p>(JNS zJYBAaN$@Q145wo=*e52?^Hnvr-L4DyqyvZlMDOHwSR~5SsK@(KEDcxxc$$odv**%0{Rup1HAO;li z;E3=OvY*J80EI*OxRH`B@7>%vuv7ydpxm>PyLa;X_eE~@iUNAdoTr}(wBOzS@?3T& zT9gG|R14PnU~s`OjlBid6=GLw!TRL#qxb8!YZxkGx$bp2t4qXY(5LhBQB_+lkV#Y; zwIah_f^Jd7Zvn^{znBOxX9UCRsxZn`bdENs?Nr@+cIJLZ?)uX2*5i(IQQPr~zoLg+ zpv!8s-Y5^y3C9eMS7>N+T9?#aQ`f z!(%fR@uIT1pi1?atc>-3b3m2tl0y45Ut^&^d(}F!xn}*wFYIF>=;1b>31Y!fTE0~f z?KsZ7p9Em4{*+UzFz~7zk2r2SkNZd;O~mp1T^pzo0OQ|d0-axs5%nILcyZz_N=44uvtpD|PItqnX z!0zi21kh%!T=QLA`R)lt$wQHho1;p_ZI%spoRz%&iG_EQ{LiGkq?ZKUt|kT?OiW%m zr@Do@eI^f;DsspG{L0U?dxhdsQee}eV??ZCA{QY!zZhAyvLCA$-6QduJ$)8&7WfvC zN6bG?e+~c@qf>D@>R^skfgUlX@AA0g=e~-3anF0`e|v4M1%GUIz1r%@ajXVkMt>q# zHjJPeF7>wgbbxG>4BCN)l5|3YOo-poea}wa)W? z4*Jk}|8fQ-Ddht5{-~IFpnjozeTA3^;5T&uF}#wRGwWrVT0|y~1CCff3V)`5_+9UD znWg>cSx(NAgD3HnCe^+QLs_W}93LVV6La7}xqKWXPqVmUb&mZr9?dtFfn0(Ll8Ji~ z)e|Td*trhmT6$;YTBSh>4^&&w*_#6*H0BvIkab5bV+?7AMV>l!^yY|nU#+SmK1HGp z5sJT+&;^bITuAH%nAIoer2PSi57cm!pR!m$)>(2-;%J1Ch-GYE{whu-Nq>*Z^hhL6 zC7VC?=})qk&4|(H21vS-rpgc9(m?~pYV_XKm$bN3+)zo4ZZumK>zk)dKSdTP5N{tl zL)>E=I}U%|Z3`}do@(SfG>q}&qOT!*-;R@K_payaS9jR-n>~7URkEc^XkUGRFuZ>! zi)Eoxfa^TzkGwx#t{B${LLV?YFLz-$ z4UkyF@leZU!Cqg(*Y2(k%gv=$|J$yl&>>nqMzca)P7q#>h?gD%7+TR z$YQXe_&rL`4!J>$Za>|ZikUO^!R)pPYKNN5yGkr~w*|&RLkFono#I>*Cf2@AD% z9-K!F%y{QpW23!(mqXg=uWLn$)L~_TMKpEdHN9>jBhZ2@P-F5a0ZXj=2)G{#NVM-2 z0Eb!NwBq>^({jdvc${`g>5pnyiBtE8TZGB$UzghoRlCEZ4>d!cf zA-T<6{lki*=g|c3yf35ZYo{U?fs`~tyalS#@ge;?(rPjYoUZdO{KJAjWU)P8s0 zNoaVZ{q>INwyNcEUn?Y*x`?52h(vqSS2l=*2e~ZZO(UYQdFmg(-R>BcUG~)DV&VhB zEVOA&kL;fA-taFDi)wX>d`0A6Z`HzMULxkvU-j}kuH>*WLMuV-@6XL8xc{Ok9_w6p zLc32&9PJ^Iu%EKNIx&VevS-p8BG8pr-u5ujR$sy2;;@2qa6aG~Jn*M=<8B;RTjYQF zH9{qTB{C{38e9048@j5obL|(7Iq{kGKlpAYbKY27N$VIPe3hQZE1+TE`tDGw&44I)-R+^ZR8|{!o^t!{4*@iMMuF?P(5*L5%G0d0(8~7^<{TgP8P7-0~`g*qOA| zPm{7vLjmWZLu&MJoZ(B1Tp5R8F{6h!ll;DqgcUvxk)RS1YWMQ#sD;1V9re_XH3;N2 z3+&tLNeX-Poj(0pkd8A*naAHeq2wT$&WZg1IyM{tqu2gVgw}elE&xYnqCX_5Q4d8e z18Gk_F}Ee`3(mW6j`16NPW8Z8784Z18NRShV^9vGS$_9heCwwsqsfXdUwK~rW9f}v zHrWrB7IFId8Ug5rq+f~A$%s2f+Bh4)3CV3WMC=eOZ*qlYL9~v^nnGY+VzwvQ(ErIB z)W)u`Yo_WF!)y9pQ5KnHz={Pi%oHU)UV{|IJDhM6<#Hc$=`+u`2wvnusVayNkKtFz zZ0G*=kWaUd1Fyxbe;lqPvOyI#3~%5J6%1+Oz>-#v;ncPrln5edhP+dF@+S4g^c98r z9FVH?>?JFrJs2ZWldK4=4clIPe*I}9irNyK138RSLWVrVT6*!$1^uY$cC6!q^R@e? zEj>QS%b{#MKWOLu){0XZjnP2f{%a@Ou^PF6l8Wb8wWe;MFZQ?EwulZIxgSuxgfA@i z7E3&2s!5iFxo&R8aHEKd*;K&Po9#(6nEAobqrT(e~4lAI;1jY6PBzofr z1r^XxcLEl(`eXF%oxY@D2XpwBNKXTW^^>voh(@|S6eg554CM+~4(#*F<)VFm^gnLYSf=q&JqmPB zUnVas22DI)lRQ9{FquDG8$A+O@#=-eG(6CR8^FTbxMSf>652ID~typ z*;2hwNafD(*zk4L(tKCMDTL0pkXfS1S1U@O*nD{1f)a7q8!3Xjp%%>6)ei8CK3-~@ z)al;zc{+Q_YUSFCTB}W$>bI!a2lgdvB|{rWhYP=~W{?_{IA!v^O}#iFMea2YmPrL` zzk(ovDr+%K-TY`_Xcp(3eacexqV?=U0AeaJbe+=T#F0l;8Bz#ZD|oy9QTd5p@jwho za#ofdX%@o`AtH{vJmlIX&t(_9f|a&p$Dva(xpAS`0pSVGq46HPpy~HY0m^^qYmVx| zJ{nvLN&ARD+y^PxuuvES@o)VX-ga_#RK*OBD)og&y>;J&{A?t6mVQV*D&K)!vr7>{1Riv$JFoShRFs&l`_5l-mdD` zl6CAf8ev`6^P&?&7xYD(Yi{wHd`w?>QQNM0IGANqS!SFK%Nsk@!b(7BF{lFdRlNn( z20QJtFx_3@UvJ{y;OH+p!!9+cyOhkdKEAY_i10^@E4TX_q!U1fe;l^%Qqzz%WU-^uh{tY%>Ri?z~9!V?~1+% z;pT2nSkKXmsHE_2UCE;!e{iiRB;v9K6Y7Nh=v@lkVMz!2V~N);>QO^7}~SJ{X}FH;0f+cKwp&gI(A_q){%)_N|FhG3g32i@O*C_>x?hw7YPp?EY$oyrN{VY%fp z-6Rj=T)Wm%!9_ij6@|Re`yNfL!eS-a1sN#Nj8b~7A?7@BB6XN5M!g2&u?mGO7LLtm zdmC4dagi0PVWV%B$PGmYx8+L%zuf3LK~OM-D_=da^D!((IqTu#axpdK-AAnVoHi5r z;OeyziPuk&d)OuM?mpz4<+qX_O@8)fh_+G<;UC{1pLzB#649n=dU|CQs$?zgGq*JXxVtWOAfyc7&s%tA^l{~4;J#ID-o zp8|RrXJ2ose%?9rK0e>IS_D0jEv;C!V$+m#J&Z{jS=lnt@LCK~Z0E}UP9!iR1L4iv ziOG(EVOI*m*U#nbpP<|ddR+8HHhCO+3%FN(ddQ3rDg)KFn6A3p{<2Y&s$cx`e9~f$ zKvZ~IrP#?bN}k}KzfU7?x^u3t_%YRCr4vVpmg((k2y(cc#}yiKOKsQIHMNb8>}Y0b;5P6v)wha!43^&U_Q_-x2Vjo~#(a&1zn?Y>!M z`~D%p_#mYuHZ^Zz4-SNv?|%8-uV8=FdS;{FtHEnZ0Nw{FjU&kC1RPhCH3&95V~M#6 zL1X3+&Q*Xa09%MCLbgSKs)Gd8$mgg5aAq0J^}te#9KT0dK?alQsU1Z5<1S_qN3wS! zOMV>|n{9B9GTaA>UNE>py>69p2;1>Q!GrzHCs(_mA8#O7Gxy@n`D!yJ@;0bab?l?c z=Agmiw)c{^{B{nYYp$gOtwP0HIqJwK}kOUxvs?`lEaz4ZusN z1jttPxCaY^yo1ezww9zB-kJs{$ch)9cru65{#KqX`uAd}5<)9dsmX%!0%*ekIzGq? zfX6dz<3+}%P_~%_QlpyQseFkCOAa8)Kpj*E#ZXqtAI0hDQM})_TQ^ntWk@gON3 zy3>>zaMSt=&zUPw0YLJ(mxzK9;&5-FL!Z;jC|**jD^i1hKQP58p4523(RZZJOz!M+ zK-c{51N31dGLxUWR?(II^NU^78D@2R0B`3J;4QqGtHi-OEq3aIwjvY!eaG_35WmXz zuYD&)$a&hY^|@ak&h|$SK-+Ua-&KNFHBf$ARR8H}Cb{r3_8h7>z_U4NJ7{|6w% zk%vPx1aY@y?MVN6K?Pe;|}4Zr~i%yZV138 zp&t{o!v4L)D>Fbc+vk6OB~Tgmd$~n{fzNG$3~IT4GT@hh|KANx+~45E(d^cLH%JDp z{R8&@l>}*x|DFm)$BL*gi2LJsr97GgW$L9P$A z!aZsWtmz0FJ7yALA@Koi$i1JqZ(k+#;Q0Sk#*G%If?lMH!WGN#oP&li%O((1JZ^{^ER6ylmky0s3~Wm8PKEo*Iz)K4a|liAW4n{c-q; z9gRcxk^xdfoqv~P(Q8x`B@S93ws|=N>d4EF+i8HD-}&o%Yhv~wAR&MjCpaH(XY0Tk z<@sa2S7Q2H^TNk8t4X~kHwv# z)E8qpLcp8SEFzk{h3UJ*3xFj$fwwB(Y*>aGDNct9ikyugeYYXQ&}epA2SuDJy&7}i zp99WpsFE1^lf?nhB4+i~0UH(&X1^+Hu$hGTnFETS=U3-&A?b0z2X?)?{o@aktDVto zfV;fz4nhKy;XhzJNcrvcKwSpZtH!eh#z!8A?tuRBC0_v`2Od!bFuMUO!*Q{GAJENF zKr@(+p^#j)0st7@58Y-~*nO0Gf)A#Wh`TUd49qJR;R{R)?Ge4z`M2(5RWW_w(3uhj z`d;9a_X5H!Fqw($35*IT-Rwr8ELH$OTO8$+dnB{!*+*_ZJ_2BGwvfKR(*6q=QoMVb z9Y#7h{0$#eV77Y?^~p^Qf6oXzaBoPHE}eIzRWiOsk-lGrY{jc zNf^`4#}W}R2Ls_#gD{P^bk{VL{o@vH4EcL>`D*~7nwh=_zWk{f^pph9QH0pQWT3i` zuoWk`0^e}`9Aygti0_H^$bFY|@RfU0Qc^$$gGtPZpcE;j@CWStHOW(;R@eHy1${ZR zr`oJH&+o?qure|=F9_fR_Ap|gWaZ}Zz80+9|JN;4KT-y$Avj{*<$92Qi3a9C6dagj zCf)e+#@(!-qxmf$vlb|r)_2VWHg><12i8sil3W2cl3H*rsIJr|(G)m809zEWaFMl< zpJ0VkL%ILxzmkyu3>Sjb0tWYoSA>Dm3~MG!raNP314g|YgYrxH z5I&<+&^0DXaT0Knr*mY6OC6l1R2@B}@8bmkYbSro&lqMz1pknv%9r~L&o5OdM%}X} zvF)KaHc=^}=#;=!{cj@-1RJ4rb97ele;Z*=06tdyDpe%oeS83ZBE`j3iXxqe_)u_Q zXJotz zB5ys|GDL7eeIVZRDR;E^2n;?J%}KmvPdnfZj8!AM33$|btHEcU;i!?vHMbby>Odj1 z3ASCq34VfVf-E`}@^M~J5`al$ZYU7EIPc-9VKdXG8AUL@Sxk z@{~{CllCz)5_}_N0YW`x=0YNO@8p4BZ=^P)O*4LB$ORW7hXEJu4C> zor#bI9k0MpOu|3wNshn&rf0Q#BtRcS&H1;CaJ@~$-JL`ndU(!jyVS_GV50CXES7G_ z{DmM0monfJFKW2DDB$+qz{BwzzXN@dhcvK;+ic1;lCsI>i)>R75#$oU+%X9y)dvJaSVjQ1Bu~+EicWg zi+0Lh`BgvF`de(gKYUry?uRCbwo8~9I6I8gW1e_;qXN0qBkp z>C=|0jcCBQ^Bv`zyV4Z~wfrBTJF=W2*5xAz4V)4D@xIToNf|V2T#!EOTu!qd;4US7 zn*w51i=+FVIG_an^Kc=!7z`7`{KrW1vGqh+IZ*fpAdgG@Q^^f5sM7IZ&g|8Ek*7(J zhenVr0tgWVhqE6H)87|?6574Q2h`wvL0y(>H2Y;sDX6+}w4FdmQNsa}RjDvrGQu2r zK{oMBnu-bBdQi{XAuu*~^iH21disKz24JhSo{tHCelrDLkz^iq#k2U`$$$uAmKMMjIK!4gzL9q{WhH`0NFwoM19HGyik z@KOD=Ja#k1wG@|>WSy!B_Hb6WfOim~3PE$Ac=PkAMs$D) zBy$nBhah|t*wc(BGsY((jZXl#JHwBFMciiwsP=~>1W7sKB~OOwc9T!0(|Zz6n)L9L zWg$1ic`=^L=dNGIr_P(0Y9T}6w6(sot^e*s5eVSb?OFeqj)4*@7(1Fcy$Amb)8yAD z9KkM0C;mF&-e06|pqm#MY!UCv2iX?1+|Rx>A%HS`acz=@^rIstMZ!HcfEQp?eHREP z^xM3+CPipP)nI=*n;d+qKV%tuN&GVdsSv^0n1~>tt{H(Dm|uaFAvMT$b`Q16>cq0# zZ>lv6RH3wAJjFK|;fVE7U^$vNbt3`#@1GnY*NlbnCwE`}pPsHep6c)a>$KBYTC*%^ue^GDB9OlB-17;@Tr}B^k-68^SlG>Yd%c7aUu5b*-p?!1Vq@xaKR}hXitPwaXU>yT+co6!Q`&!~Fuz7X|;SoM{I6(+ztXE{5yH=8+#?M}o8a(swl$ z7Nr`msRz=ijk56jj%&gaQ8TZc2p(gUOd47~(FqkQ-qYB-GKwsAEnY6d8K_%Xs8_U+ z6*f+hcVkCew*o<>0`N0b+^46$ee763;bcx>Cg|hcn!LsNcfb531GG9{yjrTu4^98rAGuHf@wAERciY%1d_UcB&Md{5u;Kh(KwZ~(5e_z7?irUZAl$vNBOU?}IULa$ zs#*lV25|0@)ZEnjdbNqB0#0%T6>+Dakcy6zo=_pMnI-*WqcqqQ6zL6pO;Vxv;UUBx zOkFYjp$t$Nq>vTS0e*Dpc8#T~)|`j#0RF{zVQaNsy{w2xP z223>sW5P(1M~v(S6EE5b@uDd{<2aF8u&RK4(?M}uNy z@zpAwO@cWFeBXQf`8A#Fm_qaS8}pXTXT$-TFy)mx-KBQ9fQ34&@qE7s=>qu*=GBw) z9d5tj*kl71y=cepcefc+J9{n2z{j?If+hyeuy~gBOXNCoE<*OHm!xCehx3Y1!D3O= z+2qD%FEm;34|qyQ5lHiH>~o$ZjWNQ@PTos?A5!Rsj;*H=-s zdn;nIG8y2#@h@49G{<-a-~mgGJqw!GE`oINF#Z}tJ_RkO715+!GGRhl6aK#U%+;hv z6aHWj@*?)AH*oX1+vn%Us_Np`WJ_exgS!M>bFKSKntyQC=rv$#M;kDDwy4lcO5C37wvI-h_NmK7?*)*CnN1(=UGms@WC7>j_gaML*x4S6ur75 z|7Pto($P^|E7hn4QxET5v1;FY+(&5ig%vRWeRtZ?7`S64nsJ?SnJ>Te{}vhv16^5d z7Xp_f$j=5%FaBo+cty{G2(`_L@$CZ&DkU&8DyorhJf5XJQt68@s=NQ;c#|+YPK8lP zBBebzPmpnca>UsG%Q;%v(FL{q?0&S7nuGuOAL9MpALC>DJa|Hp^bQd#cN=V;){>(+ z`snq`K-!JR9-Jn4)?S9rMG)c(3E>YBl)!e~#` zp6=a?XpswB+T~m=6I;k81Y0u17CHCRQ%0KEJi30j@03}C_SiCf#XCz~=W5cdC9Zi{ zN({a@Pum$$#P|Ynblcj(7Ah}{i@#>dV`8biVCsFeQnz9y1yuuj)r%fAx%> zKzue+S-?YpE*MHg9I@&;nAy2)v!9IY&J!M9 zb1FY56)8q3er`6!^%U^DO99Y{!5CM8Z?6R`BQuM_5NIy0Y5@}#f7F{V|{9~ z5iM_I^7zAT`1d8yM{VEAnNBOQ36 z4+TGplAeXdYeU9Z8#rq7>;pPD5({?Hg?JomE(52HghIAdn5-J~OmBybIM=~ts)4CZ zc6BDWctXWqA7PcKj`P=z8Q7>U7Ro0}XnDz~#ueI@1bA*dq$3(L{dtn1l0gvroa^d7 zLT&h3#Y`Ki<_6E`OH~jY3rVtzP!&{RgZwD}WbVlbk5i{}F5V zPFc=S@KYrV7MLbRGyLyFbRmWwM+h0ws(+LPC||;WdgHm8u1w}F|oe0KFh;*J*fgWyy9_WV27sI7{mMX{5}vO==di_4H_ zxY~XJHM+H`y5{|i0F;HCf!7s!Cf60V!-D0nr+7Oc7FSq94-HFxH?!w?bQ|*EmGaB zx441}=yhW`8EcIjXUuc-AlJLoYTE!A2{0d1z3dG}-OmL`mh;B+MaY zNt{iic-3i_(vZcNn+8Zhb^Bs*BP=34KHw-6k721PdAYo1t!HiRB|o0KtAxv@p4R4Ca>} zep|Q)$>}vjOjKa~74qRK+l`;lU-JeTjWF>6*eBvjaCbDfuA=x~zIuwDks*UCdAISs zNNA*6;ZHE)1oi3l@PT=LijnDk<@40H*HWTPr{_TA2Y-Xacm;X@+F+MmmR5nbA~Kbx8IYrn&b`n#N zqr3j5zL(TmDk7=8T5l4Ccacvn{g}DpxW-7@oz!1G(6e@?`GoO zCVF}?Cz-wm>mc+5#mrLb83kIE<=tAZ-_d%JKO`BzBv)=G#-ZMqQ$+tLFsaGLq-UlQ z{r>1)9SF#29OrIwbYQBVHuiWiam$#6WbBFth@_Tn%^ymgS4*`7bs33Hu3vB@rao%# zHM5M;NKZ2pB1WgY8n0d1W7#JGVoNf^xU1A}nU+Wf#6uvrBEqUVK_|f%(PleDd1C)m z=w_xLi?b_4y9T*?2hq~;OXt&o-%{tIwHa6~fOqbli?5Z1JQD?v8owSUpS8%r@QdFC zxeIObUFsGOgE^iWayn+5c*OM>PlO9D47*Vd6)A|vGFf_XY?RczcWWOcY7bjm%v&Y< z)IPo{pzz?>DaGA{`&R^iX!u3o9=RyoIs z_H>HyDYo3H6?;YTGjR7tc57`}!j9Gri}vsxz`jB(lIvp7;-^{R?*UpnD;R-$ES%^U zIydl-M&Bh8vc7#lOt{~3Ov3$;q8b(DhBpED`D?v|C20&6<~9kRWGz*4wE0#Ow_f%b zsHXZE;IReKT3S!W5IXxo(dV8vRH{Hwx>c1tXB9|B+T~bKFJOlN8{~|pzIe-h3$zdL zVU9(TwOsZC$9jp&_{kJ(E`yEZBsbO5dm!KYe#O!t2x9cHU`><4L{kpFJ6F!3;fj1XzAP%PFhe>68edYc=9>?MX07th6>WXA4x5!5bKFz?eIQe1 zDsOA!iZcEyQG>En8r~XVsL~=;f4BNuC{wTvGZ9{lEW?P5cxbb!0oRafhyCCWx1T&nQ&% zpJPWD#{#$pPtH{2KW7^=rio<|7{QkxW$6mS;l?F=ULk>Wb?#LoXj>|v->nIOO#I#l z-!V(*YXfH>t^rt%tn2~h^SlW$K=VLEyeMz+dvYoDoY5{kI%spUQkUG6Jgy@csRaw~ zQRf}E9+$tE&8L#3pnD|DNb^>4>Rsxvo~W5cE|CW-t1xBiXRnF+h%Td`^WZERf0M@R z+*&UN5n=`^>iueMFz-i#v3|gG>5DcfE?|d13tc#ipsiRGt0 zGOp!-WR@UTbYTcID%B<>OmA71%`M|L&)qz4F+QFn+`i99fsDC`89&mI8Jiba-oG4F zqoqJ8rzT`Eq^YwUb^wROeb173^ghBOk+z?Xm_hk#u1YwKk6EfEd=30XR~AO!NNR|em7=f;ww7o%4qX8Js{^t8 zkJNx(3`?Q#?vS#1V!Y|kY5s2KbYM@@{T9)SWL(@OGQpVwxxip4GATYk4)aUR>NS+6 zQNBhK1y_SEDbMmGLGeATVPLpH5%hD*#5Ndtqy{(NO5U}Wo<2>3r_0$6H>sGFPf0Ry zBjc6squmNGJJ`T)y|FX>4Afp0FpN_ixJ4ngT2=N6ks*RQS=Yseq{0|O*gI42>am&P zu~d#UWqp?Ge0n`apT_QAc_zrfmt&Zd3=JT^V2&@P_i;;Jsa81Tv%k8@?5s^a$}yp_ zpYaX*BCo7Bcyn!3sL{*6ox!ZnuOVYOAl{Px!Ooz zLgCI_q~O4|hz;;1hIkdS8Jpl!Up`?yX#wR8qRh|IsFVyb`cBr0*&qe}R&kBbJ%#w0 zeUb^bW1?~4tZGvabARrrSgH7lOc*aa>@61DR5`IhzAdLQ!_J`;{buU}(Sgr_Kr{k2 zg2>iu_@<#-NXhR^-F;nIL^!{_WxS4kb6H`6#ZTb5si7#lfVEe>5vnv*3B%?9+5DfB zM}lB?)I@V8mpGl!kUA-(LhdVqk6AT%H^bi&%IC4bS=!d`@Cjv6=91QJrqdn34jE`x z8p8F)(nONx6T8>CsTr&peu%|bJo@FNN2R^cYUacnm0<8UgL%blDh@sn-lEeekLZgTuu>{4Qpf<={hqc|ArK`x*1OCAr^Qr;DT$o`1z>k5m1d``0(bo}qReW27CpovSR zxXWyJul731QdpsMruNZX?xo{hl>@S>=I_(q0<6KU1^lG4N5I#l9R>@ zr?e84P~WoxEQLa``s{DY-4Hu-V77j5_?*cdR!lmsbP>#@t&S7lQ1*MjtayuTjA>!0 zG@Xep5F*x!nF})4bQzf!rX~Tf*#tcF#Rn-aXH;%{aw_KFAsuToky-=J@kx-GQ7R$L z(~lcuOrQT8U3L7bcJAZlW9hyTh3%-^UY77yV@SFdFd*(-4A!q_356Wl>hrg?QvaFz zJ!Btb$ov0i2UHkv3XFekt?<9J1_5YIL%4p>zcxLD1u!Fch=LLaONtt%OFsh^HhR{v zy-@7+yo*lyUYbW!*~3n7$3L7Bq;3BqC?^Igq&2e(^de?-BG(n*6Z7virbzGx*(-r9 zDuox=9uJc2D8c8ygfm+uNJ~La#GvBv4;B^{a4Kui#S6%1EK9DS_@q!P<%dbjc~?jB z`{(Jn3M^4H6>ei#r}IO8uOH8bHRAzd@T*th{D8zoq=Lu9shtn62gPSl^mKum6y0L$ zpbDHCkah%Zuw_W#YJD)O4unLr`Rw1j84KKcxhaF8RvpVAMS6TT?s4XfICi<(3ZL^f zL{IQ9NPwhtMT@2C3A(^=U5WHH&{?rL_}2WSR7##6w}~{*Xb!qT;+!ahZF`8M6fEbd znRd!Gy(czLSd&Z6B1DaQh}!MVSeFN>piLF)WP#CxkG5kVgb+_rt8)7_io9Y%E#5;@y zeKfDiUN-QU2RVfGS5Mp0<2PU=`e!7CUC}o}Gm;Nl7{nR!^#nVCk-&2E@al9LcXtByaeqLRtUGXZ4?| k@!z)}dHnm9`y#t1Z0e6?3FF%t6yVQL&s6t`j&s6)0deu$+yDRo literal 0 HcmV?d00001 diff --git a/docs/images/~$sash_workflow_overview_diagram_Vqc.pptx b/docs/images/~$sash_workflow_overview_diagram_Vqc.pptx deleted file mode 100644 index bb00bd0f8570becd198e970ef3182994a2bc6f87..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 165 zcmWd#EKSWT$;?x5&Pl8+E>6v3P#^{fFa$D`GNdx(F_bW50!al1XNDYxM21R+Vju*G ID^L*v06NkZ4FCWD From 75433b8f5a519592882b022128702c4e26a49e11 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Fri, 21 Mar 2025 16:36:43 +1100 Subject: [PATCH 03/36] Add info readme --- README.md | 108 +++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 94 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 5e0e83c2..f8589ae8 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,104 @@ -# sash +# nf-core/sash -**scwatts/sash** is the UMCCR post-processing WGS workflow. +sash is the UMCCR post-processing WGS workflow. The workflow takes DRAGEN small variant calls and oncoanalyser results as input to perform annotation, prioritisation, rescue and filtering, and reporting for the WGS variant data. Additionally, sash runs several sensors for biomarker assessment and genomic characterisation including HRD status, mutational signatures, purity/ploidy, MSI, and TMB. -## Table of contents +While the sash workflow utilises a range of tools and software, it is most closely coupled with bolt, a Python package that implements the UMCCR post-processing logic and supporting functionality. -* [Requirements](#requirements) -* [Usage](#usage) +The **Sash** pipeline has three main workflows: -## Requirements +1. **Somatic Small Variants (SNV somatic)** + - Integrate DRAGEN calls with [**SAGE**](https://github.com/hartwigmedical/hmftools/tree/master/sage) to integrate mutations in hotspot. + - Annotates and filters variants using the **PCGR** framework to classify them into tiers (ACMG guidelines). + - Produces a comprehensive HTML report of clinically relevant mutations, mutation burden (TMB), MSI status, and more. -* Java -* Nextflow ≥22.10.6 -* Docker +2. **Somatic Structural Variants (SV somatic)** + - Integrates outputs from GRIDSS2 with **PURPLE** (for purity, ploidy, and CNVs). + - Annotates breakpoints (SnpEff) and prioritizes events (known oncogenic fusions, copy losses/gains) using panel-of-normals and known fusion references. -## Usage +3. **Germline Variants (SNV germline)** + - Filters germline calls from DRAGEN to a known cancer predisposition gene list. + - Annotates with **CPSR** for pathogenicity classification, generating an HTML report with ClinVar and ACMG-based interpretations (e.g., likely pathogenic, uncertain significance). + +Reports + +--- +## Sample sheet input + +```csv +id,subject_name,sample_name,filetype,filepath +subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ +subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ +subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ +``` + +--- +## Quick Start ```bash nextflow run scwatts/sash \ - -profile docker \ - --input samplesheet.csv \ - --ref_data_path reference_data/ \ - --outdir output/ + -profile docker \ + --input samplesheet.csv \ + --ref_data_path /path/to/reference_data/ \ + --outdir output/ ``` + +- `--input` specifies a CSV file listing your tumor/normal samples and any pre-existing Oncoanalyser outputs. +- `--ref_data_path` points to a directory containing reference resources (genome FASTA, PCGR/CPSR data bundle, hotspot lists, etc.). +- `-profile docker` runs the pipeline with Docker. Use `singularity` or `conda` if Docker is not available. + +Results are organized into subfolders for SNVs, SVs, germline calls, and final HTML reports (PCGR, CPSR). A `MultiQC` report aggregates quality metrics. + +--- + +## Installation + +For installation instructions, **[please see our tutorial page](https://nf-co.re/usage/installation)**. +You will need: +- **Nextflow** (≥22.10.0) +- A container engine (e.g., **Docker** or **Singularity**) or a Conda environment +- **Java 8/11** for running Nextflow + +Skeptically confirm that all dependencies are installed and reference data is correctly downloaded before proceeding. Erroneous references or mismatched genome builds (e.g., b37 vs GRCh38) are a common source of confusion [@nextflow_docs]. + +--- + +## Documentation + +- **[Usage Instructions](docs/usage.md)**: Detailed parameters, sample sheet format, and output descriptions. +- **Oncoanalyser**: [github.com/nf-core/oncoanalyser](https://github.com/nf-core/oncoanalyser) + + +--- + +## Pipeline Steps + +Below is a simplified overview of the main pipeline stages (each stage may have multiple processes): + +1. **Somatic SNV** + - Merge DRAGEN VCF & SAGE VCF → Annotate with PCGR → Filter → HTML report +2. **Somatic SV** + - Integrate structural calls → PURPLE for CNVs/purity → Annotate (SnpEff) → Filter → Prioritize → Summaries in MultiQC +3. **Germline** + - Filter by known predisposition genes → CPSR classification → Germline report (HTML/TSV) + +The pipeline concludes with a final MultiQC run to aggregate logs and QC. + + +## Contributions & Support + +Contributions are welcomed. For issues or feature requests: +1. Check [open issues on GitHub](https://github.com/nf-core/sash/issues) +2. If it’s new, submit a detailed report with logs and sample sheet. + +For user support, join the **nf-core Slack** community. Always verify your environment and reference integrity before blaming pipeline scripts. + +--- + +## Citation + +If you use **nf-core/sash** for your analysis, please cite: + +- **Nextflow**: [doi:10.1038/nbt.3820](https://doi.org/10.1038/nbt.3820) +- **nf-core**: [doi:10.1038/s41587-020-0439-x](https://doi.org/10.1038/s41587-020-0439-x) +- **PCGR**: [doi:10.1186/s12859-019-3220-4](https://doi.org/10.1186/s12859-019-3220-4) +- **Hartwig WiGiTS** (SAGE, PURPLE, LINX): [@hartwigmedicalfoundation_hmftools](https://github.com/hartwigmedical/hmftools) \ No newline at end of file From f2ee7881d69bebeb8d24fff58046e0b77e640db5 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Fri, 21 Mar 2025 16:59:51 +1100 Subject: [PATCH 04/36] add output.md file --- docs/output.md | 367 ++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 320 insertions(+), 47 deletions(-) diff --git a/docs/output.md b/docs/output.md index 27a5dacd..ded9c31f 100644 --- a/docs/output.md +++ b/docs/output.md @@ -1,5 +1,79 @@ -# Sash Workflow Overview - +# Sash Output + +## Introduction + +This document outlines the key results and files produced by the UMCCR SASH (post-processing WGS tumor/normal) pipeline. After a run, the pipeline organizes output files by analysis module under a directory for each tumor/normal pair (identified by run ID and sample names). The main outputs include annotated variant reports for somatic and germline variants, copy number and structural variant analyses, and a comprehensive MultiQC report for quality control. All paths below are relative to the top-level results directory of a given run. +## Pipeline overview + +- [Sash Output](#sash-output) + - [Introduction](#introduction) + - [Pipeline overview](#pipeline-overview) + - [Directory Structure](#directory-structure) + - [Summary](#summary) + - [Workflows](#workflows) + - [Somatic Small Variants](#somatic-small-variants) + - [General](#general) + - [Summary](#summary-1) + - [Details](#details) + - [bolt smlv somatic rescue](#bolt-smlv-somatic-rescue) + - [BOLT\_SMLV\_SOMATIC\_ANNOTATE](#bolt_smlv_somatic_annotate) + - [BOLT\_SMLV\_SOMATIC\_FILTER](#bolt_smlv_somatic_filter) + - [SOMATIC\_SNV\_REPORTS](#somatic_snv_reports) + - [Somatic Structural Variants](#somatic-structural-variants) + - [General](#general-1) + - [Summary](#summary-2) + - [SV Annotation](#sv-annotation) + - [SV Prioritization](#sv-prioritization) + - [Germline Variants](#germline-variants) + - [General](#general-2) + - [Summary](#summary-3) + - [Germline Preparation](#germline-preparation) + - [Germline Reports](#germline-reports) + - [Reports](#reports) + - [Cancer Report](#cancer-report) + - [LINX Reports](#linx-reports) + - [PURPLE Reports](#purple-reports) + - [PCGR Reports](#pcgr-reports) + - [SIGRAP Reports](#sigrap-reports) + - [CPSR Reports](#cpsr-reports) + - [MultiQC Reports](#multiqc-reports) + +## Directory Structure + +```bash +[RUN_ID]/[sample]/ +├── .cancer_report.html +├── .cpsr.html +├── .pcgr.html +├── _linx.html +├── .multiqc.html +├── cancer_report/ +│ ├── img/ +│ └── cancer_report_tables/ +│ ├── hrd/ +│ ├── json/ +│ ├── purple/ +│ └── sigs/ +├── linx/ +│ ├── germline_annotations/ +│ ├── somatic_annotations/ +│ └── somatic_plots/ +├── multiqc_data/ +├── purple/ +├── smlv_germline/ +│ └── prepare/ +| └── report/ +├── smlv_somatic/ +│ └── report/ +│ └── annotate/ +│ └── filter/ +│ └── rescue/ +└── sv_somatic/ + └── annotate/ + └── prioritise/ +``` + +i ## Summary The **Sash Workflow** comprises three primary pipelines: **Somatic Small Variants**, **Somatic Structural Variants**, and **Germline Variants**. These pipelines utilize **Bolt**, a Python package designed for modular processing, and leverage outputs from the **DRAGEN Variant Caller** alongside **HMFtools in Oncoanalyser**. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation, and HTML reports for research and curation. @@ -10,7 +84,7 @@ The **Sash Workflow** comprises three primary pipelines: **Somatic Small Variant #### General -In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser (SAGE, Purple)** outputs. It’s structured into four steps: **Rescue**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarizing the results. +In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser (SAGE, Purple)** outputs. It's structured into four steps: **Rescue**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarizing the results. #### Summary @@ -21,107 +95,306 @@ In the **Somatic Small Variants** workflow, variant detection is performed using ### Details -#### BOLT_SMLV_SOMATIC_RESCUE +## bolt smlv somatic rescue
Output files -- `output/` - - `output/${meta.tumor_id}.rescued.vcf.gz`: Rescued somatic VCF file. - - `output/${meta.tumor_id}.rescued.vcf.gz.tbi`: Index file for the rescued VCF. +- `smlv_somatic/rescue/` + - `.rescued.vcf.gz`: Rescued somatic VCF file containing previously filtered variants at known hotspots.
-The `BOLT_SMLV_SOMATIC_RESCUE` process rescues somatic variants using the BOLT tool. The output includes the rescued VCF file and its index. +The `BOLT_SMLV_SOMATIC_RESCUE` process rescues somatic variants using the BOLT tool. The output includes the rescued VCF file that recovers potentially important variants that may have been filtered in earlier steps due to borderline quality metrics. #### BOLT_SMLV_SOMATIC_ANNOTATE
Output files -- `output/` - - `output/${meta.tumor_id}.annotations.vcf.gz`: Annotated somatic VCF file. - - `output/${meta.tumor_id}.annotations.vcf.gz.tbi`: Index file for the annotated VCF. +- `smlv_somatic/annotate/` + - `.annotations.vcf.gz`: Annotated somatic VCF file with functional and clinical annotations.
-The `BOLT_SMLV_SOMATIC_ANNOTATE` process annotates somatic variants using the BOLT tool. The output includes the annotated VCF file and its index. +The `BOLT_SMLV_SOMATIC_ANNOTATE` process annotates somatic variants using the BOLT tool. The output includes the annotated VCF file enriched with gene information, variant effect predictions, and other annotations to aid in variant interpretation. #### BOLT_SMLV_SOMATIC_FILTER
Output files -- `output/` - - `output/${meta.tumor_id}*pass.vcf.gz`: Filtered somatic VCF file. - - `output/${meta.tumor_id}*pass.vcf.gz.tbi`: Index file for the filtered VCF. - - `output/${meta.tumor_id}*filters_set.vcf.gz`: VCF file with filters set. +- `smlv_somatic/filter/` + - `.filters_set.vcf.gz`: VCF file with filters set but all variants retained. + - `.pass.vcf.gz`: Filtered somatic VCF file containing only PASS variants. + - `.pass.vcf.gz.tbi`: Index file for the filtered VCF.
-The `BOLT_SMLV_SOMATIC_FILTER` process filters somatic variants using the BOLT tool. The output includes the filtered VCF file, its index, and the VCF file with filters set. +The `BOLT_SMLV_SOMATIC_FILTER` process filters somatic variants using the BOLT tool. The output includes both a VCF with all variants but filter tags applied, and a filtered VCF containing only variants that pass all quality filters. -#### PAVE_SOMATIC +#### SOMATIC_SNV_REPORTS
Output files -- `output/` - - `output/${meta.tumor_id}.pave.vcf.gz`: PAVE somatic VCF file. - - `output/${meta.tumor_id}.pave.vcf.gz.tbi`: Index file for the PAVE VCF. +- `smlv_somatic/report/` + - `.somatic.bcftools_stats.txt`: Statistical summary of somatic variants. + - `.somatic.variant_counts_process.json`: Variant count metrics at each processing step. + - `.somatic.variant_counts_type.yaml`: Variant counts by variant type. + - `af_tumor.txt`: Information about variant allele frequencies in tumor. + - `af_tumor_keygenes.txt`: Variant allele frequencies in key cancer-related genes. + - `pcgr/`: Directory containing PCGR report outputs.
-The `PAVE_SOMATIC` process processes somatic variants using the PAVE tool. The output includes the PAVE VCF file and its index. +The reporting process generates statistical summaries and specialized reports for somatic SNVs, including PCGR HTML reports for clinical interpretation. + +### Somatic Structural Variants + +#### General + +The **Somatic Structural Variants** workflow identifies and analyzes large genomic rearrangements such as deletions, duplications, inversions, and translocations. It processes outputs from GRIDSS, PURPLE, and LINX to provide comprehensive SV analysis. + +#### Summary + +1. **Annotate** SVs with gene context and potential functional impacts. +2. **Prioritize** SVs based on cancer relevance and gene disruption potential. +3. **Report** clinically relevant SVs with gene fusion predictions and visualization. + +#### SV Annotation + +
+Output files -### Sash Module Outputs +- `sv_somatic/annotate/` + - `.annotated.vcf.gz`: Annotated structural variant VCF file. -**1. Somatic SNVs:** +
-- `smlv_somatic/filter/{tid}.pass.vcf.gz`: Contains somatic single nucleotide variants (SNVs) with filtering applied. +This process adds gene annotations and functional impact predictions to structural variants. The annotated VCF contains information about genes affected by breakpoints, potential fusion events, and other biologically relevant details. -**2. Somatic SVs:** +#### SV Prioritization -- `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz`: Contains somatic structural variants (SVs) with prioritization applied. +
+Output files -**3. Somatic CNVs:** +- `sv_somatic/prioritise/` + - `.cnv.prioritised.tsv`: Prioritized copy number variations in tabular format. + - `.sv.prioritised.tsv`: Prioritized structural variants in tabular format. + - `.sv.prioritised.vcf.gz`: Prioritized structural variants in VCF format. -- `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz`: Contains somatic copy number variations (CNVs) data. +
-**4. Somatic Gene CNVs:** +This process ranks structural variants based on their potential clinical relevance, creating filterable lists for review. It separately handles copy number variations and other structural variants for easier interpretation. -- `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz`: Contains gene-level somatic copy number variations (CNVs) data. +### Germline Variants -**5. Germline SNVs:** +#### General -- `dragen_germline_output/{nid}.hard-filtered.vcf.gz`: Contains germline single nucleotide variants (SNVs) with hard filtering applied. +The **Germline Variants** workflow analyzes inherited variants from the normal sample to identify potential cancer predisposition genes and variants that may influence treatment decisions. -**6. Purple Purity, Ploidy, MS Status:** +#### Summary -- `purple/{tid}.purple.purity.tsv`: Contains estimated tumor purity, ploidy, and microsatellite status. +1. **Prepare** germline variants from DRAGEN normal sample outputs. +2. **Report** potentially actionable germline variants through CPSR. -**7. PCGR JSON with TMB:** +#### Germline Preparation -- `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz`: Contains PCGR annotations, including tumor mutational burden (TMB). +
+Output files -**8. DRAGEN HRD Score:** +- `smlv_germline/prepare/` + - `.prepared.vcf.gz`: Prepared germline VCF file for annotation. + +
+ +This process prepares germline variants for downstream annotation and reporting. It applies normalization, left-alignment, and other preprocessing steps to ensure consistent variant representation. + +#### Germline Reports + +
+Output files + +- `smlv_germline/report/` + - `.annotations.vcf.gz`: Annotated germline VCF. + - `.germline.bcftools_stats.txt`: Statistical summary of germline variants. + - `.germline.variant_counts_type.yaml`: Variant counts by type. + - `cpsr/`: Directory containing CPSR outputs. + - `.cpsr.grch38.json.gz`: Structured CPSR data. + - `.cpsr.grch38.pass.tsv.gz`: Filtered CPSR variants in tabular format. + - `.cpsr.grch38.snvs_indels.tiers.tsv`: Tiered variants by clinical significance. + - Other CPSR output files. + +
+ +The germline reporting process focuses on identifying variants in cancer predisposition genes and producing a comprehensive CPSR (Cancer Predisposition Sequencing Reporter) report. + +### Reports + +#### Cancer Report + +
+Output files + +- `.cancer_report.html`: Main cancer report HTML file. +- `cancer_report/` + - `.snvs.normalised.vcf.gz`: Normalized SNVs used in the report. + - `img/`: Images used in the cancer report. + - `cancer_report_tables/`: Tabular data supporting the report. + - `_-qc_summary.tsv.gz`: Quality control summary. + - `_-report_inputs.tsv.gz`: Report configuration inputs. + - `hrd/`: Homologous Recombination Deficiency analysis from multiple methods. + - `json/`: JSON-formatted report data. + - `purple/`: Copy number information. + - `sigs/`: Mutational signature analysis (SBS, DBS, indels). + +
-- `dragen_somatic_output/{tid}.hrdscore.tsv`: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. +The cancer report integrates findings from all analysis modules into a comprehensive HTML report for clinical interpretation. It includes tumor characteristics, key somatic alterations, mutational signatures, and therapy recommendations. -### Pipeline information +#### LINX Reports
Output files -- `pipeline_info/` - - Reports generated by Nextflow: `execution_report.html`, `execution_timeline.html`, `execution_trace.txt` and `pipeline_dag.dot`/`pipeline_dag.svg`. - - Reports generated by the pipeline: `pipeline_report.html`, `pipeline_report.txt` and `software_versions.yml`. The `pipeline_report*` files will only be present if the `--email` / `--email_on_fail` parameter's are used when running the pipeline. - - Reformatted samplesheet files used as input to the pipeline: `samplesheet.valid.csv`. +- `_linx.html`: LINX visualization report. +- `linx/` + - `germline_annotations/`: Germline structural variant analysis. + - `.linx.germline.breakend.tsv`: Germline breakend annotations. + - `.linx.germline.clusters.tsv`: Germline SV clusters. + - `.linx.germline.disruption.tsv`: Gene disruptions by germline SVs. + - `.linx.germline.driver.catalog.tsv`: Potential driver germline SVs. + - `.linx.germline.links.tsv`: Links between germline SVs. + - `.linx.germline.svs.tsv`: Germline structural variants. + - `linx.version`: LINX version information. + - `somatic_annotations/`: Somatic structural variant analysis. + - `.linx.breakend.tsv`: Somatic breakend annotations. + - `.linx.clusters.tsv`: Somatic SV clusters. + - `.linx.driver.catalog.tsv`: Potential driver somatic SVs. + - `.linx.drivers.tsv`: Driver SV details. + - `.linx.fusion.tsv`: Gene fusion predictions. + - `.linx.links.tsv`: Links between somatic SVs. + - `.linx.svs.tsv`: Somatic structural variants. + - Visualization data files (vis_*). + - `linx.version`: LINX version information. + - `somatic_plots/`: Visualizations of somatic structural variants.
-[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. +LINX reports provide detailed analysis of structural variants, including gene fusions, disruptions, and visualization of complex rearrangements. The HTML visualization report offers interactive exploration of structural variants and their potential functional impacts. -## Conclusion +#### PURPLE Reports + +
+Output files + +- `purple/` + - `.purple.cnv.gene.tsv`: Gene-level copy number variations. + - `.purple.cnv.somatic.tsv`: Segment-level somatic copy number variations. + - `.purple.driver.catalog.germline.tsv`: Potential germline driver variants. + - `.purple.driver.catalog.somatic.tsv`: Potential somatic driver variants. + - `.purple.germline.deletion.tsv`: Germline deletion information. + - `.purple.purity.range.tsv`: Range of possible purity values. + - `.purple.purity.tsv`: Tumor purity, ploidy, and microsatellite status. + - `.purple.qc`: Quality control metrics. + - `.purple.segment.tsv`: Genomic segmentation data. + - `.purple.somatic.clonality.tsv`: Clonality analysis of somatic variants. + - `.purple.somatic.hist.tsv`: Somatic variant histograms. + - `.purple.somatic.vcf.gz`: Somatic variants VCF with copy number annotations. + - `.purple.sv.germline.vcf.gz`: Germline structural variants. + - `.purple.sv.vcf.gz`: Somatic structural variants. + - `circos/`: Circos visualization data files. + - `.ratio.circos`: Normal sample coverage ratio data. + - `.baf.circos`: B-allele frequency data. + - `.cnv.circos`: Copy number data. + - `.indel.circos`: Indel visualization data. + - `.link.circos`: SV links visualization data. + - `.snp.circos`: SNP visualization data. + - Configuration and input files for Circos. + - `plot/`: Additional visualization data. + - `purple.version`: PURPLE version information. + +
+ +PURPLE reports provide copy number analysis, tumor purity estimation, and whole genome doubling assessment. The circos directory contains data for generating circular genome plots that visualize genomic alterations across the entire genome. + +#### PCGR Reports + +
+Output files + +- `.pcgr.html`: PCGR HTML report. +- `smlv_somatic/report/pcgr/` + - `.pcgr_acmg.grch38.flexdb.html`: Flexible database PCGR report. + - `.pcgr_acmg.grch38.json.gz`: Structured PCGR data in JSON format. + - `.pcgr_acmg.grch38.mp_input.vcf.gz`: Input VCF for mutational pattern analysis. + - `.pcgr_acmg.grch38.mutational_signatures.tsv`: Mutational signature analysis. + - `.pcgr_acmg.grch38.pass.tsv.gz`: Filtered variants in tabular format. + - `.pcgr_acmg.grch38.pass.vcf.gz`: Filtered variants in VCF format. + - `.pcgr_acmg.grch38.snvs_indels.tiers.tsv`: Tiered variants by clinical significance. + - `.pcgr_acmg.grch38.vcf.gz`: All variants in VCF format. + - `.pcgr_config.rds`: PCGR configuration. + +
+ +PCGR (Personal Cancer Genome Reporter) reports provide clinical interpretation of somatic variants, including therapy matches, clinical trial eligibility, and tumor mutational burden assessment. + +#### SIGRAP Reports + +
+Output files + +- `cancer_report/cancer_report_tables/sigs/` + - `_-dbs.tsv.gz`: Double base substitution signature analysis. + - `_-indel.tsv.gz`: Indel signature analysis. + - `_-snv_2015.tsv.gz`: SNV signature analysis using 2015 signatures. + - `_-snv_2020.tsv.gz`: SNV signature analysis using 2020 signatures. +- `cancer_report/cancer_report_tables/json/sigs/`: JSON-formatted signature data. + +
+ +SIGRAP reports provide mutational signature analysis, identifying patterns associated with specific mutational processes or exposures. The pipeline analyzes single base substitutions (SBS), double base substitutions (DBS), and indel signatures using both the 2015 and 2020 reference signature sets. + +#### CPSR Reports + +
+Output files + +- `.cpsr.html`: CPSR HTML report. +- `smlv_germline/report/cpsr/` + - `.cpsr.grch38.custom_list.bed`: Custom gene list in BED format. + - `.cpsr.grch38.json.gz`: Structured CPSR data in JSON format. + - `.cpsr.grch38.pass.tsv.gz`: Filtered variants in tabular format. + - `.cpsr.grch38.pass.vcf.gz`: Filtered variants in VCF format. + - `.cpsr.grch38.snvs_indels.tiers.tsv`: Tiered variants by clinical significance. + - `.cpsr.grch38.vcf.gz`: All variants in VCF format. + - `.cpsr_config.rds`: CPSR configuration. + +
+ +CPSR (Cancer Predisposition Sequencing Reporter) focuses on germline variants in known cancer predisposition genes, providing a comprehensive report of inherited cancer risk variants. + +#### MultiQC Reports + +
+Output files + +- `.multiqc.html`: Main MultiQC report. +- `multiqc_data/`: Supporting data for the MultiQC report. + - `dragen_frag_len.txt`: Fragment length metrics. + - `dragen_map_metrics.txt`: Mapping metrics. + - `dragen_ploidy.txt`: Ploidy estimation metrics. + - `dragen_time_metrics.txt`: Processing time metrics. + - `dragen_trimmer_metrics.txt`: Read trimming metrics. + - `dragen_vc_metrics.txt`: Variant calling metrics. + - `dragen_wgs_cov_metrics.txt`: WGS coverage metrics. + - `multiqc.log`: MultiQC log file. + - `multiqc_bcftools_stats.txt`: BCFtools statistics. + - `multiqc_data.json`: MultiQC data in JSON format. + - `multiqc_general_stats.txt`: General statistics. + - `purple.txt`: PURPLE metrics. + +
-This document provides an overview of the output files generated by the pipeline. For more detailed information about each step and the tools used, please refer to the respective documentation and help pages. \ No newline at end of file +MultiQC aggregates quality metrics from all pipeline components into a single HTML report, providing an overview of sample quality and analysis performance. \ No newline at end of file From b8227fbd719c131d83b0a73accc7837f3cf15066 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Mon, 24 Mar 2025 12:05:26 +1100 Subject: [PATCH 05/36] linting --- README.md | 7 ++++--- assets/samplesheet.csv | 8 +++++++- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index f8589ae8..22eb75fc 100644 --- a/README.md +++ b/README.md @@ -4,11 +4,12 @@ sash is the UMCCR post-processing WGS workflow. The workflow takes DRAGEN small While the sash workflow utilises a range of tools and software, it is most closely coupled with bolt, a Python package that implements the UMCCR post-processing logic and supporting functionality. -The **Sash** pipeline has three main workflows: +![Summary](docs/images/sash_overview_qc.png) +The **Sash** pipeline has three main workflows (details in [Docs](docs/README.md)): 1. **Somatic Small Variants (SNV somatic)** - Integrate DRAGEN calls with [**SAGE**](https://github.com/hartwigmedical/hmftools/tree/master/sage) to integrate mutations in hotspot. - - Annotates and filters variants using the **PCGR** framework to classify them into tiers (ACMG guidelines). + - Annotates and filters variants using the **PCGR** framework to classify them into tiers ([ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330) guidelines). - Produces a comprehensive HTML report of clinically relevant mutations, mutation burden (TMB), MSI status, and more. 2. **Somatic Structural Variants (SV somatic)** @@ -77,7 +78,7 @@ Below is a simplified overview of the main pipeline stages (each stage may have 1. **Somatic SNV** - Merge DRAGEN VCF & SAGE VCF → Annotate with PCGR → Filter → HTML report 2. **Somatic SV** - - Integrate structural calls → PURPLE for CNVs/purity → Annotate (SnpEff) → Filter → Prioritize → Summaries in MultiQC + - Integrate structural calls → PURPLE for CNVs/purity → Annotate (SnpEff) → Filter → Prioritize → Summaries in MultiQC 3. **Germline** - Filter by known predisposition genes → CPSR classification → Germline report (HTML/TSV) diff --git a/assets/samplesheet.csv b/assets/samplesheet.csv index 8e141d97..c4cc70d5 100644 --- a/assets/samplesheet.csv +++ b/assets/samplesheet.csv @@ -1 +1,7 @@ -id,subject_name,sample_name,sample_type,filetype,filepath +id,subject_name,sample_name,filetype,filepath +subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ +subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ +subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ +subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ +subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ +subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ \ No newline at end of file From d30363ea011afb96b5ede5b7cb2993a20d07a91a Mon Sep 17 00:00:00 2001 From: qclayssen Date: Mon, 24 Mar 2025 12:05:35 +1100 Subject: [PATCH 06/36] add doc --- docs/README.md | 732 +++++++++++++++++++++++++------------------------ 1 file changed, 372 insertions(+), 360 deletions(-) diff --git a/docs/README.md b/docs/README.md index 9e8cac48..13530521 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,3 +1,8 @@ +- [Usage](usage.md) + - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. +- [Output](output.md) + - An overview of the different results produced by the pipeline and how to interpret them. + # Sash Workflow Overview ![Summary](images/sash_overview_qc.png) @@ -8,25 +13,36 @@ The **sash Workflow** is a genomic analysis framework comprising three primary p - **Somatic Structural Variants (SV somatic):** Identifies large-scale genomic alterations (deletions, duplications, etc.) and integrates copy number data. - **Germline Variants (SNV germline):** Focuses on inherited variants linked to cancer predisposition. -These pipelines utilise **Bolt**, a Python package designed for modular processing, and leverage outputs from the [**DRAGEN**](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) **Variant Caller** alongside and the Hartwig Medical Foundation **WiGiTS** toolkit (via [Oncoanalyser]() [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. +These pipelines utilise **Bolt**, a Python package designed for modular processing, and leverage outputs from the [**DRAGEN**](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) **Variant Caller** alongside and the Hartwig Medical Foundation **WiGiTS** toolkit (via [Oncoanalyser](https://github.com/nf-core/oncoanalyser)) [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. -# [**HMFtools WiGiTs**](https://github.com/hartwigmedical/hmftools/tree/master) +# [**HMFtools WiGiTs**](https://github.com/hartwigmedical/hmftools/tree/master) **HMFtools WiGiTS** is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: - **[SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md):** A tiered SNV/indel caller targeting ~10,000 cancer hotspots (e.g., OncoKB, CIViC) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). -- **[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple):** +- **[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple):** Estimates tumor **purity** (tumor cell fraction) and **ploidy** (average copy number), integrates copy number data, and calculates **TMB** (tumor mutation burden) and **MSI** (microsatellite instability). +# Pipeline Inputs + +## Dragen + + def vcf = file(meta.dragen_somatic_dir).toUriString() + "/${meta.tumor_id}.hard-filtered.vcf.gz" + +## Oncoanalyser + + def subpath = "/gridss/${meta.tumor_id}.gridss.vcf.gz" + def subpath = "/sage/somatic/${meta.tumor_id}.sage.somatic.vcf.gz" + # Workflows ## **Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants** #### General -In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser** that is relaing on **Somatic Alterations in Genome[(SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)>)** outputs. It’s structured into four steps: **Integrations**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarising the results. +In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser** that is relaing on **Somatic Alterations in Genome [SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple))** outputs. It’s structured into four steps: **Integrations**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarising the results. #### Summary @@ -37,71 +53,67 @@ In the **Somatic Small Variants** workflow, variant detection is performed using ### Variant Calling integrations -The **variant calling integrations** step use variants fromemploys the **Somatic Alterations in Genome (SAGE)** variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on **targets known cancer hotspots (from sources like CGI, CIViC, OncoKB)Targeted Hotspot. Analysis**, prioritising predefined genomic regions of high clinical or biological relevance. This enables the integration callingrecovery of biologically significant variants in a VCF that may have been missed otherwise. - -[sage](https://github.com/hartwigmedical/hmftools/tree/master/sage) -[#6-soft-filters](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters) +The **variant calling integrations** step use variants fromemploys the **Somatic Alterations in Genome (SAGE)** variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on **targets known cancer hotspots (from sources like CGI, CIViC, OncoKB) Targeted Hotspot**. Analysis, prioritising predefined genomic regions of high clinical or biological relevance with his own [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. -* Employ the **SAGE tool** for targeted hotspot analysis to recover: - * Low-allele-frequency variants in hotspots genomic regions of clinical significance. -* Hotspots are derived from: - * Cancer Genome Interpreter (CGI) - * CIViC \- Clinical interpretations of variants in cancer. - * OncoKB \- Precision Oncology Knowledge Base. -* Outputs a VCF containing rescued variants. +- Low-allele-frequency variants in hotspots genomic regions of clinical significance. +- Hotspots are derived from: + Cancer Genome Interpreter (CGI) + - CIViC \- Clinical interpretations of variants in cancer. + - OncoKB \- Precision Oncology Knowledge Base. +- Outputs a VCF containing rescued variants. ##### Inputs: -* From DRAGEN: somatic small variant callerVCF - * ${meta.tumor\_id}.main.dragen.vcf.gz -* From oncoanalyser: SAGE VCF - * ${meta.tumor\_id}.main.sage.filtered.vcf.gz +- From DRAGEN: somatic small variant callerVCF + - ${tumor\_id}.main.dragen.vcf.gz +- From oncoanalyser: SAGE VCF + - ${tumor\_id}.main.sage.filtered.vcf.gz Filter on chr 1..22 and chr X,Y,M ##### Output: -* Rescue: VCF - * ${meta.tumor\_id}.rescued.vcf.g +- Rescue: VCF + - ${tumor\_id}.rescued.vcf.g #### Details **Steps are:** 1. **Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered:** - * **Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions.** - * **Hotspot regions are derived from databases such as:** - * **Cancer Genome Interpreter (CGI)** - * **CIViC (Clinical Interpretations of Variants in Cancer)** - * **OncoKB (Precision Oncology Knowledge Base)** - * **This ensures that only high-confidence variants in clinically relevant regions are considered.** + - **Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions.** + - **Hotspot regions are derived from databases such as:** + - **Cancer Genome Interpreter (CGI)** + - **CIViC (Clinical Interpretations of Variants in Cancer)** + - **OncoKB (Precision Oncology Knowledge Base)** + - **This ensures that only high-confidence variants in clinically relevant regions are considered.** 2. **Separate SAGE calls into existing and novel variants** - * **Compare the input VCF and the filtered SAGE VCF to identify overlapping and unique variants.** + - **Compare the input VCF and the filtered SAGE VCF to identify overlapping and unique variants.** 3. **Annotate existing somatic variant calls also present in the SAGE calls in the input VCF** - * **Annotate variants that are re-called by SAGE:** - * **For each variant in the input VCF, check if it exists in the SAGE existing calls.** - * **For variants re-called by SAGE:** - * **If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`:** - * **Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot.** - * **If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`:** - * **Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is rescued by SAGE.** - * **Update `FILTER=PASS` to include the variant in the final analysis.** - * **If `SAGE FILTER` is not `PASS`:** - * **Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants.** - * **Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix** + - **Annotate variants that are re-called by SAGE:** + - **For each variant in the input VCF, check if it exists in the SAGE existing calls.** + - **For variants re-called by SAGE:** + - **If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`:** + - **Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot.** + - **If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`:** + - **Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is rescued by SAGE.** + - **Update `FILTER=PASS` to include the variant in the final analysis.** + - **If `SAGE FILTER` is not `PASS`:** + - **Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants.** + - **Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix** 4. **Combine annotated input VCF with novel SAGE calls** - * **Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF::** - * **Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions:** - * **For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`.** - * **Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data.** + - **Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF::** + - **Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions:** + - **For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`.** + - **Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data.** **Summary Finalize the rescued of VCF file integration** - * **The final VCF file includes:** - * **Original variants from the input VCF, annotated with SAGE information where applicable.** - * **Novel variants identified by SAGE in hotspot regions.** - * **Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process.** - * **The rescued VCF provides a comprehensive set of variants for downstream analysis, prioritizing clinically significant mutations.** + - **The final VCF file includes:** + - **Original variants from the input VCF, annotated with SAGE information where applicable.** + - **Novel variants identified by SAGE in hotspot regions.** + - **Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process.** + - **The rescued VCF provides a comprehensive set of variants for downstream analysis, prioritizing clinically significant mutations.** ### Annotation @@ -112,77 +124,77 @@ The **Annotation** consists of three processes:step employs Reference Sources (G Summary: Use **PCGR** to enrich the VCF with: -* Functional impact information (e.g., consequences, mutation hotspots). -* Clinical relevance (e.g., tier classifications, mutational signatures). -* Process VCF files in chunks ≤500,000 variants each. -* Merge annotated chunks into a unified VCF. +- Functional impact information (e.g., consequences, mutation hotspots). +- Clinical relevance (e.g., tier classifications, mutational signatures). +- Process VCF files in chunks ≤500,000 variants each. +- Merge annotated chunks into a unified VCF. ##### Inputs: -* Small variant vcfRescue VCF - * ${meta.tumor\_id}.main.sage.filtered.vcf.gz +- Small variant vcfRescue VCF + - ${tumor\_id}.main.sage.filtered.vcf.gz ##### Output: -* Annotated VCF - * ${meta.tumor\_id}.annotations.vcf.g +- Annotated VCF + - ${tumor\_id}.annotations.vcf.g Details: **Steps are:** 1. **Set FILTER to "PASS" for unfiltered variants** - * Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. + - Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. 2. **Annotate the VCF against reference sources** - * Use **vcfanno** to add annotations to the VCF file: - * **gnomAD** - * **Hartwig Hotspots** - * **ENCODE Blacklist** - * **Genome in a Bottle High-Confidence Regions**: Mark high-confidence regions from the Genome in a Bottle benchmark. - * **Low and High GC Regions**: Mark regions with \30% or \65% GC content, compiled by GA4GH. - * **Bad Promoter Regions**: Annotate regions with poor coverage, compiled by GA4GH. + - Use **vcfanno** to add annotations to the VCF file: + - **gnomAD** + - **Hartwig Hotspots** + - **ENCODE Blacklist** + - **Genome in a Bottle High-Confidence Regions**: Mark high-confidence regions from the Genome in a Bottle benchmark. + - **Low and High GC Regions**: Mark regions with \30% or \65% GC content, compiled by GA4GH. + - **Bad Promoter Regions**: Annotate regions with poor coverage, compiled by GA4GH. 3. **Annotate with UMCCR panel of normals counts** - * Use **vcfanno** and **bcftools** to annotate the VCF with counts from the **UMCCR panel of normals**, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. + - Use **vcfanno** and **bcftools** to annotate the VCF with counts from the **UMCCR panel of normals**, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. 4. **Standardize the VCF fields** - * Add new `INFO` fields for use with **PCGR**: -* `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. -* `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. -* Add the `AD` FORMAT field: -* `AD`: Allelic depths for the reference and alternate alleles. + - Add new `INFO` fields for use with **PCGR**: +- `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. +- `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. +- Add the `AD` FORMAT field: +- `AD`: Allelic depths for the reference and alternate alleles. 5. **Prepare VCF for PCGR annotation** - * Exclude unnecessary data from the VCF header keeping on INFO AF/DP . - * Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. - * Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. + - Exclude unnecessary data from the VCF header keeping on INFO AF/DP . + - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. + - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. 6. **Run PCGR to annotate VCF against external sources** - * Use **PCGR** (Personal Cancer Genome Reporter) to annotate the VCF with clinical, functional, and biological information. - * Classify variants by tiers based on annotations and functional impact according to **ACMG** guidelines. - * Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - * External sources used during this step include **VEP**, **ClinVar**, **COSMIC**, **TCGA**, **ICGC**, **Open Targets Platform**, **CancerMine**, **DoCM**, **CBMDB**, **DisGeNET**, **Cancer Hotspots**, **dbNSFP**, **UniProt/SwissProt**, **Pfam**, **DGIdb**, and **ChEMBL**. + - Use **PCGR** (Personal Cancer Genome Reporter) to annotate the VCF with clinical, functional, and biological information. + - Classify variants by tiers based on annotations and functional impact according to **ACMG** guidelines. + - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. + - External sources used during this step include **VEP**, **ClinVar**, **COSMIC**, **TCGA**, **ICGC**, **Open Targets Platform**, **CancerMine**, **DoCM**, **CBMDB**, **DisGeNET**, **Cancer Hotspots**, **dbNSFP**, **UniProt/SwissProt**, **Pfam**, **DGIdb**, and **ChEMBL**. 7. **Transfer PCGR annotations to the full set of variants** - * merge the PCGR annotations back into the original VCF file. - * Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. - * Preserve the `FILTER` statuses and other annotations from the original VCF. + - merge the PCGR annotations back into the original VCF file. + - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. + - Preserve the `FILTER` statuses and other annotations from the original VCF. 8. **Filter variants to remove putative germline variants and artefactsartifacts while keeping known hotspots/actionable variants** - * **Keep variants**: - * Called by **SAGE** in known hotspots (CGI, CIViC, OncoKB) regardless of other evidence. - * With PCGR TIER 1 and 2 classifications, indicating strong or potential clinical significance according to ACMG guidelines. - * All driver mutations from; - * **IntOGen** - * mutation hotspots - * ClinVar pathogenic or uncertain significance - * COSMIC count ≥10 - * TCGA pancancer count ≥5 - * ICGC PCAWG count ≥3. - * **Apply filters to other variants**: - * Remove variants with `AF 10%`. - * Remove common variants in gnomAD (<`population AF ≥ 1%`), adding them to the germline set. - * Remove variants present in ≥5 samples of the Panel of Normals. - * Remove indels in "bad promoter" regions (as defined by GA4GH). - * Remove variants overlapping the ENCODE blacklist. - * Remove variants with variant depth `VD 4`. - * Remove variants with `VD < 6` and overlapping a low complexity region. - * Remove **VarDict** strand-biased variants unless supported by other callers. + - **Keep variants**: + - Called by **SAGE** in known hotspots (CGI, CIViC, OncoKB) regardless of other evidence. + - With PCGR TIER 1 and 2 classifications, indicating strong or potential clinical significance according to ACMG guidelines. + - All driver mutations from; + - **IntOGen** + - mutation hotspots + - ClinVar pathogenic or uncertain significance + - COSMIC count ≥10 + - TCGA pancancer count ≥5 + - ICGC PCAWG count ≥3. + - **Apply filters to other variants**: + - Remove variants with `AF 10%`. + - Remove common variants in gnomAD (<`population AF ≥ 1%`), adding them to the germline set. + - Remove variants present in ≥5 samples of the Panel of Normals. + - Remove indels in "bad promoter" regions (as defined by GA4GH). + - Remove variants overlapping the ENCODE blacklist. + - Remove variants with variant depth `VD 4`. + - Remove variants with `VD < 6` and overlapping a low complexity region. + - Remove **VarDict** strand-biased variants unless supported by other callers. 9. **Report passing variants using PCGR, classified by the ACMG tier system** 10. Generate the final report of variants classified according to clinical significance using **PCGR**, ready for downstream analysis. @@ -196,13 +208,13 @@ The **Filter** step applies a series of stringent filters to somatic variant cal Inputs: -* Annotated VCF - * ${meta.tumor\_id}.annotations.vcf.gz +- Annotated VCF + - ${meta.tumor\_id}.annotations.vcf.gz #### Output: -* Filter VCF - * ${meta.tumor\_id}\*filters\_set.vcf.gz +- Filter VCF + - ${meta.tumor\_id}\*filters\_set.vcf.gz **Filters:** @@ -210,41 +222,41 @@ Inputs: #### **1.1 Allele Frequency (20% are also excluded.** -* **This step reduces contamination from sequencing artefacts or undetected germline variants.** +- **Variants present in more than 5 normal samples from the UMCCR Panel of Normals are removed.** +- **Variants with a PoN AF \>20% are also excluded.** +- **This step reduces contamination from sequencing artefacts or undetected germline variants.** ### **3\. Rescue and Clinical Significance Filters** @@ -252,58 +264,58 @@ Inputs: ### **3.1 Hotspot Rescue** -* **Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters.** +- **Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters.** #### **3.2 Reference Database Hit Count Rescue** -* **Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering:** - * **COSMIC count ≥10** - * **TCGA pan-cancer count ≥5** - * **ICGC PCAWG count ≥5** +- **Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering:** + - **COSMIC count ≥10** + - **TCGA pan-cancer count ≥5** + - **ICGC PCAWG count ≥5** #### **3.3 ClinVar Pathogenicity Rescue** -* **Variants classified in ClinVar as:** - * **Likely Pathogenic** - * **Pathogenic** - * **Uncertain Significance (VUS) with strong clinical evidence** - -* Allele Frequency (AF) Filter: - * Excludes variants with a tumor allele frequency below a threshold of 0.1. -* Allele Depth (AD) Filter: - * Removes variants with fewer than 4 supporting reads in the tumor sample. -* Degraded Mappability AD Filter: - * Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. - * Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor\_ad \ 6 -* Non-GIAB AD Filter: - * Removes variants not confirmed by the Genome in a Bottle (= 0.01 -* Panel of Normals (PON) Germline Filter: - * Filters out variants with an allele frequency in the PON below 0.20. - * Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON\_COUNT \>= 5 -* FIlter rescue variant: +- **Variants classified in ClinVar as:** + - **Likely Pathogenic** + - **Pathogenic** + - **Uncertain Significance (VUS) with strong clinical evidence** + +- Allele Frequency (AF) Filter: + - Excludes variants with a tumor allele frequency below a threshold of 0.1. +- Allele Depth (AD) Filter: + - Removes variants with fewer than 4 supporting reads in the tumor sample. +- Degraded Mappability AD Filter: + - Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. + - Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor\_ad \ 6 +- Non-GIAB AD Filter: + - Removes variants not confirmed by the Genome in a Bottle (= 0.01 +- Panel of Normals (PON) Germline Filter: + - Filters out variants with an allele frequency in the PON below 0.20. + - Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON\_COUNT \>= 5 +- FIlter rescue variant: Variants meeting these criteria are flagged as `CLINICAL_POTENTIAL_RESCUE` are **NOT filtered out** -* **Reference Database Hit Counts**: - * Variants with a **COSMIC count** of ≥10. - * Variants with a **TCGA pan-cancer count** of ≥5. - * Variants with an **ICGC PCAWG count** of ≥5. -* **ClinVar Significance**: - * Variants with ClinVar classifications matching the following categories are rescued: - * `conflicting_interpretations_of_pathogenicity` - * `likely_pathogenic` - * `pathogenic` - * `uncertain_significance` -* **Mutation Hotspots**: - * Variants identified as hotspots in: - * `HMF_HOTSPOT` - * `PCGR_MUTATION_HOTSPOT` -* **PCGR Tiers**: - * Variants classified as: - * `TIER_1` - * `TIER_2` +- **Reference Database Hit Counts**: + - Variants with a **COSMIC count** of ≥10. + - Variants with a **TCGA pan-cancer count** of ≥5. + - Variants with an **ICGC PCAWG count** of ≥5. +- **ClinVar Significance**: + - Variants with ClinVar classifications matching the following categories are rescued: + - `conflicting_interpretations_of_pathogenicity` + - `likely_pathogenic` + - `pathogenic` + - `uncertain_significance` +- **Mutation Hotspots**: + - Variants identified as hotspots in: + - `HMF_HOTSPOT` + - `PCGR_MUTATION_HOTSPOT` +- **PCGR Tiers**: + - Variants classified as: + - `TIER_1` + - `TIER_2` ### Repor @@ -311,24 +323,24 @@ The **Report** step utilises the **Personal Cancer Genome Reporter (PCGR)** Inputs: -* Purple purity -* Filter VCF -* Dragen VCF +- Purple purity +- Filter VCF +- Dragen VCF #### Output: -* PCGRCancer repor - * ${meta.tumor\_id}.pcgr\_acmg.grch38.html +- PCGRCancer repor + - ${meta.tumor\_id}.pcgr\_acmg.grch38.html 1. **Generate BCFtools Statistics on the Input VCF:** The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. 2. **Calculate Allele Frequency Distributions:** The `allele_frequencies` function uses external tools (bcftools, bedtools) to: - * Filter and normalize variants according to high-confidence regions. - * Extract allele frequency data from tumor samples. - * Produce both a global allele frequency summary and a subset of allele frequencies restricted to key cancer genes. + - Filter and normalize variants according to high-confidence regions. + - Extract allele frequency data from tumor samples. + - Produce both a global allele frequency summary and a subset of allele frequencies restricted to key cancer genes. 3. **Compare Variant Counts From Two Variant Sets (DRAGEN vs. BOLT)** - * The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. + - The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. 4. **Count Variants by Processing Stage** 5. **Parse Purity and Ploidy Information (Purple Data)** 6. **Run PCGR Annotation** @@ -340,25 +352,25 @@ The **Somatic Structural Variants (SVs) pipeline** identifies and annotates **la ### **Summary:** 1. **GRIPDSS filtering:** - * GRIPDSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known\_fusions + - GRIPDSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known\_fusions 2. PURPLE - * Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. + - Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. 3. Annotation - * Combines SV calls from GRIPSS with CNV data from PURPLE - * Annotate variant using [SnpEff](https://github.com/pcingola/SnpEff) + - Combines SV calls from GRIPSS with CNV data from PURPLE + - Annotate variant using [SnpEff](https://github.com/pcingola/SnpEff) 4. Prioritisation - * Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) data>) - * Prioritise variants based on clinical relevance and support metric + - Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) + - Prioritise variants based on clinical relevance and support metric 5. Repor - * Cancer repor - * Multiqc + - Cancer repor + - Multiqc 6. **Assign SV Types:** - * Classify SVs as duplications or deletions based on copy number thresholds. - * Split variants into separate files for structural variants (SVs) and copy number variants (CNVs). + - Classify SVs as duplications or deletions based on copy number thresholds. + - Split variants into separate files for structural variants (SVs) and copy number variants (CNVs). 7. **Annotate and Prioritize Variants:** - * Use **SnpEff** to annotate variants with gene-level and functional impact information. - * Prioritize variants based on clinical relevance and support metrics. - * Generate TSV (tab-separated values) files summarizing the prioritized SVs and CNVs. + - Use **SnpEff** to annotate variants with gene-level and functional impact information. + - Prioritize variants based on clinical relevance and support metrics. + - Generate TSV (tab-separated values) files summarizing the prioritized SVs and CNVs. 8. **Generate Summary Reports:** 9. Create TSV (tab-separated values) files summarizing the prioritized SVs and CNVs for downstream analysis and reporting. @@ -369,101 +381,101 @@ The **Somatic Structural Variants (SVs) pipeline** identifies and annotates **la ### **Primary SV VCFs:** - * GRIDSS2 - * ${meta.tumor\_id}.gridss.vcf.gz + - GRIDSS2 + - ${meta.tumor\_id}.gridss.vcf.gz ### Details ### **Detailed Steps:** 1. **GRIPSS filtering:** - * Evaluate split-read and paired-end support; discard variants with low support. - * Apply panel-of-normals filtering to remove artefacts observed in normal samples. - * Retain variants overlapping known oncogenic fusion hotspots (using UMCCR-curated lists). - * Exclude variants in repetitive regions based on Repeat Masker annotations. + - Evaluate split-read and paired-end support; discard variants with low support. + - Apply panel-of-normals filtering to remove artefacts observed in normal samples. + - Retain variants overlapping known oncogenic fusion hotspots (using UMCCR-curated lists). + - Exclude variants in repetitive regions based on Repeat Masker annotations. 2. **Purple:** - * **Merge SV calls with CNV segmentation data.** - * **Estimate tumor purity and ploidy.** - * **Adjust SV breakpoints based on copy number transitions.** - * **Classify SVs as somatic or germline.** + - **Merge SV calls with CNV segmentation data.** + - **Estimate tumor purity and ploidy.** + - **Adjust SV breakpoints based on copy number transitions.** + - **Classify SVs as somatic or germline.** 3. **Annotation** - * **Compile SV and CNV information into a unified VCF file.** - * **Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE\_baf, PURPLE\_copyNumber).** - * **Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions).** - * **Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences.** + - **Compile SV and CNV information into a unified VCF file.** + - **Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE\_baf, PURPLE\_copyNumber).** + - **Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions).** + - **Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences.** 4. **Prioritization** - * **Run the prioritization module (forked from the AstraZeneca simple\_sv\_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes.** - * **Classify Variants:** - * **Structural Variants (SVs):** Variants labeled with the source `sv_gridss`. - * **Copy Number Variants (CNVs):** Variants labeled with the source `cnv_purple`. - * **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** -* **exon loss** - * **on cancer gene list (1)** - * **other (2)** -* **gene fusion** - * **paired (hits two genes)** - * **on list of known pairs (1) (curated by [HMF]()** - * **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** - * **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** - * **other:** - * **one or two genes on cancer gene list (2)** - * **neither gene on cancer gene list (3)** - * **unpaired (hits one gene)** - * **on cancer gene list (2)** - * **others (3)** -* **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** - * **on cancer gene list genes (2)** -* **LoF or HIGH impact in a tumor suppressor** - * **on cancer gene list (2)** - * **other TS gene (3)** -* **other (4)** - * **Filter Low-Quality Calls:** - * **Keep variants with sufficient read support (e.g., split reads ().** - * **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** - * **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** - * **The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields:** - * **SIMPLE\_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier.** - * **SV\_TOP\_TIER: A numeric field indicating the highest priority tier for the variant.** - * **The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated.** + - **Run the prioritization module (forked from the AstraZeneca simple\_sv\_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes.** + - **Classify Variants:** + - **Structural Variants (SVs):** Variants labeled with the source `sv_gridss`. + - **Copy Number Variants (CNVs):** Variants labeled with the source `cnv_purple`. + - **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** +- **exon loss** + - **on cancer gene list (1)** + - **other (2)** +- **gene fusion** + - **paired (hits two genes)** + - **on list of known pairs (1) (curated by [HMF]()** + - **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** + - **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** + - **other:** + - **one or two genes on cancer gene list (2)** + - **neither gene on cancer gene list (3)** + - **unpaired (hits one gene)** + - **on cancer gene list (2)** + - **others (3)** +- **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** + - **on cancer gene list genes (2)** +- **LoF or HIGH impact in a tumor suppressor** + - **on cancer gene list (2)** + - **other TS gene (3)** +- **other (4)** + - **Filter Low-Quality Calls:** + - **Keep variants with sufficient read support (e.g., split reads ().** + - **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** + - **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** + - **The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields:** + - **SIMPLE\_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier.** + - **SV\_TOP\_TIER: A numeric field indicating the highest priority tier for the variant.** + - **The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated.** 1. **Report** - * **Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data.** - * **MultiQC Report: Aggregates quality control metrics from GRIDSS2, PURPLE, LINX, and the annotation/prioritization steps, providing an overall assessment of data quality.** + - **Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data.** + - **MultiQC Report: Aggregates quality control metrics from GRIDSS2, PURPLE, LINX, and the annotation/prioritization steps, providing an overall assessment of data quality.** 2. **Obtain Input Structural Variants:** - * **Source Data:** - * Obtain the structural variant VCF file generated by **PURPLE**, which integrates data from **GRIDSS** (for SV detection), **PURPLE** (for copy number analysis). - * The input includes both structural variants and copy number changes detected in the tumor sample. + - **Source Data:** + - Obtain the structural variant VCF file generated by **PURPLE**, which integrates data from **GRIDSS** (for SV detection), **PURPLE** (for copy number analysis). + - The input includes both structural variants and copy number changes detected in the tumor sample. 3. **\\Assign Structural Variant Types:** - * **Classify Variants:** - * **Structural Variants (SVs): Variants labeled with the source sv\_gridss.** - * **Copy Number Variants (CNVs): Variants labeled with the source cnv\_purple.** - * **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** -* **exon loss** - * **on cancer gene list (1)** - * **other (2)** -* **gene fusion** - * **paired (hits two genes)** - * **on list of known pairs (1) (curated by [HMF]()** - * **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** - * **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** - * **other:** - * **one or two genes on cancer gene list (2)** - * **neither gene on cancer gene list (3)** - * **unpaired (hits one gene)** - * **on cancer gene list (2)** - * **others (3)** -* **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** - * **on cancer gene list genes (2)** -* **LoF or HIGH impact in a tumor suppressor** - * **on cancer gene list (2)** - * **other TS gene (3)** -* **other (4)** + - **Classify Variants:** + - **Structural Variants (SVs): Variants labeled with the source sv\_gridss.** + - **Copy Number Variants (CNVs): Variants labeled with the source cnv\_purple.** + - **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** +- **exon loss** + - **on cancer gene list (1)** + - **other (2)** +- **gene fusion** + - **paired (hits two genes)** + - **on list of known pairs (1) (curated by [HMF]()** + - **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** + - **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** + - **other:** + - **one or two genes on cancer gene list (2)** + - **neither gene on cancer gene list (3)** + - **unpaired (hits one gene)** + - **on cancer gene list (2)** + - **others (3)** +- **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** + - **on cancer gene list genes (2)** +- **LoF or HIGH impact in a tumor suppressor** + - **on cancer gene list (2)** + - **other TS gene (3)** +- **other (4)** * - * **Filter Low-Quality Calls:** + - **Filter Low-Quality Calls:** **Apply Quality Filters:** - * **Keep variants with sufficient read support (e.g., split reads ().** - * **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** - * **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** + - **Keep variants with sufficient read support (e.g., split reads ().** + - **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** + - **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** 1. **Generate Summary Reports** @@ -483,31 +495,31 @@ The CPSR (Cancer Predisposition Sequencing Report) includes the following: **Settings**: -* Sample metadata -* Report configuration -* Virtual gene panel +- Sample metadata +- Report configuration +- Virtual gene panel **Summary of Findings**: -* Variant statistics +- Variant statistics **Variant Classification**: ClinVarc and Non-ClinVar -* Class 5 \- Pathogenic variants -* Class 4 \- Likely Pathogenic variants -* Class 3 \- Variants of Uncertain Significance (VUS) -* Class 2 \- Likely Benign variants -* Class 1 \- Benign variants -* Biomarkers +- Class 5 \- Pathogenic variants +- Class 4 \- Likely Pathogenic variants +- Class 3 \- Variants of Uncertain Significance (VUS) +- Class 2 \- Likely Benign variants +- Class 1 \- Benign variants +- Biomarkers PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): -* **Tier 1 (High):** Highest priority variants with strong clinical relevance. -* **Tier 2 (Moderate):** Variants with potential clinical significance. -* **Tier 3 (Low):** Variants with uncertain significance. -* **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. +- **Tier 1 (High):** Highest priority variants with strong clinical relevance. +- **Tier 2 (Moderate):** Variants with potential clinical significance. +- **Tier 3 (Low):** Variants with uncertain significance. +- **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. # Common Reports @@ -517,51 +529,51 @@ UMCCR cancer report containing: **Tumor Mutation Burden (TMB):** -* **Data Source:** filtered somatic VCF -* **Tool:** PURPLE +- **Data Source:** filtered somatic VCF +- **Tool:** PURPLE #### **Mutational Signatures:** -* **Data Source:** filtered SNV/CNV VCF -* **Tool:** MutationalPatterns R package (via PCGR) +- **Data Source:** filtered SNV/CNV VCF +- **Tool:** MutationalPatterns R package (via PCGR) #### **Contamination Score:** -* **Data Source:** – -* **Note:** No dedicated contamination metric is currently generated +- **Data Source:** – +- **Note:** No dedicated contamination metric is currently generated #### **Purity & Ploidy:** -* **Data Source:** COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) -* **Tool:** PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) +- **Data Source:** COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) +- **Tool:** PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) #### **HRD Score:** -* **Data Source:** HRD analysis output file (${meta.tumor\_id}.hrdscore.tsv) -* **Tool:** DRAGEN +- **Data Source:** HRD analysis output file (${meta.tumor\_id}.hrdscore.tsv) +- **Tool:** DRAGEN #### **MSI (Microsatellite Instability):** -* **Data Source:** Indels in microsatellite regions from SNV/CNV -* **Tool:** PURPLE +- **Data Source:** Indels in microsatellite regions from SNV/CNV +- **Tool:** PURPLE #### **Structural Variant Metrics:** -* **Data Source:** GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation -* **Tools:** GRIDSS/GRIPSS and PURPLE +- **Data Source:** GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation +- **Tools:** GRIDSS/GRIPSS and PURPLE #### **Copy Number Metrics (Segments, Deleted Genes, etc.):** -* **Data Source:** PURPLE CNV outputs (segmentation files, gene-level CNV TSV) -* **Tool:** PURPLE +- **Data Source:** PURPLE CNV outputs (segmentation files, gene-level CNV TSV) +- **Tool:** PURPLE The LINX report includes the following: -* **Tables of Variants**: - * Breakends - * Links - * Driver Catalog -* **Plots**: - * Cluster-Level Plots +- **Tables of Variants**: + - Breakends + - Links + - Driver Catalog +- **Plots**: + - Cluster-Level Plots ### MultiQC @@ -581,12 +593,12 @@ The LINX report includes the following: **Key Metrics:** -* **Variant Classification and Tier Distribution:** PCGR categorizes variants into tiers based on their clinical and biological significance. The report details the proportion of variants across different tiers, indicating their potential clinical relevance. -* **Mutational Signatures:** The report includes analysis of mutational signatures, offering insights into the mutational processes active in the tumor. -* **Copy Number Alterations (CNAs):** Visual representations of CNAs are provided, highlighting significant gains and losses across the genome. Genome-wide plots display regions of copy number gains and losses. -* **Tumor Mutational Burden (TMB):** Calculations of TMB are included, which can have implications for immunotherapy eligibility. The report presents the TMB value, representing the number of mutations per megabase. -* **Microsatellite Instability (MSI) Status:** Assessment of MSI status is performed, relevant for certain cancer types and treatment decisions. -* **Clinical Trials Information:** Information on relevant clinical trials is incorporated, offering potential therapeutic options based on the identified variants. +- **Variant Classification and Tier Distribution:** PCGR categorizes variants into tiers based on their clinical and biological significance. The report details the proportion of variants across different tiers, indicating their potential clinical relevance. +- **Mutational Signatures:** The report includes analysis of mutational signatures, offering insights into the mutational processes active in the tumor. +- **Copy Number Alterations (CNAs):** Visual representations of CNAs are provided, highlighting significant gains and losses across the genome. Genome-wide plots display regions of copy number gains and losses. +- **Tumor Mutational Burden (TMB):** Calculations of TMB are included, which can have implications for immunotherapy eligibility. The report presents the TMB value, representing the number of mutations per megabase. +- **Microsatellite Instability (MSI) Status:** Assessment of MSI status is performed, relevant for certain cancer types and treatment decisions. +- **Clinical Trials Information:** Information on relevant clinical trials is incorporated, offering potential therapeutic options based on the identified variants. **Note:** The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered ou @@ -596,31 +608,31 @@ The CPSR (Cancer Predisposition Sequencing Report) includes the following: **Settings**: -* Sample metadata -* Report configuration -* Virtual gene panel +- Sample metadata +- Report configuration +- Virtual gene panel **Summary of Findings**: -* Variant statistics +- Variant statistics **Variant Classification**: ClinVarc and Non-ClinVar -* Class 5 \- Pathogenic variants -* Class 4 \- Likely Pathogenic variants -* Class 3 \- Variants of Uncertain Significance (VUS) -* Class 2 \- Likely Benign variants -* Class 1 \- Benign variants -* Biomarkers +- Class 5 \- Pathogenic variants +- Class 4 \- Likely Pathogenic variants +- Class 3 \- Variants of Uncertain Significance (VUS) +- Class 2 \- Likely Benign variants +- Class 1 \- Benign variants +- Biomarkers PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): -* **Tier 1 (High):** Highest priority variants with strong clinical relevance. -* **Tier 2 (Moderate):** Variants with potential clinical significance. -* **Tier 3 (Low):** Variants with uncertain significance. -* **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. +- **Tier 1 (High):** Highest priority variants with strong clinical relevance. +- **Tier 2 (Moderate):** Variants with potential clinical significance. +- **Tier 3 (Low):** Variants with uncertain significance. +- **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. # Reference data @@ -632,39 +644,39 @@ WiGiTS (hmftools) **Annotation Databases**: -* **gnomAD**: Provides population allele frequencies to help distinguish common variants from rare ones. -* **ClinVar**: Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. -* **COSMIC**: Contains data on somatic mutations found in cancer, facilitating the identification of cancer-related variants. -* **Gene Panels**: Focuses analysis on specific sets of genes relevant to particular conditions or research interests. +- **gnomAD**: Provides population allele frequencies to help distinguish common variants from rare ones. +- **ClinVar**: Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. +- **COSMIC**: Contains data on somatic mutations found in cancer, facilitating the identification of cancer-related variants. +- **Gene Panels**: Focuses analysis on specific sets of genes relevant to particular conditions or research interests. **Structural Variant Data**: -* **SnpEff Databases**: Used for predicting the effects of variants on genes and proteins. -* **Panel of Normals (PON)**: Helps filter out technical artifacts by comparing against a set of normal samples. -* **RepeatMasker**: Identifies repetitive genomic regions to prevent false-positive variant calls. +- **SnpEff Databases**: Used for predicting the effects of variants on genes and proteins. +- **Panel of Normals (PON)**: Helps filter out technical artifacts by comparing against a set of normal samples. +- **RepeatMasker**: Identifies repetitive genomic regions to prevent false-positive variant calls. **Databases/datasets PCGR Reference Data:** ***Version: v20220203*** -* [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) -* [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 () -* [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) -* [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) -* [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) -* [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021\_11 () -* [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021\_04) -* [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () -* [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) -* [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) -* [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) -* [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) -* [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 () -* [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) -* [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) -* [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) -* [GWAS\_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) -* [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) +- [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) +- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 () +- [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) +- [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) +- [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) +- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021\_11 () +- [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021\_04) +- [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () +- [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) +- [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) +- [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) +- [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) +- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 () +- [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) +- [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) +- [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) +- [GWAS\_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) +- [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) ### @@ -674,43 +686,43 @@ WiGiTS (hmftools) **Somatic SNVs** -* File: `smlv_somatic/filter/{tid}.pass.vcf.gz` -* Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. +- File: `smlv_somatic/filter/{tid}.pass.vcf.gz` +- Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. **Somatic SVs** -* File: `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz` -* Description: Contains somatic structural variants (SVs) with prioritization applied. +- File: `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz` +- Description: Contains somatic structural variants (SVs) with prioritization applied. **Somatic CNVs** -* File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz` -* Description: Contains somatic copy number variations (CNVs) data. +- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz` +- Description: Contains somatic copy number variations (CNVs) data. **Somatic Gene CNVs** -* File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz` -* Description: Contains gene-level somatic copy number variations (CNVs) data. +- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz` +- Description: Contains gene-level somatic copy number variations (CNVs) data. **Germline SNVs** -* File: `dragen_germline_output/{nid}.hard-filtered.vcf.gz` -* Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. +- File: `dragen_germline_output/{nid}.hard-filtered.vcf.gz` +- Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. **Purple Purity, Ploidy, MS Status** -* File: `purple/{tid}.purple.purity.tsv` -* Description: Contains estimated tumor purity, ploidy, and microsatellite status. +- File: `purple/{tid}.purple.purity.tsv` +- Description: Contains estimated tumor purity, ploidy, and microsatellite status. **PCGR JSON with TMB** -* File: `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz` -* Description: Contains PCGR annotations, including tumor mutational burden (TMB). +- File: `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz` +- Description: Contains PCGR annotations, including tumor mutational burden (TMB). **DRAGEN HRD Score** -* File: `dragen_somatic_output/{tid}.hrdscore.tsv` -* Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. +- File: `dragen_somatic_output/{tid}.hrdscore.tsv` +- Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. # FAQ From f4ae804ced8b948ffe25f397d116251eb5094ab5 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Mon, 24 Mar 2025 12:05:49 +1100 Subject: [PATCH 07/36] uptdate usage --- docs/usage.md | 28 ++++++++-------------------- 1 file changed, 8 insertions(+), 20 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index 37f6aac6..1699d1ad 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -14,32 +14,20 @@ You will need to create a samplesheet with information about the samples you wou --input '[path to samplesheet file]' ``` -### Multiple runs of the same sample - -The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. The pipeline will concatenate the raw reads before performing any downstream analysis. Below is an example for the same sample sequenced across 3 lanes: - -```console -sample,fastq_1,fastq_2 -CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz -CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz -CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz -``` - ### Full samplesheet The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below. A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice. -```console -sample,fastq_1,fastq_2 -CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz -CONTROL_REP2,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz -CONTROL_REP3,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz -TREATMENT_REP1,AEG588A4_S4_L003_R1_001.fastq.gz, -TREATMENT_REP2,AEG588A5_S5_L003_R1_001.fastq.gz, -TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz, -TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz, +```csv +id,subject_name,sample_name,filetype,filepath +subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ +subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ +subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ +subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ +subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ +subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ ``` | Column | Description | From 9683ae9b828e24c61d837aa3384dded9c5486492 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 11:37:34 +1100 Subject: [PATCH 08/36] add adr --- docs/adr.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 docs/adr.md diff --git a/docs/adr.md b/docs/adr.md new file mode 100644 index 00000000..b1a7ea88 --- /dev/null +++ b/docs/adr.md @@ -0,0 +1,44 @@ +# ADR #1: Implement VCF Chunking and Parallelization in Sash Workflow for PCGR Processing + +**Status**: In Progress +**Date**: 2024-11-07 +**Deciders**: Oliver Hofmann, Stephen Watts, Quentin Clayssen +**Technical Story**: Based on the limitations of PCGR in handling large variant datasets within the sash workflow, specifically impacting hypermutated samples. + +## Context +[PCGR](https://sigven.github.io/pcgr/) (Personal Cancer Genome Reporter) currently has a variant processing limit of 500,000 variants per run. In the sash workflow, hypermutated samples often exceed this variant limit. PCGR has its own filtering steps, but an additional filtering step was also introduced in Bolt. By using VCF chunking and parallel processing, we can ensure that these large datasets are analyzed effectively without exceeding the PCGR variant limit, leading to larger annotation and a more scalable pipeline. + +## Decision +To address the limitations of PCGR when handling hypermutated samples, we WILL implement the following: + +1. **Split VCF Files into Chunks**: Input VCF files MUST be divided into chunks, each containing no more than 500,000 variants. This ensures that each chunk remains within PCGR’s processing capacity. + +2. **Parallelize Processing**: Each chunk MUST be processed concurrently through PCGR to optimize processing time. The annotated outputs from all chunks MUST be merged to create a unified dataset. + +3. **Integrate into Bolt Annotation**: The chunking and parallelization changes MUST be implemented in the Bolt annotation module to ensure seamless and scalable processing for large variant datasets. + +4. **Efficiency Consideration**: For now, there MAY be a loss of efficiency for larger variant sets due to the fixed resources allocated for annotation. Further resource adjustments SHOULD be evaluated in the future. + +## Consequences + +### Positive Consequences +- **Improved Efficiency**: This approach allows large variant datasets to be processed within PCGR's constraints, enhancing efficiency and ensuring more comprehensive analysis. +- **Scalability**: Chunking and parallel processing make the sash workflow more scalable for hypermutated samples, accommodating larger datasets. + +### Negative Consequences +- **Complexity**: Adding chunking and merging processes WILL increase complexity in data handling and ensuring integrity across all merged data. +- **Resource Demand**: Parallel processing MAY increase resource consumption, affecting system performance and requiring further resource management. + +## Remaining Challenges +While the proposed approach mitigates the current limitations of PCGR, it MAY not fully resolve the issues for hypermutated samples with exceptionally high variant counts. Additional solutions MUST be explored, such as: + +- **Additional Filtering Criteria**: Applying additional filters to reduce the variant count where applicable. +- **Alternative Reporting Methods**: Exploring more scalable reporting approaches that COULD handle higher variant loads. + +## Status +**Status**: In Progress + +## Links +- [Related PR for VCF Chunking and Parallelization Implementation](https://github.com/scwatts/bolt/pull/2) +- [PCGR Documentation on Variant Limit](https://sigven.github.io/pcgr/articles/running.html#large-input-sets-vcf) +- Discussion on Hypermutated Samples Handling From 2d3353435fd1aba02335b9a5f212a5a0656a790e Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 11:38:19 +1100 Subject: [PATCH 09/36] reorganise doc and linting --- docs/README.md | 758 +----------------------------------------------- docs/detail.md | 765 +++++++++++++++++++++++++++++++++++++++++++++++++ docs/usage.md | 1 + 3 files changed, 770 insertions(+), 754 deletions(-) create mode 100644 docs/detail.md diff --git a/docs/README.md b/docs/README.md index 13530521..c03d035a 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,758 +1,8 @@ +- [Details](details.md) + - In details of the pipeline steps - [Usage](usage.md) - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. - [Output](output.md) - An overview of the different results produced by the pipeline and how to interpret them. - -# Sash Workflow Overview - -![Summary](images/sash_overview_qc.png) - -The **sash Workflow** is a genomic analysis framework comprising three primary pipelines: - -- **Somatic Small Variants (SNV somatic):** Detects single nucleotide variants (SNVs) and indels in tumor samples, emphasizing clinical relevance. -- **Somatic Structural Variants (SV somatic):** Identifies large-scale genomic alterations (deletions, duplications, etc.) and integrates copy number data. -- **Germline Variants (SNV germline):** Focuses on inherited variants linked to cancer predisposition. - -These pipelines utilise **Bolt**, a Python package designed for modular processing, and leverage outputs from the [**DRAGEN**](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) **Variant Caller** alongside and the Hartwig Medical Foundation **WiGiTS** toolkit (via [Oncoanalyser](https://github.com/nf-core/oncoanalyser)) [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. - -# [**HMFtools WiGiTs**](https://github.com/hartwigmedical/hmftools/tree/master) - -**HMFtools WiGiTS** is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: - -- **[SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md):** - A tiered SNV/indel caller targeting ~10,000 cancer hotspots (e.g., OncoKB, CIViC) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). - -- **[PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple):** - Estimates tumor **purity** (tumor cell fraction) and **ploidy** (average copy number), integrates copy number data, and calculates **TMB** (tumor mutation burden) and **MSI** (microsatellite instability). - -# Pipeline Inputs - -## Dragen - - def vcf = file(meta.dragen_somatic_dir).toUriString() + "/${meta.tumor_id}.hard-filtered.vcf.gz" - -## Oncoanalyser - - def subpath = "/gridss/${meta.tumor_id}.gridss.vcf.gz" - def subpath = "/sage/somatic/${meta.tumor_id}.sage.somatic.vcf.gz" - -# Workflows - -## **Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants** - -#### General - -In the **Somatic Small Variants** workflow, variant detection is performed using the **DRAGEN Variant Caller** and **Oncoanalyser** that is relaing on **Somatic Alterations in Genome [SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple))** outputs. It’s structured into four steps: **Integrations**, **Annotation**, **Filter**, and **Report**. The final outputs include an **HTML report** summarising the results. - -#### Summary - -1. **Rescue** variants using SAGE to recover low-frequency alterations in clinically important hotspots. -2. **Annotate** variants with clinical and functional information using PCGR. -3. **Filter** variants based on quality and frequency criteria (e.g., allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.).Filter variants based on allele frequency (AF), supporting reads (AD), and population frequency (gnomAD AF), removing low-confidence and common variants. -4. **Report** final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, multiqc) format. - -### Variant Calling integrations - -The **variant calling integrations** step use variants fromemploys the **Somatic Alterations in Genome (SAGE)** variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on **targets known cancer hotspots (from sources like CGI, CIViC, OncoKB) Targeted Hotspot**. Analysis, prioritising predefined genomic regions of high clinical or biological relevance with his own [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. - -- Low-allele-frequency variants in hotspots genomic regions of clinical significance. -- Hotspots are derived from: - Cancer Genome Interpreter (CGI) - - CIViC \- Clinical interpretations of variants in cancer. - - OncoKB \- Precision Oncology Knowledge Base. -- Outputs a VCF containing rescued variants. - -##### Inputs: - -- From DRAGEN: somatic small variant callerVCF - - ${tumor\_id}.main.dragen.vcf.gz -- From oncoanalyser: SAGE VCF - - ${tumor\_id}.main.sage.filtered.vcf.gz - - Filter on chr 1..22 and chr X,Y,M - -##### Output: - -- Rescue: VCF - - ${tumor\_id}.rescued.vcf.g - -#### Details - -**Steps are:** - -1. **Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered:** - - **Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions.** - - **Hotspot regions are derived from databases such as:** - - **Cancer Genome Interpreter (CGI)** - - **CIViC (Clinical Interpretations of Variants in Cancer)** - - **OncoKB (Precision Oncology Knowledge Base)** - - **This ensures that only high-confidence variants in clinically relevant regions are considered.** -2. **Separate SAGE calls into existing and novel variants** - - **Compare the input VCF and the filtered SAGE VCF to identify overlapping and unique variants.** -3. **Annotate existing somatic variant calls also present in the SAGE calls in the input VCF** - - **Annotate variants that are re-called by SAGE:** - - **For each variant in the input VCF, check if it exists in the SAGE existing calls.** - - **For variants re-called by SAGE:** - - **If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`:** - - **Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot.** - - **If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`:** - - **Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is rescued by SAGE.** - - **Update `FILTER=PASS` to include the variant in the final analysis.** - - **If `SAGE FILTER` is not `PASS`:** - - **Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants.** - - **Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix** -4. **Combine annotated input VCF with novel SAGE calls** - - **Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF::** - - **Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions:** - - **For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`.** - - **Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data.** - - **Summary Finalize the rescued of VCF file integration** - - - **The final VCF file includes:** - - **Original variants from the input VCF, annotated with SAGE information where applicable.** - - **Novel variants identified by SAGE in hotspot regions.** - - **Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process.** - - **The rescued VCF provides a comprehensive set of variants for downstream analysis, prioritizing clinically significant mutations.** - -### Annotation - -The **Annotation** consists of three processes:step employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots),UMCCR panel of normals and theand the **Personal Cancer Genome Reporter** (PCGR) tool to enrich variants with detailed functional and with clinical information using **ACMG** guidelines. PCGR classifies variants into **tiers** based on their clinical and biological significance and incorporates **mutational signature** analysis to provide insights into underlying mutational processes. To manage memory usage effectively, the input VCF file is divided into chunks, each containing up to 500,000 variants. Each chunk is processed independently through PCGR, and after annotation, the chunks are merged to produce an annotated VCF and TSV file. - -#### These annotations are used to decide which variants are retained or filtered in the next step - -Summary: -Use **PCGR** to enrich the VCF with: - -- Functional impact information (e.g., consequences, mutation hotspots). -- Clinical relevance (e.g., tier classifications, mutational signatures). -- Process VCF files in chunks ≤500,000 variants each. -- Merge annotated chunks into a unified VCF. - -##### Inputs: - -- Small variant vcfRescue VCF - - ${tumor\_id}.main.sage.filtered.vcf.gz - -##### Output: - -- Annotated VCF - - ${tumor\_id}.annotations.vcf.g - -Details: - -**Steps are:** - -1. **Set FILTER to "PASS" for unfiltered variants** - - Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. -2. **Annotate the VCF against reference sources** - - Use **vcfanno** to add annotations to the VCF file: - - **gnomAD** - - **Hartwig Hotspots** - - **ENCODE Blacklist** - - **Genome in a Bottle High-Confidence Regions**: Mark high-confidence regions from the Genome in a Bottle benchmark. - - **Low and High GC Regions**: Mark regions with \30% or \65% GC content, compiled by GA4GH. - - **Bad Promoter Regions**: Annotate regions with poor coverage, compiled by GA4GH. -3. **Annotate with UMCCR panel of normals counts** - - Use **vcfanno** and **bcftools** to annotate the VCF with counts from the **UMCCR panel of normals**, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. -4. **Standardize the VCF fields** - - Add new `INFO` fields for use with **PCGR**: -- `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. -- `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. -- Add the `AD` FORMAT field: -- `AD`: Allelic depths for the reference and alternate alleles. -5. **Prepare VCF for PCGR annotation** - - Exclude unnecessary data from the VCF header keeping on INFO AF/DP . - - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. - - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. - -6. **Run PCGR to annotate VCF against external sources** - - Use **PCGR** (Personal Cancer Genome Reporter) to annotate the VCF with clinical, functional, and biological information. - - Classify variants by tiers based on annotations and functional impact according to **ACMG** guidelines. - - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - - External sources used during this step include **VEP**, **ClinVar**, **COSMIC**, **TCGA**, **ICGC**, **Open Targets Platform**, **CancerMine**, **DoCM**, **CBMDB**, **DisGeNET**, **Cancer Hotspots**, **dbNSFP**, **UniProt/SwissProt**, **Pfam**, **DGIdb**, and **ChEMBL**. -7. **Transfer PCGR annotations to the full set of variants** - - merge the PCGR annotations back into the original VCF file. - - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. - - Preserve the `FILTER` statuses and other annotations from the original VCF. -8. **Filter variants to remove putative germline variants and artefactsartifacts while keeping known hotspots/actionable variants** - - **Keep variants**: - - Called by **SAGE** in known hotspots (CGI, CIViC, OncoKB) regardless of other evidence. - - With PCGR TIER 1 and 2 classifications, indicating strong or potential clinical significance according to ACMG guidelines. - - All driver mutations from; - - **IntOGen** - - mutation hotspots - - ClinVar pathogenic or uncertain significance - - COSMIC count ≥10 - - TCGA pancancer count ≥5 - - ICGC PCAWG count ≥3. - - **Apply filters to other variants**: - - Remove variants with `AF 10%`. - - Remove common variants in gnomAD (<`population AF ≥ 1%`), adding them to the germline set. - - Remove variants present in ≥5 samples of the Panel of Normals. - - Remove indels in "bad promoter" regions (as defined by GA4GH). - - Remove variants overlapping the ENCODE blacklist. - - Remove variants with variant depth `VD 4`. - - Remove variants with `VD < 6` and overlapping a low complexity region. - - Remove **VarDict** strand-biased variants unless supported by other callers. -9. **Report passing variants using PCGR, classified by the ACMG tier system** -10. Generate the final report of variants classified according to clinical significance using **PCGR**, ready for downstream analysis. - - #### - -#### - -### Filter - -The **Filter** step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. - -Inputs: - -- Annotated VCF - - ${meta.tumor\_id}.annotations.vcf.gz - -#### Output: - -- Filter VCF - - ${meta.tumor\_id}\*filters\_set.vcf.gz - -**Filters:** - -**1\. Technical Quality Filters** - -#### **1.1 Allele Frequency (20% are also excluded.** -- **This step reduces contamination from sequencing artefacts or undetected germline variants.** - -### **3\. Rescue and Clinical Significance Filters** - -**These variants are retained even if they fail technical filters.** - -### **3.1 Hotspot Rescue** - -- **Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters.** - - #### **3.2 Reference Database Hit Count Rescue** - -- **Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering:** - - **COSMIC count ≥10** - - **TCGA pan-cancer count ≥5** - - **ICGC PCAWG count ≥5** - - #### **3.3 ClinVar Pathogenicity Rescue** - -- **Variants classified in ClinVar as:** - - **Likely Pathogenic** - - **Pathogenic** - - **Uncertain Significance (VUS) with strong clinical evidence** - -- Allele Frequency (AF) Filter: - - Excludes variants with a tumor allele frequency below a threshold of 0.1. -- Allele Depth (AD) Filter: - - Removes variants with fewer than 4 supporting reads in the tumor sample. -- Degraded Mappability AD Filter: - - Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. - - Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor\_ad \ 6 -- Non-GIAB AD Filter: - - Removes variants not confirmed by the Genome in a Bottle (= 0.01 -- Panel of Normals (PON) Germline Filter: - - Filters out variants with an allele frequency in the PON below 0.20. - - Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON\_COUNT \>= 5 -- FIlter rescue variant: - -Variants meeting these criteria are flagged as `CLINICAL_POTENTIAL_RESCUE` are **NOT filtered out** - -- **Reference Database Hit Counts**: - - Variants with a **COSMIC count** of ≥10. - - Variants with a **TCGA pan-cancer count** of ≥5. - - Variants with an **ICGC PCAWG count** of ≥5. -- **ClinVar Significance**: - - Variants with ClinVar classifications matching the following categories are rescued: - - `conflicting_interpretations_of_pathogenicity` - - `likely_pathogenic` - - `pathogenic` - - `uncertain_significance` -- **Mutation Hotspots**: - - Variants identified as hotspots in: - - `HMF_HOTSPOT` - - `PCGR_MUTATION_HOTSPOT` -- **PCGR Tiers**: - - Variants classified as: - - `TIER_1` - - `TIER_2` - -### Repor - -The **Report** step utilises the **Personal Cancer Genome Reporter (PCGR)** - -Inputs: - -- Purple purity -- Filter VCF -- Dragen VCF - -#### Output: - -- PCGRCancer repor - - ${meta.tumor\_id}.pcgr\_acmg.grch38.html - -1. **Generate BCFtools Statistics on the Input VCF:** - The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. -2. **Calculate Allele Frequency Distributions:** - The `allele_frequencies` function uses external tools (bcftools, bedtools) to: - - Filter and normalize variants according to high-confidence regions. - - Extract allele frequency data from tumor samples. - - Produce both a global allele frequency summary and a subset of allele frequencies restricted to key cancer genes. -3. **Compare Variant Counts From Two Variant Sets (DRAGEN vs. BOLT)** - - The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. -4. **Count Variants by Processing Stage** -5. **Parse Purity and Ploidy Information (Purple Data)** -6. **Run PCGR Annotation** - -## Somatic structural variants - -The **Somatic Structural Variants (SVs) pipeline** identifies and annotates **large-scale genomic alterations**, including **deletions, duplications, inversions, insertions, and translocations** in tumor samples. This step integrates outputs from **DRAGEN Variant Caller**, **GRIDSS2**, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants.The analysis of somatic structural variants (SVs) involves processing, annotating, and prioritizing variants to identify those with clinical and biological significance. This process uses outputs from tools like PURPLE and GRIDSS and involves several key steps: - -### **Summary:** - -1. **GRIPDSS filtering:** - - GRIPDSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known\_fusions -2. PURPLE - - Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. -3. Annotation - - Combines SV calls from GRIPSS with CNV data from PURPLE - - Annotate variant using [SnpEff](https://github.com/pcingola/SnpEff) -4. Prioritisation - - Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) - - Prioritise variants based on clinical relevance and support metric -5. Repor - - Cancer repor - - Multiqc -6. **Assign SV Types:** - - Classify SVs as duplications or deletions based on copy number thresholds. - - Split variants into separate files for structural variants (SVs) and copy number variants (CNVs). -7. **Annotate and Prioritize Variants:** - - Use **SnpEff** to annotate variants with gene-level and functional impact information. - - Prioritize variants based on clinical relevance and support metrics. - - Generate TSV (tab-separated values) files summarizing the prioritized SVs and CNVs. -8. **Generate Summary Reports:** -9. Create TSV (tab-separated values) files summarizing the prioritized SVs and CNVs for downstream analysis and reporting. - - - - - ### **Input Files** - - ### **Primary SV VCFs:** - - - GRIDSS2 - - ${meta.tumor\_id}.gridss.vcf.gz - -### Details - -### **Detailed Steps:** - -1. **GRIPSS filtering:** - - Evaluate split-read and paired-end support; discard variants with low support. - - Apply panel-of-normals filtering to remove artefacts observed in normal samples. - - Retain variants overlapping known oncogenic fusion hotspots (using UMCCR-curated lists). - - Exclude variants in repetitive regions based on Repeat Masker annotations. -2. **Purple:** - - **Merge SV calls with CNV segmentation data.** - - **Estimate tumor purity and ploidy.** - - **Adjust SV breakpoints based on copy number transitions.** - - **Classify SVs as somatic or germline.** -3. **Annotation** - - **Compile SV and CNV information into a unified VCF file.** - - **Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE\_baf, PURPLE\_copyNumber).** - - **Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions).** - - **Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences.** -4. **Prioritization** - - **Run the prioritization module (forked from the AstraZeneca simple\_sv\_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes.** - - **Classify Variants:** - - **Structural Variants (SVs):** Variants labeled with the source `sv_gridss`. - - **Copy Number Variants (CNVs):** Variants labeled with the source `cnv_purple`. - - **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** -- **exon loss** - - **on cancer gene list (1)** - - **other (2)** -- **gene fusion** - - **paired (hits two genes)** - - **on list of known pairs (1) (curated by [HMF]()** - - **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** - - **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** - - **other:** - - **one or two genes on cancer gene list (2)** - - **neither gene on cancer gene list (3)** - - **unpaired (hits one gene)** - - **on cancer gene list (2)** - - **others (3)** -- **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** - - **on cancer gene list genes (2)** -- **LoF or HIGH impact in a tumor suppressor** - - **on cancer gene list (2)** - - **other TS gene (3)** -- **other (4)** - - **Filter Low-Quality Calls:** - - **Keep variants with sufficient read support (e.g., split reads ().** - - **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** - - **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** - - **The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields:** - - **SIMPLE\_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier.** - - **SV\_TOP\_TIER: A numeric field indicating the highest priority tier for the variant.** - - **The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated.** -1. **Report** - - **Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data.** - - **MultiQC Report: Aggregates quality control metrics from GRIDSS2, PURPLE, LINX, and the annotation/prioritization steps, providing an overall assessment of data quality.** - -2. **Obtain Input Structural Variants:** - - **Source Data:** - - Obtain the structural variant VCF file generated by **PURPLE**, which integrates data from **GRIDSS** (for SV detection), **PURPLE** (for copy number analysis). - - The input includes both structural variants and copy number changes detected in the tumor sample. -3. **\\Assign Structural Variant Types:** - - **Classify Variants:** - - **Structural Variants (SVs): Variants labeled with the source sv\_gridss.** - - **Copy Number Variants (CNVs): Variants labeled with the source cnv\_purple.** - - **Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest):** -- **exon loss** - - **on cancer gene list (1)** - - **other (2)** -- **gene fusion** - - **paired (hits two genes)** - - **on list of known pairs (1) (curated by [HMF]()** - - **one gene is a known promiscuous fusion gene (1) (curated by [HMF]()** - - **on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2)** - - **other:** - - **one or two genes on cancer gene list (2)** - - **neither gene on cancer gene list (3)** - - **unpaired (hits one gene)** - - **on cancer gene list (2)** - - **others (3)** -- **upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed ()** - - **on cancer gene list genes (2)** -- **LoF or HIGH impact in a tumor suppressor** - - **on cancer gene list (2)** - - **other TS gene (3)** -- **other (4)** -* - - **Filter Low-Quality Calls:** - **Apply Quality Filters:** - - **Keep variants with sufficient read support (e.g., split reads ().** - - **Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`.** - - **Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1.** -1. **Generate Summary Reports** - - - -## Germline small variants - -Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. - -1. **Prepare** - 1. **Selection of Passing Variants:** - 1. Raw germline variant calls (e.g. from DRAGEN or an ensemble caller) are filtered to retain only those variants marked as PASS (or with no filter flag) - 2. **Selection of Gene Panel Variants:** - 1. The filtered variants are then further restricted to regions defined by a gene panel transcript regions file. -2. Report: CPSR - -The CPSR (Cancer Predisposition Sequencing Report) includes the following: - -**Settings**: - -- Sample metadata -- Report configuration -- Virtual gene panel - -**Summary of Findings**: - -- Variant statistics - -**Variant Classification**: - -ClinVarc and Non-ClinVar - -- Class 5 \- Pathogenic variants -- Class 4 \- Likely Pathogenic variants -- Class 3 \- Variants of Uncertain Significance (VUS) -- Class 2 \- Likely Benign variants -- Class 1 \- Benign variants -- Biomarkers - -PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - -- **Tier 1 (High):** Highest priority variants with strong clinical relevance. -- **Tier 2 (Moderate):** Variants with potential clinical significance. -- **Tier 3 (Low):** Variants with uncertain significance. -- **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. - -# Common Reports - -### [Cancer report](https://umccr.github.io/gpgr/) - -UMCCR cancer report containing: - -**Tumor Mutation Burden (TMB):** - -- **Data Source:** filtered somatic VCF -- **Tool:** PURPLE - -#### **Mutational Signatures:** - -- **Data Source:** filtered SNV/CNV VCF -- **Tool:** MutationalPatterns R package (via PCGR) - -#### **Contamination Score:** - -- **Data Source:** – -- **Note:** No dedicated contamination metric is currently generated - -#### **Purity & Ploidy:** - -- **Data Source:** COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) -- **Tool:** PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) - -#### **HRD Score:** - -- **Data Source:** HRD analysis output file (${meta.tumor\_id}.hrdscore.tsv) -- **Tool:** DRAGEN - -#### **MSI (Microsatellite Instability):** - -- **Data Source:** Indels in microsatellite regions from SNV/CNV -- **Tool:** PURPLE - -#### **Structural Variant Metrics:** - -- **Data Source:** GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation -- **Tools:** GRIDSS/GRIPSS and PURPLE - -#### **Copy Number Metrics (Segments, Deleted Genes, etc.):** - -- **Data Source:** PURPLE CNV outputs (segmentation files, gene-level CNV TSV) -- **Tool:** PURPLE - -The LINX report includes the following: -- **Tables of Variants**: - - Breakends - - Links - - Driver Catalog -- **Plots**: - - Cluster-Level Plots - -### MultiQC - -**General Stats**: Overview of QC metrics aggregated from all tools, providing high-level sample quality information. - -**DRAGEN**: Mapping metrics (mapped reads, paired reads, duplicated alignments, secondary alignments), WGS coverage (average depth, cumulative coverage, per-contig coverage), fragment length distributions, trimming metrics, and time metrics for pipeline steps. - -**PURPLE**: Sample QC status (PASS/FAIL), ploidy, tumor purity, polyclonality percentage, tumor mutational burden (TMB), microsatellite instability (MSI) status, and variant metrics for somatic and germline SNPs/indels. - -**BcfTools Stats**: Variant substitution types, SNP and indel counts, quality scores, variant depth, and allele frequency metrics for both somatic and germline variants. - -**DRAGEN-FastQC**: Per-base sequence quality, per-sequence quality scores, GC content (per-sequence and per-position), HRD score, sequence length distributions, adapter contamination, and sequence duplication levels. - -### PCGR - -**Personal Cancer Genome Reporter (PCGR)** tool to generate a comprehensive, interactive HTML report that consolidates filtered and annotated variant data, providing detailed insights into the somatic variants identified. - -**Key Metrics:** - -- **Variant Classification and Tier Distribution:** PCGR categorizes variants into tiers based on their clinical and biological significance. The report details the proportion of variants across different tiers, indicating their potential clinical relevance. -- **Mutational Signatures:** The report includes analysis of mutational signatures, offering insights into the mutational processes active in the tumor. -- **Copy Number Alterations (CNAs):** Visual representations of CNAs are provided, highlighting significant gains and losses across the genome. Genome-wide plots display regions of copy number gains and losses. -- **Tumor Mutational Burden (TMB):** Calculations of TMB are included, which can have implications for immunotherapy eligibility. The report presents the TMB value, representing the number of mutations per megabase. -- **Microsatellite Instability (MSI) Status:** Assessment of MSI status is performed, relevant for certain cancer types and treatment decisions. -- **Clinical Trials Information:** Information on relevant clinical trials is incorporated, offering potential therapeutic options based on the identified variants. - -**Note:** The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered ou - -### CPSR Repor - -The CPSR (Cancer Predisposition Sequencing Report) includes the following: - -**Settings**: - -- Sample metadata -- Report configuration -- Virtual gene panel - -**Summary of Findings**: - -- Variant statistics - -**Variant Classification**: - -ClinVarc and Non-ClinVar - -- Class 5 \- Pathogenic variants -- Class 4 \- Likely Pathogenic variants -- Class 3 \- Variants of Uncertain Significance (VUS) -- Class 2 \- Likely Benign variants -- Class 1 \- Benign variants -- Biomarkers - -PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - -- **Tier 1 (High):** Highest priority variants with strong clinical relevance. -- **Tier 2 (Moderate):** Variants with potential clinical significance. -- **Tier 3 (Low):** Variants with uncertain significance. -- **Tier 4 (No Interest):** Variants unlikely to be clinically relevant. - -# Reference data - -### [UMCCR Genes panels](https://github.com/umccr/gene_panels) - -### Genome annotations - -WiGiTS (hmftools) - -**Annotation Databases**: - -- **gnomAD**: Provides population allele frequencies to help distinguish common variants from rare ones. -- **ClinVar**: Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. -- **COSMIC**: Contains data on somatic mutations found in cancer, facilitating the identification of cancer-related variants. -- **Gene Panels**: Focuses analysis on specific sets of genes relevant to particular conditions or research interests. - -**Structural Variant Data**: - -- **SnpEff Databases**: Used for predicting the effects of variants on genes and proteins. -- **Panel of Normals (PON)**: Helps filter out technical artifacts by comparing against a set of normal samples. -- **RepeatMasker**: Identifies repetitive genomic regions to prevent false-positive variant calls. - -**Databases/datasets PCGR Reference Data:** - -***Version: v20220203*** - -- [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) -- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 () -- [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) -- [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) -- [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) -- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021\_11 () -- [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021\_04) -- [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () -- [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) -- [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) -- [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) -- [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) -- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 () -- [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) -- [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) -- [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) -- [GWAS\_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) -- [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) - -### - -### - -# Sash Module Outputs: - -**Somatic SNVs** - -- File: `smlv_somatic/filter/{tid}.pass.vcf.gz` -- Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. - -**Somatic SVs** - -- File: `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz` -- Description: Contains somatic structural variants (SVs) with prioritization applied. - -**Somatic CNVs** - -- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz` -- Description: Contains somatic copy number variations (CNVs) data. - -**Somatic Gene CNVs** - -- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz` -- Description: Contains gene-level somatic copy number variations (CNVs) data. - -**Germline SNVs** - -- File: `dragen_germline_output/{nid}.hard-filtered.vcf.gz` -- Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. - -**Purple Purity, Ploidy, MS Status** - -- File: `purple/{tid}.purple.purity.tsv` -- Description: Contains estimated tumor purity, ploidy, and microsatellite status. - -**PCGR JSON with TMB** - -- File: `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz` -- Description: Contains PCGR annotations, including tumor mutational burden (TMB). - -**DRAGEN HRD Score** - -- File: `dragen_somatic_output/{tid}.hrdscore.tsv` -- Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. - -# FAQ - -### Q: Do we use PCGR for the rescue of sage? - -**A:** In Somatic SV, we used sage to make variant calling then we did annotation of the variant using PCGR, then we filtered the variant. If variants have high-tier ranks, they are not filtered out whatsoever - -### Q: how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? - -**A:** In the current version of Sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that are not considered that don’t have clinical impact, in hotspot region, until we meet the threshold. We that wil impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be hable to get correct TMB and MSI from purple - -### Q: how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? -**A:** Filter out as we Filter on chr 1..22 and chr X,Y,M - -### Q: inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? -**A:** Circos plots come Purple - -### Q: we dropped the CACAO coverage reports; can we discuss how to utilize DRAGEN or WiGiTS coverage information instead? - -### Q: what TMB score is displayed in the cancer reporter? -**A:** The TMB display is the on calculated by pcgr - -### Q: what filtered VCF is the source for the mutational signatures? -**A:** We use the filtred VCF for mutational signatures - -### Q: Where is the contamination score coming from currently? -**A:** I don’t think there is contamination at the moment in sash - -### Q: Do the GRIPSS step do something more than what's happening in oncoanalyser ? -**A:** no different settings are applied to GRIPSS other than reference files - -### Q: Does the data from Somatic Small Variantsworkflow are use for the SV ? -**A:** iirc data from the somatic small variant workflow is not used in the sv workflow \ No newline at end of file +- [Architectural decision record (ADR)](adr.md) + - describes a choice the team makes about a significant aspect of the software architecture they're planning to build \ No newline at end of file diff --git a/docs/detail.md b/docs/detail.md new file mode 100644 index 00000000..6e20f084 --- /dev/null +++ b/docs/detail.md @@ -0,0 +1,765 @@ +- [Usage](usage.md) + - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. +- [Output](output.md) + - An overview of the different results produced by the pipeline and how to interpret them. + +# Sash Workflow Overview + +![Summary](images/sash_overview_qc.png) + +The sash Workflow is a genomic analysis framework comprising three primary pipelines: + +- Somatic Small Variants (SNV somatic): Detects single nucleotide variants (SNVs) and indels in tumor samples, emphasizing clinical relevance. +- Somatic Structural Variants (SV somatic): Identifies large-scale genomic alterations (deletions, duplications, etc.) and integrates copy number data. +- Germline Variants (SNV germline): Focuses on inherited variants linked to cancer predisposition. + +These pipelines utilise Bolt, a Python package designed for modular processing, and leverage outputs from the [DRAGEN](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) Variant Caller alongside and the Hartwig Medical Foundation WiGiTS toolkit (via [Oncoanalyser](https://github.com/nf-core/oncoanalyser)) [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. + +# [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) + +HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: + +- [SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md): + A tiered SNV/indel caller targeting ~10,000 cancer hotspots (e.g., OncoKB, CIViC) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). + +- [PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple): + Estimates tumor purity (tumor cell fraction) and ploidy (average copy number), integrates copy number data, and calculates TMB (tumor mutation burden) and MSI (microsatellite instability). + +# Pipeline Inputs + +## Dragen + +{tumor_id}.hard-filtered.vcf.gz + +## Oncoanalyser + +### Gridss +{tumor_id}.gridss.vcf.gz" + +### SAGE +{tumor_id}.sage.somatic.vcf.gz" +### Virusbraken +virusbreakend directory +### Cobalt + +Cobalt directory used by purple +### Amber +Amber directory used by purple + +# Workflows + +## Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants + +#### General + +In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relaing on Somatic Alterations in Genome [SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. + +#### Summary + +1. Rescue variants using SAGE to recover low-frequency alterations in clinically important hotspots. +2. Annotate variants with clinical and functional information using PCGR. +3. Filter variants based on quality and frequency criteria (e.g., allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.).Filter variants based on allele frequency (AF), supporting reads (AD), and population frequency (gnomAD AF), removing low-confidence and common variants. +4. Report final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, multiqc) format. + +### Variant Calling integrations + +The variant calling integrations step use variants fromemploys the Somatic Alterations in Genome (SAGE) variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on targets known cancer hotspots (from sources like CGI, CIViC, OncoKB) Targeted Hotspot. Analysis, prioritising predefined genomic regions of high clinical or biological relevance with his own [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. + +- Low-allele-frequency variants in hotspots genomic regions of clinical significance. +- Hotspots are derived from: + Cancer Genome Interpreter (CGI) + - CIViC \- Clinical interpretations of variants in cancer. + - OncoKB \- Precision Oncology Knowledge Base. +- Outputs a VCF containing rescued variants. + +##### Inputs: + +- From DRAGEN: somatic small variant caller VCF + - ${tumor\_id}.main.dragen.vcf.gz +- From oncoanalyser: SAGE VCF + - ${tumor\_id}.main.sage.filtered.vcf.gz + + Filter on chr 1..22 and chr X,Y,M + +##### Output: + +- Rescue: VCF + - ${tumor\_id}.rescued.vcf.g + +#### Details + +Steps are: + +1. Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered: + - Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions. + - Hotspot regions are derived from databases such as: + - Cancer Genome Interpreter (CGI) + - CIViC (Clinical Interpretations of Variants in Cancer) + - OncoKB (Precision Oncology Knowledge Base) + - This ensures that only high-confidence variants in clinically relevant regions are considered. +2. Separate SAGE calls into existing and novel variants + - Compare the input VCF and the filtered SAGE VCF to identify overlapping and unique variants. +3. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF + - Annotate variants that are re-called by SAGE: + - For each variant in the input VCF, check if it exists in the SAGE existing calls. + - For variants re-called by SAGE: + - If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`: + - Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot. + - If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`: + - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is rescued by SAGE. + - Update `FILTER=PASS` to include the variant in the final analysis. + - If `SAGE FILTER` is not `PASS`: + - Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants. + - Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix +4. Combine annotated input VCF with novel SAGE calls + - Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF:: + - Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions: + - For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`. + - Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data. + + Summary Finalize the rescued of VCF file integration + + - The final VCF file includes: + - Original variants from the input VCF, annotated with SAGE information where applicable. + - Novel variants identified by SAGE in hotspot regions. + - Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process. + - The rescued VCF provides a comprehensive set of variants for downstream analysis, prioritizing clinically significant mutations. + +### Annotation + +The Annotation consists of three processes:step employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots),UMCCR panel of normals and theand the Personal Cancer Genome Reporter (PCGR) tool to enrich variants with detailed functional and with clinical information using ACMG guidelines. PCGR classifies variants into tiers based on their clinical and biological significance and incorporates mutational signature analysis to provide insights into underlying mutational processes. To manage memory usage effectively, the input VCF file is divided into chunks, each containing up to 500,000 variants. Each chunk is processed independently through PCGR, and after annotation, the chunks are merged to produce an annotated VCF and TSV file. + +#### These annotations are used to decide which variants are retained or filtered in the next step + +Summary: +Use PCGR to enrich the VCF with: + +- Functional impact information (e.g., consequences, mutation hotspots). +- Clinical relevance (e.g., tier classifications, mutational signatures). +- Process VCF files in chunks ≤500,000 variants each. +- Merge annotated chunks into a unified VCF. + +##### Inputs: + +- Small variant vcfRescue VCF + - ${tumor\_id}.main.sage.filtered.vcf.gz + +##### Output: + +- Annotated VCF + - ${tumor\_id}.annotations.vcf.g + +Details: + +Steps are: + +1. Set FILTER to "PASS" for unfiltered variants + - Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. +2. Annotate the VCF against reference sources + - Use vcfanno to add annotations to the VCF file: + - gnomAD + - Hartwig Hotspots + - ENCODE Blacklist + - Genome in a Bottle High-Confidence Regions: Mark high-confidence regions from the Genome in a Bottle benchmark. + - Low and High GC Regions: Mark regions with \30% or \65% GC content, compiled by GA4GH. + - Bad Promoter Regions: Annotate regions with poor coverage, compiled by GA4GH. +3. Annotate with UMCCR panel of normals counts + - Use vcfanno and bcftools to annotate the VCF with counts from the UMCCR panel of normals, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. +4. Standardize the VCF fields + - Add new `INFO` fields for use with PCGR: +- `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. +- `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. +- Add the `AD` FORMAT field: +- `AD`: Allelic depths for the reference and alternate alleles. +5. Prepare VCF for PCGR annotation + - Exclude unnecessary data from the VCF header keeping on INFO AF/DP . + - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. + - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. + +6. Run PCGR to annotate VCF against external sources + - Use PCGR (Personal Cancer Genome Reporter) to annotate the VCF with clinical, functional, and biological information. + - Classify variants by tiers based on annotations and functional impact according to ACMG guidelines. + - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. + - External sources used during this step include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. +7. Transfer PCGR annotations to the full set of variants + - merge the PCGR annotations back into the original VCF file. + - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. + - Preserve the `FILTER` statuses and other annotations from the original VCF. +8. Filter variants to remove putative germline variants and artefactsartifacts while keeping known hotspots/actionable variants + - Keep variants: + - Called by SAGE in known hotspots (CGI, CIViC, OncoKB) regardless of other evidence. + - With PCGR TIER 1 and 2 classifications, indicating strong or potential clinical significance according to ACMG guidelines. + - All driver mutations from; + - IntOGen + - mutation hotspots + - ClinVar pathogenic or uncertain significance + - COSMIC count ≥10 + - TCGA pancancer count ≥5 + - ICGC PCAWG count ≥3. + - Apply filters to other variants: + - Remove variants with `AF 10%`. + - Remove common variants in gnomAD (<`population AF ≥ 1%`), adding them to the germline set. + - Remove variants present in ≥5 samples of the Panel of Normals. + - Remove indels in "bad promoter" regions (as defined by GA4GH). + - Remove variants overlapping the ENCODE blacklist. + - Remove variants with variant depth `VD 4`. + - Remove variants with `VD < 6` and overlapping a low complexity region. + - Remove VarDict strand-biased variants unless supported by other callers. +9. Report passing variants using PCGR, classified by the ACMG tier system +10. Generate the final report of variants classified according to clinical significance using PCGR, ready for downstream analysis. + + #### + +#### + +### Filter + +The Filter step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. + +Inputs: + +- Annotated VCF + - ${meta.tumor\_id}.annotations.vcf.gz + +#### Output: + +- Filter VCF + - ${meta.tumor\_id}\*filters\_set.vcf.gz + +Filters: + +1\. Technical Quality Filters + +#### 1.1 Allele Frequency (20% are also excluded. +- This step reduces contamination from sequencing artefacts or undetected germline variants. + +### 3\. Rescue and Clinical Significance Filters + +These variants are retained even if they fail technical filters. + +### 3.1 Hotspot Rescue + +- Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters. + + #### 3.2 Reference Database Hit Count Rescue + +- Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering: + - COSMIC count ≥10 + - TCGA pan-cancer count ≥5 + - ICGC PCAWG count ≥5 + + #### 3.3 ClinVar Pathogenicity Rescue + +- Variants classified in ClinVar as: + - Likely Pathogenic + - Pathogenic + - Uncertain Significance (VUS) with strong clinical evidence + +- Allele Frequency (AF) Filter: + - Excludes variants with a tumor allele frequency below a threshold of 0.1. +- Allele Depth (AD) Filter: + - Removes variants with fewer than 4 supporting reads in the tumor sample. +- Degraded Mappability AD Filter: + - Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. + - Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor\_ad \ 6 +- Non-GIAB AD Filter: + - Removes variants not confirmed by the Genome in a Bottle (= 0.01 +- Panel of Normals (PON) Germline Filter: + - Filters out variants with an allele frequency in the PON below 0.20. + - Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON\_COUNT \>= 5 +- FIlter rescue variant: + +Variants meeting these criteria are flagged as `CLINICAL_POTENTIAL_RESCUE` are NOT filtered out + +- Reference Database Hit Counts: + - Variants with a COSMIC count of ≥10. + - Variants with a TCGA pan-cancer count of ≥5. + - Variants with an ICGC PCAWG count of ≥5. +- ClinVar Significance: + - Variants with ClinVar classifications matching the following categories are rescued: + - `conflicting_interpretations_of_pathogenicity` + - `likely_pathogenic` + - `pathogenic` + - `uncertain_significance` +- Mutation Hotspots: + - Variants identified as hotspots in: + - `HMF_HOTSPOT` + - `PCGR_MUTATION_HOTSPOT` +- PCGR Tiers: + - Variants classified as: + - `TIER_1` + - `TIER_2` + +### Reports + +The Report step utilises the Personal Cancer Genome Reporter (PCGR) + +Inputs: + +- Purple purity +- Filter VCF +- Dragen VCF + +#### Output: + +- PCGRCancer repor + - ${meta.tumor\_id}.pcgr\_acmg.grch38.html + +1. Generate BCFtools Statistics on the Input VCF: + The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. +2. Calculate Allele Frequency Distributions: + The `allele_frequencies` function uses external tools (bcftools, bedtools) to: + - Filter and normalize variants according to high-confidence regions. + - Extract allele frequency data from tumor samples. + - Produce both a global allele frequency summary and a subset of allele frequencies restricted to key cancer genes. +3. Compare Variant Counts From Two Variant Sets (DRAGEN vs. BOLT) + - The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. +4. Count Variants by Processing Stage +5. Parse Purity and Ploidy Information (Purple Data) +6. Run PCGR Annotation + +## Somatic structural variants + +The Somatic Structural Variants (SVs) pipeline identifies and annotates large-scale genomic alterations, including deletions, duplications, inversions, insertions, and translocations in tumor samples. This step integrates outputs from DRAGEN Variant Caller, GRIDSS2, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants.The analysis of somatic structural variants (SVs) involves processing, annotating, and prioritizing variants to identify those with clinical and biological significance. This process uses outputs from tools like PURPLE and GRIDSS and involves several key steps: + +### Summary: + +1. GRIPSS filtering: + - GRIPSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known\_fusions +2. PURPLE + - Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. +3. Annotation + - Combines SV calls from GRIPSS with CNV data from PURPLE + - Annotate variant using [SnpEff](https://github.com/pcingola/SnpEff) +4. Prioritisation + - Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) + - Prioritise variants based on clinical relevance and support metric +5. Repor + - Cancer repor + - Multiqc +6. Assign SV Types: + - Classify SVs as duplications or deletions based on copy number thresholds. + - Split variants into separate files for structural variants (SVs) and copy number variants (CNVs). +7. Annotate and Prioritize Variants: + - Use SnpEff to annotate variants with gene-level and functional impact information. + - Prioritize variants based on clinical relevance and support metrics. + - Generate TSV (tab-separated values) files summarizing the prioritized SVs and CNVs. +8. Generate Summary Reports: +9. Create TSV (tab-separated values) files summarizing the prioritized SVs and CNVs for downstream analysis and reporting. + + ### Input Files + + ### Primary SV VCFs: + + - GRIDSS2 + - ${meta.tumor\_id}.gridss.vcf.gz + +### Details + +### Detailed Steps: + +1. GRIPSS filtering: + - Evaluate split-read and paired-end support; discard variants with low support. + - Apply panel-of-normals filtering to remove artefacts observed in normal samples. + - Retain variants overlapping known oncogenic fusion hotspots (using UMCCR-curated lists). + - Exclude variants in repetitive regions based on Repeat Masker annotations. +2. Purple: + - Merge SV calls with CNV segmentation data. + - Estimate tumor purity and ploidy. + - Adjust SV breakpoints based on copy number transitions. + - Classify SVs as somatic or germline. +3. Annotation + - Compile SV and CNV information into a unified VCF file. + - Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE\_baf, PURPLE\_copyNumber). + - Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions). + - Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences. +4. Prioritization + - Run the prioritization module (forked from the AstraZeneca simple\_sv\_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes. + - Classify Variants: + - Structural Variants (SVs): Variants labeled with the source `sv_gridss`. + - Copy Number Variants (CNVs): Variants labeled with the source `cnv_purple`. + - Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest): +- exon loss + - on cancer gene list (1) + - other (2) +- gene fusion + - paired (hits two genes) + - on list of known pairs (1) (curated by [HMF]() + - one gene is a known promiscuous fusion gene (1) (curated by [HMF]() + - on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2) + - other: + - one or two genes on cancer gene list (2) + - neither gene on cancer gene list (3) + - unpaired (hits one gene) + - on cancer gene list (2) + - others (3) +- upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed () + - on cancer gene list genes (2) +- LoF or HIGH impact in a tumor suppressor + - on cancer gene list (2) + - other TS gene (3) +- other (4) + - Filter Low-Quality Calls: + - Keep variants with sufficient read support (e.g., split reads (). + - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. + - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1. + - The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields: + - SIMPLE\_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier. + - SV\_TOP\_TIER: A numeric field indicating the highest priority tier for the variant. + - The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated. +1. Report + - Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data. + - MultiQC Report: Aggregates quality control metrics from GRIDSS2, PURPLE, LINX, and the annotation/prioritization steps, providing an overall assessment of data quality. + +2. Obtain Input Structural Variants: + - Source Data: + - Obtain the structural variant VCF file generated by PURPLE, which integrates data from GRIDSS (for SV detection), PURPLE (for copy number analysis). + - The input includes both structural variants and copy number changes detected in the tumor sample. +3. Assign Structural Variant Types: + - Classify Variants: + - Structural Variants (SVs): Variants labeled with the source sv\_gridss. + - Copy Number Variants (CNVs): Variants labeled with the source cnv\_purple. +4. Prioritise variants on a 4 tier system: + **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** + - exon loss + - on cancer gene list (1) + - other (2) + - gene fusion + - paired (hits two genes) + - on list of known pairs (1) (curated by [HMF](https://resources.hartwigmedicalfoundation.nl)) + - one gene is a known promiscuous fusion gene (1) (curated by [HMF](https://resources.hartwigmedicalfoundation.nl)) + - on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2) + - other: + - one or two genes on cancer gene list (2) + - neither gene on cancer gene list (3) + - unpaired (hits one gene) + - on cancer gene list (2) + - others (3) + - upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed (oncogene or underexpressed (tsgene)) + - on cancer gene list genes (2) + - LoF or HIGH impact in a tumor suppressor + - on cancer gene list (2) + - other TS gene (3) + - other (4) +5. Filter Low-Quality Calls: + Apply Quality Filters: + - Keep variants with sufficient read support (e.g., split reads (). + - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. + - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1. + 1. Generate Summary Reports + + + +## Germline small variants + +Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. + +1. Prepare + 1. Selection of Passing Variants: + 1. Raw germline variant calls (e.g. from DRAGEN or an ensemble caller) are filtered to retain only those variants marked as PASS (or with no filter flag) + 2. Selection of Gene Panel Variants: + 1. The filtered variants are then further restricted to regions defined by a gene panel transcript regions file. +2. Report: CPSR + +The CPSR (Cancer Predisposition Sequencing Report) includes the following: + +Settings: + +- Sample metadata +- Report configuration +- Virtual gene panel + +Summary of Findings: + +- Variant statistics + +Variant Classification: + +ClinVarc and Non-ClinVar + +- Class 5 \- Pathogenic variants +- Class 4 \- Likely Pathogenic variants +- Class 3 \- Variants of Uncertain Significance (VUS) +- Class 2 \- Likely Benign variants +- Class 1 \- Benign variants +- Biomarkers + +PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): + +- Tier 1 (High): Highest priority variants with strong clinical relevance. +- Tier 2 (Moderate): Variants with potential clinical significance. +- Tier 3 (Low): Variants with uncertain significance. +- Tier 4 (No Interest): Variants unlikely to be clinically relevant. + +# Common Reports + +### [Cancer report](https://umccr.github.io/gpgr/) + +UMCCR cancer report containing: + +Tumor Mutation Burden (TMB): + +- Data Source: filtered somatic VCF +- Tool: PURPLE + +#### Mutational Signatures: + +- Data Source: filtered SNV/CNV VCF +- Tool: MutationalPatterns R package (via PCGR) + +#### Contamination Score: + +- Data Source: – +- Note: No dedicated contamination metric is currently generated + +#### Purity & Ploidy: + +- Data Source: COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) +- Tool: PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) + +#### HRD Score: + +- Data Source: HRD analysis output file (${meta.tumor\_id}.hrdscore.tsv) +- Tool: DRAGEN + +#### MSI (Microsatellite Instability): + +- Data Source: Indels in microsatellite regions from SNV/CNV +- Tool: PURPLE + +#### Structural Variant Metrics: + +- Data Source: GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation +- Tools: GRIDSS/GRIPSS and PURPLE + +#### Copy Number Metrics (Segments, Deleted Genes, etc.): + +- Data Source: PURPLE CNV outputs (segmentation files, gene-level CNV TSV) +- Tool: PURPLE + +The LINX report includes the following: +- Tables of Variants: + - Breakends + - Links + - Driver Catalog +- Plots: + - Cluster-Level Plots + +### MultiQC + +General Stats: Overview of QC metrics aggregated from all tools, providing high-level sample quality information. + +DRAGEN: Mapping metrics (mapped reads, paired reads, duplicated alignments, secondary alignments), WGS coverage (average depth, cumulative coverage, per-contig coverage), fragment length distributions, trimming metrics, and time metrics for pipeline steps. + +PURPLE: Sample QC status (PASS/FAIL), ploidy, tumor purity, polyclonality percentage, tumor mutational burden (TMB), microsatellite instability (MSI) status, and variant metrics for somatic and germline SNPs/indels. + +BcfTools Stats: Variant substitution types, SNP and indel counts, quality scores, variant depth, and allele frequency metrics for both somatic and germline variants. + +DRAGEN-FastQC: Per-base sequence quality, per-sequence quality scores, GC content (per-sequence and per-position), HRD score, sequence length distributions, adapter contamination, and sequence duplication levels. + +### PCGR + +Personal Cancer Genome Reporter (PCGR) tool to generate a comprehensive, interactive HTML report that consolidates filtered and annotated variant data, providing detailed insights into the somatic variants identified. + +Key Metrics: + +- Variant Classification and Tier Distribution: PCGR categorizes variants into tiers based on their clinical and biological significance. The report details the proportion of variants across different tiers, indicating their potential clinical relevance. +- Mutational Signatures: The report includes analysis of mutational signatures, offering insights into the mutational processes active in the tumor. +- Copy Number Alterations (CNAs): Visual representations of CNAs are provided, highlighting significant gains and losses across the genome. Genome-wide plots display regions of copy number gains and losses. +- Tumor Mutational Burden (TMB): Calculations of TMB are included, which can have implications for immunotherapy eligibility. The report presents the TMB value, representing the number of mutations per megabase. +- Microsatellite Instability (MSI) Status: Assessment of MSI status is performed, relevant for certain cancer types and treatment decisions. +- Clinical Trials Information: Information on relevant clinical trials is incorporated, offering potential therapeutic options based on the identified variants. + +Note: The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered ou + +### CPSR Repor + +The CPSR (Cancer Predisposition Sequencing Report) includes the following: + +Settings: + +- Sample metadata +- Report configuration +- Virtual gene panel + +Summary of Findings: + +- Variant statistics + +Variant Classification: + +ClinVarc and Non-ClinVar + +- Class 5 \- Pathogenic variants +- Class 4 \- Likely Pathogenic variants +- Class 3 \- Variants of Uncertain Significance (VUS) +- Class 2 \- Likely Benign variants +- Class 1 \- Benign variants +- Biomarkers + +PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): + +- Tier 1 (High): Highest priority variants with strong clinical relevance. +- Tier 2 (Moderate): Variants with potential clinical significance. +- Tier 3 (Low): Variants with uncertain significance. +- Tier 4 (No Interest): Variants unlikely to be clinically relevant. + +# Reference data + +### [UMCCR Genes panels](https://github.com/umccr/gene_panels) + +### Genome annotations + +WiGiTS (hmftools) + +Annotation Databases: + +- gnomAD: Provides population allele frequencies to help distinguish common variants from rare ones. +- ClinVar: Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. +- COSMIC: Contains data on somatic mutations found in cancer, facilitating the identification of cancer-related variants. +- Gene Panels: Focuses analysis on specific sets of genes relevant to particular conditions or research interests. + +Structural Variant Data: + +- SnpEff Databases: Used for predicting the effects of variants on genes and proteins. +- Panel of Normals (PON): Helps filter out technical artifacts by comparing against a set of normal samples. +- RepeatMasker: Identifies repetitive genomic regions to prevent false-positive variant calls. + +Databases/datasets PCGR Reference Data: + +*Version: v20220203* + +- [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) +- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 () +- [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) +- [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) +- [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) +- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021\_11 () +- [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021\_04) +- [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () +- [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) +- [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) +- [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) +- [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) +- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 () +- [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) +- [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) +- [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) +- [GWAS\_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) +- [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) + +### + +### + +# Sash Module Outputs: + +Somatic SNVs + +- File: `smlv_somatic/filter/{tid}.pass.vcf.gz` +- Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. + +Somatic SVs + +- File: `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz` +- Description: Contains somatic structural variants (SVs) with prioritization applied. + +Somatic CNVs + +- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz` +- Description: Contains somatic copy number variations (CNVs) data. + +Somatic Gene CNVs + +- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz` +- Description: Contains gene-level somatic copy number variations (CNVs) data. + +Germline SNVs + +- File: `dragen_germline_output/{nid}.hard-filtered.vcf.gz` +- Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. + +Purple Purity, Ploidy, MS Status + +- File: `purple/{tid}.purple.purity.tsv` +- Description: Contains estimated tumor purity, ploidy, and microsatellite status. + +PCGR JSON with TMB + +- File: `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz` +- Description: Contains PCGR annotations, including tumor mutational burden (TMB). + +DRAGEN HRD Score + +- File: `dragen_somatic_output/{tid}.hrdscore.tsv` +- Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. + +# FAQ + +### Q: Do we use PCGR for the rescue of sage? + +A: In Somatic SV, we used sage to make variant calling then we did annotation of the variant using PCGR, then we filtered the variant. If variants have high-tier ranks, they are not filtered out whatsoever + +### Q: how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? + +A: In the current version of Sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that are not considered that don’t have clinical impact, in hotspot region, until we meet the threshold. We that wil impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be hable to get correct TMB and MSI from purple + +### Q: how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? +A: Filter out as we Filter on chr 1..22 and chr X,Y,M + +### Q: inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? +A: Circos plots come Purple + +### Q: we dropped the CACAO coverage reports; can we discuss how to utilize DRAGEN or WiGiTS coverage information instead? + +### Q: what TMB score is displayed in the cancer reporter? +A: The TMB display is the on calculated by pcgr + +### Q: what filtered VCF is the source for the mutational signatures? +A: We use the filtred VCF for mutational signatures + +### Q: Where is the contamination score coming from currently? +A: I don’t think there is contamination at the moment in sash + +### Q: Do the GRIPSS step do something more than what's happening in oncoanalyser ? +A: no different settings are applied to GRIPSS other than reference files + +### Q: Does the data from Somatic Small Variantsworkflow are use for the SV ? +A: iirc data from the somatic small variant workflow is not used in the sv workflow \ No newline at end of file diff --git a/docs/usage.md b/docs/usage.md index 1699d1ad..011d2574 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,3 +1,4 @@ +# TODO # umccr/sash: Usage > _Documentation of pipeline parameters is generated automatically from the pipeline schema and can no longer be found in markdown files._ From c768668ff023751f55d5cf90fd835285ee63222a Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 11:39:18 +1100 Subject: [PATCH 10/36] fix typo --- docs/{detail.md => details.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{detail.md => details.md} (100%) diff --git a/docs/detail.md b/docs/details.md similarity index 100% rename from docs/detail.md rename to docs/details.md From 81c5919d955f63c1816cec55f06c6678b44118fc Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 13:23:51 +1100 Subject: [PATCH 11/36] linting add inputs --- docs/details.md | 121 ++++++++++++++++++++++++++---------------------- 1 file changed, 65 insertions(+), 56 deletions(-) diff --git a/docs/details.md b/docs/details.md index 6e20f084..3df158bb 100644 --- a/docs/details.md +++ b/docs/details.md @@ -1,8 +1,3 @@ -- [Usage](usage.md) - - An overview of how the pipeline works, how to run it and a description of all of the different command-line flags. -- [Output](output.md) - - An overview of the different results produced by the pipeline and how to interpret them. - # Sash Workflow Overview ![Summary](images/sash_overview_qc.png) @@ -15,42 +10,59 @@ The sash Workflow is a genomic analysis framework comprising three primary pipel These pipelines utilise Bolt, a Python package designed for modular processing, and leverage outputs from the [DRAGEN](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) Variant Caller alongside and the Hartwig Medical Foundation WiGiTS toolkit (via [Oncoanalyser](https://github.com/nf-core/oncoanalyser)) [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. -# [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) +## [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: -- [SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md): +- [SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md): A tiered SNV/indel caller targeting ~10,000 cancer hotspots (e.g., OncoKB, CIViC) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). - [PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple): Estimates tumor purity (tumor cell fraction) and ploidy (average copy number), integrates copy number data, and calculates TMB (tumor mutation burden) and MSI (microsatellite instability). -# Pipeline Inputs +- [Cobalt](https://github.com/hartwigmedical/hmftools/blob/master/cobalt/README.md): +Cobalt calculates read-depth ratios from sequencing data, providing essential input for copy number analysis. Its outputs are used by PURPLE to generate accurate copy number profiles across the genome. + +- [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md): +Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's integrated analysis. + +## Pipeline Inputs + +### Dragen + +`{tumor_id}.hard-filtered.vcf.gz` + +### Oncoanalyser + +#### [GRIDSS/GRIPSS](https://github.com/PapenfussLab/gridss) -## Dragen +`{tumor_id}.gridss.vcf.gz` +Description: This VCF contains structural variant calls produced by GRIDSS2. -{tumor_id}.hard-filtered.vcf.gz +#### SAGE -## Oncoanalyser +`{tumor_id}.sage.somatic.vcf.gz` -### Gridss -{tumor_id}.gridss.vcf.gz" +#### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) -### SAGE -{tumor_id}.sage.somatic.vcf.gz" -### Virusbraken -virusbreakend directory -### Cobalt +- Directory: `virusbreakend` +- Description:Contains outputs from Virusbraken, used for detecting viral integration events. -Cobalt directory used by purple -### Amber -Amber directory used by purple +#### Cobalt -# Workflows +- Directory: `Cobalt` +- Description: Contains read-depth ratio data required for copy number analysis by PURPLE. + +#### Amber + +- Directory: `Amber` +- Description: Contains B-allele frequency measurements used by PURPLE to estimate tumor purity and ploidy. + +## Workflows ## Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants -#### General +### General In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relaing on Somatic Alterations in Genome [SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. @@ -72,19 +84,19 @@ The variant calling integrations step use variants fromemploys the Somatic Alter - OncoKB \- Precision Oncology Knowledge Base. - Outputs a VCF containing rescued variants. -##### Inputs: +#### Inputs - From DRAGEN: somatic small variant caller VCF - - ${tumor\_id}.main.dragen.vcf.gz + - `${tumor_id}.main.dragen.vcf.gz` - From oncoanalyser: SAGE VCF - - ${tumor\_id}.main.sage.filtered.vcf.gz + - `${tumor_id}.main.sage.filtered.vcf.gz` Filter on chr 1..22 and chr X,Y,M -##### Output: +##### Output - Rescue: VCF - - ${tumor\_id}.rescued.vcf.g + - `${tumor_id}.rescued.vcf.g` #### Details @@ -142,12 +154,12 @@ Use PCGR to enrich the VCF with: ##### Inputs: - Small variant vcfRescue VCF - - ${tumor\_id}.main.sage.filtered.vcf.gz + - `${tumor_id}.main.sage.filtered.vcf.gz` ##### Output: - Annotated VCF - - ${tumor\_id}.annotations.vcf.g + - `${tumor_id}.annotations.vcf.g` Details: @@ -167,10 +179,10 @@ Steps are: - Use vcfanno and bcftools to annotate the VCF with counts from the UMCCR panel of normals, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. 4. Standardize the VCF fields - Add new `INFO` fields for use with PCGR: -- `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. -- `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. -- Add the `AD` FORMAT field: -- `AD`: Allelic depths for the reference and alternate alleles. + - `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. + - `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. + - Add the `AD` FORMAT field: + - `AD`: Allelic depths for the reference and alternate alleles. 5. Prepare VCF for PCGR annotation - Exclude unnecessary data from the VCF header keeping on INFO AF/DP . - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. @@ -208,9 +220,6 @@ Steps are: 9. Report passing variants using PCGR, classified by the ACMG tier system 10. Generate the final report of variants classified according to clinical significance using PCGR, ready for downstream analysis. - #### - -#### ### Filter @@ -219,12 +228,12 @@ The Filter step applies a series of stringent filters to somatic variant calls i Inputs: - Annotated VCF - - ${meta.tumor\_id}.annotations.vcf.gz + - `${meta.tumor_id}.annotations.vcf.gz` -#### Output: +#### Output - Filter VCF - - ${meta.tumor\_id}\*filters\_set.vcf.gz + - `${meta.tumor_id}\*filters_set.vcf.gz` Filters: @@ -296,14 +305,14 @@ These variants are retained even if they fail technical filters. - Removes variants with fewer than 4 supporting reads in the tumor sample. - Degraded Mappability AD Filter: - Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. - - Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor\_ad \ 6 + - Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor_ad \ 6 - Non-GIAB AD Filter: - Removes variants not confirmed by the Genome in a Bottle (= 0.01 + - Excludes variants with a population allele frequency greater than 0.01, based on gnomAD data. Gnomad_af \>= 0.01 - Panel of Normals (PON) Germline Filter: - Filters out variants with an allele frequency in the PON below 0.20. - - Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON\_COUNT \>= 5 + - Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON_COUNT \>= 5 - FIlter rescue variant: Variants meeting these criteria are flagged as `CLINICAL_POTENTIAL_RESCUE` are NOT filtered out @@ -340,7 +349,7 @@ Inputs: #### Output: - PCGRCancer repor - - ${meta.tumor\_id}.pcgr\_acmg.grch38.html + - ${meta.tumor_id}.pcgr_acmg.grch38.html 1. Generate BCFtools Statistics on the Input VCF: The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. @@ -362,7 +371,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc ### Summary: 1. GRIPSS filtering: - - GRIPSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known\_fusions + - GRIPSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known_fusions 2. PURPLE - Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. 3. Annotation @@ -389,7 +398,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc ### Primary SV VCFs: - GRIDSS2 - - ${meta.tumor\_id}.gridss.vcf.gz + - ${meta.tumor_id}.gridss.vcf.gz ### Details @@ -407,11 +416,11 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Classify SVs as somatic or germline. 3. Annotation - Compile SV and CNV information into a unified VCF file. - - Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE\_baf, PURPLE\_copyNumber). + - Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE_baf, PURPLE_copyNumber). - Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions). - Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences. 4. Prioritization - - Run the prioritization module (forked from the AstraZeneca simple\_sv\_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes. + - Run the prioritization module (forked from the AstraZeneca simple_sv_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes. - Classify Variants: - Structural Variants (SVs): Variants labeled with the source `sv_gridss`. - Copy Number Variants (CNVs): Variants labeled with the source `cnv_purple`. @@ -441,8 +450,8 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1. - The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields: - - SIMPLE\_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier. - - SV\_TOP\_TIER: A numeric field indicating the highest priority tier for the variant. + - SIMPLE_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier. + - SV_TOP_TIER: A numeric field indicating the highest priority tier for the variant. - The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated. 1. Report - Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data. @@ -454,8 +463,8 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - The input includes both structural variants and copy number changes detected in the tumor sample. 3. Assign Structural Variant Types: - Classify Variants: - - Structural Variants (SVs): Variants labeled with the source sv\_gridss. - - Copy Number Variants (CNVs): Variants labeled with the source cnv\_purple. + - Structural Variants (SVs): Variants labeled with the source sv_gridss. + - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. 4. Prioritise variants on a 4 tier system: **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** - exon loss @@ -556,7 +565,7 @@ Tumor Mutation Burden (TMB): #### HRD Score: -- Data Source: HRD analysis output file (${meta.tumor\_id}.hrdscore.tsv) +- Data Source: HRD analysis output file (${meta.tumor_id}.hrdscore.tsv) - Tool: DRAGEN #### MSI (Microsatellite Instability): @@ -671,8 +680,8 @@ Databases/datasets PCGR Reference Data: - [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) - [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) - [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) -- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021\_11 () -- [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021\_04) +- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021_11 () +- [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021_04) - [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () - [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) - [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) @@ -682,7 +691,7 @@ Databases/datasets PCGR Reference Data: - [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) - [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) - [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) -- [GWAS\_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) +- [GWAS_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) - [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) ### From 7e55c71a2f925376d6434a892bb2e0bb47726ac2 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 15:36:50 +1100 Subject: [PATCH 12/36] linting --- docs/details.md | 48 +++--------------------------------------------- 1 file changed, 3 insertions(+), 45 deletions(-) diff --git a/docs/details.md b/docs/details.md index 3df158bb..824c9689 100644 --- a/docs/details.md +++ b/docs/details.md @@ -62,7 +62,7 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. ## Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants -### General +#### General In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relaing on Somatic Alterations in Genome [SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. @@ -424,47 +424,6 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Classify Variants: - Structural Variants (SVs): Variants labeled with the source `sv_gridss`. - Copy Number Variants (CNVs): Variants labeled with the source `cnv_purple`. - - Prioritise variants on a 4 tier system \- 1 (high) \- 2 (moderate) \- 3 (low) \- 4 (no interest): -- exon loss - - on cancer gene list (1) - - other (2) -- gene fusion - - paired (hits two genes) - - on list of known pairs (1) (curated by [HMF]() - - one gene is a known promiscuous fusion gene (1) (curated by [HMF]() - - on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2) - - other: - - one or two genes on cancer gene list (2) - - neither gene on cancer gene list (3) - - unpaired (hits one gene) - - on cancer gene list (2) - - others (3) -- upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed () - - on cancer gene list genes (2) -- LoF or HIGH impact in a tumor suppressor - - on cancer gene list (2) - - other TS gene (3) -- other (4) - - Filter Low-Quality Calls: - - Keep variants with sufficient read support (e.g., split reads (). - - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. - - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1. - - The module assigns a priority tier to each variant (ranging from Tier 1 for high priority to Tier 4 for no interest) and populates the INFO fields: - - SIMPLE_ANN: A simplified annotation string that includes SV type, effect, involved genes, transcript(s), a description, and the assigned tier. - - SV_TOP_TIER: A numeric field indicating the highest priority tier for the variant. - - The unified VCF is then split into separate files for SVs and CNVs using bcftools, and TSV summary reports are generated. -1. Report - - Cancer Report: Integrates the prioritized SV data with somatic SNVs, CNVs, and quality metrics to provide a comprehensive overview of the tumor’s genomic alterations. This report includes detailed tables, a fusion gene summary, and a Circos plot (produced by PURPLE) that visualizes copy number and SV data. - - MultiQC Report: Aggregates quality control metrics from GRIDSS2, PURPLE, LINX, and the annotation/prioritization steps, providing an overall assessment of data quality. - -2. Obtain Input Structural Variants: - - Source Data: - - Obtain the structural variant VCF file generated by PURPLE, which integrates data from GRIDSS (for SV detection), PURPLE (for copy number analysis). - - The input includes both structural variants and copy number changes detected in the tumor sample. -3. Assign Structural Variant Types: - - Classify Variants: - - Structural Variants (SVs): Variants labeled with the source sv_gridss. - - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. 4. Prioritise variants on a 4 tier system: **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** - exon loss @@ -492,9 +451,8 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Keep variants with sufficient read support (e.g., split reads (). - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1. - 1. Generate Summary Reports - - + - Structural Variants (SVs): Variants labeled with the source sv_gridss. + - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. ## Germline small variants From d644b226c629fb2513ff46bf6ddb954a1fa95f6a Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 16:21:15 +1100 Subject: [PATCH 13/36] more linting --- docs/details.md | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/docs/details.md b/docs/details.md index 824c9689..7dfe9eb1 100644 --- a/docs/details.md +++ b/docs/details.md @@ -210,7 +210,7 @@ Steps are: - ICGC PCAWG count ≥3. - Apply filters to other variants: - Remove variants with `AF 10%`. - - Remove common variants in gnomAD (<`population AF ≥ 1%`), adding them to the germline set. + - Remove common variants in gnomAD (`population AF ≥ 1%`), adding them to the germline set. - Remove variants present in ≥5 samples of the Panel of Normals. - Remove indels in "bad promoter" regions (as defined by GA4GH). - Remove variants overlapping the ENCODE blacklist. @@ -228,21 +228,21 @@ The Filter step applies a series of stringent filters to somatic variant calls i Inputs: - Annotated VCF - - `${meta.tumor_id}.annotations.vcf.gz` + - `${tumor_id}.annotations.vcf.gz` #### Output - Filter VCF - - `${meta.tumor_id}\*filters_set.vcf.gz` + - `${tumor_id}\*filters_set.vcf.gz` Filters: 1\. Technical Quality Filters -#### 1.1 Allele Frequency (= 0.01 - Panel of Normals (PON) Germline Filter: @@ -349,7 +349,7 @@ Inputs: #### Output: - PCGRCancer repor - - ${meta.tumor_id}.pcgr_acmg.grch38.html + - ${tumor_id}.pcgr_acmg.grch38.html 1. Generate BCFtools Statistics on the Input VCF: The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. @@ -380,8 +380,8 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc 4. Prioritisation - Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) - Prioritise variants based on clinical relevance and support metric -5. Repor - - Cancer repor +5. Report + - Cancer report - Multiqc 6. Assign SV Types: - Classify SVs as duplications or deletions based on copy number thresholds. @@ -398,7 +398,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc ### Primary SV VCFs: - GRIDSS2 - - ${meta.tumor_id}.gridss.vcf.gz + - ${tumor_id}.gridss.vcf.gz ### Details @@ -448,9 +448,9 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - other (4) 5. Filter Low-Quality Calls: Apply Quality Filters: - - Keep variants with sufficient read support (e.g., split reads (). + - Keep variants with sufficient read support (e.g., split reads (SR) ≥ 5 and paired reads (PR) ≥ 5). - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. - - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (<`AF0` or `AF1`) are below 0.1. + - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (`AF0` or `AF1`) are below 0.1. - Structural Variants (SVs): Variants labeled with the source sv_gridss. - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. @@ -523,7 +523,7 @@ Tumor Mutation Burden (TMB): #### HRD Score: -- Data Source: HRD analysis output file (${meta.tumor_id}.hrdscore.tsv) +- Data Source: HRD analysis output file (${tumor_id}.hrdscore.tsv) - Tool: DRAGEN #### MSI (Microsatellite Instability): @@ -634,18 +634,18 @@ Databases/datasets PCGR Reference Data: *Version: v20220203* - [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) -- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 () +- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 (v4.2)) - [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) - [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) - [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) -- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021_11 () +- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021_11 (v35.0)) - [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021_04) -- [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 () +- [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 (October 2018)) - [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) - [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) - [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) - [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) -- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 () +- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 (v42)) - [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) - [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) - [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) From 41f68c45f42bc3f9b94d014dc4833f4a1a36d6b3 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 16:36:21 +1100 Subject: [PATCH 14/36] separation --- docs/details.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/details.md b/docs/details.md index 7dfe9eb1..a469209a 100644 --- a/docs/details.md +++ b/docs/details.md @@ -10,6 +10,8 @@ The sash Workflow is a genomic analysis framework comprising three primary pipel These pipelines utilise Bolt, a Python package designed for modular processing, and leverage outputs from the [DRAGEN](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) Variant Caller alongside and the Hartwig Medical Foundation WiGiTS toolkit (via [Oncoanalyser](https://github.com/nf-core/oncoanalyser)) [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. +--- + ## [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: @@ -26,6 +28,8 @@ Cobalt calculates read-depth ratios from sequencing data, providing essential in - [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md): Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's integrated analysis. +--- + ## Pipeline Inputs ### Dragen @@ -57,6 +61,8 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. - Directory: `Amber` - Description: Contains B-allele frequency measurements used by PURPLE to estimate tumor purity and ploidy. + +--- ## Workflows @@ -495,6 +501,8 @@ PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - Tier 3 (Low): Variants with uncertain significance. - Tier 4 (No Interest): Variants unlikely to be clinically relevant. +--- + # Common Reports ### [Cancer report](https://umccr.github.io/gpgr/) @@ -576,7 +584,7 @@ Key Metrics: Note: The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered ou -### CPSR Repor +### CPSR Report The CPSR (Cancer Predisposition Sequencing Report) includes the following: @@ -608,6 +616,8 @@ PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - Tier 3 (Low): Variants with uncertain significance. - Tier 4 (No Interest): Variants unlikely to be clinically relevant. +--- + # Reference data ### [UMCCR Genes panels](https://github.com/umccr/gene_panels) @@ -652,9 +662,7 @@ Databases/datasets PCGR Reference Data: - [GWAS_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) - [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) -### - -### +--- # Sash Module Outputs: @@ -698,6 +706,8 @@ DRAGEN HRD Score - File: `dragen_somatic_output/{tid}.hrdscore.tsv` - Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. +--- + # FAQ ### Q: Do we use PCGR for the rescue of sage? From 437eeb3cff7d4184f03a2eccf2690d6a5f622f53 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 16:44:16 +1100 Subject: [PATCH 15/36] remove useless sumaary --- README.md | 15 --------------- 1 file changed, 15 deletions(-) diff --git a/README.md b/README.md index 22eb75fc..37cc296c 100644 --- a/README.md +++ b/README.md @@ -68,23 +68,8 @@ Skeptically confirm that all dependencies are installed and reference data is co - **[Usage Instructions](docs/usage.md)**: Detailed parameters, sample sheet format, and output descriptions. - **Oncoanalyser**: [github.com/nf-core/oncoanalyser](https://github.com/nf-core/oncoanalyser) - --- -## Pipeline Steps - -Below is a simplified overview of the main pipeline stages (each stage may have multiple processes): - -1. **Somatic SNV** - - Merge DRAGEN VCF & SAGE VCF → Annotate with PCGR → Filter → HTML report -2. **Somatic SV** - - Integrate structural calls → PURPLE for CNVs/purity → Annotate (SnpEff) → Filter → Prioritize → Summaries in MultiQC -3. **Germline** - - Filter by known predisposition genes → CPSR classification → Germline report (HTML/TSV) - -The pipeline concludes with a final MultiQC run to aggregate logs and QC. - - ## Contributions & Support Contributions are welcomed. For issues or feature requests: From 7f86249d0a3e35ad2180fa1a2f648cb5080a1007 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Thu, 27 Mar 2025 16:50:56 +1100 Subject: [PATCH 16/36] rescue -> integration --- README.md | 2 +- docs/details.md | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 37cc296c..0cc58fda 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # nf-core/sash -sash is the UMCCR post-processing WGS workflow. The workflow takes DRAGEN small variant calls and oncoanalyser results as input to perform annotation, prioritisation, rescue and filtering, and reporting for the WGS variant data. Additionally, sash runs several sensors for biomarker assessment and genomic characterisation including HRD status, mutational signatures, purity/ploidy, MSI, and TMB. +sash is the UMCCR post-processing WGS workflow. The workflow takes DRAGEN small variant calls and oncoanalyser results as input to perform annotation, prioritisation, integration and filtering, and reporting for the WGS variant data. Additionally, sash runs several sensors for biomarker assessment and genomic characterisation including HRD status, mutational signatures, purity/ploidy, MSI, and TMB. While the sash workflow utilises a range of tools and software, it is most closely coupled with bolt, a Python package that implements the UMCCR post-processing logic and supporting functionality. diff --git a/docs/details.md b/docs/details.md index a469209a..39b68260 100644 --- a/docs/details.md +++ b/docs/details.md @@ -88,7 +88,7 @@ The variant calling integrations step use variants fromemploys the Somatic Alter Cancer Genome Interpreter (CGI) - CIViC \- Clinical interpretations of variants in cancer. - OncoKB \- Precision Oncology Knowledge Base. -- Outputs a VCF containing rescued variants. +- Outputs a VCF containing integrated variants. #### Inputs @@ -159,7 +159,7 @@ Use PCGR to enrich the VCF with: ##### Inputs: -- Small variant vcfRescue VCF +- Small variant vcf Rescue VCF - `${tumor_id}.main.sage.filtered.vcf.gz` ##### Output: From 4cdde390f0d20cb7fc05f5ceda766cfb73af645b Mon Sep 17 00:00:00 2001 From: qclayssen Date: Fri, 28 Mar 2025 16:41:24 +1100 Subject: [PATCH 17/36] rephrase linting --- README.md | 30 ++++++++++++++++-------------- docs/details.md | 16 +++++++--------- 2 files changed, 23 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 0cc58fda..832e7cd0 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,8 @@ The **Sash** pipeline has three main workflows (details in [Docs](docs/README.md Reports --- -## Sample sheet input + +## Sample sheet input ```csv id,subject_name,sample_name,filetype,filepath @@ -33,6 +34,7 @@ subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyse ``` --- + ## Quick Start ```bash @@ -43,9 +45,9 @@ nextflow run scwatts/sash \ --outdir output/ ``` -- `--input` specifies a CSV file listing your tumor/normal samples and any pre-existing Oncoanalyser outputs. -- `--ref_data_path` points to a directory containing reference resources (genome FASTA, PCGR/CPSR data bundle, hotspot lists, etc.). -- `-profile docker` runs the pipeline with Docker. Use `singularity` or `conda` if Docker is not available. +- `--input` specifies a CSV file listing your tumor/normal samples and any pre-existing Oncoanalyser outputs. +- `--ref_data_path` points to a directory containing reference resources (genome FASTA, PCGR/CPSR data bundle, hotspot lists, etc.). +- `-profile docker` runs the pipeline with Docker. Use `singularity` or `conda` if Docker is not available. Results are organized into subfolders for SNVs, SVs, germline calls, and final HTML reports (PCGR, CPSR). A `MultiQC` report aggregates quality metrics. @@ -53,11 +55,11 @@ Results are organized into subfolders for SNVs, SVs, germline calls, and final H ## Installation -For installation instructions, **[please see our tutorial page](https://nf-co.re/usage/installation)**. +For installation instructions, **[please see our tutorial page](https://nf-co.re/usage/installation)**. You will need: -- **Nextflow** (≥22.10.0) -- A container engine (e.g., **Docker** or **Singularity**) or a Conda environment -- **Java 8/11** for running Nextflow +- **Nextflow** (≥22.10.0) +- A container engine (e.g., **Docker** or **Singularity**) or a Conda environment +- **Java 8/11** for running Nextflow Skeptically confirm that all dependencies are installed and reference data is correctly downloaded before proceeding. Erroneous references or mismatched genome builds (e.g., b37 vs GRCh38) are a common source of confusion [@nextflow_docs]. @@ -73,8 +75,8 @@ Skeptically confirm that all dependencies are installed and reference data is co ## Contributions & Support Contributions are welcomed. For issues or feature requests: -1. Check [open issues on GitHub](https://github.com/nf-core/sash/issues) -2. If it’s new, submit a detailed report with logs and sample sheet. +1. Check [open issues on GitHub](https://github.com/nf-core/sash/issues) +2. If it’s new, submit a detailed report with logs and sample sheet. For user support, join the **nf-core Slack** community. Always verify your environment and reference integrity before blaming pipeline scripts. @@ -84,7 +86,7 @@ For user support, join the **nf-core Slack** community. Always verify your envir If you use **nf-core/sash** for your analysis, please cite: -- **Nextflow**: [doi:10.1038/nbt.3820](https://doi.org/10.1038/nbt.3820) -- **nf-core**: [doi:10.1038/s41587-020-0439-x](https://doi.org/10.1038/s41587-020-0439-x) -- **PCGR**: [doi:10.1186/s12859-019-3220-4](https://doi.org/10.1186/s12859-019-3220-4) -- **Hartwig WiGiTS** (SAGE, PURPLE, LINX): [@hartwigmedicalfoundation_hmftools](https://github.com/hartwigmedical/hmftools) \ No newline at end of file +- **Nextflow**: [doi:10.1038/nbt.3820](https://doi.org/10.1038/nbt.3820) +- **nf-core**: [doi:10.1038/s41587-020-0439-x](https://doi.org/10.1038/s41587-020-0439-x) +- **PCGR**: [doi:10.1186/s12859-019-3220-4](https://doi.org/10.1186/s12859-019-3220-4) +- **Hartwig WiGiTS** (SAGE, PURPLE, LINX): [@hartwigmedicalfoundation_hmftools](https://github.com/hartwigmedical/hmftools) diff --git a/docs/details.md b/docs/details.md index 39b68260..d17f550b 100644 --- a/docs/details.md +++ b/docs/details.md @@ -14,10 +14,10 @@ These pipelines utilise Bolt, a Python package designed for modular processing, ## [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) -HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Sash include: +HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Oncoanalyser, by sash include: - [SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md): - A tiered SNV/indel caller targeting ~10,000 cancer hotspots (e.g., OncoKB, CIViC) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). + A tiered SNV/indel caller targeting ~10,000 cancer hotspots ([Cancer Genome Interpreter](https://www.cancergenomeinterpreter.org/home), [CIViC](http://civic.genome.wustl.edu/), [OncoKB](https://oncokb.org/)) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). - [PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple): Estimates tumor purity (tumor cell fraction) and ploidy (average copy number), integrates copy number data, and calculates TMB (tumor mutation burden) and MSI (microsatellite instability). @@ -70,7 +70,7 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. #### General -In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relaing on Somatic Alterations in Genome [SAGE)](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple)) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. +In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relaing on Somatic Alterations in Genome [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. #### Summary @@ -81,11 +81,11 @@ In the Somatic Small Variants workflow, variant detection is performed using the ### Variant Calling integrations -The variant calling integrations step use variants fromemploys the Somatic Alterations in Genome (SAGE) variant callertool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed filtered out. [SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage) focuses on targets known cancer hotspots (from sources like CGI, CIViC, OncoKB) Targeted Hotspot. Analysis, prioritising predefined genomic regions of high clinical or biological relevance with his own [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. +The variant calling integrations step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with his own [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. - Low-allele-frequency variants in hotspots genomic regions of clinical significance. - Hotspots are derived from: - Cancer Genome Interpreter (CGI) + - Cancer Genome Interpreter (CGI) - CIViC \- Clinical interpretations of variants in cancer. - OncoKB \- Precision Oncology Knowledge Base. - Outputs a VCF containing integrated variants. @@ -113,10 +113,8 @@ Steps are: - Hotspot regions are derived from databases such as: - Cancer Genome Interpreter (CGI) - CIViC (Clinical Interpretations of Variants in Cancer) - - OncoKB (Precision Oncology Knowledge Base) - - This ensures that only high-confidence variants in clinically relevant regions are considered. -2. Separate SAGE calls into existing and novel variants - - Compare the input VCF and the filtered SAGE VCF to identify overlapping and unique variants. + - OncoKB (Precision Oncology Knowledge Base) + - Compare the input VCF and the SAGE VCF to identify overlapping and unique variants. 3. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF - Annotate variants that are re-called by SAGE: - For each variant in the input VCF, check if it exists in the SAGE existing calls. From eaa0206e14c1ea7b11d48fbd1b38df833f113cf5 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Mon, 31 Mar 2025 10:00:18 +1100 Subject: [PATCH 18/36] Add tools and linting --- docs/details.md | 78 ++++++++++++++++++++++++++++--------------------- 1 file changed, 45 insertions(+), 33 deletions(-) diff --git a/docs/details.md b/docs/details.md index d17f550b..39b44d57 100644 --- a/docs/details.md +++ b/docs/details.md @@ -1,4 +1,4 @@ -# Sash Workflow Overview +# sash workflow details ![Summary](images/sash_overview_qc.png) @@ -28,6 +28,19 @@ Cobalt calculates read-depth ratios from sequencing data, providing essential in - [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md): Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's integrated analysis. +--- +## Other Tools + +### [SIGRAP](https://github.com/umccr/sigrap) + +#### [Personal Cancer Genome Reporter (PCGR)](https://github.com/sigven/pcgr/tree/v1.4.1) + +#### [Cancer Predisposition Sequencing Report (CPSR)](https://github.com/sigven/cpsr) + +### [Genomics Platform Group Reporting(GPGR)](https://github.com/umccr/gpgr) for Cancer Report + +### [Linx](https://github.com/umccr/linxreport) + --- ## Pipeline Inputs @@ -66,22 +79,23 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. ## Workflows -## Somatic Small Variants (SNV/Indel, Tumor)Somatic small variants +## Somatic Small Variants (SNV/Indel, Tumor) #### General -In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relaing on Somatic Alterations in Genome [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage),and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. +In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relying on Somatic Alterations in Genome [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage), and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. #### Summary -1. Rescue variants using SAGE to recover low-frequency alterations in clinically important hotspots. +1. Integration of SAGE variants to recover low-frequency mutations in hotspots. 2. Annotate variants with clinical and functional information using PCGR. -3. Filter variants based on quality and frequency criteria (e.g., allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.).Filter variants based on allele frequency (AF), supporting reads (AD), and population frequency (gnomAD AF), removing low-confidence and common variants. -4. Report final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, multiqc) format. +3. Filter variants based on quality and frequency criteria (allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.). +4. 4.Filter variants based on allele frequency (AF), supporting reads (AD), population frequency (gnomAD AF), removing low-confidence and common variants. +5. Report final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, MultiQC) format. ### Variant Calling integrations -The variant calling integrations step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with his own [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. +The variant calling integrations step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with its [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. - Low-allele-frequency variants in hotspots genomic regions of clinical significance. - Hotspots are derived from: @@ -102,7 +116,7 @@ The variant calling integrations step use variants from the **Somatic Alteration ##### Output - Rescue: VCF - - `${tumor_id}.rescued.vcf.g` + - `${tumor_id}.rescued.vcf.gz` #### Details @@ -115,35 +129,34 @@ Steps are: - CIViC (Clinical Interpretations of Variants in Cancer) - OncoKB (Precision Oncology Knowledge Base) - Compare the input VCF and the SAGE VCF to identify overlapping and unique variants. -3. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF +2. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF - Annotate variants that are re-called by SAGE: - For each variant in the input VCF, check if it exists in the SAGE existing calls. - For variants re-called by SAGE: - If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`: - Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot. - If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`: - - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is rescued by SAGE. + - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is integrated from SAGE. - Update `FILTER=PASS` to include the variant in the final analysis. - If `SAGE FILTER` is not `PASS`: - Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants. - Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix -4. Combine annotated input VCF with novel SAGE calls - - Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF:: +3. Combine annotated input VCF with novel SAGE calls + - Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF: - Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions: - For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`. - Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data. - Summary Finalize the rescued of VCF file integration + Summary Finalize the integration of VCF file integration - The final VCF file includes: - Original variants from the input VCF, annotated with SAGE information where applicable. - Novel variants identified by SAGE in hotspot regions. - Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process. - - The rescued VCF provides a comprehensive set of variants for downstream analysis, prioritizing clinically significant mutations. ### Annotation -The Annotation consists of three processes:step employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots),UMCCR panel of normals and theand the Personal Cancer Genome Reporter (PCGR) tool to enrich variants with detailed functional and with clinical information using ACMG guidelines. PCGR classifies variants into tiers based on their clinical and biological significance and incorporates mutational signature analysis to provide insights into underlying mutational processes. To manage memory usage effectively, the input VCF file is divided into chunks, each containing up to 500,000 variants. Each chunk is processed independently through PCGR, and after annotation, the chunks are merged to produce an annotated VCF and TSV file. +The Annotation consists of three step processes, employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals and the PCGR tool to enrich variants with detailed functional and with clinical information using ACMG guidelines. PCGR classifies variants into tiers based on their clinical and biological significance and incorporates mutational signature analysis to provide insights into underlying mutational processes. To manage memory usage effectively, the input VCF file is divided into chunks, each containing up to 500,000 variants. Each chunk is processed independently through PCGR, and after annotation, the chunks are merged to produce an annotated VCF and TSV file. #### These annotations are used to decide which variants are retained or filtered in the next step @@ -155,12 +168,12 @@ Use PCGR to enrich the VCF with: - Process VCF files in chunks ≤500,000 variants each. - Merge annotated chunks into a unified VCF. -##### Inputs: +##### Inputs - Small variant vcf Rescue VCF - `${tumor_id}.main.sage.filtered.vcf.gz` -##### Output: +##### Output - Annotated VCF - `${tumor_id}.annotations.vcf.g` @@ -213,7 +226,7 @@ Steps are: - TCGA pancancer count ≥5 - ICGC PCAWG count ≥3. - Apply filters to other variants: - - Remove variants with `AF 10%`. + - Remove variants with AF > 10%. - Remove common variants in gnomAD (`population AF ≥ 1%`), adding them to the germline set. - Remove variants present in ≥5 samples of the Panel of Normals. - Remove indels in "bad promoter" regions (as defined by GA4GH). @@ -224,7 +237,6 @@ Steps are: 9. Report passing variants using PCGR, classified by the ACMG tier system 10. Generate the final report of variants classified according to clinical significance using PCGR, ready for downstream analysis. - ### Filter The Filter step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. @@ -278,18 +290,18 @@ Filters: #### 2.2 Panel of Normals (PoN) Germline Filter - Variants present in more than 5 normal samples from the UMCCR Panel of Normals are removed. -- Variants with a PoN AF \>20% are also excluded. +- Variants with a PoN AF >20% are also excluded. - This step reduces contamination from sequencing artefacts or undetected germline variants. -### 3\. Rescue and Clinical Significance Filters +### 3. Clinical Significance Filters -These variants are retained even if they fail technical filters. +Variants with clinical significance are retained even if they fail technical filters. -### 3.1 Hotspot Rescue +### 3.1 Hotspot - Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters. - #### 3.2 Reference Database Hit Count Rescue + #### 3.2 Reference Database Hit Count - Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering: - COSMIC count ≥10 @@ -352,7 +364,7 @@ Inputs: #### Output: -- PCGRCancer repor +- PCGR Cancer repor - ${tumor_id}.pcgr_acmg.grch38.html 1. Generate BCFtools Statistics on the Input VCF: @@ -485,11 +497,11 @@ Variant Classification: ClinVarc and Non-ClinVar -- Class 5 \- Pathogenic variants -- Class 4 \- Likely Pathogenic variants -- Class 3 \- Variants of Uncertain Significance (VUS) -- Class 2 \- Likely Benign variants -- Class 1 \- Benign variants +- Class 5 - Pathogenic variants +- Class 4 - Likely Pathogenic variants +- Class 3 - Variants of Uncertain Significance (VUS) +- Class 2 - Likely Benign variants +- Class 1 - Benign variants - Biomarkers PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): @@ -662,7 +674,7 @@ Databases/datasets PCGR Reference Data: --- -# Sash Module Outputs: +# sash module outputs: Somatic SNVs @@ -714,7 +726,7 @@ A: In Somatic SV, we used sage to make variant calling then we did annotation of ### Q: how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? -A: In the current version of Sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that are not considered that don’t have clinical impact, in hotspot region, until we meet the threshold. We that wil impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be hable to get correct TMB and MSI from purple +A: In the current version of sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that 1. don’t have clinical impact, 2. in hotspot region, until we meet the threshold. That will impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be able to get correct TMB and MSI from purple ### Q: how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? A: Filter out as we Filter on chr 1..22 and chr X,Y,M @@ -728,7 +740,7 @@ A: Circos plots come Purple A: The TMB display is the on calculated by pcgr ### Q: what filtered VCF is the source for the mutational signatures? -A: We use the filtred VCF for mutational signatures +A: We use the filtered VCF for mutational signatures ### Q: Where is the contamination score coming from currently? A: I don’t think there is contamination at the moment in sash From 2841c64b55d0d7323ef776176103a2e384299c12 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 1 Apr 2025 17:20:25 +1100 Subject: [PATCH 19/36] add tables --- docs/details.md | 141 +++++++++++------------------------------------- 1 file changed, 32 insertions(+), 109 deletions(-) diff --git a/docs/details.md b/docs/details.md index 39b44d57..e013c068 100644 --- a/docs/details.md +++ b/docs/details.md @@ -74,7 +74,7 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. - Directory: `Amber` - Description: Contains B-allele frequency measurements used by PURPLE to estimate tumor purity and ploidy. - + --- ## Workflows @@ -241,6 +241,7 @@ Steps are: The Filter step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. + Inputs: - Annotated VCF @@ -253,105 +254,29 @@ Inputs: Filters: -1\. Technical Quality Filters - -#### 1.1 Allele Frequency (AF) Filter - -- Removes variants with tumor allele frequency (AF) \10% to exclude low-confidence mutations. -- Exception: Variants located in known cancer hotspots (Hartwig, OncoKB) are not filtered, even if their AF is below 10%. - - #### 1.2 Allele Depth (AD) Filter - -- Variants with fewer than 4 supporting reads in the tumor sample are excluded. -- Higher Depth Requirement in Low-Complexity Regions: - - Variants located in low-complexity regions (defined by GIAB) require at least 6 supporting reads to be retained. - - #### 1.3 Non-GIAB AD Filter - -- Variants that are not in high-confidence Genome in a Bottle (GIAB) regions must meet a stricter depth requirement. -- Ensures high-quality variant detection outside of well-characterized genomic regions. - - #### 1.4 Strand Bias Filter - -- Variants showing extreme strand bias (i.e., all ALT reads are on one strand while REF reads are distributed) are removed. -- Exceptions are made for multi-caller support – if the variant is detected by multiple callers, it is retained. - - #### 1.5 Exclusion of Problematic Genomic Regions - -- Variants overlapping the ENCODE Blacklist or known low-complexity regions are filtered out. -- Indels found in bad promoter regions (as defined by GA4GH) are also removed. - -2\. Population Frequency & Panel of Normals (PoN) Filters - -#### 2.1 Population Frequency (gnomAD) Filter - -- Variants with a gnomAD maximum population allele frequency (AF) ≥1% are filtered as likely germline variants. - - #### 2.2 Panel of Normals (PoN) Germline Filter - -- Variants present in more than 5 normal samples from the UMCCR Panel of Normals are removed. -- Variants with a PoN AF >20% are also excluded. -- This step reduces contamination from sequencing artefacts or undetected germline variants. - -### 3. Clinical Significance Filters - -Variants with clinical significance are retained even if they fail technical filters. - -### 3.1 Hotspot - -- Variants located in Hartwig, OncoKB, or other curated hotspot databases are retained, even if they fail other quality or frequency filters. - - #### 3.2 Reference Database Hit Count - -- Variants with strong prior evidence in COSMIC, TCGA, or ICGC are retained, even if they fail standard filtering: - - COSMIC count ≥10 - - TCGA pan-cancer count ≥5 - - ICGC PCAWG count ≥5 - - #### 3.3 ClinVar Pathogenicity Rescue - -- Variants classified in ClinVar as: - - Likely Pathogenic - - Pathogenic - - Uncertain Significance (VUS) with strong clinical evidence - -- Allele Frequency (AF) Filter: - - Excludes variants with a tumor allele frequency below a threshold of 0.1. -- Allele Depth (AD) Filter: - - Removes variants with fewer than 4 supporting reads in the tumor sample. -- Degraded Mappability AD Filter: - - Applies stricter thresholds in regions with low sequence complexity or poor mappability, where errors are more likely. - - Requires a minimum of 6 supporting reads in low-sequence complexity regions(difficult region) to retain the variant. Tumor_ad \ 6 -- Non-GIAB AD Filter: - - Removes variants not confirmed by the Genome in a Bottle (GIAB) consortium if their allele depth falls below thresholds for challenging regions. -- Population Frequency Filter (gnomAD): - - Excludes variants with a population allele frequency greater than 0.01, based on gnomAD data. Gnomad_af \>= 0.01 -- Panel of Normals (PON) Germline Filter: - - Filters out variants with an allele frequency in the PON below 0.20. - - Additionally excludes variants that occur in more than 5 PON samples to mitigate germline contamination or recurrent artifacts. PON_COUNT \>= 5 -- FIlter rescue variant: - -Variants meeting these criteria are flagged as `CLINICAL_POTENTIAL_RESCUE` are NOT filtered out - -- Reference Database Hit Counts: - - Variants with a COSMIC count of ≥10. - - Variants with a TCGA pan-cancer count of ≥5. - - Variants with an ICGC PCAWG count of ≥5. -- ClinVar Significance: - - Variants with ClinVar classifications matching the following categories are rescued: - - `conflicting_interpretations_of_pathogenicity` - - `likely_pathogenic` - - `pathogenic` - - `uncertain_significance` -- Mutation Hotspots: - - Variants identified as hotspots in: - - `HMF_HOTSPOT` - - `PCGR_MUTATION_HOTSPOT` -- PCGR Tiers: - - Variants classified as: - - `TIER_1` - - `TIER_2` - +1. Technical Quality Filters + +| **Filter Type** | **Threshold/Criteria** | +|-------------------------------------------|------------------------------------------------| +| **Allele Frequency (AF) Filter** | Tumor AF < 10% (0.10) | +| **Allele Depth (AD) Filter** | Fewer than 4 supporting reads (6 in low-complexity regions) | +| **Non-GIAB AD Filter** | Stricter thresholds outside GIAB high-confidence regions | +| **Strand Bias Filter** | Extreme strand bias (criteria not numerically defined) | +| **Problematic Genomic Regions Filter** | Overlap with ENCODE blacklist, bad promoter, or low-complexity regions | +| **Population Frequency (gnomAD) Filter** | gnomAD AF ≥ 1% (0.01) | +| **Panel of Normals (PoN) Germline Filter** | Present in ≥ 5 normal samples or PoN AF > 20% (0.20) | +| **COSMIC Database Hit Count Filter** | COSMIC count ≥ 10 | +| **TCGA Pan-cancer Count Filter** | TCGA count ≥ 5 | +| **ICGC PCAWG Count Filter** | ICGC count ≥ 5 | + +### 2. Clinical Significance execeptions + +| Exception Category | Criteria | +|------------------------------|-------------------------------------------------------------------------------------------------------------------| +| **Reference Database Hit Count** | COSMIC count ≥10 OR TCGA pan-cancer count ≥5 OR ICGC PCAWG count ≥5 | +| **ClinVar Pathogenicity** | ClinVar classification of `conflicting_interpretations_of_pathogenicity`, `likely_pathogenic`, `pathogenic`, or `uncertain_significance` | +| **Mutation Hotspots** | Annotated as `HMF_HOTSPOT` OR `PCGR_MUTATION_HOTSPOT` | +| **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | ### Reports The Report step utilises the Personal Cancer Genome Reporter (PCGR) @@ -364,7 +289,7 @@ Inputs: #### Output: -- PCGR Cancer repor +- PCGR Cancer report - ${tumor_id}.pcgr_acmg.grch38.html 1. Generate BCFtools Statistics on the Input VCF: @@ -409,16 +334,14 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc 8. Generate Summary Reports: 9. Create TSV (tab-separated values) files summarizing the prioritized SVs and CNVs for downstream analysis and reporting. - ### Input Files - - ### Primary SV VCFs: +### Input File - - GRIDSS2 - - ${tumor_id}.gridss.vcf.gz +- GRIDSS2 + - ${tumor_id}.gridss.vcf.gz ### Details -### Detailed Steps: +### Detailed Steps 1. GRIPSS filtering: - Evaluate split-read and paired-end support; discard variants with low support. @@ -440,7 +363,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Classify Variants: - Structural Variants (SVs): Variants labeled with the source `sv_gridss`. - Copy Number Variants (CNVs): Variants labeled with the source `cnv_purple`. -4. Prioritise variants on a 4 tier system: +5. Prioritise variants on a 4 tier system: **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** - exon loss - on cancer gene list (1) @@ -462,7 +385,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - on cancer gene list (2) - other TS gene (3) - other (4) -5. Filter Low-Quality Calls: +6. Filter Low-Quality Calls: Apply Quality Filters: - Keep variants with sufficient read support (e.g., split reads (SR) ≥ 5 and paired reads (PR) ≥ 5). - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. From 12a0ae8a7aeaaa2c614c4cf06c607e3739b6561c Mon Sep 17 00:00:00 2001 From: qclayssen Date: Wed, 2 Apr 2025 11:53:53 +1100 Subject: [PATCH 20/36] remove redundancy --- docs/details.md | 51 +++++++++++++++---------------------------------- 1 file changed, 15 insertions(+), 36 deletions(-) diff --git a/docs/details.md b/docs/details.md index e013c068..a588d980 100644 --- a/docs/details.md +++ b/docs/details.md @@ -214,28 +214,6 @@ Steps are: - merge the PCGR annotations back into the original VCF file. - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. - Preserve the `FILTER` statuses and other annotations from the original VCF. -8. Filter variants to remove putative germline variants and artefactsartifacts while keeping known hotspots/actionable variants - - Keep variants: - - Called by SAGE in known hotspots (CGI, CIViC, OncoKB) regardless of other evidence. - - With PCGR TIER 1 and 2 classifications, indicating strong or potential clinical significance according to ACMG guidelines. - - All driver mutations from; - - IntOGen - - mutation hotspots - - ClinVar pathogenic or uncertain significance - - COSMIC count ≥10 - - TCGA pancancer count ≥5 - - ICGC PCAWG count ≥3. - - Apply filters to other variants: - - Remove variants with AF > 10%. - - Remove common variants in gnomAD (`population AF ≥ 1%`), adding them to the germline set. - - Remove variants present in ≥5 samples of the Panel of Normals. - - Remove indels in "bad promoter" regions (as defined by GA4GH). - - Remove variants overlapping the ENCODE blacklist. - - Remove variants with variant depth `VD 4`. - - Remove variants with `VD < 6` and overlapping a low complexity region. - - Remove VarDict strand-biased variants unless supported by other callers. -9. Report passing variants using PCGR, classified by the ACMG tier system -10. Generate the final report of variants classified according to clinical significance using PCGR, ready for downstream analysis. ### Filter @@ -265,9 +243,9 @@ Filters: | **Problematic Genomic Regions Filter** | Overlap with ENCODE blacklist, bad promoter, or low-complexity regions | | **Population Frequency (gnomAD) Filter** | gnomAD AF ≥ 1% (0.01) | | **Panel of Normals (PoN) Germline Filter** | Present in ≥ 5 normal samples or PoN AF > 20% (0.20) | -| **COSMIC Database Hit Count Filter** | COSMIC count ≥ 10 | -| **TCGA Pan-cancer Count Filter** | TCGA count ≥ 5 | -| **ICGC PCAWG Count Filter** | ICGC count ≥ 5 | +| **COSMIC Database Hit Count Filter** | COSMIC count < 10 | +| **TCGA Pan-cancer Count Filter** | TCGA count < 5 | +| **ICGC PCAWG Count Filter** | ICGC count < 5 | ### 2. Clinical Significance execeptions @@ -275,7 +253,7 @@ Filters: |------------------------------|-------------------------------------------------------------------------------------------------------------------| | **Reference Database Hit Count** | COSMIC count ≥10 OR TCGA pan-cancer count ≥5 OR ICGC PCAWG count ≥5 | | **ClinVar Pathogenicity** | ClinVar classification of `conflicting_interpretations_of_pathogenicity`, `likely_pathogenic`, `pathogenic`, or `uncertain_significance` | -| **Mutation Hotspots** | Annotated as `HMF_HOTSPOT` OR `PCGR_MUTATION_HOTSPOT` | +| **Mutation Hotspots** | Annotated as `HMF_HOTSPOT`, `PCGR_MUTATION_HOTSPOT` and SAGE Hotspots(CGI, CIViC, OncoKB) | | **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | ### Reports @@ -601,42 +579,42 @@ Databases/datasets PCGR Reference Data: Somatic SNVs -- File: `smlv_somatic/filter/{tid}.pass.vcf.gz` +- File: `smlv_somatic/filter/{tumor_id}.pass.vcf.gz` - Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. Somatic SVs -- File: `sv_somatic/prioritise/{tid}.sv.prioritised.vcf.gz` +- File: `sv_somatic/prioritise/{tumor_id}.sv.prioritised.vcf.gz` - Description: Contains somatic structural variants (SVs) with prioritization applied. Somatic CNVs -- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som.tsv.gz` +- File: `cancer_report/cancer_report_tables/purple/{tumor_id}-purple_cnv_som.tsv.gz` - Description: Contains somatic copy number variations (CNVs) data. Somatic Gene CNVs -- File: `cancer_report/cancer_report_tables/purple/{sid}_{tid}-purple_cnv_som_gene.tsv.gz` +- File: `cancer_report/cancer_report_tables/purple/{tumor_id}-purple_cnv_som_gene.tsv.gz` - Description: Contains gene-level somatic copy number variations (CNVs) data. Germline SNVs -- File: `dragen_germline_output/{nid}.hard-filtered.vcf.gz` +- File: `dragen_germline_output/{normal_id}.hard-filtered.vcf.gz` - Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. Purple Purity, Ploidy, MS Status -- File: `purple/{tid}.purple.purity.tsv` +- File: `purple/{tumor_id}.purple.purity.tsv` - Description: Contains estimated tumor purity, ploidy, and microsatellite status. PCGR JSON with TMB -- File: `smlv_somatic/report/pcgr/{tid}.pcgr_acmg.grch38.json.gz` +- File: `smlv_somatic/report/pcgr/{tumor_id}.pcgr_acmg.grch38.json.gz` - Description: Contains PCGR annotations, including tumor mutational burden (TMB). DRAGEN HRD Score -- File: `dragen_somatic_output/{tid}.hrdscore.tsv` +- File: `dragen_somatic_output/{tumor_id}.hrdscore.tsv` - Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. --- @@ -649,7 +627,7 @@ A: In Somatic SV, we used sage to make variant calling then we did annotation of ### Q: how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? -A: In the current version of sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that 1. don’t have clinical impact, 2. in hotspot region, until we meet the threshold. That will impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New reale will be able to get correct TMB and MSI from purple +A: In the current version of sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that 1. don’t have clinical impact, 2. in hotspot region, until we meet the threshold. That will impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New release will be able to get correct TMB and MSI from purple. ### Q: how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? A: Filter out as we Filter on chr 1..22 and chr X,Y,M @@ -657,7 +635,8 @@ A: Filter out as we Filter on chr 1..22 and chr X,Y,M ### Q: inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? A: Circos plots come Purple -### Q: we dropped the CACAO coverage reports; can we discuss how to utilize DRAGEN or WiGiTS coverage information instead? +### Q: we dropped the CACAO coverage reports. can we discuss how to utilize DRAGEN or WiGiTS coverage information instead? + ### Q: what TMB score is displayed in the cancer reporter? A: The TMB display is the on calculated by pcgr From eeafb170481e9619f09722419ae279c8342333bc Mon Sep 17 00:00:00 2001 From: qclayssen Date: Mon, 7 Apr 2025 09:16:50 +1000 Subject: [PATCH 21/36] remove redundancy --- docs/details.md | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/docs/details.md b/docs/details.md index a588d980..1f9074ff 100644 --- a/docs/details.md +++ b/docs/details.md @@ -97,13 +97,6 @@ In the Somatic Small Variants workflow, variant detection is performed using the The variant calling integrations step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with its [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. -- Low-allele-frequency variants in hotspots genomic regions of clinical significance. -- Hotspots are derived from: - - Cancer Genome Interpreter (CGI) - - CIViC \- Clinical interpretations of variants in cancer. - - OncoKB \- Precision Oncology Knowledge Base. -- Outputs a VCF containing integrated variants. - #### Inputs - From DRAGEN: somatic small variant caller VCF @@ -239,13 +232,13 @@ Filters: | **Allele Frequency (AF) Filter** | Tumor AF < 10% (0.10) | | **Allele Depth (AD) Filter** | Fewer than 4 supporting reads (6 in low-complexity regions) | | **Non-GIAB AD Filter** | Stricter thresholds outside GIAB high-confidence regions | -| **Strand Bias Filter** | Extreme strand bias (criteria not numerically defined) | | **Problematic Genomic Regions Filter** | Overlap with ENCODE blacklist, bad promoter, or low-complexity regions | | **Population Frequency (gnomAD) Filter** | gnomAD AF ≥ 1% (0.01) | | **Panel of Normals (PoN) Germline Filter** | Present in ≥ 5 normal samples or PoN AF > 20% (0.20) | | **COSMIC Database Hit Count Filter** | COSMIC count < 10 | | **TCGA Pan-cancer Count Filter** | TCGA count < 5 | | **ICGC PCAWG Count Filter** | ICGC count < 5 | + ### 2. Clinical Significance execeptions @@ -255,6 +248,8 @@ Filters: | **ClinVar Pathogenicity** | ClinVar classification of `conflicting_interpretations_of_pathogenicity`, `likely_pathogenic`, `pathogenic`, or `uncertain_significance` | | **Mutation Hotspots** | Annotated as `HMF_HOTSPOT`, `PCGR_MUTATION_HOTSPOT` and SAGE Hotspots(CGI, CIViC, OncoKB) | | **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | + + ### Reports The Report step utilises the Personal Cancer Genome Reporter (PCGR) From b387700d5d0e419384626f2c829167d17026d496 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 8 Apr 2025 17:37:06 +1000 Subject: [PATCH 22/36] missed changed in docs --- docs/details.md | 81 ++++++++++++++++++------------------------------- 1 file changed, 29 insertions(+), 52 deletions(-) diff --git a/docs/details.md b/docs/details.md index 1f9074ff..b25fc4c5 100644 --- a/docs/details.md +++ b/docs/details.md @@ -26,7 +26,7 @@ HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Har Cobalt calculates read-depth ratios from sequencing data, providing essential input for copy number analysis. Its outputs are used by PURPLE to generate accurate copy number profiles across the genome. - [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md): -Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's integrated analysis. +Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's re-call analysis. --- ## Other Tools @@ -41,6 +41,10 @@ Amber computes B-allele frequencies, which are critical for estimating tumor pur ### [Linx](https://github.com/umccr/linxreport) +### [GRIDSS/GRIPSS](https://github.com/PapenfussLab/gridss) + +#### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) + --- ## Pipeline Inputs @@ -56,23 +60,23 @@ Amber computes B-allele frequencies, which are critical for estimating tumor pur `{tumor_id}.gridss.vcf.gz` Description: This VCF contains structural variant calls produced by GRIDSS2. -#### SAGE +#### [SAGE](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md) `{tumor_id}.sage.somatic.vcf.gz` #### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) -- Directory: `virusbreakend` +- Directory: `virusbreakend/` - Description:Contains outputs from Virusbraken, used for detecting viral integration events. -#### Cobalt +#### [Cobalt](https://github.com/hartwigmedical/hmftools/blob/master/cobalt/README.md) -- Directory: `Cobalt` +- Directory: `cobalt/` - Description: Contains read-depth ratio data required for copy number analysis by PURPLE. -#### Amber +#### [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md) -- Directory: `Amber` +- Directory: `amber/` - Description: Contains B-allele frequency measurements used by PURPLE to estimate tumor purity and ploidy. --- @@ -83,19 +87,19 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. #### General -In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relying on Somatic Alterations in Genome [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage), and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple) outputs. It’s structured into four steps: Integrations, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. +In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relying on Somatic Alterations in Genome [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage), and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple) outputs. It’s structured into four steps: Re-callings, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. #### Summary -1. Integration of SAGE variants to recover low-frequency mutations in hotspots. +1. re-callings of SAGE variants to recover low-frequency mutations in hotspots. 2. Annotate variants with clinical and functional information using PCGR. 3. Filter variants based on quality and frequency criteria (allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.). 4. 4.Filter variants based on allele frequency (AF), supporting reads (AD), population frequency (gnomAD AF), removing low-confidence and common variants. 5. Report final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, MultiQC) format. -### Variant Calling integrations +### Variant Calling re-callings -The variant calling integrations step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with its [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the integration calling of biologically significant variants in a VCF that may have been missed otherwise. +The variant calling re-callings step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with its [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the re-callings calling of biologically significant variants in a VCF that may have been missed otherwise. #### Inputs @@ -117,19 +121,14 @@ Steps are: 1. Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered: - Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions. - - Hotspot regions are derived from databases such as: - - Cancer Genome Interpreter (CGI) - - CIViC (Clinical Interpretations of Variants in Cancer) - - OncoKB (Precision Oncology Knowledge Base) - Compare the input VCF and the SAGE VCF to identify overlapping and unique variants. 2. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF - - Annotate variants that are re-called by SAGE: - For each variant in the input VCF, check if it exists in the SAGE existing calls. - - For variants re-called by SAGE: + - For variants integrateed by SAGE: - If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`: - Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot. - If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`: - - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is integrated from SAGE. + - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is re-call from SAGE. - Update `FILTER=PASS` to include the variant in the final analysis. - If `SAGE FILTER` is not `PASS`: - Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants. @@ -140,31 +139,15 @@ Steps are: - For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`. - Retain necessary `INFO` and `FORMAT` annotations while removing others to streamline the data. - Summary Finalize the integration of VCF file integration - - - The final VCF file includes: - - Original variants from the input VCF, annotated with SAGE information where applicable. - - Novel variants identified by SAGE in hotspot regions. - - Updated `FILTER` and `INFO` fields reflecting the rescue and annotation process. - ### Annotation -The Annotation consists of three step processes, employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals and the PCGR tool to enrich variants with detailed functional and with clinical information using ACMG guidelines. PCGR classifies variants into tiers based on their clinical and biological significance and incorporates mutational signature analysis to provide insights into underlying mutational processes. To manage memory usage effectively, the input VCF file is divided into chunks, each containing up to 500,000 variants. Each chunk is processed independently through PCGR, and after annotation, the chunks are merged to produce an annotated VCF and TSV file. - -#### These annotations are used to decide which variants are retained or filtered in the next step - -Summary: -Use PCGR to enrich the VCF with: - -- Functional impact information (e.g., consequences, mutation hotspots). -- Clinical relevance (e.g., tier classifications, mutational signatures). -- Process VCF files in chunks ≤500,000 variants each. -- Merge annotated chunks into a unified VCF. +The Annotation consists of three step processes, employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals and the PCGR tool to enrich variants with [classification](https://sigven.github.io/pcgr/articles/variant_classification.html) and with clinical information. +**These annotations are used to decide which variants are retained or filtered in the next step** ##### Inputs -- Small variant vcf Rescue VCF - - `${tumor_id}.main.sage.filtered.vcf.gz` +- Small variant VCF + - `${tumor_id}.rescued.vcf.gz` ##### Output @@ -194,13 +177,13 @@ Steps are: - Add the `AD` FORMAT field: - `AD`: Allelic depths for the reference and alternate alleles. 5. Prepare VCF for PCGR annotation - - Exclude unnecessary data from the VCF header keeping on INFO AF/DP . + - Make minimal VCF header keeping on INFO AF/DP, and contigs size . - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. 6. Run PCGR to annotate VCF against external sources - - Use PCGR (Personal Cancer Genome Reporter) to annotate the VCF with clinical, functional, and biological information. - - Classify variants by tiers based on annotations and functional impact according to ACMG guidelines. + - Use PCGR to annotate the VCF + - Classify variants by tiers based on annotations and functional impact according to AMP/ASCO/CAP guidelines. - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - External sources used during this step include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. 7. Transfer PCGR annotations to the full set of variants @@ -280,7 +263,7 @@ Inputs: ## Somatic structural variants -The Somatic Structural Variants (SVs) pipeline identifies and annotates large-scale genomic alterations, including deletions, duplications, inversions, insertions, and translocations in tumor samples. This step integrates outputs from DRAGEN Variant Caller, GRIDSS2, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants.The analysis of somatic structural variants (SVs) involves processing, annotating, and prioritizing variants to identify those with clinical and biological significance. This process uses outputs from tools like PURPLE and GRIDSS and involves several key steps: +The Somatic Structural Variants (SVs) pipeline identifies and annotates large-scale genomic alterations, including deletions, duplications, inversions, insertions, and translocations in tumor samples. This step re-calls outputs from DRAGEN Variant Caller, GRIDSS2, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants. ### Summary: @@ -297,15 +280,6 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc 5. Report - Cancer report - Multiqc -6. Assign SV Types: - - Classify SVs as duplications or deletions based on copy number thresholds. - - Split variants into separate files for structural variants (SVs) and copy number variants (CNVs). -7. Annotate and Prioritize Variants: - - Use SnpEff to annotate variants with gene-level and functional impact information. - - Prioritize variants based on clinical relevance and support metrics. - - Generate TSV (tab-separated values) files summarizing the prioritized SVs and CNVs. -8. Generate Summary Reports: -9. Create TSV (tab-separated values) files summarizing the prioritized SVs and CNVs for downstream analysis and reporting. ### Input File @@ -336,7 +310,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Classify Variants: - Structural Variants (SVs): Variants labeled with the source `sv_gridss`. - Copy Number Variants (CNVs): Variants labeled with the source `cnv_purple`. -5. Prioritise variants on a 4 tier system: +5. Prioritise variants on a 4 tier system using [prioritize_sv](https://github.com/umccr/vcf_stuff/blob/master/scripts/prioritize_sv.): **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** - exon loss - on cancer gene list (1) @@ -365,6 +339,9 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (`AF0` or `AF1`) are below 0.1. - Structural Variants (SVs): Variants labeled with the source sv_gridss. - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. +7. Report: + - Make Multiqc and cancer report + ## Germline small variants From f1733cbad3e5a955a43d990821047062c77aa9f7 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 8 Apr 2025 17:53:56 +1000 Subject: [PATCH 23/36] linting --- docs/details.md | 31 +++++++++++++++---------------- 1 file changed, 15 insertions(+), 16 deletions(-) diff --git a/docs/details.md b/docs/details.md index b25fc4c5..65a3f084 100644 --- a/docs/details.md +++ b/docs/details.md @@ -91,11 +91,10 @@ In the Somatic Small Variants workflow, variant detection is performed using the #### Summary -1. re-callings of SAGE variants to recover low-frequency mutations in hotspots. +1. re-callings SAGE variants to recover low-frequency mutations in hotspots. 2. Annotate variants with clinical and functional information using PCGR. 3. Filter variants based on quality and frequency criteria (allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.). -4. 4.Filter variants based on allele frequency (AF), supporting reads (AD), population frequency (gnomAD AF), removing low-confidence and common variants. -5. Report final annotated variants in a comprehensive HTML report (PCGR, CANCER REPORT, LINX, MultiQC) format. +4. Report final annotated variants in a comprehensive HTML report(PCGR, CANCER REPORT, LINX, MultiQC) ### Variant Calling re-callings @@ -115,9 +114,7 @@ The variant calling re-callings step use variants from the **Somatic Alterations - Rescue: VCF - `${tumor_id}.rescued.vcf.gz` -#### Details - -Steps are: +#### Steps 1. Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered: - Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions. @@ -154,9 +151,7 @@ The Annotation consists of three step processes, employs Reference Sources (GA4G - Annotated VCF - `${tumor_id}.annotations.vcf.g` -Details: - -Steps are: +#### Steps 1. Set FILTER to "PASS" for unfiltered variants - Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. @@ -196,7 +191,7 @@ Steps are: The Filter step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. -Inputs: +#### Inputs - Annotated VCF - `${tumor_id}.annotations.vcf.gz` @@ -206,7 +201,7 @@ Inputs: - Filter VCF - `${tumor_id}\*filters_set.vcf.gz` -Filters: +#### Filters 1. Technical Quality Filters @@ -237,16 +232,20 @@ Filters: The Report step utilises the Personal Cancer Genome Reporter (PCGR) -Inputs: +#### Inputs - Purple purity -- Filter VCF +- Filterd VCF + - `${tumor_id}\*filters_set.vcf.gz` - Dragen VCF + - `${tumor_id}.main.dragen.vcf.gz` #### Output: - PCGR Cancer report - - ${tumor_id}.pcgr_acmg.grch38.html + - `${tumor_id}.pcgr_acmg.grch38.html` + +#### Steps 1. Generate BCFtools Statistics on the Input VCF: The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. @@ -259,7 +258,7 @@ Inputs: - The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. 4. Count Variants by Processing Stage 5. Parse Purity and Ploidy Information (Purple Data) -6. Run PCGR Annotation +6. Run PCGR ## Somatic structural variants @@ -354,7 +353,7 @@ Filtering Select passing variants in the given [gene panel transcript regions](h 1. The filtered variants are then further restricted to regions defined by a gene panel transcript regions file. 2. Report: CPSR -The CPSR (Cancer Predisposition Sequencing Report) includes the following: +### The CPSR (Cancer Predisposition Sequencing Report) Settings: From c721156e8240c2f90d6ee546db6f7024e96abfdc Mon Sep 17 00:00:00 2001 From: qclayssen Date: Wed, 9 Apr 2025 14:41:49 +1000 Subject: [PATCH 24/36] fix redundancy typo --- docs/details.md | 65 +++++++++++++++++-------------------------------- 1 file changed, 23 insertions(+), 42 deletions(-) diff --git a/docs/details.md b/docs/details.md index 65a3f084..5a7104dc 100644 --- a/docs/details.md +++ b/docs/details.md @@ -43,7 +43,7 @@ Amber computes B-allele frequencies, which are critical for estimating tumor pur ### [GRIDSS/GRIPSS](https://github.com/PapenfussLab/gridss) -#### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) +### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) --- @@ -138,7 +138,7 @@ The variant calling re-callings step use variants from the **Somatic Alterations ### Annotation -The Annotation consists of three step processes, employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals and the PCGR tool to enrich variants with [classification](https://sigven.github.io/pcgr/articles/variant_classification.html) and with clinical information. +The Annotation consists of three step processes, employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals and the PCGR tool to enrich variants with [classification](https://sigven.github.io/pcgr/articles/variant_classification.html) and clinical information. **These annotations are used to decide which variants are retained or filtered in the next step** ##### Inputs @@ -186,11 +186,10 @@ The Annotation consists of three step processes, employs Reference Sources (GA4G - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. - Preserve the `FILTER` statuses and other annotations from the original VCF. -### Filter +### Filter The Filter step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. - #### Inputs - Annotated VCF @@ -216,7 +215,6 @@ The Filter step applies a series of stringent filters to somatic variant calls i | **COSMIC Database Hit Count Filter** | COSMIC count < 10 | | **TCGA Pan-cancer Count Filter** | TCGA count < 5 | | **ICGC PCAWG Count Filter** | ICGC count < 5 | - ### 2. Clinical Significance execeptions @@ -227,7 +225,6 @@ The Filter step applies a series of stringent filters to somatic variant calls i | **Mutation Hotspots** | Annotated as `HMF_HOTSPOT`, `PCGR_MUTATION_HOTSPOT` and SAGE Hotspots(CGI, CIViC, OncoKB) | | **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | - ### Reports The Report step utilises the Personal Cancer Genome Reporter (PCGR) @@ -240,7 +237,7 @@ The Report step utilises the Personal Cancer Genome Reporter (PCGR) - Dragen VCF - `${tumor_id}.main.dragen.vcf.gz` -#### Output: +#### Output - PCGR Cancer report - `${tumor_id}.pcgr_acmg.grch38.html` @@ -264,7 +261,7 @@ The Report step utilises the Personal Cancer Genome Reporter (PCGR) The Somatic Structural Variants (SVs) pipeline identifies and annotates large-scale genomic alterations, including deletions, duplications, inversions, insertions, and translocations in tumor samples. This step re-calls outputs from DRAGEN Variant Caller, GRIDSS2, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants. -### Summary: +### Summary 1. GRIPSS filtering: - GRIPSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known_fusions @@ -334,17 +331,28 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc 6. Filter Low-Quality Calls: Apply Quality Filters: - Keep variants with sufficient read support (e.g., split reads (SR) ≥ 5 and paired reads (PR) ≥ 5). - - Exclude Tier 3 and Tier 4 variants where `SR 5` and `PR < 5`. + - Exclude Tier 3 and Tier 4 variants where `SR < 5` and `PR < 5`. - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (`AF0` or `AF1`) are below 0.1. - Structural Variants (SVs): Variants labeled with the source sv_gridss. - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. 7. Report: - - Make Multiqc and cancer report - + - Generate MultiQC and cancer report outputs ## Germline small variants -Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. +Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. + +#### Inputs + +- Dragen VCF + - `${normal_id}.hard-filtered.vcf.gz` + +#### Output + +- CPSR report + - `${normal_id}.cpsr.grch38.html` + +#### Steps 1. Prepare 1. Selection of Passing Variants: @@ -353,36 +361,6 @@ Filtering Select passing variants in the given [gene panel transcript regions](h 1. The filtered variants are then further restricted to regions defined by a gene panel transcript regions file. 2. Report: CPSR -### The CPSR (Cancer Predisposition Sequencing Report) - -Settings: - -- Sample metadata -- Report configuration -- Virtual gene panel - -Summary of Findings: - -- Variant statistics - -Variant Classification: - -ClinVarc and Non-ClinVar - -- Class 5 - Pathogenic variants -- Class 4 - Likely Pathogenic variants -- Class 3 - Variants of Uncertain Significance (VUS) -- Class 2 - Likely Benign variants -- Class 1 - Benign variants -- Biomarkers - -PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - -- Tier 1 (High): Highest priority variants with strong clinical relevance. -- Tier 2 (Moderate): Variants with potential clinical significance. -- Tier 3 (Low): Variants with uncertain significance. -- Tier 4 (No Interest): Variants unlikely to be clinically relevant. - --- # Common Reports @@ -498,6 +476,9 @@ PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - Tier 3 (Low): Variants with uncertain significance. - Tier 4 (No Interest): Variants unlikely to be clinically relevant. +--- +# Coverage + --- # Reference data From ebf295457d6ea5b12e123dcf7ec9524de98554e9 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Wed, 9 Apr 2025 15:54:39 +1000 Subject: [PATCH 25/36] add table of content --- docs/details.md | 52 ++++++++++++++++++++++++++++++++----------------- 1 file changed, 34 insertions(+), 18 deletions(-) diff --git a/docs/details.md b/docs/details.md index 5a7104dc..8cece356 100644 --- a/docs/details.md +++ b/docs/details.md @@ -1,5 +1,22 @@ # sash workflow details +## Table of Contents +- [Overview](#overview) +- [HMFtools WiGiTs](#hmftools-wigits) +- [Other Tools](#other-tools) +- [Pipeline Inputs](#pipeline-inputs) +- [Workflows](#workflows) + - [Somatic Small Variants](#somatic-small-variants) + - [Somatic Structural Variants](#somatic-structural-variants) + - [Germline Small Variants](#germline-small-variants) +- [Common Reports](#common-reports) +- [sash Module Outputs](#sash-module-outputs) +- [FAQ](#faq) +- [Key Metrics](#key-metrics) +- [CPSR Report](#cpsr-report) + +## Overview + ![Summary](images/sash_overview_qc.png) The sash Workflow is a genomic analysis framework comprising three primary pipelines: @@ -12,7 +29,7 @@ These pipelines utilise Bolt, a Python package designed for modular processing, --- -## [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) +## HMFtools WiGiTs HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Oncoanalyser, by sash include: @@ -29,6 +46,7 @@ Cobalt calculates read-depth ratios from sequencing data, providing essential in Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's re-call analysis. --- + ## Other Tools ### [SIGRAP](https://github.com/umccr/sigrap) @@ -83,7 +101,7 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. ## Workflows -## Somatic Small Variants (SNV/Indel, Tumor) +### Somatic Small Variants #### General @@ -257,11 +275,11 @@ The Report step utilises the Personal Cancer Genome Reporter (PCGR) 5. Parse Purity and Ploidy Information (Purple Data) 6. Run PCGR -## Somatic structural variants +### Somatic Structural Variants The Somatic Structural Variants (SVs) pipeline identifies and annotates large-scale genomic alterations, including deletions, duplications, inversions, insertions, and translocations in tumor samples. This step re-calls outputs from DRAGEN Variant Caller, GRIDSS2, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants. -### Summary +#### Summary 1. GRIPSS filtering: - GRIPSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known_fusions @@ -277,14 +295,12 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc - Cancer report - Multiqc -### Input File +#### Inputs - GRIDSS2 - ${tumor_id}.gridss.vcf.gz -### Details - -### Detailed Steps +#### Steps 1. GRIPSS filtering: - Evaluate split-read and paired-end support; discard variants with low support. @@ -338,7 +354,7 @@ The Somatic Structural Variants (SVs) pipeline identifies and annotates large-sc 7. Report: - Generate MultiQC and cancer report outputs -## Germline small variants +### Germline Small Variants Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. @@ -363,7 +379,7 @@ Filtering Select passing variants in the given [gene panel transcript regions](h --- -# Common Reports +## Common Reports ### [Cancer report](https://umccr.github.io/gpgr/) @@ -374,37 +390,37 @@ Tumor Mutation Burden (TMB): - Data Source: filtered somatic VCF - Tool: PURPLE -#### Mutational Signatures: +#### Mutational Signatures - Data Source: filtered SNV/CNV VCF - Tool: MutationalPatterns R package (via PCGR) -#### Contamination Score: +#### Contamination Score - Data Source: – - Note: No dedicated contamination metric is currently generated -#### Purity & Ploidy: +#### Purity & Ploidy - Data Source: COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) - Tool: PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) -#### HRD Score: +#### HRD Score - Data Source: HRD analysis output file (${tumor_id}.hrdscore.tsv) - Tool: DRAGEN -#### MSI (Microsatellite Instability): +#### MSI (Microsatellite Instability) - Data Source: Indels in microsatellite regions from SNV/CNV - Tool: PURPLE -#### Structural Variant Metrics: +#### Structural Variant Metrics - Data Source: GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation - Tools: GRIDSS/GRIPSS and PURPLE -#### Copy Number Metrics (Segments, Deleted Genes, etc.): +#### Copy Number Metrics (Segments, Deleted Genes, etc.) - Data Source: PURPLE CNV outputs (segmentation files, gene-level CNV TSV) - Tool: PURPLE @@ -571,7 +587,7 @@ DRAGEN HRD Score --- -# FAQ +## FAQ ### Q: Do we use PCGR for the rescue of sage? From 16ba1dc4a05c14e2344ad94edd73e907793f9da2 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Wed, 9 Apr 2025 15:55:29 +1000 Subject: [PATCH 26/36] fix table --- docs/details.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/docs/details.md b/docs/details.md index 8cece356..18e78293 100644 --- a/docs/details.md +++ b/docs/details.md @@ -12,8 +12,6 @@ - [Common Reports](#common-reports) - [sash Module Outputs](#sash-module-outputs) - [FAQ](#faq) -- [Key Metrics](#key-metrics) -- [CPSR Report](#cpsr-report) ## Overview From f491a4fe514d32b71cf8886c5370a1f722208ab0 Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 15 Apr 2025 11:55:13 +1000 Subject: [PATCH 27/36] fix filrer and info --- docs/details.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/details.md b/docs/details.md index 18e78293..7f29d8b8 100644 --- a/docs/details.md +++ b/docs/details.md @@ -195,7 +195,7 @@ The Annotation consists of three step processes, employs Reference Sources (GA4G 6. Run PCGR to annotate VCF against external sources - Use PCGR to annotate the VCF - Classify variants by tiers based on annotations and functional impact according to AMP/ASCO/CAP guidelines. - - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `INTOGEN_DRIVER_MUT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. + - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - External sources used during this step include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. 7. Transfer PCGR annotations to the full set of variants - merge the PCGR annotations back into the original VCF file. @@ -218,7 +218,7 @@ The Filter step applies a series of stringent filters to somatic variant calls i #### Filters -1. Technical Quality Filters +variant that do not meet this criteria will not be considered unless [Clinical Significance execeptions](#2-clinical-significance-execeptions) | **Filter Type** | **Threshold/Criteria** | |-------------------------------------------|------------------------------------------------| From e64df39f38cf68f3cd975ad81f60e879f6d2e68c Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 15 Apr 2025 11:56:45 +1000 Subject: [PATCH 28/36] linting --- docs/details.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/details.md b/docs/details.md index 7f29d8b8..f8348810 100644 --- a/docs/details.md +++ b/docs/details.md @@ -232,7 +232,7 @@ variant that do not meet this criteria will not be considered unless [Clinical S | **TCGA Pan-cancer Count Filter** | TCGA count < 5 | | **ICGC PCAWG Count Filter** | ICGC count < 5 | -### 2. Clinical Significance execeptions +#### Clinical Significance execeptions | Exception Category | Criteria | |------------------------------|-------------------------------------------------------------------------------------------------------------------| From 85e58f35a59ef4d344bcc4426e09464a92ca571f Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 15 Apr 2025 12:00:14 +1000 Subject: [PATCH 29/36] typo --- docs/details.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/details.md b/docs/details.md index f8348810..fd23cedb 100644 --- a/docs/details.md +++ b/docs/details.md @@ -218,7 +218,7 @@ The Filter step applies a series of stringent filters to somatic variant calls i #### Filters -variant that do not meet this criteria will not be considered unless [Clinical Significance execeptions](#2-clinical-significance-execeptions) +variant that do not meet this criteria will not be considered unless [Clinical Significance Exceptions](#2-clinical-significance-Exceptions) | **Filter Type** | **Threshold/Criteria** | |-------------------------------------------|------------------------------------------------| @@ -232,7 +232,7 @@ variant that do not meet this criteria will not be considered unless [Clinical S | **TCGA Pan-cancer Count Filter** | TCGA count < 5 | | **ICGC PCAWG Count Filter** | ICGC count < 5 | -#### Clinical Significance execeptions +#### Clinical Significance Exceptions | Exception Category | Criteria | |------------------------------|-------------------------------------------------------------------------------------------------------------------| From 759e9f066db07f67c9dc0a0742701c172f5e446f Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 15 Apr 2025 17:08:45 +1000 Subject: [PATCH 30/36] correct filter --- docs/details.md | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/details.md b/docs/details.md index fd23cedb..06300084 100644 --- a/docs/details.md +++ b/docs/details.md @@ -188,19 +188,18 @@ The Annotation consists of three step processes, employs Reference Sources (GA4G - Add the `AD` FORMAT field: - `AD`: Allelic depths for the reference and alternate alleles. 5. Prepare VCF for PCGR annotation - - Make minimal VCF header keeping on INFO AF/DP, and contigs size . + - Make minimal VCF header keeping INFO AF/DP and contigs size. - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. - 6. Run PCGR to annotate VCF against external sources - - Use PCGR to annotate the VCF + - Use PCGR to annotate the VCF. - Classify variants by tiers based on annotations and functional impact according to AMP/ASCO/CAP guidelines. - - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - - External sources used during this step include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. + - Add the following INFO fields to the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. + - (External sources include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL.) 7. Transfer PCGR annotations to the full set of variants - - merge the PCGR annotations back into the original VCF file. - - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. - - Preserve the `FILTER` statuses and other annotations from the original VCF. + - Merge the PCGR annotations back into the original VCF. + - Ensure that all variants—including those not selected for PCGR annotation—have their relevant clinical annotations. + - Preserve the original FILTER statuses and other annotations. ### Filter @@ -214,11 +213,11 @@ The Filter step applies a series of stringent filters to somatic variant calls i #### Output - Filter VCF - - `${tumor_id}\*filters_set.vcf.gz` + - `${tumor_id}*filters_set.vcf.gz` #### Filters -variant that do not meet this criteria will not be considered unless [Clinical Significance Exceptions](#2-clinical-significance-Exceptions) +variants that do not meet these criteria will not be considered unless [Clinical Significance Exceptions](#2-clinical-significance-Exceptions) | **Filter Type** | **Threshold/Criteria** | |-------------------------------------------|------------------------------------------------| @@ -228,9 +227,7 @@ variant that do not meet this criteria will not be considered unless [Clinical S | **Problematic Genomic Regions Filter** | Overlap with ENCODE blacklist, bad promoter, or low-complexity regions | | **Population Frequency (gnomAD) Filter** | gnomAD AF ≥ 1% (0.01) | | **Panel of Normals (PoN) Germline Filter** | Present in ≥ 5 normal samples or PoN AF > 20% (0.20) | -| **COSMIC Database Hit Count Filter** | COSMIC count < 10 | -| **TCGA Pan-cancer Count Filter** | TCGA count < 5 | -| **ICGC PCAWG Count Filter** | ICGC count < 5 | + #### Clinical Significance Exceptions @@ -240,6 +237,9 @@ variant that do not meet this criteria will not be considered unless [Clinical S | **ClinVar Pathogenicity** | ClinVar classification of `conflicting_interpretations_of_pathogenicity`, `likely_pathogenic`, `pathogenic`, or `uncertain_significance` | | **Mutation Hotspots** | Annotated as `HMF_HOTSPOT`, `PCGR_MUTATION_HOTSPOT` and SAGE Hotspots(CGI, CIViC, OncoKB) | | **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | +| **COSMIC Database Hit Count Filter** | COSMIC count >= 10 | +| **TCGA Pan-cancer Count Filter** | TCGA count >= 5 | +| **ICGC PCAWG Count Filter** | ICGC count >= 5 | ### Reports From e607ea7045325434db13a255864a745201964b0a Mon Sep 17 00:00:00 2001 From: qclayssen Date: Wed, 16 Apr 2025 13:29:25 +1000 Subject: [PATCH 31/36] linting --- docs/details.md | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/docs/details.md b/docs/details.md index 06300084..1efcc487 100644 --- a/docs/details.md +++ b/docs/details.md @@ -188,18 +188,19 @@ The Annotation consists of three step processes, employs Reference Sources (GA4G - Add the `AD` FORMAT field: - `AD`: Allelic depths for the reference and alternate alleles. 5. Prepare VCF for PCGR annotation - - Make minimal VCF header keeping INFO AF/DP and contigs size. + - Make minimal VCF header keeping on INFO AF/DP, and contigs size . - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. + 6. Run PCGR to annotate VCF against external sources - - Use PCGR to annotate the VCF. + - Use PCGR to annotate the VCF - Classify variants by tiers based on annotations and functional impact according to AMP/ASCO/CAP guidelines. - - Add the following INFO fields to the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - - (External sources include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL.) + - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. + - External sources used during this step include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. 7. Transfer PCGR annotations to the full set of variants - - Merge the PCGR annotations back into the original VCF. - - Ensure that all variants—including those not selected for PCGR annotation—have their relevant clinical annotations. - - Preserve the original FILTER statuses and other annotations. + - merge the PCGR annotations back into the original VCF file. + - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. + - Preserve the `FILTER` statuses and other annotations from the original VCF. ### Filter @@ -213,11 +214,11 @@ The Filter step applies a series of stringent filters to somatic variant calls i #### Output - Filter VCF - - `${tumor_id}*filters_set.vcf.gz` + - `${tumor_id}\*filters_set.vcf.gz` #### Filters -variants that do not meet these criteria will not be considered unless [Clinical Significance Exceptions](#2-clinical-significance-Exceptions) +variant that do not meet this criteria will not be considered unless [Clinical Significance Exceptions](#2-clinical-significance-Exceptions) | **Filter Type** | **Threshold/Criteria** | |-------------------------------------------|------------------------------------------------| From f4f6f25f91a755be31658b08a86d943821e20ebc Mon Sep 17 00:00:00 2001 From: qclayssen Date: Tue, 29 Apr 2025 11:03:51 +1000 Subject: [PATCH 32/36] linting and improve writing --- docs/details.md | 515 ++++++++++++++++++++++-------------------------- 1 file changed, 238 insertions(+), 277 deletions(-) diff --git a/docs/details.md b/docs/details.md index 1efcc487..ea8e1457 100644 --- a/docs/details.md +++ b/docs/details.md @@ -11,6 +11,8 @@ - [Germline Small Variants](#germline-small-variants) - [Common Reports](#common-reports) - [sash Module Outputs](#sash-module-outputs) +- [Coverage](#coverage) +- [Reference Data](#reference-data) - [FAQ](#faq) ## Overview @@ -23,77 +25,74 @@ The sash Workflow is a genomic analysis framework comprising three primary pipel - Somatic Structural Variants (SV somatic): Identifies large-scale genomic alterations (deletions, duplications, etc.) and integrates copy number data. - Germline Variants (SNV germline): Focuses on inherited variants linked to cancer predisposition. -These pipelines utilise Bolt, a Python package designed for modular processing, and leverage outputs from the [DRAGEN](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) Variant Caller alongside and the Hartwig Medical Foundation WiGiTS toolkit (via [Oncoanalyser](https://github.com/nf-core/oncoanalyser)) [HMFtools WiGiTs](https://github.com/hartwigmedical/hmftools/tree/master) in Oncoanalyser. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. +These pipelines utilize Bolt (a Python package designed for modular processing) and leverage outputs from the [DRAGEN](https://sapac.illumina.com/products/by-type/informatics-products/dragen-secondary-analysis.html) Variant Caller alongside the [Hartwig Medical Foundation (HMF) tools](https://github.com/hartwigmedical/hmftools/tree/master) integrated via [Oncoanalyser](https://github.com/nf-core/oncoanalyser). Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation and HTML reports for research and curation. --- -## HMFtools WiGiTs +## HMFtools -HMFtools WiGiTS is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in Oncoanalyser, by sash include: +HMFtools is an open-source suite for cancer genomics developed by the Hartwig Medical Foundation. Key components used in sash include: - [SAGE (Somatic Alterations in Genome)](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md): - A tiered SNV/indel caller targeting ~10,000 cancer hotspots ([Cancer Genome Interpreter](https://www.cancergenomeinterpreter.org/home), [CIViC](http://civic.genome.wustl.edu/), [OncoKB](https://oncokb.org/)) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). + A tiered SNV/indel caller targeting cancer hotspots from databases including [Cancer Genome Interpreter](https://www.cancergenomeinterpreter.org/home), [CIViC](http://civic.genome.wustl.edu/), and [OncoKB](https://oncokb.org/) to recover low-frequency variants missed by DRAGEN. Outputs a VCF with confidence tiers (hotspot, panel, high/low confidence). - [PURPLE](https://github.com/hartwigmedical/hmftools/tree/master/purple): Estimates tumor purity (tumor cell fraction) and ploidy (average copy number), integrates copy number data, and calculates TMB (tumor mutation burden) and MSI (microsatellite instability). - [Cobalt](https://github.com/hartwigmedical/hmftools/blob/master/cobalt/README.md): -Cobalt calculates read-depth ratios from sequencing data, providing essential input for copy number analysis. Its outputs are used by PURPLE to generate accurate copy number profiles across the genome. + Calculates read-depth ratios from sequencing data, providing essential input for copy number analysis. Its outputs are used by PURPLE to generate accurate copy number profiles across the genome. - [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md): -Amber computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's re-call analysis. + Computes B-allele frequencies, which are critical for estimating tumor purity and ploidy. The Amber directory contains these measurements, supporting PURPLE's analysis. --- ## Other Tools ### [SIGRAP](https://github.com/umccr/sigrap) +A framework for running PCGR and other genomic reporting tools. -#### [Personal Cancer Genome Reporter (PCGR)](https://github.com/sigven/pcgr/tree/v1.4.1) +### [Personal Cancer Genome Reporter (PCGR)](https://github.com/sigven/pcgr/tree/v1.4.1) +Tool for comprehensive clinical interpretation of somatic variants, providing tiered classifications and extensive annotation. -#### [Cancer Predisposition Sequencing Report (CPSR)](https://github.com/sigven/cpsr) +### [Cancer Predisposition Sequencing Report (CPSR)](https://github.com/sigven/cpsr) +Tool for predisposition variant analysis and reporting in germline samples. -### [Genomics Platform Group Reporting(GPGR)](https://github.com/umccr/gpgr) for Cancer Report +### [Genomics Platform Group Reporting (GPGR)](https://github.com/umccr/gpgr) +UMCCR-developed R package for generating cancer genomics reports. -### [Linx](https://github.com/umccr/linxreport) +### [Linx](https://github.com/hartwigmedical/hmftools/tree/master/linx) +Tool for structural variant annotation and visualization to classify complex rearrangements. ### [GRIDSS/GRIPSS](https://github.com/PapenfussLab/gridss) +Structural variant caller (GRIDSS) and accompanying filtering tool (GRIPSS) for high-confidence SV detection. -### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) +### [VIRUSBreakend](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) +Tool for detecting viral integration events in human genome sequencing data. --- ## Pipeline Inputs -### Dragen - -`{tumor_id}.hard-filtered.vcf.gz` +### DRAGEN +- `{tumor_id}.hard-filtered.vcf.gz`: Somatic variant calls from DRAGEN pipeline. ### Oncoanalyser #### [GRIDSS/GRIPSS](https://github.com/PapenfussLab/gridss) - -`{tumor_id}.gridss.vcf.gz` -Description: This VCF contains structural variant calls produced by GRIDSS2. +- `{tumor_id}.gridss.vcf.gz`: VCF containing structural variant calls produced by GRIDSS2. #### [SAGE](https://github.com/hartwigmedical/hmftools/blob/master/sage/README.md) +- `{tumor_id}.sage.somatic.vcf.gz`: Somatic SNV/indel calls from SAGE. -`{tumor_id}.sage.somatic.vcf.gz` - -#### [Virusbraken](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) - -- Directory: `virusbreakend/` -- Description:Contains outputs from Virusbraken, used for detecting viral integration events. +#### [VIRUSBreakend](https://github.com/PapenfussLab/gridss/blob/master/VIRUSBreakend_Readme.md) +- Directory: `virusbreakend/`: Contains outputs from VIRUSBreakend, used for detecting viral integration events. #### [Cobalt](https://github.com/hartwigmedical/hmftools/blob/master/cobalt/README.md) - -- Directory: `cobalt/` -- Description: Contains read-depth ratio data required for copy number analysis by PURPLE. +- Directory: `cobalt/`: Contains read-depth ratio data required for copy number analysis by PURPLE. #### [Amber](https://github.com/hartwigmedical/hmftools/blob/master/amber/README.md) - -- Directory: `amber/` -- Description: Contains B-allele frequency measurements used by PURPLE to estimate tumor purity and ploidy. +- Directory: `amber/`: Contains B-allele frequency measurements used by PURPLE to estimate tumor purity and ploidy. --- @@ -102,51 +101,46 @@ Description: This VCF contains structural variant calls produced by GRIDSS2. ### Somatic Small Variants #### General - -In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser that is relying on Somatic Alterations in Genome [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage), and [Purple](https://github.com/hartwigmedical/hmftools/tree/master/purple) outputs. It’s structured into four steps: Re-callings, Annotation, Filter, and Report. The final outputs include an HTML report summarising the results. +In the Somatic Small Variants workflow, variant detection is performed using the DRAGEN Variant Caller and Oncoanalyser (relying on SAGE and PURPLE outputs). It's structured into four steps: Re-calling, Annotation, Filter, and Report. The final outputs include an HTML report summarizing the results. #### Summary - -1. re-callings SAGE variants to recover low-frequency mutations in hotspots. +1. Re-calling SAGE variants to recover low-frequency mutations in hotspots. 2. Annotate variants with clinical and functional information using PCGR. -3. Filter variants based on quality and frequency criteria (allele frequency, read depth, population frequency), while retaining those of potential clinical significance (hotspots, high-impact, etc.). -4. Report final annotated variants in a comprehensive HTML report(PCGR, CANCER REPORT, LINX, MultiQC) +3. Filter variants based on quality and frequency criteria, while retaining those of potential clinical significance. +4. Generate comprehensive HTML reports (PCGR, Cancer Report, LINX, MultiQC). -### Variant Calling re-callings +### Variant Calling Re-calling -The variant calling re-callings step use variants from the **Somatic Alterations in Genome** ([SAGE](https://github.com/hartwigmedical/hmftools/tree/sage-v1.0/sage)) variant caller tool, which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency that might have been missed, or filtered out. SAGE focuses on targets known cancer hotspots prioritising predefined genomic regions of high clinical or biological relevance with its [filter](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the re-callings calling of biologically significant variants in a VCF that may have been missed otherwise. +The variant calling re-calling step uses variants from [SAGE](https://github.com/hartwigmedical/hmftools/tree/master/sage), which is more sensitive than DRAGEN in detecting variants, particularly those with low allele frequency. SAGE focuses on cancer hotspots, prioritizing predefined genomic regions of high clinical or biological relevance with its [filtering system](https://github.com/hartwigmedical/hmftools/tree/master/sage#6-soft-filters). This enables the re-calling of biologically significant variants that may have been missed otherwise. #### Inputs - -- From DRAGEN: somatic small variant caller VCF +- From DRAGEN: Somatic small variant caller VCF - `${tumor_id}.main.dragen.vcf.gz` -- From oncoanalyser: SAGE VCF +- From Oncoanalyser: SAGE VCF - `${tumor_id}.main.sage.filtered.vcf.gz` + + Filtered on chromosomes 1-22, X, Y, and M. - Filter on chr 1..22 and chr X,Y,M - -##### Output - -- Rescue: VCF +#### Output +- Re-calling: VCF - `${tumor_id}.rescued.vcf.gz` #### Steps - -1. Select High-Confidence SAGE Calls in Hotspot Regions to ensure only high-confidence variants in clinically relevant regions are considered: +1. Select High-Confidence SAGE Calls in Hotspot Regions: - Filter the SAGE output to retain only variants that pass quality filters and overlap with known hotspot regions. - Compare the input VCF and the SAGE VCF to identify overlapping and unique variants. -2. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF - - For each variant in the input VCF, check if it exists in the SAGE existing calls. - - For variants integrateed by SAGE: - - If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`: - - Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot. - - If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`: - - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is re-call from SAGE. - - Update `FILTER=PASS` to include the variant in the final analysis. - - If `SAGE FILTER` is not `PASS`: - - Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants. - - Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix -3. Combine annotated input VCF with novel SAGE calls +2. Annotate existing somatic variant calls also present in the SAGE calls in the input VCF: + - For each variant in the input VCF, check if it exists in the SAGE existing calls. + - For variants integrated by SAGE: + - If `SAGE FILTER=PASS` and input VCF `FILTER=PASS`: + - Set `INFO/SAGE_HOTSPOT` to indicate the variant is called by SAGE in a hotspot. + - If `SAGE FILTER=PASS` and input VCF `FILTER` is not `PASS`: + - Set `INFO/SAGE_HOTSPOT` and `INFO/SAGE_RESCUE` to indicate the variant is re-called from SAGE. + - Update `FILTER=PASS` to include the variant in the final analysis. + - If `SAGE FILTER` is not `PASS`: + - Append `SAGE_lowconf` to the `FILTER` field to flag low-confidence variants. + - Transfer SAGE `FORMAT` fields to the input VCF with a `SAGE_` prefix. +3. Combine annotated input VCF with novel SAGE calls: - Prepare novel SAGE calls. For each variant in the SAGE VCF missing from the input VCF: - Rename certain `FORMAT` fields in the novel SAGE VCF to avoid namespace collisions: - For example, `FORMAT/SB` is renamed to `FORMAT/SAGE_SB`. @@ -154,51 +148,46 @@ The variant calling re-callings step use variants from the **Somatic Alterations ### Annotation -The Annotation consists of three step processes, employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals and the PCGR tool to enrich variants with [classification](https://sigven.github.io/pcgr/articles/variant_classification.html) and clinical information. -**These annotations are used to decide which variants are retained or filtered in the next step** - -##### Inputs +The Annotation process employs Reference Sources (GA4GH/GIAB problem region stratifications, GIAB high confidence regions, gnomAD, Hartwig hotspots), UMCCR panel of normals (built from approximately 200 normal samples), and the PCGR tool to enrich variants with [classification](https://sigven.github.io/pcgr/articles/variant_classification.html) and clinical information. +**These annotations are used to decide which variants are retained or filtered in the next step.** +#### Inputs - Small variant VCF - `${tumor_id}.rescued.vcf.gz` -##### Output - +#### Output - Annotated VCF - - `${tumor_id}.annotations.vcf.g` + - `${tumor_id}.annotations.vcf.gz` #### Steps - -1. Set FILTER to "PASS" for unfiltered variants - - Iterate over the input VCF file the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). This standardization is necessary for downstream tools. -2. Annotate the VCF against reference sources +1. Set FILTER to "PASS" for unfiltered variants: + - Iterate over the input VCF file and set the `FILTER` field to `PASS` for any variants that currently have no filter status (`FILTER` is `.` or `None`). +2. Annotate the VCF against reference sources: - Use vcfanno to add annotations to the VCF file: - - gnomAD + - gnomAD (version 2.1) - Hartwig Hotspots - ENCODE Blacklist - - Genome in a Bottle High-Confidence Regions: Mark high-confidence regions from the Genome in a Bottle benchmark. - - Low and High GC Regions: Mark regions with \30% or \65% GC content, compiled by GA4GH. - - Bad Promoter Regions: Annotate regions with poor coverage, compiled by GA4GH. -3. Annotate with UMCCR panel of normals counts - - Use vcfanno and bcftools to annotate the VCF with counts from the UMCCR panel of normals, built from tumor-only Mutect2 calls from approximately 200 normal samples. This helps identify and filter out recurrent sequencing artifacts or germline variants. -4. Standardize the VCF fields + - Genome in a Bottle High-Confidence Regions (v4.2.1) + - Low and High GC Regions (< 30% or > 65% GC content, compiled by GA4GH) + - Bad Promoter Regions (compiled by GA4GH) +3. Annotate with UMCCR panel of normals counts: + - Use vcfanno and bcftools to annotate the VCF with counts from the UMCCR panel of normals. +4. Standardize the VCF fields: - Add new `INFO` fields for use with PCGR: - - `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. - - `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. + - `TUMOR_AF`, `NORMAL_AF`: Tumor and normal allele frequencies. + - `TUMOR_DP`, `NORMAL_DP`: Tumor and normal read depths. - Add the `AD` FORMAT field: - - `AD`: Allelic depths for the reference and alternate alleles. -5. Prepare VCF for PCGR annotation - - Make minimal VCF header keeping on INFO AF/DP, and contigs size . + - `AD`: Allelic depths for the reference and alternate alleles. +5. Prepare VCF for PCGR annotation: + - Make minimal VCF header keeping only INFO AF/DP, and contigs size. - Move tumor and normal `FORMAT/AF` and `FORMAT/DP` annotations to the `INFO` field as required by PCGR. - Set `FILTER` to `PASS` and remove all `FORMAT` and sample columns. - -6. Run PCGR to annotate VCF against external sources - - Use PCGR to annotate the VCF +6. Run PCGR (v1.4.1) to annotate VCF against external sources: - Classify variants by tiers based on annotations and functional impact according to AMP/ASCO/CAP guidelines. - Add `INFO` fields into the VCF: `TIER`, `SYMBOL`, `CONSEQUENCE`, `MUTATION_HOTSPOT`, `TCGA_PANCANCER_COUNT`, `CLINVAR_CLNSIG`, `ICGC_PCAWG_HITS`, `COSMIC_CNT`. - - External sources used during this step include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. -7. Transfer PCGR annotations to the full set of variants - - merge the PCGR annotations back into the original VCF file. + - External sources include VEP, ClinVar, COSMIC, TCGA, ICGC, Open Targets Platform, CancerMine, DoCM, CBMDB, DisGeNET, Cancer Hotspots, dbNSFP, UniProt/SwissProt, Pfam, DGIdb, and ChEMBL. +7. Transfer PCGR annotations to the full set of variants: + - Merge the PCGR annotations back into the original VCF file. - Ensure that all variants, including those not selected for PCGR annotation, have relevant clinical annotations where available. - Preserve the `FILTER` statuses and other annotations from the original VCF. @@ -207,190 +196,168 @@ The Annotation consists of three step processes, employs Reference Sources (GA4G The Filter step applies a series of stringent filters to somatic variant calls in the VCF file, ensuring the retention of high-confidence and biologically meaningful variants. #### Inputs - - Annotated VCF - `${tumor_id}.annotations.vcf.gz` #### Output - -- Filter VCF - - `${tumor_id}\*filters_set.vcf.gz` +- Filtered VCF + - `${tumor_id}*filters_set.vcf.gz` #### Filters -variant that do not meet this criteria will not be considered unless [Clinical Significance Exceptions](#2-clinical-significance-Exceptions) +Variants that do not meet these criteria will be filtered out unless they qualify for [Clinical Significance Exceptions](#clinical-significance-exceptions): | **Filter Type** | **Threshold/Criteria** | |-------------------------------------------|------------------------------------------------| -| **Allele Frequency (AF) Filter** | Tumor AF < 10% (0.10) | +| **Allele Frequency (AF) Filter** | Tumor AF < 10% (0.10) | | **Allele Depth (AD) Filter** | Fewer than 4 supporting reads (6 in low-complexity regions) | -| **Non-GIAB AD Filter** | Stricter thresholds outside GIAB high-confidence regions | +| **Non-GIAB AD Filter** | Stricter thresholds outside GIAB high-confidence regions | | **Problematic Genomic Regions Filter** | Overlap with ENCODE blacklist, bad promoter, or low-complexity regions | -| **Population Frequency (gnomAD) Filter** | gnomAD AF ≥ 1% (0.01) | -| **Panel of Normals (PoN) Germline Filter** | Present in ≥ 5 normal samples or PoN AF > 20% (0.20) | - +| **Population Frequency (gnomAD) Filter** | gnomAD AF ≥ 1% (0.01) | +| **Panel of Normals (PoN) Germline Filter**| Present in ≥ 5 normal samples or PoN AF > 20% (0.20) | #### Clinical Significance Exceptions -| Exception Category | Criteria | -|------------------------------|-------------------------------------------------------------------------------------------------------------------| -| **Reference Database Hit Count** | COSMIC count ≥10 OR TCGA pan-cancer count ≥5 OR ICGC PCAWG count ≥5 | +| Exception Category | Criteria | +|-----------------------------------|-------------------------------------------------------------------------| +| **Reference Database Hit Count** | COSMIC count ≥10 OR TCGA pan-cancer count ≥5 OR ICGC PCAWG count ≥5 | | **ClinVar Pathogenicity** | ClinVar classification of `conflicting_interpretations_of_pathogenicity`, `likely_pathogenic`, `pathogenic`, or `uncertain_significance` | -| **Mutation Hotspots** | Annotated as `HMF_HOTSPOT`, `PCGR_MUTATION_HOTSPOT` and SAGE Hotspots(CGI, CIViC, OncoKB) | -| **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | -| **COSMIC Database Hit Count Filter** | COSMIC count >= 10 | -| **TCGA Pan-cancer Count Filter** | TCGA count >= 5 | -| **ICGC PCAWG Count Filter** | ICGC count >= 5 | +| **Mutation Hotspots** | Annotated as `HMF_HOTSPOT`, `PCGR_MUTATION_HOTSPOT`, or SAGE Hotspots (CGI, CIViC, OncoKB) | +| **PCGR Tier Exception** | Classified as `TIER_1` OR `TIER_2` | ### Reports -The Report step utilises the Personal Cancer Genome Reporter (PCGR) +The Report step utilizes the Personal Cancer Genome Reporter (PCGR) and other tools to generate comprehensive reports. #### Inputs - -- Purple purity -- Filterd VCF - - `${tumor_id}\*filters_set.vcf.gz` -- Dragen VCF +- Purple purity data +- Filtered VCF + - `${tumor_id}*filters_set.vcf.gz` +- DRAGEN VCF - `${tumor_id}.main.dragen.vcf.gz` #### Output - - PCGR Cancer report - `${tumor_id}.pcgr_acmg.grch38.html` #### Steps - 1. Generate BCFtools Statistics on the Input VCF: - The code runs a helper function (`bcftools_stats_prepare`) to create a modified version of the input VCF, adjusting quality scores so that `bcftools stats` can produce more meaningful outputs. It then executes `bcftools stats` to gather statistics on variant quality and distribution, storing the results in a text file. + - Run `bcftools stats` to gather statistics on variant quality and distribution. 2. Calculate Allele Frequency Distributions: - The `allele_frequencies` function uses external tools (bcftools, bedtools) to: - Filter and normalize variants according to high-confidence regions. - Extract allele frequency data from tumor samples. - Produce both a global allele frequency summary and a subset of allele frequencies restricted to key cancer genes. -3. Compare Variant Counts From Two Variant Sets (DRAGEN vs. BOLT) - - The code counts the total number and types of variants (SNPs, Indels, Others) passing filters in both a DRAGEN VCF and the FILTER BOLT VCF. -4. Count Variants by Processing Stage -5. Parse Purity and Ploidy Information (Purple Data) -6. Run PCGR +3. Compare Variant Counts From Two Variant Sets (DRAGEN vs. BOLT): + - Count the total number and types of variants (SNPs, Indels, Others) passing filters in both the DRAGEN VCF and the Filtered BOLT VCF. +4. Count Variants by Processing Stage. +5. Parse Purity and Ploidy Information (Purple Data). +6. Run PCGR to generate the final report. ### Somatic Structural Variants The Somatic Structural Variants (SVs) pipeline identifies and annotates large-scale genomic alterations, including deletions, duplications, inversions, insertions, and translocations in tumor samples. This step re-calls outputs from DRAGEN Variant Caller, GRIDSS2, using PURPLE applies filtering criteria, and prioritizes clinically significant structural variants. #### Summary - 1. GRIPSS filtering: - - GRIPSS filtering refines the structural variant calls from Oncoanalyser using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations data are the specific to umccr like known_fusions -2. PURPLE - - Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. PURPLE adjusts SV breakpoints based on copy number transitions and robustly classifies events as somatic versus germline. -3. Annotation - - Combines SV calls from GRIPSS with CNV data from PURPLE - - Annotate variant using [SnpEff](https://github.com/pcingola/SnpEff) -4. Prioritisation - - Prioritise SV annotation based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data including umccr panel genes, tumor suppressor gene lists, hartwig known fusion pairs, [appris](https://ngdc.cncb.ac.cn/databasecommons/database/id/323) - - Prioritise variants based on clinical relevance and support metric -5. Report - - Cancer report - - Multiqc + - Refines the structural variant calls using read counts, panel-of-normals, known fusion hotspots, and repeat masker annotations. +2. PURPLE: + - Combines the GRIPSS-filtered SV calls with copy number variation (CNV) data and tumor purity/ploidy estimates. +3. Annotation: + - Combines SV calls with CNV data and annotates using [SnpEff](https://github.com/pcingola/SnpEff). +4. Prioritization: + - Prioritizes SV annotations based on [AstraZeneca-NGS](https://github.com/AstraZeneca-NGS/simple_sv_annotation) using curated reference data. +5. Report: + - Generates cancer report and MultiQC output. #### Inputs - - GRIDSS2 - - ${tumor_id}.gridss.vcf.gz + - `${tumor_id}.gridss.vcf.gz` #### Steps - 1. GRIPSS filtering: - Evaluate split-read and paired-end support; discard variants with low support. - - Apply panel-of-normals filtering to remove artefacts observed in normal samples. + - Apply panel-of-normals filtering to remove artifacts observed in normal samples. - Retain variants overlapping known oncogenic fusion hotspots (using UMCCR-curated lists). - Exclude variants in repetitive regions based on Repeat Masker annotations. -2. Purple: +2. PURPLE: - Merge SV calls with CNV segmentation data. - Estimate tumor purity and ploidy. - Adjust SV breakpoints based on copy number transitions. - Classify SVs as somatic or germline. -3. Annotation +3. Annotation: - Compile SV and CNV information into a unified VCF file. - Extend the VCF header with PURPLE-related INFO fields (e.g., PURPLE_baf, PURPLE_copyNumber). - Convert CNV records from TSV format into VCF records with appropriate SVTYPE tags (e.g., 'DUP' for duplications, 'DEL' for deletions). - - Run snpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences. -4. Prioritization + - Run SnpEff to annotate the unified VCF with functional information such as gene names, transcript effects, and coding consequences. +4. Prioritization: - Run the prioritization module (forked from the AstraZeneca simple_sv_annotation tool) using reference data files including known fusion pairs, known fusion 5′ and 3′ lists, key genes, and key tumor suppressor genes. - Classify Variants: - Structural Variants (SVs): Variants labeled with the source `sv_gridss`. - Copy Number Variants (CNVs): Variants labeled with the source `cnv_purple`. -5. Prioritise variants on a 4 tier system using [prioritize_sv](https://github.com/umccr/vcf_stuff/blob/master/scripts/prioritize_sv.): - **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** - - exon loss - - on cancer gene list (1) - - other (2) - - gene fusion - - paired (hits two genes) - - on list of known pairs (1) (curated by [HMF](https://resources.hartwigmedicalfoundation.nl)) - - one gene is a known promiscuous fusion gene (1) (curated by [HMF](https://resources.hartwigmedicalfoundation.nl)) - - on list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2) - - other: - - one or two genes on cancer gene list (2) - - neither gene on cancer gene list (3) - - unpaired (hits one gene) - - on cancer gene list (2) - - others (3) - - upstream or downstream (a specific type of fusion, e.g. one gene is got into control of another gene's promoter and get over-expressed (oncogene or underexpressed (tsgene)) - - on cancer gene list genes (2) - - LoF or HIGH impact in a tumor suppressor - - on cancer gene list (2) - - other TS gene (3) - - other (4) +5. Prioritize variants on a 4-tier system using [prioritize_sv](https://github.com/umccr/vcf_stuff/blob/master/scripts/prioritize_sv.): + - **1 (high)** - **2 (moderate)** - **3 (low)** - **4 (no interest)** + - Exon loss: + - On cancer gene list (1) + - Other (2) + - Gene fusion: + - Paired (hits two genes): + - On list of known pairs (1) (curated by [HMF](https://resources.hartwigmedicalfoundation.nl)) + - One gene is a known promiscuous fusion gene (1) (curated by [HMF](https://resources.hartwigmedicalfoundation.nl)) + - On list of [FusionCatcher](https://github.com/ndaniel/fusioncatcher/blob/master/bin/generate_known.py) known pairs (2) + - Other: + - One or two genes on cancer gene list (2) + - Neither gene on cancer gene list (3) + - Unpaired (hits one gene): + - On cancer gene list (2) + - Others (3) + - Upstream or downstream: A specific type of fusion where one gene comes under the control of another gene's promoter, potentially leading to overexpression (oncogene) or underexpression (tumor suppressor gene): + - On cancer gene list genes (2) + - LoF or HIGH impact in a tumor suppressor: + - On cancer gene list (2) + - Other TS gene (3) + - Other (4) 6. Filter Low-Quality Calls: - Apply Quality Filters: - - Keep variants with sufficient read support (e.g., split reads (SR) ≥ 5 and paired reads (PR) ≥ 5). - - Exclude Tier 3 and Tier 4 variants where `SR < 5` and `PR < 5`. - - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (`AF0` or `AF1`) are below 0.1. - - Structural Variants (SVs): Variants labeled with the source sv_gridss. - - Copy Number Variants (CNVs): Variants labeled with the source cnv_purple. + - Apply Quality Filters: + - Keep variants with sufficient read support (e.g., split reads (SR) ≥ 5 and paired reads (PR) ≥ 5). + - Exclude Tier 3 and Tier 4 variants where `SR < 5` and `PR < 5`. + - Exclude Tier 3 and Tier 4 variants where `SR < 10`, `PR < 10`, and allele frequencies (`AF0` or `AF1`) are below 0.1. 7. Report: - - Generate MultiQC and cancer report outputs + - Generate MultiQC and cancer report outputs. ### Germline Small Variants Filtering Select passing variants in the given [gene panel transcript regions](https://github.com/umccr/gene_panels/tree/main/germline_panel) made with PMCC familial cancer clinic list then make CPSR report. #### Inputs - -- Dragen VCF +- DRAGEN VCF - `${normal_id}.hard-filtered.vcf.gz` #### Output - - CPSR report - `${normal_id}.cpsr.grch38.html` #### Steps - -1. Prepare - 1. Selection of Passing Variants: - 1. Raw germline variant calls (e.g. from DRAGEN or an ensemble caller) are filtered to retain only those variants marked as PASS (or with no filter flag) - 2. Selection of Gene Panel Variants: - 1. The filtered variants are then further restricted to regions defined by a gene panel transcript regions file. -2. Report: CPSR +1. Prepare: + - Selection of Passing Variants: + - Raw germline variant calls from DRAGEN are filtered to retain only those variants marked as PASS (or with no filter flag). + - Selection of Gene Panel Variants: + - The filtered variants are further restricted to regions defined by the [gene panel transcript regions file](https://github.com/umccr/gene_panels/tree/main/germline_panel), based on the PMCC familial cancer clinic list. +2. Report: + - Generate CPSR (Cancer Predisposition Sequencing Report) summarizing germline findings. --- ## Common Reports -### [Cancer report](https://umccr.github.io/gpgr/) +### [Cancer Report](https://umccr.github.io/gpgr/) UMCCR cancer report containing: -Tumor Mutation Burden (TMB): - +#### Tumor Mutation Burden (TMB) - Data Source: filtered somatic VCF - Tool: PURPLE #### Mutational Signatures - - Data Source: filtered SNV/CNV VCF - Tool: MutationalPatterns R package (via PCGR) @@ -400,27 +367,22 @@ Tumor Mutation Burden (TMB): - Note: No dedicated contamination metric is currently generated #### Purity & Ploidy - - Data Source: COBALT (providing read-depth ratios) and AMBER (providing B-allele frequency measurements) - Tool: PURPLE, which uses these inputs to compute sample purity (percentage of tumor cells) and overall ploidy (average copy number) #### HRD Score - - Data Source: HRD analysis output file (${tumor_id}.hrdscore.tsv) - Tool: DRAGEN #### MSI (Microsatellite Instability) - - Data Source: Indels in microsatellite regions from SNV/CNV - Tool: PURPLE #### Structural Variant Metrics - - Data Source: GRIDSS/GRIPSS SV VCF and PURPLE CNV segmentation - Tools: GRIDSS/GRIPSS and PURPLE #### Copy Number Metrics (Segments, Deleted Genes, etc.) - - Data Source: PURPLE CNV outputs (segmentation files, gene-level CNV TSV) - Tool: PURPLE @@ -446,7 +408,7 @@ DRAGEN-FastQC: Per-base sequence quality, per-sequence quality scores, GC conten ### PCGR -Personal Cancer Genome Reporter (PCGR) tool to generate a comprehensive, interactive HTML report that consolidates filtered and annotated variant data, providing detailed insights into the somatic variants identified. +Personal Cancer Genome Reporter (PCGR) tool generates a comprehensive, interactive HTML report that consolidates filtered and annotated variant data, providing detailed insights into the somatic variants identified. Key Metrics: @@ -457,62 +419,71 @@ Key Metrics: - Microsatellite Instability (MSI) Status: Assessment of MSI status is performed, relevant for certain cancer types and treatment decisions. - Clinical Trials Information: Information on relevant clinical trials is incorporated, offering potential therapeutic options based on the identified variants. -Note: The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered ou +Note: The PCGR tool is designed to process a maximum of 500,000 variants. If the input VCF file contains more than this limit, variants exceeding 500,000 will be filtered out. ### CPSR Report The CPSR (Cancer Predisposition Sequencing Report) includes the following: Settings: - - Sample metadata - Report configuration - Virtual gene panel Summary of Findings: - - Variant statistics Variant Classification: -ClinVarc and Non-ClinVar - -- Class 5 \- Pathogenic variants -- Class 4 \- Likely Pathogenic variants -- Class 3 \- Variants of Uncertain Significance (VUS) -- Class 2 \- Likely Benign variants -- Class 1 \- Benign variants +ClinVar and Non-ClinVar variants: +- Class 5 - Pathogenic variants +- Class 4 - Likely Pathogenic variants +- Class 3 - Variants of Uncertain Significance (VUS) +- Class 2 - Likely Benign variants +- Class 1 - Benign variants - Biomarkers PCGR TIER according to [ACMG](https://www.ncbi.nlm.nih.gov/pubmed/27993330): - - Tier 1 (High): Highest priority variants with strong clinical relevance. - Tier 2 (Moderate): Variants with potential clinical significance. - Tier 3 (Low): Variants with uncertain significance. - Tier 4 (No Interest): Variants unlikely to be clinically relevant. --- -# Coverage ---- +## Coverage + +The sash workflow utilizes coverage metrics from DRAGEN to evaluate the sequencing quality and depth across target regions. Coverage analysis includes: -# Reference data +- Mean coverage across targeted genomic regions +- Percentage of target regions covered at various depth thresholds (10X, 20X, 50X, 100X) +- Coverage uniformity metrics +- Gap analysis for regions with insufficient coverage -### [UMCCR Genes panels](https://github.com/umccr/gene_panels) +These metrics are integrated into the MultiQC report, providing a comprehensive overview of sequencing quality and coverage. -### Genome annotations +--- + +## Reference Data -WiGiTS (hmftools) +### [UMCCR Gene Panels](https://github.com/umccr/gene_panels) +Curated gene panels for specific analyses, including the germline cancer predisposition gene panel used in the Germline Small Variants workflow. -Annotation Databases: +### Genome Annotations -- gnomAD: Provides population allele frequencies to help distinguish common variants from rare ones. -- ClinVar: Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. +#### HMFtools Reference Data +- Ensembl reference data (GRCh38) +- Somatic driver catalogs +- Known fusion gene pairs +- Driver gene panels + +#### Annotation Databases: +- gnomAD (v2.1): Provides population allele frequencies to help distinguish common variants from rare ones. +- ClinVar (20220103): Offers clinically curated variant information, aiding in the interpretation of potential pathogenicity. - COSMIC: Contains data on somatic mutations found in cancer, facilitating the identification of cancer-related variants. - Gene Panels: Focuses analysis on specific sets of genes relevant to particular conditions or research interests. -Structural Variant Data: - +#### Structural Variant Data: - SnpEff Databases: Used for predicting the effects of variants on genes and proteins. - Panel of Normals (PON): Helps filter out technical artifacts by comparing against a set of normal samples. - RepeatMasker: Identifies repetitive genomic regions to prevent false-positive variant calls. @@ -521,101 +492,91 @@ Databases/datasets PCGR Reference Data: *Version: v20220203* -- [GENCODE](https://www.gencodegenes.org/) \- high quality reference gene annotation and experimental validation (release 39/19) -- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) \- Database of non-synonymous functional predictions (20210406 (v4.2)) -- [dbMTS](http://database.liulab.science/dbMTS) \- Database of alterations in microRNA target sites (v1.0) -- [ncER](https://github.com/TelentiLab/ncER_datasets) \- Non-coding essential regulation score (genome-wide percentile rank) (v2) -- [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) \- Genomic Evolutionary Rate Profiling (GERP) \- rejected substitutions (RS) score (v1) -- [Pfam](http://pfam.xfam.org) \- Collection of protein families/domains (2021_11 (v35.0)) -- [UniProtKB](http://www.uniprot.org) \- Comprehensive resource of protein sequence and functional information (2021_04) -- [gnomAD](http://gnomad.broadinstitute.org) \- Germline variant frequencies exome-wide (r2.1 (October 2018)) -- [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) \- Database of short genetic variants (154) -- [DoCM](http://docm.genome.wustl.edu) \- Database of curated mutations (release 3.2) -- [CancerHotspots](http://cancerhotspots.org) \- A resource for statistically significant mutations in cancer (2017) -- [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) \- Database of genomic variants of clinical significance (20220103) -- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) \- Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 (v42)) -- [OncoTree](http://oncotree.mskcc.org/) \- Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) -- [DiseaseOntology](http://disease-ontology.org) \- Standardized ontology for human disease (20220131) -- [EFO](https://github.com/EBISPOT/efo) \- Experimental Factor Ontology (v3.38.0) -- [GWAS_Catalog](https://www.ebi.ac.uk/gwas/) \- The NHGRI-EBI Catalog of published genome-wide association studies (20211221) -- [CGI](http://cancergenomeinterpreter.org/biomarkers) \- Cancer Genome Interpreter Cancer Biomarkers Database (20180117) +- [GENCODE](https://www.gencodegenes.org/) - high quality reference gene annotation and experimental validation (release 39/19) +- [dbNSFP](https://sites.google.com/site/jpopgen/dbNSFP) - Database of non-synonymous functional predictions (20210406 (v4.2)) +- [dbMTS](http://database.liulab.science/dbMTS) - Database of alterations in microRNA target sites (v1.0) +- [ncER](https://github.com/TelentiLab/ncER_datasets) - Non-coding essential regulation score (genome-wide percentile rank) (v2) +- [GERP](http://mendel.stanford.edu/SidowLab/downloads/gerp/) - Genomic Evolutionary Rate Profiling (GERP) - rejected substitutions (RS) score (v1) +- [Pfam](http://pfam.xfam.org) - Collection of protein families/domains (2021_11 (v35.0)) +- [UniProtKB](http://www.uniprot.org) - Comprehensive resource of protein sequence and functional information (2021_04) +- [gnomAD](http://gnomad.broadinstitute.org) - Germline variant frequencies exome-wide (r2.1 (October 2018)) +- [dbSNP](http://www.ncbi.nlm.nih.gov/SNP/) - Database of short genetic variants (154) +- [DoCM](http://docm.genome.wustl.edu) - Database of curated mutations (release 3.2) +- [CancerHotspots](http://cancerhotspots.org) - A resource for statistically significant mutations in cancer (2017) +- [ClinVar](https://www.ncbi.nlm.nih.gov/clinvar) - Database of genomic variants of clinical significance (20220103) +- [CancerMine](http://bionlp.bcgsc.ca/cancermine/) - Literature-mined database of tumor suppressor genes/proto-oncogenes (20211106 (v42)) +- [OncoTree](http://oncotree.mskcc.org/) - Open-source ontology developed at MSK-CC for standardization of cancer type diagnosis (2021-11-02) +- [DiseaseOntology](http://disease-ontology.org) - Standardized ontology for human disease (20220131) +- [EFO](https://github.com/EBISPOT/efo) - Experimental Factor Ontology (v3.38.0) +- [GWAS_Catalog](https://www.ebi.ac.uk/gwas/) - The NHGRI-EBI Catalog of published genome-wide association studies (20211221) +- [CGI](http://cancergenomeinterpreter.org/biomarkers) - Cancer Genome Interpreter Cancer Biomarkers Database (20180117) --- -# sash module outputs: - -Somatic SNVs +## sash Module Outputs +### Somatic SNVs - File: `smlv_somatic/filter/{tumor_id}.pass.vcf.gz` -- Description: Contains somatic single nucleotide variants (SNVs) with filtering applied. - -Somatic SVs +- Description: Contains somatic single nucleotide variants (SNVs) with filtering applied (VCF format). +### Somatic SVs - File: `sv_somatic/prioritise/{tumor_id}.sv.prioritised.vcf.gz` -- Description: Contains somatic structural variants (SVs) with prioritization applied. - -Somatic CNVs +- Description: Contains somatic structural variants (SVs) with prioritization applied (VCF format). +### Somatic CNVs - File: `cancer_report/cancer_report_tables/purple/{tumor_id}-purple_cnv_som.tsv.gz` -- Description: Contains somatic copy number variations (CNVs) data. - -Somatic Gene CNVs +- Description: Contains somatic copy number variations (CNVs) data (TSV format). +### Somatic Gene CNVs - File: `cancer_report/cancer_report_tables/purple/{tumor_id}-purple_cnv_som_gene.tsv.gz` -- Description: Contains gene-level somatic copy number variations (CNVs) data. - -Germline SNVs +- Description: Contains gene-level somatic copy number variations (CNVs) data (TSV format). +### Germline SNVs - File: `dragen_germline_output/{normal_id}.hard-filtered.vcf.gz` -- Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied. - -Purple Purity, Ploidy, MS Status +- Description: Contains germline single nucleotide variants (SNVs) with hard filtering applied (VCF format). +### Purple Purity, Ploidy, MS Status - File: `purple/{tumor_id}.purple.purity.tsv` -- Description: Contains estimated tumor purity, ploidy, and microsatellite status. - -PCGR JSON with TMB +- Description: Contains estimated tumor purity, ploidy, and microsatellite status (TSV format). +### PCGR JSON with TMB - File: `smlv_somatic/report/pcgr/{tumor_id}.pcgr_acmg.grch38.json.gz` -- Description: Contains PCGR annotations, including tumor mutational burden (TMB). - -DRAGEN HRD Score +- Description: Contains PCGR annotations, including tumor mutational burden (TMB) (JSON format). +### DRAGEN HRD Score - File: `dragen_somatic_output/{tumor_id}.hrdscore.tsv` -- Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis. +- Description: Contains homologous recombination deficiency (HRD) score from DRAGEN analysis (TSV format). --- ## FAQ -### Q: Do we use PCGR for the rescue of sage? - -A: In Somatic SV, we used sage to make variant calling then we did annotation of the variant using PCGR, then we filtered the variant. If variants have high-tier ranks, they are not filtered out whatsoever - -### Q: how are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? - -A: In the current version of sash, hypermutated samples are identified based on a threshold 500,000 of total somatic variant counts. For instance, if the variant count exceeds the threshold , the sample is flagged as hypermutated. When this occurs we will filter variant that 1. don’t have clinical impact, 2. in hotspot region, until we meet the threshold. That will impact the TMB and MSI calculated by purple. For Now we are using the TMB and MSI of purple is this edges case. New release will be able to get correct TMB and MSI from purple. +### Q: Do we use PCGR for the rescue of SAGE? +A: In Somatic SV, we use SAGE for variant calling, then annotate the variants using PCGR, followed by filtering. Variants with high-tier ranks (TIER_1 or TIER_2) are not filtered out regardless of other criteria. -### Q: how are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? -A: Filter out as we Filter on chr 1..22 and chr X,Y,M +### Q: How are hypermutated samples handled in the current version, and is there any impact on derived metrics such as TMB or MSI? +A: In the current version of sash, hypermutated samples are identified based on a threshold of 500,000 total somatic variant counts. If the variant count exceeds this threshold, the sample is flagged as hypermutated. When this occurs, we will filter variants that: 1) don't have clinical impact, 2) aren't in hotspot regions, until we meet the threshold. This impacts the TMB and MSI calculations by PURPLE. Currently, we are using the TMB and MSI values from PURPLE in these edge cases. A future release will provide correct TMB and MSI calculations from PURPLE. -### Q: inputs for the cancer reporter \- have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? -A: Circos plots come Purple +### Q: How are we handling non-standard chromosomes if present in the input VCFs (ALTs, chrM, etc)? +A: We filter on chromosomes 1-22 and chromosomes X, Y, M. All other non-standard chromosomes and contigs are filtered out. -### Q: we dropped the CACAO coverage reports. can we discuss how to utilize DRAGEN or WiGiTS coverage information instead? +### Q: What inputs for the cancer reporter - have they changed (and what can we harmonize); e.g., where is the Circos plot from at this point? +A: Circos plots are generated by PURPLE. +### Q: We dropped the CACAO coverage reports. Can we discuss how to utilize DRAGEN or HMFtools coverage information instead? +A: DRAGEN coverage metrics are now integrated into the MultiQC report, providing a comprehensive overview of sequencing quality and coverage across the genome. We are exploring further integration of HMFtools coverage analysis for future releases. -### Q: what TMB score is displayed in the cancer reporter? -A: The TMB display is the on calculated by pcgr +### Q: What TMB score is displayed in the cancer reporter? +A: The TMB displayed is calculated by PCGR. -### Q: what filtered VCF is the source for the mutational signatures? -A: We use the filtered VCF for mutational signatures +### Q: What filtered VCF is the source for the mutational signatures? +A: We use the filtered VCF (after applying quality filters but retaining clinically significant variants) for mutational signatures analysis. ### Q: Where is the contamination score coming from currently? -A: I don’t think there is contamination at the moment in sash +A: Currently, sash does not calculate a dedicated contamination metric. Tumor purity estimation from PURPLE serves as the primary indicator of sample quality. -### Q: Do the GRIPSS step do something more than what's happening in oncoanalyser ? -A: no different settings are applied to GRIPSS other than reference files +### Q: Do the GRIPSS steps do something more than what's happening in Oncoanalyser? +A: No, the same GRIPSS parameters are applied, with the only difference being the reference files used. -### Q: Does the data from Somatic Small Variantsworkflow are use for the SV ? -A: iirc data from the somatic small variant workflow is not used in the sv workflow \ No newline at end of file +### Q: Does the data from Somatic Small Variants workflow get used for the SV analysis? +A: No, the somatic small variant workflow data is not used in the structural variant (SV) workflow. These are independent analyses that run in parallel. \ No newline at end of file From 8baed92600a70a3e7910fb753d05157799e407f1 Mon Sep 17 00:00:00 2001 From: Quentin Clayssen Date: Mon, 8 Sep 2025 11:07:25 +1000 Subject: [PATCH 33/36] fixing input doc + linting --- assets/samplesheet.csv | 3 --- docs/output.md | 14 ++++++-------- docs/usage.md | 29 +++++++++++++++++------------ 3 files changed, 23 insertions(+), 23 deletions(-) diff --git a/assets/samplesheet.csv b/assets/samplesheet.csv index c4cc70d5..5adeb527 100644 --- a/assets/samplesheet.csv +++ b/assets/samplesheet.csv @@ -1,7 +1,4 @@ id,subject_name,sample_name,filetype,filepath subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ -subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ -subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ -subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ \ No newline at end of file diff --git a/docs/output.md b/docs/output.md index ded9c31f..50203b2c 100644 --- a/docs/output.md +++ b/docs/output.md @@ -62,18 +62,16 @@ This document outlines the key results and files produced by the UMCCR SASH (pos ├── purple/ ├── smlv_germline/ │ └── prepare/ -| └── report/ -├── smlv_somatic/ │ └── report/ -│ └── annotate/ -│ └── filter/ +├── smlv_somatic/ +│ ├── report/ +│ ├── annotate/ +│ ├── filter/ │ └── rescue/ └── sv_somatic/ - └── annotate/ + ├── annotate/ └── prioritise/ ``` - -i ## Summary The **Sash Workflow** comprises three primary pipelines: **Somatic Small Variants**, **Somatic Structural Variants**, and **Germline Variants**. These pipelines utilize **Bolt**, a Python package designed for modular processing, and leverage outputs from the **DRAGEN Variant Caller** alongside **HMFtools in Oncoanalyser**. Each pipeline is tailored to a specific type of genomic variant, incorporating filtering, annotation, and HTML reports for research and curation. @@ -397,4 +395,4 @@ CPSR (Cancer Predisposition Sequencing Reporter) focuses on germline variants in -MultiQC aggregates quality metrics from all pipeline components into a single HTML report, providing an overview of sample quality and analysis performance. \ No newline at end of file +MultiQC aggregates quality metrics from all pipeline components into a single HTML report, providing an overview of sample quality and analysis performance. diff --git a/docs/usage.md b/docs/usage.md index 011d2574..812bc8d0 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -9,7 +9,7 @@ ## Samplesheet input -You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with 3 columns, and a header row as shown in the examples below. +You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It must be a CSV file with a header row containing the following columns: `id,subject_name,sample_name,filetype,filepath`. ```bash --input '[path to samplesheet file]' @@ -17,27 +17,32 @@ You will need to create a samplesheet with information about the samples you wou ### Full samplesheet -The pipeline will auto-detect whether a sample is single- or paired-end using the information provided in the samplesheet. The samplesheet can have as many columns as you desire, however, there is a strict requirement for the first 3 columns to match those defined in the table below. +Provide one row per available input directory for a given analysis `id`. All rows sharing the same `id` must have the same `subject_name`. The `filetype` column must be one of: -A final samplesheet file consisting of both single- and paired-end data may look something like the one below. This is for 6 samples, where `TREATMENT_REP3` has been sequenced twice. +- `dragen_germline_dir`: directory containing DRAGEN germline outputs (normal sample) +- `dragen_somatic_dir`: directory containing DRAGEN somatic tumor/normal outputs (tumor sample) +- `oncoanalyser_dir`: directory containing HMFtools/Oncoanalyser outputs (e.g., Purple, Linx) + +Example: ```csv id,subject_name,sample_name,filetype,filepath subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ -subject_a.example,subject_a,sample_germline,dragen_germline_dir,/path/to/dragen_germline/ -subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_somatic/ subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ -subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ ``` -| Column | Description | -| --------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `sample` | Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores (`_`). | -| `fastq_1` | Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | -| `fastq_2` | Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension ".fastq.gz" or ".fq.gz". | +Column descriptions: + +| Column | Description | +| --------------- | ----------- | +| `id` | Analysis identifier used to group multiple rows belonging to the same subject/run. | +| `subject_name` | Subject identifier. Must be identical across all rows for a given `id`. | +| `sample_name` | Sample identifier. Used to set `normal_id` for `dragen_germline_dir` and `tumor_id` for `dragen_somatic_dir`. | +| `filetype` | One of `dragen_germline_dir`, `dragen_somatic_dir`, `oncoanalyser_dir`. | +| `filepath` | Absolute or relative path to the corresponding directory. | -An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline. +An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline and matches the format above. ## Running the pipeline From b9b8c9fada9c34fa11e28332864fcd75e5a97a25 Mon Sep 17 00:00:00 2001 From: Quentin Clayssen Date: Mon, 8 Sep 2025 11:19:32 +1000 Subject: [PATCH 34/36] work usage --- docs/usage.md | 66 +++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 51 insertions(+), 15 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index 812bc8d0..f9e852ec 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,15 +1,18 @@ -# TODO # umccr/sash: Usage > _Documentation of pipeline parameters is generated automatically from the pipeline schema and can no longer be found in markdown files._ ## Introduction - +SASH is UMCCR’s post-processing pipeline for tumor/normal WGS analyses. It consumes DRAGEN outputs and optional nf-core/oncoanalyser (WiGiTS) results to perform small-variant rescue, annotation, filtering, structural variant integration, CNV calling (PURPLE), and reporting (PCGR/CPSR, LINX, MultiQC, cancer report). + +- Requires Nextflow >= 22.10.6 and a container engine (Docker/Singularity/Apptainer/Podman). +- Uses GRCh38 reference data defined in `conf/refdata.config` accessed via `--ref_data_path`. +- Inputs are provided via a CSV samplesheet; no FASTQ inputs are expected by SASH. ## Samplesheet input -You will need to create a samplesheet with information about the samples you would like to analyse before running the pipeline. Use this parameter to specify its location. It must be a CSV file with a header row containing the following columns: `id,subject_name,sample_name,filetype,filepath`. +Create a CSV samplesheet describing your DRAGEN and Oncoanalyser input directories. Pass it with `--input`. It must have a header row with columns: `id,subject_name,sample_name,filetype,filepath`. ```bash --input '[path to samplesheet file]' @@ -17,7 +20,7 @@ You will need to create a samplesheet with information about the samples you wou ### Full samplesheet -Provide one row per available input directory for a given analysis `id`. All rows sharing the same `id` must have the same `subject_name`. The `filetype` column must be one of: +Provide one row per available input directory for a given analysis `id`. Rows sharing the same `id` must have the same `subject_name`. The `filetype` must be one of: - `dragen_germline_dir`: directory containing DRAGEN germline outputs (normal sample) - `dragen_somatic_dir`: directory containing DRAGEN somatic tumor/normal outputs (tumor sample) @@ -44,17 +47,38 @@ Column descriptions: An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline and matches the format above. +### Required directory contents + +SASH expects specific files inside each directory referenced in the samplesheet. Paths below are relative to each directory path you provide in `filepath`. + +- `dragen_somatic_dir` + - `${tumor_id}.hard-filtered.vcf.gz` and index `${tumor_id}.hard-filtered.vcf.gz.tbi` + - `${tumor_id}.hrdscore.csv` +- `dragen_germline_dir` + - `${normal_id}.hard-filtered.vcf.gz` +- `oncoanalyser_dir` (from nf-core/oncoanalyser) + - `amber/` and `cobalt/` directories + - `gridss/${tumor_id}.gridss.vcf.gz` + - `sage/somatic/${tumor_id}.sage.somatic.vcf.gz` (+ `.tbi`) + - `virusbreakend/` directory + +Note: SASH runs PURPLE itself; precomputed PURPLE outputs are not required as inputs. + ## Running the pipeline -The typical command for running the pipeline is as follows: +Quickstart command: ```bash -nextflow run umccr/sash --input samplesheet.csv --outdir --genome GRCh37 -profile docker +nextflow run umccr/sash \ + --input samplesheet.csv \ + --ref_data_path /path/to/reference_data_root \ + --outdir results/ \ + -profile docker ``` This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles. -Note that the pipeline will create the following files in your working directory: +This creates the following in your working directory: ```bash work # Directory containing the nextflow working files @@ -63,9 +87,7 @@ work # Directory containing the nextflow working files # Other nextflow hidden files, eg. history of pipeline runs and old logs. ``` -If you wish to repeatedly use the same parameters for multiple runs, rather than specifying each flag in the command, you can specify these in a params file. - -Pipeline settings can be provided in a `yaml` or `json` file via `-params-file `. +If you wish to reuse parameters across runs, specify them in a `yaml` or `json` file via `-params-file `. > ⚠️ Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). > The above pipeline run specified with a params file in yaml format: @@ -77,15 +99,25 @@ nextflow run umccr/sash -profile docker -params-file params.yaml with `params.yaml` containing: ```yaml -input: './samplesheet.csv' -outdir: './results/' -genome: 'GRCh37' -input: 'data' +input: 'samplesheet.csv' +ref_data_path: '/path/to/reference_data_root' +outdir: 'results/' <...> ``` You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch). +### Reference data + +SASH reads all required resources from a single base directory passed via `--ref_data_path`. The internal paths and versions are defined in `conf/refdata.config` and include: + +- Genome FASTA/FAI/DICT: `genomes/GRCh38_umccr/...` +- HMF reference data (WiGiTS): `hmf_reference_data/hmftools//...` +- PCGR bundle: `databases/pcgr/v/` +- UMCCR resources: panels, known fusions, PoNs, config files + +Organise your reference data root to match these relative paths, or provide a site config that overrides them. See `subworkflows/local/prepare_reference.nf` and `conf/refdata.config` for details. + ### Updating the pipeline When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: @@ -98,7 +130,7 @@ nextflow pull umccr/sash It is a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. -First, go to the [umccr/sash releases page](https://github.com/umccr/sash/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag. +First, go to the [umccr/sash releases page](https://github.com/umccr/sash/releases) and find the latest pipeline version (e.g. `0.5.0`). Then specify this when running with `-r` (single hyphen), for example `-r 0.5.0`. This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports. @@ -169,6 +201,10 @@ To use a different container from the default container or conda environment spe A pipeline might not always support every possible argument or option of a particular tool used in pipeline. Fortunately, nf-core pipelines provide some freedom to users to insert additional parameters that the pipeline does not include by default. +## Outputs + +See detailed descriptions in `docs/output.md`. Top-level results include PCGR/CPSR HTML reports, cancer report, LINX/PURPLE artefacts, and a MultiQC summary. + ## Running in the background Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished. From 475b6c788c54d8696377f446676a37c175016411 Mon Sep 17 00:00:00 2001 From: Quentin Clayssen Date: Thu, 2 Oct 2025 12:34:57 +1000 Subject: [PATCH 35/36] add info usage --- docs/usage.md | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/docs/usage.md b/docs/usage.md index f9e852ec..9b3f7a2c 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -4,12 +4,33 @@ ## Introduction -SASH is UMCCR’s post-processing pipeline for tumor/normal WGS analyses. It consumes DRAGEN outputs and optional nf-core/oncoanalyser (WiGiTS) results to perform small-variant rescue, annotation, filtering, structural variant integration, CNV calling (PURPLE), and reporting (PCGR/CPSR, LINX, MultiQC, cancer report). +`sash` is the UMCCR post-processing WGS workflow. The workflow takes DRAGEN small variant calls and oncoanalyser results as input to perform annotation, prioritisation, rescue and filtering, and reporting for the WGS variant data. Additionally, `sash` runs several sensors for biomarker assessment and genomic characterisation including HRD status, mutational signatures, purity/ploidy, MSI, and TMB. + +The general processes `sash` runs include: + +- `gpgr` for generating the summary Cancer Report +- `PCGR` to report processed small somatic variants (annotated, rescued, filtered, prioritised) +- `CPSR` to report processed small germline variants (filtered, annotated, prioritised) +- `linxreport` to collate SV annotations and plots from LINX +- `MultiQC` for reporting various WGS statistics / metrics for QC +- `SAGE` variant calling to supplement DRAGEN small somatic variants +- `PURPLE` for TMB, MSI, CNV calling, and purity / ploidy estimation +- `HRDetect` and `CHORD` for HRD inference +- `MutationalPatterns` to fit mutational signatures +- `PAVE` for somatic variant annotation with MNV filtering + +While the `sash` workflow utilises a range of tools and software, it is most closely coupled with [bolt](https://github.com/scwatts/bolt), a Python package that implements the UMCCR post-processing logic and supporting functionality. - Requires Nextflow >= 22.10.6 and a container engine (Docker/Singularity/Apptainer/Podman). - Uses GRCh38 reference data defined in `conf/refdata.config` accessed via `--ref_data_path`. - Inputs are provided via a CSV samplesheet; no FASTQ inputs are expected by SASH. +## Requirements + +- Java +- Nextflow ≥22.10.6 +- A container engine, such as Docker, Singularity, Apptainer, or Podman + ## Samplesheet input Create a CSV samplesheet describing your DRAGEN and Oncoanalyser input directories. Pass it with `--input`. It must have a header row with columns: `id,subject_name,sample_name,filetype,filepath`. From 89750ed224686defe48ba82f063713333f4e78a2 Mon Sep 17 00:00:00 2001 From: Quentin Clayssen Date: Thu, 2 Oct 2025 12:42:00 +1000 Subject: [PATCH 36/36] reshape usage --- docs/usage.md | 237 ++++++++++++++++++-------------------------------- 1 file changed, 84 insertions(+), 153 deletions(-) diff --git a/docs/usage.md b/docs/usage.md index 9b3f7a2c..0be3bb51 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,53 +1,34 @@ # umccr/sash: Usage -> _Documentation of pipeline parameters is generated automatically from the pipeline schema and can no longer be found in markdown files._ +> Parameter documentation is generated automatically from `nextflow_schema.json`. Run `nextflow run umccr/sash --help` +> or use [nf-core/launch](https://nf-co.re/launch) for an interactive form. ## Introduction -`sash` is the UMCCR post-processing WGS workflow. The workflow takes DRAGEN small variant calls and oncoanalyser results as input to perform annotation, prioritisation, rescue and filtering, and reporting for the WGS variant data. Additionally, `sash` runs several sensors for biomarker assessment and genomic characterisation including HRD status, mutational signatures, purity/ploidy, MSI, and TMB. +umccr/sash is UMCCR’s post-processing pipeline for tumour/normal WGS analyses. It consumes DRAGEN secondary-analysis +outputs together with nf-core/oncoanalyser WiGiTS artefacts to perform small-variant rescue, annotation, filtering, +structural variant integration, PURPLE CNV calling, and reporting (PCGR, CPSR, GPGR cancer report, LINX, MultiQC). -The general processes `sash` runs include: - -- `gpgr` for generating the summary Cancer Report -- `PCGR` to report processed small somatic variants (annotated, rescued, filtered, prioritised) -- `CPSR` to report processed small germline variants (filtered, annotated, prioritised) -- `linxreport` to collate SV annotations and plots from LINX -- `MultiQC` for reporting various WGS statistics / metrics for QC -- `SAGE` variant calling to supplement DRAGEN small somatic variants -- `PURPLE` for TMB, MSI, CNV calling, and purity / ploidy estimation -- `HRDetect` and `CHORD` for HRD inference -- `MutationalPatterns` to fit mutational signatures -- `PAVE` for somatic variant annotation with MNV filtering - -While the `sash` workflow utilises a range of tools and software, it is most closely coupled with [bolt](https://github.com/scwatts/bolt), a Python package that implements the UMCCR post-processing logic and supporting functionality. - -- Requires Nextflow >= 22.10.6 and a container engine (Docker/Singularity/Apptainer/Podman). -- Uses GRCh38 reference data defined in `conf/refdata.config` accessed via `--ref_data_path`. -- Inputs are provided via a CSV samplesheet; no FASTQ inputs are expected by SASH. - -## Requirements - -- Java -- Nextflow ≥22.10.6 -- A container engine, such as Docker, Singularity, Apptainer, or Podman +- Requires Nextflow ≥ 22.10.6 and a container engine (Docker/Singularity/Apptainer/Podman/Conda). +- Uses GRCh38 reference data resolved from `--ref_data_path` (see [Reference data](#reference-data)). +- Expects inputs via a CSV samplesheet describing DRAGEN and Oncoanalyser directories; no FASTQ inputs are needed. ## Samplesheet input -Create a CSV samplesheet describing your DRAGEN and Oncoanalyser input directories. Pass it with `--input`. It must have a header row with columns: `id,subject_name,sample_name,filetype,filepath`. - -```bash ---input '[path to samplesheet file]' -``` - -### Full samplesheet +Pass a CSV with `--input`. Each row represents one directory staged by upstream pipelines for a given analysis `id`. +Rows sharing the same `id` are grouped into a single tumour/normal run. -Provide one row per available input directory for a given analysis `id`. Rows sharing the same `id` must have the same `subject_name`. The `filetype` must be one of: +### Column definitions -- `dragen_germline_dir`: directory containing DRAGEN germline outputs (normal sample) -- `dragen_somatic_dir`: directory containing DRAGEN somatic tumor/normal outputs (tumor sample) -- `oncoanalyser_dir`: directory containing HMFtools/Oncoanalyser outputs (e.g., Purple, Linx) +| Column | Description | +| -------------- | ----------- | +| `id` | Unique analysis identifier grouping rows belonging to the same tumour/normal pair. | +| `subject_name` | Subject identifier; must be identical for all rows with the same `id`. | +| `sample_name` | DRAGEN sample label. Used to derive tumour (`dragen_somatic_dir`) and normal (`dragen_germline_dir`) identifiers. | +| `filetype` | One of the supported directory types below. | +| `filepath` | Absolute or relative path to the directory containing the expected files. | -Example: +Example row set: ```csv id,subject_name,sample_name,filetype,filepath @@ -56,34 +37,25 @@ subject_a.example,subject_a,sample_somatic,dragen_somatic_dir,/path/to/dragen_so subject_a.example,subject_a,sample_somatic,oncoanalyser_dir,/path/to/oncoanalyser/ ``` -Column descriptions: - -| Column | Description | -| --------------- | ----------- | -| `id` | Analysis identifier used to group multiple rows belonging to the same subject/run. | -| `subject_name` | Subject identifier. Must be identical across all rows for a given `id`. | -| `sample_name` | Sample identifier. Used to set `normal_id` for `dragen_germline_dir` and `tumor_id` for `dragen_somatic_dir`. | -| `filetype` | One of `dragen_germline_dir`, `dragen_somatic_dir`, `oncoanalyser_dir`. | -| `filepath` | Absolute or relative path to the corresponding directory. | - -An [example samplesheet](../assets/samplesheet.csv) has been provided with the pipeline and matches the format above. +An example sheet is included at `assets/samplesheet.csv`. ### Required directory contents -SASH expects specific files inside each directory referenced in the samplesheet. Paths below are relative to each directory path you provide in `filepath`. +Paths below are relative to the value of `filepath` for each row. The pipeline targets nf-core/oncoanalyser ≥ 2.2.0 +exports. - `dragen_somatic_dir` - - `${tumor_id}.hard-filtered.vcf.gz` and index `${tumor_id}.hard-filtered.vcf.gz.tbi` - - `${tumor_id}.hrdscore.csv` + - `.hard-filtered.vcf.gz` and `.hard-filtered.vcf.gz.tbi` + - Optional: `.hrdscore.csv` (ingested into the cancer report when present) - `dragen_germline_dir` - - `${normal_id}.hard-filtered.vcf.gz` -- `oncoanalyser_dir` (from nf-core/oncoanalyser) - - `amber/` and `cobalt/` directories - - `gridss/${tumor_id}.gridss.vcf.gz` - - `sage/somatic/${tumor_id}.sage.somatic.vcf.gz` (+ `.tbi`) + - `.hard-filtered.vcf.gz` +- `oncoanalyser_dir` + - `amber/` and `cobalt/` directories (coverage inputs for PURPLE) + - `sage_calling/somatic/.sage.somatic.vcf.gz` (+ `.tbi`) + - `esvee/.esvee.ref_depth.vcf.gz` and accompanying directory (used to seed eSVee calling) - `virusbreakend/` directory -Note: SASH runs PURPLE itself; precomputed PURPLE outputs are not required as inputs. +> SASH runs PURPLE internally; precomputed PURPLE outputs are not required as inputs. ## Running the pipeline @@ -97,149 +69,108 @@ nextflow run umccr/sash \ -profile docker ``` -This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles. - -This creates the following in your working directory: +This launches the pipeline with the `docker` configuration profile. The following appear in the working directory: -```bash -work # Directory containing the nextflow working files - # Finished results in specified location (defined with --outdir) -.nextflow_log # Log file from Nextflow -# Other nextflow hidden files, eg. history of pipeline runs and old logs. +``` +work/ # Nextflow working files +results/ # Pipeline outputs (as specified by --outdir) +.nextflow_log # Nextflow run log ``` -If you wish to reuse parameters across runs, specify them in a `yaml` or `json` file via `-params-file `. +### Parameter files and profiles -> ⚠️ Do not use `-c ` to specify parameters as this will result in errors. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), other infrastructural tweaks (such as output directories), or module arguments (args). -> The above pipeline run specified with a params file in yaml format: +Reuse parameter sets via `-params-file params.yaml`: ```bash nextflow run umccr/sash -profile docker -params-file params.yaml ``` -with `params.yaml` containing: - ```yaml input: 'samplesheet.csv' -ref_data_path: '/path/to/reference_data_root' +ref_data_path: '/data/refdata' outdir: 'results/' -<...> -``` - -You can also generate such `YAML`/`JSON` files via [nf-core/launch](https://nf-co.re/launch). - -### Reference data - -SASH reads all required resources from a single base directory passed via `--ref_data_path`. The internal paths and versions are defined in `conf/refdata.config` and include: - -- Genome FASTA/FAI/DICT: `genomes/GRCh38_umccr/...` -- HMF reference data (WiGiTS): `hmf_reference_data/hmftools//...` -- PCGR bundle: `databases/pcgr/v/` -- UMCCR resources: panels, known fusions, PoNs, config files - -Organise your reference data root to match these relative paths, or provide a site config that overrides them. See `subworkflows/local/prepare_reference.nf` and `conf/refdata.config` for details. - -### Updating the pipeline - -When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline: - -```bash -nextflow pull umccr/sash ``` -### Reproducibility +> ⚠️ Avoid using `-c` to pass pipeline parameters. `-c` should only point to Nextflow config files for resource tuning, +> executor settings or module overrides (see below). -It is a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. +You can generate YAML/JSON parameter files through [nf-core/launch](https://nf-co.re/launch) or Nextflow Tower. -First, go to the [umccr/sash releases page](https://github.com/umccr/sash/releases) and find the latest pipeline version (e.g. `0.5.0`). Then specify this when running with `-r` (single hyphen), for example `-r 0.5.0`. +## Reference data -This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports. +All resources are resolved relative to `--ref_data_path` using `conf/refdata.config`. Confirm the directory contains the +expected subpaths (versions may change between releases): -To further assist in reproducbility, you can use share and re-use [parameter files](#running-the-pipeline) to repeat pipeline runs with the same settings without having to write out a command with every single parameter. +- `genomes/GRCh38_umccr/` – GRCh38 FASTA, FAI and dict files plus sequence metadata. +- `hmf_reference_data/` – WiGiTS bundle with PURPLE GC profiles, eSVee panel-of-normals, SAGE hotspot resources, LINX + transcripts and driver catalogues. +- `databases/pcgr/` – PCGR/CPSR annotation bundle. +- `umccr/` – bolt configuration files, driver panels, MultiQC templates, GPGR assets. +- `misc/` – panel-of-normals, APPRIS annotations, snpEff cache and other supporting data. -> 💡 If you wish to share such profile (such as upload as supplementary material for academic publications), make sure to NOT include cluster specific paths to files, nor institutional specific profiles. +Refer to `docs/details.md` for a deeper breakdown of required artefacts. -## Core Nextflow arguments - -> **NB:** These options are part of Nextflow and use a _single_ hyphen (pipeline parameters use a double-hyphen). +## Nextflow configuration ### `-profile` -Use this parameter to choose a configuration profile. Profiles can give configuration presets for different compute environments. - -Several generic profiles are bundled with the pipeline which instruct the pipeline to use software packaged using different methods (Docker, Singularity, Podman, Shifter, Charliecloud, Apptainer, Conda) - see below. - -> We highly recommend the use of Docker or Singularity containers for full pipeline reproducibility, however when this is not possible, Conda is also supported. - -Note that multiple profiles can be loaded, for example: `-profile test,docker` - the order of arguments is important! -They are loaded in sequence, so later profiles can overwrite earlier profiles. - -If `-profile` is not specified, the pipeline will run locally and expect all software to be installed and available on the `PATH`. This is _not_ recommended, since it can lead to different results on different machines dependent on the computer enviroment. - -- `test` - - A profile with a complete configuration for automated testing - - Includes links to test data so needs no other parameters -- `docker` - - A generic configuration profile to be used with [Docker](https://docker.com/) -- `singularity` - - A generic configuration profile to be used with [Singularity](https://sylabs.io/docs/) -- `podman` - - A generic configuration profile to be used with [Podman](https://podman.io/) -- `shifter` - - A generic configuration profile to be used with [Shifter](https://nersc.gitlab.io/development/shifter/how-to-use/) -- `charliecloud` - - A generic configuration profile to be used with [Charliecloud](https://hpc.github.io/charliecloud/) -- `apptainer` - - A generic configuration profile to be used with [Apptainer](https://apptainer.org/) -- `conda` - - A generic configuration profile to be used with [Conda](https://conda.io/docs/). Please only use Conda as a last resort i.e. when it's not possible to run the pipeline with Docker, Singularity, Podman, Shifter, Charliecloud, or Apptainer. +Profiles configure software packaging and cluster backends. Bundled profiles include `test`, `docker`, `singularity`, +`podman`, `shifter`, `charliecloud`, `apptainer` and `conda`. Combine multiple profiles with commas (later entries +override earlier ones). If no profile is supplied, Nextflow expects all software on `$PATH`, which is discouraged. ### `-resume` -Specify this when restarting a pipeline. Nextflow will use cached results from any pipeline steps where the inputs are the same, continuing from where it got to previously. For input to be considered the same, not only the names must be identical but the files' contents as well. For more info about this parameter, see [this blog post](https://www.nextflow.io/blog/2019/demystifying-nextflow-resume.html). - -You can also supply a run name to resume a specific run: `-resume [run-name]`. Use the `nextflow log` command to show previous run names. +Resume cached work by adding `-resume`. Nextflow matches stages using both file names and content; keep inputs identical +for cache hits. Supply a run name to resume a specific execution: `-resume `. Use `nextflow log` to list +previous runs. ### `-c` -Specify the path to a specific config file (this is a core Nextflow command). See the [nf-core website documentation](https://nf-co.re/usage/configuration) for more information. +`-c custom.config` loads additional Nextflow configuration (eg. executor queues, resource overrides, institutional +profiles). See the [nf-core configuration docs](https://nf-co.re/docs/usage/configuration) for examples. ## Custom configuration ### Resource requests -Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher requests (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. +Default resources suit typical datasets, but you can override CPUs/memory/time through custom config files. Many modules +honour nf-core’s automatic retry logic: certain exit codes trigger resubmission at 2× and 3× the original resources +before failing the run. Refer to the nf-core guides on +[max resources](https://nf-co.re/docs/usage/configuration#max-resources) and +[tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources). -To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website. +### Custom containers -### Custom Containers +nf-core pipelines default to Biocontainers/Bioconda images. You can override container or conda package selections in +config to use patched or institutional builds. See the +[updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section for patterns. -In some cases you may wish to change which container or conda environment a step of the pipeline uses for a particular tool. By default nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However in some cases the pipeline specified version maybe out of date. +### Custom tool arguments -To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website. - -### Custom Tool Arguments - -A pipeline might not always support every possible argument or option of a particular tool used in pipeline. Fortunately, nf-core pipelines provide some freedom to users to insert additional parameters that the pipeline does not include by default. +If you need to provide additional tool parameters beyond those exposed by pipeline options, set `process.ext.args` +(overrides per-module) or leverage module-specific hooks documented in nf-core. Review `conf/modules.config` for +supported overrides in umccr/sash. ## Outputs -See detailed descriptions in `docs/output.md`. Top-level results include PCGR/CPSR HTML reports, cancer report, LINX/PURPLE artefacts, and a MultiQC summary. +See `docs/output.md` for a full description of generated artefacts (PCGR/CPSR HTML, cancer report, LINX, PURPLE, MultiQC +and supporting statistics). ## Running in the background -Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished. +Nextflow supervises submitted jobs; keep the Nextflow process alive for the pipeline to finish. Options include: -The Nextflow `-bg` flag launches Nextflow in the background, detached from your terminal so that the workflow does not stop if you log out of your session. The logs are saved to a file. - -Alternatively, you can use `screen` / `tmux` or similar tool to create a detached session which you can log back into at a later time. -Some HPC setups also allow you to run nextflow within a cluster job submitted your job scheduler (from where it submits more jobs). +- `nextflow run ... -bg` to launch detached and log to `.nextflow.log`. +- Using `screen`, `tmux` or similar to keep sessions alive. +- Submitting Nextflow itself through your scheduler (eg. `sbatch`), where it will launch child jobs. ## Nextflow memory requirements -In some cases, the Nextflow Java virtual machines can start to request a large amount of memory. -We recommend adding the following line to your environment to limit this (typically in `~/.bashrc` or `~./bash_profile`): +The Nextflow JVM can request substantial RAM on large runs. Set an upper bound via environment variables, typically in +`~/.bashrc` or `~/.bash_profile`: ```bash -NXF_OPTS='-Xms1g -Xmx4g' +export NXF_OPTS='-Xms1g -Xmx4g' ``` + +Adjust limits to suit your environment.