Reproducible Python pipeline for genomic data analysis. Performs single-cell copy number variation calling by learning the underlying tumour evolution history by state-of-the-art phylogenetic tree reconstruction method: SCICoNE. The pipeline is built using Python, Conda environment management system and the Snakemake workflow management system. The pipeline starts from the raw sequencing files and a settings file for the parameter configurations. After the analysis, it produces a report and multiple figures to inform the treatment decision of the cancer patient.
The pipeline makes use of scgenpy
, a package that exposes functions for preprocessing, postprocessing and plotting data, allowing you to interact with data outside the pipeline context.
- Clone the repository
- Install and update using
pip
:
pip install -e .
- Install SCICoNE.
- Prepare the configuration file according to your analysis
- Run
snakemake
with:
snakemake --configfile your_config_file
- (Optional) Refer to https://snakemake.readthedocs.io to customise your
snakemake
for your environment
You are very welcome to contribute! You can start with the existing issues or create new issues. Make sure to follow the CI checks. Use the pre-commit hook defined in the project to meet the code style. If you are adding new functionality, add the corresponding test as well in order to keep the code coverage high.
This project is licensed under the Apache License - see the LICENSE file for details