Profiling expansions in koala population seq data 🐨
The pipeline described in this repository was created to profile the polyglutamine and polyalanine repeat expansion lengths in the RUNX2, ZIC2, FOXL2, and ARX genes across a sequenced koala population. It is associated with the Repeat expansions associated with human disease are present in diverse organisms pub from Arcadia Science.
First create a conda environment using the envs/environment.yml
file and activate your new environment
This repository uses snakemake to run the pipeline and conda to manage software environments and installations. You can find operating system-specific instructions for installing miniconda here. After installing conda, run the following commands to create and activate an environment for running this pipeline:
conda env create -f envs/environment.yml -n koala
conda activate koala
The inputs/samples.tsv
file contains all of the necessary information for the samples that will go through this workflow. If you want to run this workflow on a different set of files, you will need to edit this table. Each bam file of interest should have a single row in the table. These two columns are required for running the pipeline:
AWSFileName
: File name from this table.
AWSFolderName
: Corresponding folder from this table.
snakemake -p -j 8 --use-conda
-p
tells Snakemake to print the shell commands that it runs for each rule, which can be useful for understanding what is happening/debugging.
-j
tells Snakemake how many cores to use for executing jobs in parallel. Adjust to your system.
--use-conda
tells Snakemake to utilize Conda environments to manage dependencies for individual rules.
The --rerun-incomplete
flag can be appended to the above command if the pipeline terminates prematurely for some reason and needs to be restarted.
A table of the polyA expansion lengths for RUNX2, ZIC2, FOXL2, and ARX. PolyQ expansion is also counted for RUNX2.
sample | length | gene_aminoacid |
---|---|---|
Kyogle_F_50433 | 22 | RUNX2_a |
Kyogle_F_50433 | 18 | ZIC2_a |
Kyogle_F_50433 | 16 | FOXL2_a |
Kyogle_F_50433 | 19 | ARX_a |