Skip to content

Latest commit

 

History

History
55 lines (42 loc) · 1.77 KB

README.md

File metadata and controls

55 lines (42 loc) · 1.77 KB

Cadmium paper scripts

These scripts download, process, analyze and generate the figures for the "Integrating MNase-seq and RNA-seq time series data to study chromatin and transcription dynamics under cadmium stress" paper.

Setup

Create conda environment to establish required libraries for the scripts.

conda env create --file cadmium_env.yml

Setup a config.py for the scripts. Using config.example.py as a starting point.

Download the MEME-Suite and its motif database.

Download and extract the R64 - sacCer3 reference genome

Make sure to configure your config.py with the correct paths:

FIMO_PATH = '/path/to/fimo'
FIMO_GENOME_FSA = "path/to/sacCer3/genome.fsa"
MACISAAC_MEME_PATH = 'path/to/macisacc_yeastdata/fimo/macisaac_yeast.v1.meme'
SACCER3_REFERENCE = 'path/to/extract/sacCer3/files/'

Usage

Command-line usage: Scripts can be run straight from the command-line, in which case set USE_SLURM=False

python 1_data_initialize.py
python 2_data_preprocessing.py
python 4_analysis.py
python 3_chrom_metrics.py
python 5_figures.py

Slurm submission: Scripts can be run straight from the command-line, in which case set USE_SLURM=True. And the following variables need to be filled in inside config.py:

USE_SLURM = True
SLURM_WORKING_DIR
CONDA_PATH
CONDA_ENV

Submit to slurm queue using sbatch:

sbatch -D </path/to/slurm/logs> \
    exports=SLURM_WORKING_DIR=</path/to/python/scripts/>,CONDA_PATH<path/to/conda.sh>=c,CONDA_ENV=<conda_env_name> \
    scripts/run_pipeline.sh