Skip to content
This repository has been archived by the owner on Jul 1, 2024. It is now read-only.

Latest commit

 

History

History
36 lines (25 loc) · 1.44 KB

README.md

File metadata and controls

36 lines (25 loc) · 1.44 KB

SARSCoV2_Secstruct_Cons

Code accompanying the preprint: RNA genome conservation and secondary structure in SARS-CoV-2 and SARS-related viruses https://doi.org/10.1101/2020.03.27.012906

Overview

Top-level python scripts run conservation and secondary structure analysis:

  • conservation.py finds conserved intervals in SARS-related viruses and SARS-CoV-2 sequences
  • unstructured.py finds unstructured intervals, and conserved unstructured intervals
  • rnaz_analysis.py analyzes the RNAz screen data and compiles conserved structured intervals
  • alifoldz_analysis.py prepares alignment windows for rscape and alifoldz analysis, and compares alifoldz hits with those from RNAz

The alignments folder includes starting alignments of SARS-related and SARS-CoV-2 sequences.

The rnaz_data folder includes output from a genome-wide RNAz screen on SARS-related viruses.

The alifoldz folder includes output from alifoldz analysis.

The rscape folder includes output from rscape analysis.

The scanfold_data folder includes ScanFold output from Andrews, et al. bioRXiv 2020

The example_results folder includes example output files from the top-level python scripts, which should be reproduced by running the scripts.

Prerequisites

python packages in (pip install requirements.txt):

  • scipy
  • numpy
  • biopython

External Daslab dependencies:

  • arnie
  • Contrafold 2.0 is used for secondary structure calculations

External packages:

  • R-scape v1.4.0