Skip to content
This repository has been archived by the owner on Sep 15, 2022. It is now read-only.
/ recombination Public archive

A collection of scripts for the detection and visualisation of recombinants from SARS-CoV-2 sequence data.

License

Notifications You must be signed in to change notification settings

fischer-hub/recombination

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recombination

A collection of scripts for the detection and visualisation of recombinants from SARS-CoV-2 sequence data.

Visualisation

To run the visualisation script, just call the script with the directory containing your FASTA or VCF files as its first argument from the visual directory, e.g.:

~/recombination/visual$ ./visual.sh ../data

An output directory can be provided as an optional second argument to the script (default dir is recombination/output).

In case you call visual.sh on a FASTA file(s) the script will try to run covSonar on the file(s) to generate a VCF file for visualisation.

Since covSonar does not provide the allel frequency field in its VCF output, the allel frequency is estimated from the provided AC and AN fields from the VCF file.

The VCF files are then annotated using snpEff and formatted with SnpSift for later visualization in the Heatmap.R script.

NOTE: Both covSonar and snpEff use the SARS-cov2 isolate Wuhan-Hu-1 complete genome as reference for variant detection.

NEW:

  • Added support for multifasta files, eg.: ~/recombination/visual$ ./visual.sh -m my_multi_fasta_file.fasta
  • Added support for metadata sheets (tsv), e.g.: ~/recombination/visual$ ./visual.sh -m my_multi_fasta_file.fasta -t my_metadata_file.tsv. If provided the heatmap will sort sequences by the date associated with them in the metadata file.
  • The script now makes heatmaps for all sequences merged into one VCF, so you get the different allel frequencies, but also a heatmap with all sequences seperately where allel frequency is either 1 or 0.
  • Genomemap plots are currently not working. Please use the code from branch old to generate these plots. NOTE: Everything listed under NEW is not available there.

About

A collection of scripts for the detection and visualisation of recombinants from SARS-CoV-2 sequence data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published