Sequence alignment with MAFFT or Muscle #3

rvosa · 2020-04-11T11:50:05Z

The experiences detailed here (nextstrain/ncov#268) show that doing the MSA in one big run eventually becomes prohibitive. This was not a problem for the 400 GenBank genomes set, but as those submissions are increasing (or when we add GISAID data) it becomes an issue.

MAFFT has the virtue of being the standard that is now being used (e.g. by Rambaut et al.) but it might be slower than Muscle (@rvosa's subjective experience)? Both can be run on the CIPRES cluster. Test and decide.

rvosa · 2020-04-13T17:19:48Z

cipresrun \
     -y data/cipres_appinfo.yml \
     -t MAFFT_XSEDE \
     -p vparam.anysymbol_=1 \
     -i data/genomes/sars-cov-2.fasta \
     -o data/genomes/output.mafft

rvosa added this to the Full workflow milestone Apr 11, 2020

This was referenced Apr 11, 2020

Rerunnable workflow as CWL #5

Open

Push container to Docker hub #6

Closed

rvosa changed the title ~~Decide on alignment: MAFFT vs Muscle, decomposed or one big run?~~ Sequence alignment with MAFFT or Muscle Apr 11, 2020

rvosa closed this as completed Apr 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequence alignment with MAFFT or Muscle #3

Sequence alignment with MAFFT or Muscle #3

rvosa commented Apr 11, 2020 •

edited

Loading

rvosa commented Apr 13, 2020 •

edited

Loading

Sequence alignment with MAFFT or Muscle #3

Sequence alignment with MAFFT or Muscle #3

Comments

rvosa commented Apr 11, 2020 • edited Loading

rvosa commented Apr 13, 2020 • edited Loading

rvosa commented Apr 11, 2020 •

edited

Loading

rvosa commented Apr 13, 2020 •

edited

Loading