A script that aligns fasta files using mafft and evaluates the alignments using Gblocks. It returns a score that indicates each alignment's quality based on Gblocks result.
This script depends on the following software to run:
The local paths to these tools must be defined in the config file.
All the above tools and packages can be easily installed via the conda
environment.
Argument | Description |
---|---|
-c |
(full) path to config file |
-t |
type of sequences used [protein, dna] |
AlScore needs a config file to run. This file will contain the paths to required tools (mafft, Gblocks) as well as the path to directory that contains the fasta files to be analysed. Note that all fasta files must end with the suffix .fasta
Please change the provided config.txt
file accordingly before running your own analysis.
python AlScores.py -c config.txt -t dna
The output file will be called final_scores.tsv
and will contain two columns: filename and score.
The score can range from 0 to 1, with higher score meaning better aligment quality.
Who
Paschalis Natsidis, PhD candidate (p.natsidis@ucl.ac.uk);
Where
Telford Lab, UCL;
ITN IGNITE;
When
October 2019;