BatchPAML

Introduction

This is the script for identifying positive selection on a specific branch for batches of gene families using PAML branch-site model.

requirement

Biopython == 1.79
paml == 4.9j
macse == v2.04
Pandas == 1.24

ps: Other versions should work as well

How to use

python BatchPAML.py -h

Pattern "universe"

For universal use, you need provide a file contain two columns without header:

Family_name	MSA_file
Family1	Family1_aligned.fasta
Family2	Family2_aligned.fasta

I recommend using macse

Pattern "orthofinder"

This pattern was designed for the single copy family identified in OrthoFinder. You just need to prepare a fasta file contain all corresponding cds sequences for the protein sequences used in OrthoFinder and specify some results file from OrthoFinder.
ps: The species tree constructed by OrthoFinder is OK. There is no need to use gene tree for each gene family.

output

The result file contain two columns without header

Family_name	p-value
Family1	1.0
Family2	0.5921

Advantages

convert MSA in fasta format to paml format automatically.
unroot the rooted tree automatically
Multi thread parallel
Allow specify the type of codon table manually (using NCBI No.)

Notes

You need mark the foreground branch in the tree manually using "#1". Please do not insert space between species name and the marker
example:
(((Human#1, chimpanzee),Fish),Fly);✓
(((Human #1, chimpanzee),Fish),Fly);✗
Though this script search the paml bin in PATH and should be cross-platform, I recommand you specify the path of binary manually.
The multiple sequence alignment (MSA) must be in fasta format.
if the MSA has potential frameshift, then this family will be skipped.
The process file are stored in 'BatchPAML_Results' in working directory

Future

Allow MSA file in paml format directly
Fix frameshift problem
More flexible file storage path
Maybe a part of a comparative genome analysis pipeline

Contact

If you have any problem or advice pleas feel free to contact me by njbxhzy at hotmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
BatchPAML.py		BatchPAML.py
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
gadget.py		gadget.py
interface.py		interface.py
paml.py		paml.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BatchPAML

Introduction

requirement

How to use

Pattern "universe"

Pattern "orthofinder"

output

Advantages

Notes

Future

Contact

About

Releases

Packages

Languages

License

Hua-CM/BatchPAML

Folders and files

Latest commit

History

Repository files navigation

BatchPAML

Introduction

requirement

How to use

Pattern "universe"

Pattern "orthofinder"

output

Advantages

Notes

Future

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages