PaPaRa 2.0 with MPI

Implementation of the PaPaRa 2.0 algorithm, with rudimental MPI support. See http://sco.h-its.org/exelixis/web/software/papara/index.html for details.

Basic Instructions

Build with sh build_papara2.sh which will produce the executable papara, or with sh build_papara2_mpi.sh which will produce the executable papara_mpi, if you want MPI support. To be able to compile the sources you need a resonably recent version of the boost (www.boost.org) libraries.

Invoke PaPaRa using

./papara -t <ref tree> -s <phylip RA> -q <fasta QS>

The phylip file (option -s <phylip RA>) must contain the reference alignment, consistent with the reference tree (option -t <ref tree>). The FASTA file (option -q <fasta QS>) contains the unaligned QS. Optionally, all sequences which are in <phylip RA> but do not occur in the <ref tree> are also interpreted as QS.

The alignment parameters can be modified using the (optional) option -p <user options>. <user options> is a string and must have the following form: <gap_open>:<gap_extend>:<mismatch>:<match_cgap>, so the default parameters used given in the paper correspond to the user option -p -3:-1:2:-3.

The output alignment will be written to papara_alignment.default; you can change the file suffix "default" by supplying a run-name with parameter -n. You can invoke the multi threaded version by adding the option -j <num threads>.

MPI Support

This version is intended for cluster usage if you want to align multiple files of roughly the same size at the same time. It is ideal for equally sized chunks of bigger alignments.

The MPI version is invoked using

mpirun -n T ./papara_mpi -t <ref tree> -s <phylip RA> -q <fasta QS 1>[,<fasta QS 2>...]

where T is the number of MPI nodes to run on. Note that each node can still use multiple threads, so you can combine this with the -j option.

The fasta query files need to be separated by commas, without whitespaces. This means that your filenames cannot contain commas. If you specify fewer files than MPI nodes, the surplus nodes will do nothing. Caveat: If you however have more files than nodes, the surplus files will not be processed! This will hopefully change in the future - right now, this is a crude first MPI version of PaPaRa. Also be aware that the logging output of all nodes is mashed up, thus usually unreadable. Use the per-node log files to see what each node outputs.

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
ivy_mike @ 3269b7b		ivy_mike @ 3269b7b
old		old
ublasJama-1.0.2.3		ublasJama-1.0.2.3
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
align_pvec_vec.h		align_pvec_vec.h
align_utils.cpp		align_utils.cpp
align_utils.h		align_utils.h
align_vec.h		align_vec.h
blast_partassign.cpp		blast_partassign.cpp
blast_partassign.h		blast_partassign.h
build_extern.sh		build_extern.sh
build_papara2.sh		build_papara2.sh
build_papara2_mpi.sh		build_papara2_mpi.sh
build_papara2_static.sh		build_papara2_static.sh
call_main2.cpp		call_main2.cpp
dtw.cpp		dtw.cpp
dtw.h		dtw.h
dump_anc_probs.cpp		dump_anc_probs.cpp
epa_extract_qs_covered.cpp		epa_extract_qs_covered.cpp
fasta_random_sample.cpp		fasta_random_sample.cpp
fasta_random_sample2.cpp		fasta_random_sample2.cpp
fasta_to_phy.cpp		fasta_to_phy.cpp
inherit_test.cpp		inherit_test.cpp
main.cpp		main.cpp
math_approx.h		math_approx.h
nacl_32.cmake		nacl_32.cmake
pairwise_seq_distance.cpp		pairwise_seq_distance.cpp
pairwise_seq_distance.h		pairwise_seq_distance.h
papara.cpp		papara.cpp
papara.h		papara.h
papara2_main.cpp		papara2_main.cpp
pars_align_gapp_seq.cpp		pars_align_gapp_seq.cpp
pars_align_gapp_seq.h		pars_align_gapp_seq.h
pars_align_seq.cpp		pars_align_seq.cpp
pars_align_seq.h		pars_align_seq.h
parsimony.cpp		parsimony.cpp
parsimony.h		parsimony.h
phy_cut_partition.cpp		phy_cut_partition.cpp
phy_megamerge.cpp		phy_megamerge.cpp
phy_to_fasta.cpp		phy_to_fasta.cpp
prepare_extern.sh		prepare_extern.sh
propara.cpp		propara.cpp
pvec.cpp		pvec.cpp
pvec.h		pvec.h
pw_dist.cpp		pw_dist.cpp
raw_to_phy.rb		raw_to_phy.rb
raxml_interface.cpp		raxml_interface.cpp
raxml_interface.h		raxml_interface.h
ro_submodules.sh		ro_submodules.sh
sequence_model.cpp		sequence_model.cpp
sequence_model.h		sequence_model.h
small.tree		small.tree
smith_waterman.cpp		smith_waterman.cpp
stepwise_addition.cpp		stepwise_addition.cpp
stepwise_addition_gapp.cpp		stepwise_addition_gapp.cpp
stepwise_addition_pro.cpp		stepwise_addition_pro.cpp
stepwise_align.h		stepwise_align.h
tar_release.sh		tar_release.sh
tbb.cpp		tbb.cpp
test_bitset.cpp		test_bitset.cpp
testbench.cpp		testbench.cpp
tree_similarity.cpp		tree_similarity.cpp
tree_similarity.h		tree_similarity.h
tree_utils.h		tree_utils.h
vec_unit.h		vec_unit.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaPaRa 2.0 with MPI

Basic Instructions

MPI Support

About

Releases

Packages

Languages

License

lczech/papara_nt

Folders and files

Latest commit

History

Repository files navigation

PaPaRa 2.0 with MPI

Basic Instructions

MPI Support

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages