Skip to content
A Toolkit for ARB to Integrate Custom Databases and Externally built Phylogenies
Branch: master
Clone or download
Pull request Compare This branch is 3 commits ahead, 37 commits behind EESI:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
GetDatabase
ListAccessions.txt
MFS.tree
MFS_Align.fasta
MFS_Field_labels.txt
MFS_UID.fasta
MFS_UID.tree
MFS_import_filter.ift
MFS_metaData.txt
README
Supplementary_Final.pdf
TreeLabels_Mapped_New.txt
Treelabels_Orig.txt
buildFilter.py
cddsrv.cn3
cddsrv.cn4
getAccession.py
rename_tree_leaves.py

README

Tutorials using ARB_Toolkit are available via:
Supplementary_Final.pdf
http://www.ece.drexel.edu/gailr/EESI/tutorial.php

Description:
Researchers are perpetually amassing biological sequence data. The computational approaches employed by ecologists for organizing this data (e.g. alignment, phylogeny, etc.) typically scale nonlinearly in execution time with the size of the dataset. This often serves as a bottleneck for processing experimental data since many molecular studies are characterized by massive datasets. To keep up with experimental data demands, ecologists are forced to choose between continually upgrading expensive in-house computer hardware or outsourcing the most demanding computations to the cloud. Outsourcing is attractive since it is the least expensive option, but does not necessarily allow direct user interaction with the data for exploratory analysis. Desktop analytical tools such as ARB are indispensable for this purpose, but they do not necessarily offer a convenient solution for the coordination and integration of datasets between local and outsourced destinations. Therefore, researchers are currently left with an undesirable tradeoff between computational throughput and analytical capability. To mitigate this tradeoff we introduce a software package to leverage the utility of the interactive exploratory tools offered by ARB with the computational throughput of cloud-based resources. Our pipeline serves as middleware between the desktop and the cloud allowing researchers to form local custom databases containing sequences and metadata from multiple resources and a method for linking data outsourced for computation back to the local database. A tutorial implementation of the toolkit is provided in the supplementary material.

Contributors:
Steve Essinger
Erin Reichenberger
Chris  Blackwood
Gail Rosen
You can’t perform that action at this time.