Harvey Feng edited this page Jul 9, 2015 · 29 revisions

eXpress-D

In eXpress-D, we've implemented a distributed EM solution to the fragment assignment problem using Spark, a data analytics framework that can scale by leveraging compute clusters within datacenters - "the cloud". eXpress-D is based on the model of eXpress, but has better accuracy due to its use of the batch EM for optimization.

The selections below offer guides to running eXpress-D locally and on EC2. More advanced users can check out the tuning and configuration guides, which document eXpress-D, Spark, and JVM parameters that can be used to optimize eXpress-D performance.

User Documentation | More Information

User Documentation

All issues and source code is monitored through GitHub (i.e commits, pull requests). For any questions about these guides, or eXpress-D in general, you can post on the eXpress users group.

Setting Up and Running eXpress-D

Running on EC2: Launch EC2 clusters and run eXpress-D on them.

Notes on Configuration and Tuning

Publications

Roberts A (2013). Ambiguous fragment assignment for high-throughput sequencing experiments. EECS Department, University of California, Berkeley. [link]

Roberts A, Feng H, and Pachter L (2013). Fragment assignment in the cloud with eXpress-D. BMC Bioinformatics. [link]

Roberts A (2013). Thesis: Ambiguous fragment assignment for high-throughput sequencing experiments. EECS Department, University of California, Berkeley. [link]

Roberts A and Pachter L (2013). Streaming fragment assignment for real-time analysis of sequencing experiments. Nature Methods. [link]

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.