Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
In eXpress-D, we've implemented a distributed EM solution to the fragment assignment problem using Spark, a data analytics framework that can scale by leveraging compute clusters within datacenters - "the cloud". eXpress-D is based on the model of eXpress, but has better accuracy due to its use of the batch EM for optimization.
The selections below offer guides to running eXpress-D locally and on EC2. More advanced users can check out the tuning and configuration guides, which document eXpress-D, Spark, and JVM parameters that can be used to optimize eXpress-D performance.
All issues and source code is monitored through GitHub (i.e commits, pull requests). For any questions about these guides, or eXpress-D in general, you can post on the eXpress users group.
Running on EC2: Launch EC2 clusters and run eXpress-D on them.
Roberts A (2013). Ambiguous fragment assignment for high-throughput sequencing experiments. EECS Department, University of California, Berkeley. [link]
Roberts A, Feng H, and Pachter L (2013). Fragment assignment in the cloud with eXpress-D. BMC Bioinformatics. [link]
Roberts A (2013). Thesis: Ambiguous fragment assignment for high-throughput sequencing experiments. EECS Department, University of California, Berkeley. [link]
Roberts A and Pachter L (2013). Streaming fragment assignment for real-time analysis of sequencing experiments. Nature Methods. [link]