Since T.H. Morgan and his associates in the famous Fly Room began their foundational work in genetics using the mighty fruit fly Drosophila melanogaster, numerous strains of D. melanogaster with diverse genetic backgrounds have been used in laboratories worldwide. These include transgenic strains, deficiency strains, RNAi, genome editing, and balancers, as well as wild-type strains (e.g., Oregon-R, w1118, Canton-S). The genetic background of these strains differs from the reference strain ISO1 and is unknown. These uncharacterized differences confound the interpretation of experiments investigating the genotype-phenotype relationship using these non-reference laboratory strains. To solve this problem, we introduce the Drosophila Laboratory Pangenome Database (DLPD), a collection of ever-growing reference genome assemblies of popular D. melanogaster laboratory strains. Although we will release eleven genome assemblies initially, more will be added in the future (depending on the feedback from the community).
Do you work with a popular strain of D. melanogaster that doesn't have a high-quality reference genome assembly? Please submit the following form to request that we sequence your strain of interest for inclusion in our database: Google Forms DLPD Request (Strains requested by multiple labs will be prioritized)
You can find the genome assemblies under the following google drive link: https://drive.google.com/drive/folders/1NiBAB0Nvd9a2Wd0-d5jWRBmXSUuGFpvj
Stay tuned as we are planning on hosting these assemblies on a genome browser for ease of use and access soon. Meanwhile, please send any requests for the raw reads to either tdmillar@tamu.edu or mahul@tamu.edu
Strain | N50* (Mb) | L50* | Significance |
---|---|---|---|
ISO1 (v6.53) | 21.4 | 3 | The primary reference assembly for D. Melanogaster |
BL5905 | 24.21 | 3 | W1118 wild type strain |
BL3605 | 24.21 | 4 | W1118 wild type strain |
BL5 | 22.97 | 4 | Oregon-R-C wild type strain |
BL64349 | 24.18 | 3 | Canton-S wild type strain |
BL36303 | 24.16 | 3 | phiC31 integrase-mediated transformation |
BL36304 | 23.93 | 4 | phiC31 integrase-mediated transformation |
BL54591 | 23.63 | 3 | Expresses Cas9 protein under control of nanos regulatory sequences |
BL25211 | 24.46 | 3 | Used in modENCODE functional genomics experiments |
BL8765** | 24.58 | 3 | GAL4 expression in the nervous system and CyO balancer |
BL3954** | 22.26 | 4 | GAL4 expression driven by Act5C promoter, TM6B balancer |
BL36283** | 22.91 | 4 | Piggybac mobilization, FRT site, balancers FM7a, and TM3 |
BL4737 | 24.31 | 3 | D. simulans strain. Produces fertile female offspring when crossed with D. melanogaster |
*N50 and L50 are measures used to evaluate the quality of genome assemblies. The contig N50 is a value in megabase pairs which quantifies how well the assembly process has pieced together the genome. Specifically, 50% of the assembly is found in contigs (pieces) that are N50_value or longer. The L50 represents the number of contigs that represent the same 50% of the genome assembly. ISO1 reference assembly statistics are included for reference.
**Hi-C contact data is being used to phase and improve de novo genome assemblies for the balancer chromosomes
We are writing a manuscript describing DLPD, and we'll post the citation here once the paper is ready. Meanwhile, you can use this resource for your work. Please let us know if you want to publish results utilizing this resource before we have a manuscript.