Robert James Schaefer, PhD
- Email: firstname.lastname@example.orgemail@example.comfirstname.lastname@example.org
- Github: github.com/schae234
- Twitter: @CSciBio
- OrcidID: 0000-0001-9455-5805
Appointments and Positions
- 2017 - Current: Founder of Linkage Analytics, Denver, CO
- 2018 - Current: Postdoctoral Associate, Department of Population Medicine, University of Minnesota
- 2016-2018: USDA NIFA Postdoctoral Research Fellow, Department of Veterinary Population Medicine, University of Minnesota
- 2015 - 18: USDA NIFA AFRI Postdoctoral Fellow, University of Minnesota, St Paul, MN
- 2010 - 15: Ph.D., Biomedical Informatics and Computational Biology, University of Minnesota, Minneapolis, MN
- 2006 - 10: B.S., Computer Science, University of Minnesota, Institute of Technology, Minneapolis, MN
Community and Service
- Guest Associate Editor for Frontiers in Integrative Genomics and Network Biology in Livestock and other Domestic Animals
- Host and Mentor for Mozilla Open Leadership Program
- Co-Chair of FAANG Integrative Genomics and Network Biology Working Group
- Site Host for Mozilla Global Code Sprint (4 years)
- Figshare Ambassador
- 2018 (PI): $7,445 - XSEDE (TG-MCB180114) - Cloud based containers and workflows for genomic tools in the horse.
- 2017 - Current (Co-PI): USDA NIFA (2017-67015-26296) - Tools to Link Phenotype to Genotype in the Horse.
- 2015 - 2018 (PI) : $150,000 - USDA NIFA Postdoctoral Fellowship - Discovering causal variants for complex disease using functional networks in the horse
One of the consequences of next generation DNA sequencing technology has been the rapid adoption in non-model organisms. This surge in technology has enabled us to examine highly complex traits in non-model organisms which have substantial affect on human well being. Techniques for analyzing and applying information derived from next generation sequencing data do not always directly translate to non-model species. My research interests lie in developing novel approaches for next generation sequencing data coupled with machine learning and statistical modeling in order to extract meaningful insight rooted in complex biological systems.
Discovering causal variants for complex disease using functional networks in the horse
The recent availability of high throughput technologies in agricultural animals provides an opportunity to advance our understanding of complex, agriculturally important traits. Genome wide association studies have identified thousands of loci linked to agriculturally important traits; however in most cases the causal gene remains unknown. Assessing a single data type can often miss complex models that require variation across multiple levels of biological regulation. Integrating several sources of unbiased, genomic information allows for efficient ranking of interesting candidate regions discovered by GWAS. We are building tools to integrate available sources of genomic data in the horse to build a multi-staged data integration model for prioritization of QTL candidate genes. Using these tools, we are investigating Equine Metabolic Syndrome (EMS) in a disease specific (case-control) meta-dimensional model by integrating whole genome SNP data, muscle and adipose RNAseq, and metabolomic data from horses phenotyped for EMS.
We hypothesize an integrated, network based approach will better explain the genotype to phenotype relationship of EMS than any single dataset alone. Linking phenotype to causal genes is critical to understanding the biology underlying traits, and in the context of disease, the identification of potential preventative measures and therapeutic targets. The results of this study have the potential to substantially expand our understanding of the molecular and genetic factors that contribute to the pathophysiology of EMS, and improve our ability to predict disease risk. Furthermore, since this approach is generalizable to any phenotype of interest, our long term goal is to develop tools that allow integration of genomic and other high-dimensional datasets to better understand complex phenotypic traits and extend them to other agricultural animals. We will deploy these tools using Cyverse as a developmental platform ensuring any research group generating association data will be able to use our tools.
Integrating Co-Expression Networks with GWAS to Detect Causal Genes For Agronomically Important Traits
High-throughput technologies in agricultural species provides an opportunity to advance our understanding of complex, agronomically important traits. Genome wide association studies (GWAS) have identified thousands of loci linked to these traits; however in most cases the causal genes remain unknown. Analysis of a single data type is typically unsatisfactory in explaining complex traits that exhibit variation across multiple levels of biological regulation. To address these issues, we developed a computational framework called Camoco (Co-analysis of molecular components) that systematically integrates loci identified by GWAS with gene co-expression networks to identify a focused set of candidate loci with functional coherence. This framework analyzes the overlap between candidate loci generated from GWAS and the co-expression interactions that occur between them and addresses several biological considerations important for integrating diverse data types. On average, using this integrated approach, candidate gene lists identified by GWAS were reduced by two orders of magnitude. By incorporating co-expression network information, we can rapidly evaluate hundreds of GWAS experiments, producing focused sets of candidates with both strong associations with the phenotype of interest as well as evidence for functional coherence in the co-expression network. Identifying these candidates in a systematic and integrated manner is an important step toward resolving genes responsible for agriculturally important traits.
Developing a high density genotyping chip and imputation resource for the domestic horse
Despite the extraordinary decline in cost of genotyping, the costs associated with sequencing large populations of individuals is still prohibitive. Additionally, since the vast majority of any individual's genome is identical, most population scale studies have focused on the areas of the genome which strictly have variation. In the domestic horse, we sequenced the genomes of 156 horses to survey genetic diversity within and between 32 different breeds. Out of the over 32 million variants we discovered, we used signal processing techniques to select 2 million and 670 thousand variants which provided the most information. Working with Thermo-Fisher Scientific, a commercial genotyping manufacturer, we designed a commercial array which will be available for the entire equine community. Additionally, since we chose the most informative variants, we can leverage the structure of the genome to perform genotype imputation on the thousands of arrays which currently exist at lower densities. We have shown that with a sufficiently large number of samples, you can effectively impute hundreds of thousands of genetic markers starting from the current density of 54 thousand markers.
Awards and Honors
- November 2018
- Awarded the Equine Geneome Workshop Travel Grant to present research at the 2019 Plant and Animal Genome Conference
- January 2017
- Awarded UMII Updraft grant to attend a Network Biology Meeting in Cold Spring Harbor
- Awarded Microbial and Plant Genomics Institute travel grant to present research at the Maize Genetics Conference
- November 2016 Awarded FAANG Travel Grant for the 2017 Plant and Animal Genome Conference
- June 2016 Invited to Mozilla All Hands Meeting to discuss open science
- March 2016 UMII-Updraft Data wrangling grant proposal funded
- January 2016 Awarded USDA NIFA Postdoctoral Fellowship
- November 2015 Awarded FAANG Workshop Award for PAG 2016
- June 2015 Awarded Microbial and Plant Genomics Institute (MPGI) Travel Grant Award
- May 2015 Awarded COGS Travel Award to travel to Havemeyer Equine Gene Mapping Meeting
- November 2014 Awarded Neal A. Jorgenson Travel Award for the Equine Workshop to present research at the 2015 Plant and Animal Genomics meeting.
- April 2014 Received Doctoral Dissertation Fellowship (DDF) from the University of Minnesota Graduate Fellowship Committee.
- December 2013 Awarded Equine Genome Workshop Travel Award granted to present research at the 2014 Plant and Animal Genomics meeting.
- April 2013 Received travel grant from the iHUB collaborative for student exchange at the Leibniz Plant Science Institute in Gatersleben Germany.
- February 2012 Received travel grant from the Microbial and Plant Genomics Institute in order to present a poster and travel to the annual maize genetics conference.
- September 2010, September 2011 Received the University of Minnesota Interdisciplinary Informatics Initiative fellowship.
Previous Professional Positions and Appointments
- 2013-2014: Teaching Assistant, Dept. of Computer Science, CSCI 3003: Introduction to Computation in Biology, University of Minnesota, MN
- 2008-2010: Research Assistant, Dept. of Animal Science and Veterinary Medicine, University of Minnesota, MN
- Sabine Felkel, Claus Vogl, Doris Rigler, Viktoria Dobretsberger, Bhanu P Chowdhary, Ottmar Distl, Ruedi Fries, Vidhya Jagannathan, Jan E Janečka, Tosso Leeb, Gabriella Lindgren, Molly McCue, Julia Metzger, Markus Neuditschko, Thomas Rattei, Terje Raudsepp, Stefan Rieder, Carl-Johan Rubin, Robert Schaefer, Christian Schlötterer, Georg Thaller, Jens Tetens, Brandon Velie, Gottfried Brem, and Barbara Wallner. The horse Y chromosome as an informative marker for tracing sire lines. (Accepted)
- Samantha Beeson, Robert Schaefer, Victor Mason, Molly E McCue. Robust remapping of equine SNP array coordinates to EquCab3. Animal Genetics. (In-Press)
- Robert J Schaefer, Jean-Michel Michno, Joseph Jeffers, Owen Hoekenga, Brian Dilkes, Ivan Baxter, Chad Myers. Integrating co-expression networks with GWAS to prioritize causal genes in maize. The Plant Cell. DOI: https://doi.org/10.1101/221655
- Felipe Avila, James R. Mickelson, Robert J. Schaefer, Molly E. McCue. Genome-wide signatures of selection reveal genes associated with performance in American Quarter Horse subpopulations. Frontiers in Genetics - Livestock Genomics. DOI: https://doi.org/10.3389/fgene.2018.00249
- Victor Mason, Robert Schaefer, Molly McCue, Tosso Leeb, Vinzenz Gerber. eQTL Discovery and their Association with Severe Equine Asthma in European Warmblood Horses. BMC Genomics. DOI: https://doi.org/10.1186/s12864-018-4938-9
- S.A. Durward-Akhurst, R.J. Schaefer, J.R. Mickelson, M.E. McCue. Understanding genetic variation in the equine population. Journal of Equine Veterinary Medicine. DOI: https://doi.org/10.1016/j.jevs.2017.03.088
- Robert J Schaefer, Mikkel Schubert, Ernest Bailey, Danika L. Bannasch, Eric Barrey, Gila Kahila Bar-Gal, Gottfried Brem, Samantha A. Brooks, Ottmar Distl, Ruedi Fries, Carrie J. Finno, Vinzenz Gerber, Bianca Haase, Vidhya Jagannathan, Ted Kalbfleisch, Tosso Leeb, Gabriella Lindgren, Maria Susana Lopes, Nuria Mach, Artur da Câmara Machado, James N. MacLeod, Annette McCoy, Julia Metzger, Cecilia Penedo, Sagi Polani, Stefan Rieder, Imke Tammen, Jens Tetens, Georg Thaller, Andrea Verini-Supplizi, Claire M. Wade, Barbara Wallner, Ludovic Orlando, James R. Mickelson, Molly E. McCue. Development of a high-density, 2M SNP genotyping array and 670k SNP imputation array for the domestic horse. BMC Genomics. 2017; DOI: https://doi.org/10.1101/112979.
- Barbara Wallner, Nicola Palmieri, Claus Vogl, Doris Rigler, Elif Bozlak, Thomas Druml, Vidhya Jagannathan, Tosso Leeb, Ruedi Fries, Jens Tetens, Georg Thaller, Julia Metzger, Ottmar Distl, Gabriella Lindgren, Carl-Johan Rubin, Leif Andersson, Robert Schaefer, Molly McCue, Markus Neuditschko, Stefan Rieder, Christian Schlötterer, Gottfried Brem. Y Chromosome Uncovers the Recent Oriental Origin of Modern Stallions. Current Biology, 2017; DOI: http://doi.org/10.1016/j.cub.2017.05.086
- Lin Li, Roman Briskine, Robert Schaefer, Patrick S. Schnable, Chad L. Myers, Lex E. Flagel, Nathan M. Springer and Gary J. Muehlbauer. Co-expression network analysis of duplicate genes in maize (Zea mays L.) reveals no subgenome bias. BMC Genomics. 4 November 2016. doi: https://doi.org/10.1186/s12864-016-3194-0
- Robert J. Schaefer, Jean-Michel Michno, Chad L. Myers. Unraveling gene function in agricultural species using gene co-expression networks. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms. 30 July 2016. http://dx.doi.org/10.1016/j.bbagrm.2016.07.016.
- Mitra AK, Stessman HAF, Schaefer RJ, Wang W, Myers CL, Van Ness BG and Beiraghi S (2016) Fine-Mapping of 18q21.1 Locus Identifies Single Nucleotide Polymorphisms Associated with Nonsyndromic Cleft Lip with or without Cleft Palate. Front. Genet. 7:88. doi: https://doi.org/10.3389/fgene.2016.00088
- Robert J. Schaefer, Roman Briskine, Nathan M. Springer, Chad L. Myers. Discovering Functional Modules Across Diverse Maize Transcriptional Datasets Using COB, The Co-expression Browser. PLoS ONE 12 June 2014 doi: https://doi.org/10.1371/journal.pone.0099193.
- Mikkel Schubert, Luca Ermini, Clio Der Sarkissian, Hákon Jónsson, Aurélien Ginolhac, Robert Schaefer, Michael D. Martin, Ruth Fernández, Martin Kircher, Molly McCue, Eske Willerslev, Ludovic Orlando. Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nature Protocols 9, 1056-1082 (2014) https://doi.org/10.1038/nprot.2014.063.
- Annette McCoy, Robert Schaefer, Jessica Peterson, Peter Morrell, Megan Slamka, James Mickelson, Stephanie Valberg, Molly McCue. Evidence of Positive Selection for a Glycogen Synthase (GYS1) Mutation in Domestic Horse Populations. Journal of Heredity. 2013. doi: https://doi.org/10.1093/jhered/est075
- Terje Raudsepp, Molly E. McCue, Pranab J. Das, Lauren Dobson, Monika Vishnoi, Krista L. Fritz, Robert Schaefer, Aaron K. Rendahl, James N. Derr, Charles C. Love, Dickson D. Varner, Bhanu P. Chowdhary Genome-Wide Association Study Implicates Testis-Sperm Specific FKBP6 as a Susceptibility Locus for Impaired Acrosome Reaction in Stallions. PloS Genet 2012 8:e1003139. doi: https://doi.org/10.1371/journal.pgen.1003139
- Ruth Swanson-Wagner, Roman Briskine, Robert Schaefer, Matthew B. Hufford, Jeffrey Ross-Ibarra, Chad L. Myers, Peter Tiffin, and Nathan M. Springer. Reshaping of the maize transcriptome by domestication. PNAS 2012 109:11878-11883 doi: https://doi.org/10.1073/pnas.1201961109
Meetings and Talks
- (talk/demo) Havemeyer Horse Genetics Workshop, Pavia, Italy. Processing tens of millions of genotypes with HapDab and analyzing tissue specific gene co-expression networks with Camoco.
- (Software demo) Plant and Animal Genome Conference, San Diego, CA. Identifying High Priority Candidate Genes from GWAS using Co-Expression Networks.
- (talk) Hamline Bio Seminar. From the laptop to the field: using computer science to unravel gene function in agricultural plants and animals.
- (talk) 9th International Conference on Canine and Feline Genetics and Genomics. SNP Chip Data Imputation: Equine Example.
- (talk) NIFA PD Meeting, Washington DC. From corn to horses: crossing kingdoms and translating computational methods developed in one agricultural species to another.
- (talk) ISAG, Dublin, Ireland
- (poster) ISAG, Dublin, Ireland. Unraveling gene function using co-expression networks in the domestic horse (Talk and poster) https://doi.org/10.6084/m9.figshare.5203732
- (poster) ISAG, Dublin Ireland. Design and use of the MNEc670k SNP array for precision SNP imputation to millions of markers in 15 horse breeds (poster) https://doi.org/10.6084/m9.figshare.5203852
- (poster) Maize Genetics Conference, St Louis, MO. Integrating Co-Expression Networks with GWAS to Detect Causal Genes Driving Elemental Accumulation in Maize
- (talk) Plant and Animal Genome Conference, San Diego, CA. Unraveling gene function using gene co-expression networks in the domestic horse
- (poster) Unraveling gene function using gene co-expression networks in the domestic horse. CSHL Networks Meeting, Cold Spring Harbor, NY. March 2017.
- (poster) Plant and Animal Genome Conference, San Diego, CA. Camoco: a computational framework for inter-relating GWAS loci and unraveling gene function using co-expression networks
- (poster) College of Veterinary Medicine Research Days, St. Paul, MN. Unraveling gene function using co-expression networks and haplotype maps in the domestic horse
- (poster) Postdoctoral Association Poster Symposium, Minneapolis, MN. Surveying the transcriptional landscape of skeletal muscle and adipose tissue biology to identify breed-related insulin sensitivity and other metabolic traits in the domestic horse
- (poster) 4th Annual BICB Industry Symposium, Minneapolis, MN. Improving the organization of collective data resources using BotBot
- (poster) Maize Genetics Conference, Jacksonville, FL. Camoco: Systematically Integrating co-expression networks to detect causal genes for genome wide association studies in Zea mays
- (poster) Plant and Animal Genome Conference, San Diego, CA. A Functional Network in the Horse: Discovering Causal Variants for Complex Disease
- (poster) Plant and Animal Genome Conference, San Diego, CA. Systematic Integration of GWAS with Co-Expression Networks to Detect Causal Genes for Elemental Accumulation in Zea Mays and Arabidopsis thaliana Using Camoco
- (talk) PhD Defense, Minneapolis, MN. Integrating co-expression networks with GWAS to detect causal genes for agronomically important traits
- (poster) BICB Industry Symposium, Minneapolis, MN. LocusPocus: Genomic coordinates as a fundamental datatype in python
- (poster) CSHL Networks Meeting, Cold Spring Harbor, NY. Integrating co-expression networks with GWAS to detect causal genes for agronomically important traits
- (talk) Maize Genetics Conference, St. Charles, IL. Integrating tissue specific co-expression networks with NAM GWAS reduces candidate gene sets by orders of magnitude
- (talk) Equine Genetics Workshop, Plant and Animal Genomics, San Diego, CA. Selection of Tagging SNPs and imputation efficiency of the 670K commercial SNP chip
- (talk) Donald Danforth Plant Science Center Seminar. Computational techniques for characterizing agronomically important traits in non-model species. St. Louis, MO. 2014.
- (talk) Doctoral Dissertation Fellowship Graduate Fellow Seminar. Discovering Agriculturally Important Genes in Non-model Species Using Biological "Social Networks". Minneapolis, MN. 2014.
- (talk) Biomedical Informatics and Computational Biology Industry Symposium. Computational techniques for characterizing agriculturally important genes in non-model species. Minneapolis, MN. 2014.
- (poster) Maize Genetics Meeting. Systems biology approaches for integrating datasets identifying genes related to iron bioavailability in maize. Beijing, China. 2014.
- (talk) Equine Genome Workshop, Plant And Animal Genomics, San Diego, CA. Haplotype Discovery and an Imputation Resource for the Domestic Horse.
- (talk) Leibniz Institute for Plant Genetics, Gatersleben Germany. Leveraging Functional Networks to Better Characterize Candidate Genes in Plant Biology. iHub Student Exchange. Gatersleben Germany. 2013.
- (poster) Maize Genetics Conference. Detecting Causal Genes for Maize Agronomic Traits Using CoExpression Networks. Maize Genetics Conference. St Charles, IL. 2013.
- (talk) Oral Prelims, Minneapolis, MN. Detecting causal genes for maize agronomic traits using co-expression networks
- (poster) Maize Genetics Meeting. COB: The CoExpression Browser -- A Web Application For Integrating and Browsing Genome Scale Transcriptional Networks. Maize Genetics Meeting. Portland, OR. 2012.
- Acknowledged in PLoS Genetics Article. A high density SNP array for the domestic horse and extant Perissodactyla: utility for association mapping, genetic diversity, and phylogeny studies. PLoS Genet.
- (poster) January 2011, Plant and Animal Genome Meeting. Genome-wide association studies and gene expression profiling in Thoroughbred stallions with acrosomal dysfunction. Terje Raudsepp, Molly E. McCue, Pranab J. Das, Monika Vishnoi, Krista L. Fritz, Robert Schaefer, Aaron K. Rendahl, Steve Brinsko, Charles C. Love, James R. Mickelso, Bhanu. P. Chowdhary, Dickson, D. Varner. Plant and Animal Genome Meeting. January 2011.