Machine Learning Meetup Presentation
The presentation given at the Zurich Machine Learning Meetup Group titled Spark for High-throughput, Scalable, Quantitative Analysis of Genome-Scale Datasets
Recent improvements in the rate and quality of gene sequencing have resulted in a flood of sequence and marker data. The step of extracting meaning from these datasets has, however, been limited by the ability to analyze these results and compare them with quantitative phenotypes. We demonstrate the application of a new model for Resilient Distributed Datasets present in Spark enable quantitative analysis and real-time exploration of hundreds of phenotypes with millions of samples.