Permalink
Cannot retrieve contributors at this time
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
2019-web/data/talks/33.yaml
Go to fileThis commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
27 lines (15 sloc)
3.62 KB
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Talk details are specified in YAML files | |
| # YAML was selected because we can use multi-line strings and add | |
| # comments in the file. | |
| speaker_name: "Abhishek Gupta" | |
| talk_title: "Why that great machine learning research can’t be reproduced and how to fix it" | |
| # At least 1 tag is necessary!! | |
| talk_tags: | |
| - "Machine Learning & Data Science" | |
| talk_abstract: "Ever got excited about a piece of new machine learning research that you saw come out on arXiv or your favorite research lab’s blog hoping it will finally solve that last bit of optimization you need in your own work that will make you the ML superstar of your team? But after spending days trying to get the same results, you end up failing despite having tried everything in the paper including looking through their Github page, contacting the authors, etc. If this sounds familiar, you’re not alone! Everyday researchers and practitioners alike spend countless hours trying to replicate results from new ML research coming out but inevitably lose precious time and compute resources failing to achieve the required results. We’re facing a massive reproducibility crisis in the field of machine learning. There has been a rise in the ease of use of tools to develop machine learning (ML) based solutions, e.g. AutoML and Keras are two of many. At the same time, there are a lot more public datasets available, aimed at socially oriented research. With more people entering the field coming from diverse trainings, it is not necessary that all adhere to rigorous standards of scientific research. This is evidenced by recent calls by the technical research community at conferences like NeurIPS. We see that a lack of reproducibility in ML research will be a key hindrance in meaningful use of R&D resources. There is currently a lack of a comprehensive framework for doing reproducible machine learning. We, as Pythonistas, can do something to help this! Through my own work in this domain and the work of the intern cohort that worked on the Reproducibility in Machine Learning project this summer at the Montreal AI Ethics Institute, let’s talk through some of the social and technical aspects of this problem and how you can take these principles from the talk today and become the superhero of your ML team elevating the quality of the work coming from your team and helping others build on top of your work. We’ll walk through the following principles and apply them to a case study to understand how this simple yet effective mechanism can help address a ton of the issues that we face in the field. Our framework combines existing tooling with policy applied to solution design, data collection, model development, data and model legacy, and deployment performance tracking." | |
| about_author: "Abhishek Gupta is the founder of Montreal AI Ethics Institute (https://montrealethics.ai) and a Machine Learning Engineer at Microsoft where he serves on the CSE AI Ethics Review Board. His research focuses on applied technical and policy methods to address ethical, safety and inclusivity concerns in using AI in different domains. He has built the largest community driven, public consultation group on AI Ethics in the world that has made significant contributions to the Montreal Declaration for Responsible AI, the G7 AI Summit, AHRC and WEF Responsible Innovation framework and the European Commission Trustworthy AI Guidelines. His work on public competence building in AI Ethics has been recognized by governments from North America, Europe, Asia and Oceania." | |
| talk_metadata: | |
| - "**Date:** Sunday Nov. 17" | |
| - "**Location:** Round Room (PyData Track)" | |
| - "**Begin time:** 16:05" | |
| - "**Duration:** 25 minutes" |