Skip to content

jason-dai/cvpr2020

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Automated ML Workflow for Distributed Big Data Using Analytics Zoo


Speaker

Jason Dai

Schedule

2-5PM (Pacific Time), June 19, 2020

Description

Applying machine learning (ML) techniques to distributed big data analytics plays a central role in today’s intelligent applications and systems. These problem settings have pushed the field to address issues of data scale that were almost inconceivable even a decade ago for AI researchers. In addition, building machine learning applications for these big data problems can also be a laborious and knowledge-intensive process for ML engineers.

To address these challenges, we have open sourced Analytics Zoo, which helps users to build and productionize end-to-end ML workflow for distributed big data in an automated fashion. Using Analytics Zoo, users can simply build conventional Python notebooks on their laptops (with possible AutoML support), which can then automatically scale out to large clusters and process large amount of data in a distributed fashion.

This tutorial will present how to implement the automated ML workflow for big data (with a focus on supporting computer vision models and pipelines), by seamlessly integrating different technologies including deep learning frameworks (e.g., TensroFlow, Keras, PyTorch, etc.), distributed analytics frameworks (e.g., Apache Spark, Apache Flink, Apache Kafka, Ray, etc.), and AutoML techniques (such as hyperparameter optimizations). In addition, it will also share real-world experience and "war stories" of users who have adopted Analytics Zoo to address their challenges when applying ML techniques to distributed big data analytics.

Tutorial

Link

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published