- Responsible teacher: Hong-Linh Truong
- Other teachers/Assistants: Phuong Pham and Tri Nguyen
- Lecture 1:
- Slides: Robustness, Reliability, Resilience and Elasticity (R3E) for Big Data/Machine Learning Systems
- Key reading 1: R3E -An Approach to Robustness, Reliability, Resilience and Elasticity Engineering for End-to-End Machine Learning Systems
- Key reading 2: The New Frontier of Machine Learning Systems
- Lecture 2
- Slides: Benchmarking, Monitoring, Validation and Experimenting for Big Data and Machine Learning Systems
- Key reading 1: Benchmarking big data systems: A survey
- Key reading 2: MLPERF Training Benchmark
- Key reading 3: Data Validation for Machine Learning
- Key reading 4: Developments in MLflow: A System to Accelerate the Machine Learning Lifecycle and ModelDB: a system for machine learning model management
- Key reading 5: Putting Machine Learning into Production Systems
- Site 1: AI Matrix
- Lecture 3
- Slides: Coordination Models and Techniques for Big Data and Machine Learning Systems
- Key reading 1: Cirrus: a Serverless Framework for End-to-end ML Workflows
- Key reading 2: Towards ML Engineering: A Brief History Of TensorFlow Extended (TFX)
- Key reading 3: Orchestrating Big Data Analysis Workflows in the Cloud: Research Challenges, Survey, and Future Directions
- Key reading 4: KeystoneML: Optimizing Pipelines for Large-ScaleAdvanced Analytics
- Key reading 5: Jeff Smith. 2018. Machine Learning Systems: Designs that scale (1st. ed.). Manning Publications Co., USA.
- Key reading 6: Prediction-Serving Systems
- Lecture 4
- Slides: Machine Learning with Edge Systems
- Key reading 1: Serving deep neural networks at the cloud edge for vision applications on mobile platforms
- Key reading 2:From the Edge to the Cloud: Model Serving in ML.NET
- Key reading 3: Machine Learning at Facebook:Understanding Inference at the Edge
- Key reading 4: Distributing Deep Neural Networks with Containerized Partitions at the Edge
If you need the sources of slides for your teaching, pls. contact Linh Truong
- Machine Learning experiment management
- Observability and Monitoring
- Machine Learning Serving
- Edge ML Pipeline
- Qualty of Analytics for ML
- Common tasks with Edge ML
- Students will propose the project idea. This is an important aspect of research-oriented course. If a student cannot propose an idea, the teacher will suggest some concrete ideas for students.
- The final project demonstration should be organized like an "event" where all students can demonstrate their work and students can discuss experiences in their projects.
- Demos: