This hands-on course teaches the tools & methods used by data scientists, from researching solutions to scaling up prototypes to Spark clusters. It exposes the students to the entire data science pipeline, from data acquisition to extracting valuable insights applied to real-world problems.
Questions and discussions about the course are gathered on mattermost: https://mattermost-dslab.epfl.ch
- Slides: week 1
- Python Quick Reference: notebook
- Exercises: download - view on github
- Slides: week 2
- Solutions to last week's exercises: download - view on github
- Exercises - Set #1: download - view on github
- Exercises - Set #2: download - view on github
- Slides: week 3
- Solutions to last week's exercises #1: download - view on github
- Solutions to last week's exercises #2: download - view on github
- Exercises: Instructions
- Docker cheat sheet to get started and as reference: https://github.com/wsargent/docker-cheat-sheet
- Slides presenting the graded homework: week 4
- Solutions to last week's exercises: solutions
- Setup instructions: Instructions
- Feedback on graded homework: Feedback
- Solutions for the graded homework: Notebook
- Slides: week 5
- Exercises: Instructions
- Slides: week 6
- Solutions to last week's exercises: solutions (Right click and copy the url to import it into Zeppelin)
- Setup instructions: Instructions
- Slides: week7
- Exercises: Instructions
- Solutions to last week's exercises: solutions
- Slides: week8
- Exercises: Instructions
- Solutions to last week's exercises: solutions
- Homework 3: repository - start by reading the README
- Slides: week10
- Exercises: Instructions
- Solutions to last week's exercises: solutions
- Homework 4: repository - start by reading the README
- Slides: Final assignment
- Final assignment: repository - start by reading the README