Skip to content

SilvrDuck/dslab2018.github.io

 
 

Repository files navigation

Description

This hands-on course teaches the tools & methods used by data scientists, from researching solutions to scaling up prototypes to Spark clusters. It exposes the students to the entire data science pipeline, from data acquisition to extracting valuable insights applied to real-world problems.

Questions

Public questions and discussions about the course are gathered in the course repository.

Virtual Machine

Lab Sessions

Week 1 - 21.02.2018 - Module 1 - Python for data scientists 1/4

Week 2 - 28.02.2018 - Module 1 - Python for data scientists 2/4

Week 3 - 07.03.2018 - Module 1 - Python for data scientists 3/4

Week 4 - 14.03.2018 - Module 1 - Python for data scientists 4/4

Week 5 - 21.03.2018 - Module 2 - Distributed computing with Hadoop 1/2

Week 6 - 28.03.2018 - Module 2 - Distributed computing with Hadoop 2/2

Week 7 - 11.04.2018 - Module 3 - Distributed processing with Apache Spark 1/3

Week 8 - 18.04.2018 - Module 3 - Distributed processing with Apache Spark 2/3

Week 9 - 25.04.2018 - Module 3 - Distributed processing with Apache Spark 3/3

Week 10 - 02.05.2018 - Module 4 - Real-time data acquisition and processing 1/2

Week 11 - 09.05.2018 - Module 4 - Real-time data acquisition and processing 2/2

Week 12 - 16.05.2018 - Module 5 - Final Project 1/3

Week 13 - 23.05.2018 - Module 5 - Final Project 2/3

Week 14 - 30.05.2018 - Module 5 - Final Project 3/3

About

Website for the EPFL Lab in Data Science 2018

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 44.2%
  • Ruby 38.5%
  • HTML 12.5%
  • Shell 4.0%
  • CSS 0.8%