Skip to content

datakolektiv/dsss2022

Repository files navigation

DATA SCIENCE SUMMER SCHOOL 2022 :: MACHINE LEARNING IN R

This is a DataKolektiv repository for our MACHINE LEARNING IN R Data Science Summer School 2022.
contact: hello@datakolektiv.com

The Summer School will be hosted in Startit Center Belgrade every Saturday in June 2022, 09:00 - 18:00 CET.
Asynchronous consultations and work will be carried on via Slack and GitHub 4. - 30. June 2022.

Data Science Summer School 2022

PROGRAM

WEEK 1.


WEEK 2.

  • Saturday 11. June, 09:00 - 18:00 CET, Startit center, Belgrade
    • 09:00 - 12:30. Linear and Multiple Linear Regression [review]
    • 14:30 - 18:00. Binomial Logistic and Multinomial Logistic Regression for Classification Problems [review1], [review2]
  • Asynchronous (Slack, GitHub), Monday, 13. June - Friday, 17. June
    • Case Study 1: Churn Prediction
    • Control for Overfit 1: Regularized Linear and Generalized Linear Models [review1], [review2], [review3]

WEEK 3.

  • Saturday, 18. June, 09:00 - 18:00 CET, Startit center, Belgrade
    • 09:00 - 12:30. Cross-Validation and Regularization in Classification Problems [review]; Model Selection (ROC analysis) [review]; note: Python code, we will have ours in R
    • 14:30 - 18:00. Decision Trees (CART) [review]
  • Asynchronous (Slack, GitHub), Monday, 20. June - Friday, 24. June
    • Case Study 2: Price Prediction in the Real Estate Market
    • Control for Overfit 2: Cross-Validation and Regularization in Regression Models

WEEK 4.

  • Saturday, 25. June, 09:00 - 18:00 CET, Startit center, Belgrade
    • 09:00 - 12:30. Random Forests for Regression and Classification Problems [review1], [review2]
    • 14:30 - 18:00. Gradient Boosting: XGBoost Model for Regression and Classification Problems [review1]; note a bit heavy on math perhaps; [review2]
  • Asynchronous (Slack, GitHub), Monday, 27. June - Thursday, 30. June
    • Case Study 3: Web Content Popularity Prediction.
    • Case Study 4: Complete Model Layout and Find-Tuning XGBoost for Regression and Classification Problems