Skip to content

My personal repository for all of my work for the Statistical Learning and Visualization course at UU in the fall of 2021.

Notifications You must be signed in to change notification settings

Kevin-Patyk/Supervised-Learning-and-Visualization

Repository files navigation

Supervised Learning and Visualization

This repository is where I store all of my R scripts and HTML files from practicals and assignments for this course at Utrecht University.

Contents

Each folder contains an .Rmd file and a corresponding R Markdown HTML file. If there are any images or data associated with the code, they will be in the Data and Images folders.

Course Goals:

At the end of this course, students are able to apply and interpret the theories, principles, methods and techniques related to contemporary data science, and understand and explain different approaches to data analysis:

  • Apply data wrangling and preprocessing techniques to tidy data sets.
  • Apply, implement, understand and explain methods and techniques that are associated with statistical learning, including regression, trees, clustering, classification techniques and learning ensembles in R.
  • Evaluate the performance of these techniques with appropriate performance measures.
  • Select appropriate techniques to solve specific data science problems.
  • Motivate and explain the choice for techniques to investigate data problems
  • Implement and understand generic data science tools, such as bootstrapping, cross validation, bagging, boosting and error evaluation in R.
  • Interpret and evaluate the results of analyses and explain these techniques in simple terminology to a broad audience.
  • Understand and explain the basic principles of data visualization and the grammar of graphics.
  • Construct appropriate visualizations for each data analysis technique in R.

Course Content:

Supervised learning is such an integral part of contemporary data science, that you will most likely use it dozens of times a day, without knowing it. In this class you will learn about the most effective supervised learning techniques and you will acquire the skills to implement them to work for you. We will not only discuss the theoretical underpinnings of supervised learning, but focus also on the skills and experience to rapidly apply these techniques to new problems.

During this course, participants will actively learn how to apply the main statistical methods in data analysis and how to use machine learning algorithms and visualizing techniques. The course has a strongly practical, hands-on focus: rather than focusing on the mathematics and background of the discussed techniques, you will gain hands-on experience in using them on real data during the course and interpreting the results.

This course provides a broad introduction to supervised learning and visualization. Topics include:

  • Data manipulation and data wrangling with R.
  • Data visualization.
  • Exploratory data analysis.
  • Regression and classification.
  • Non-linear modeling.
  • Bagging, boosting, and ensemble learning.

Students will learn to adapt these techniques in their way of thinking about analyses problems. This course makes students better equipped for a further career (e.g. junior researcher or research assistant) or education in research, such as a (research) Master program, or a PhD.