Skip to content
@MAT555E-Spring23

MAT555E-Statistical Data Analysis for Comp. Sciences course by Gül İnan.

MAT555E-Statistical Data Analysis for Computational Sciences

Course Instructor: Gül İnan

Course Summary:

MAT555E is a graduate level course which aims to provide an introduction to commonly used statistical methods for inference and prediction problems in data analysis. The course will harmonize statistical theory and data analysis through examples. This course is designed such that:

  • The methods covered will include supervised learning algorithms with a focus on regression and classification problems and unsupervised learning algorithms with a focus on clustering problems,
  • Extensions of these methods to high-dimensional settings will also be discussed, and
  • Application of these methods to data analysis problems and their software implementation will be done via Python.

At the end of the semester, the students are expected:

  • To be fluent in the fundamental principles behind several statistical methods,
  • To be able to apply statistical methods to real life problems and data sets, and
  • To be prepared for more advanced coursework or scientific research in machine learning and related fields.

Course Prerequisites:

Since the course also touches on the mathematical and statistical theory behind the methods and uses Python for implementation, this course requires the following background:

Course Tentative Plan

We will closely follow the weekly schedule given below. However, weekly class schedules are subject to change depending on the progress we make as a class.

Week 1. Introduction to multivariate data analysis terminologies and related linear algebra concepts.

Week 2. Simple linear regression.

Week 3. Multiple linear regression.

Week 4. Regression as a supervised learning problem.

Week 5. Regularization methods for regression problems. Ridge regression and lasso.

Week 6. Cross-validation. Unsupervised pre-processing. Grid search and hyper-parameter tuning.

Week 7. Remaining topics for regression problems.

Week 8. ITU Fall Break.

Week 9. Introduction to classification. Logistic regression.

Week 10. Linear discriminant analysis. Quadratic discriminant analysis.

Week 11. Naive Bayes. K-nearest neighbors.

Week 12. Tree based methods. Bagging, Random forests, and Boosting.

Week 13. Unsupervised learning. Principal component analysis. Factor analysis.

Week 14. Clustering methods.

Week 15. Final review and examples.

Popular repositories

  1. MAT555E-Spring23.github.io MAT555E-Spring23.github.io Public

    Website for MAT555E-Statistical Data Analysis for Comp. Sciences taught by Gül İnan at ITU Math Eng in Spring 2023.

    CSS 1

  2. .github .github Public

Repositories

Showing 2 of 2 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…