Skip to content

Projects of Machine Learning and DataMining - course followed at Université de Technologie de Compiegne (UTC) France.

Notifications You must be signed in to change notification settings

AlexisDrch/Data-Mining

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

SY09 Data-Mining & Machine Learning - UTC

Credits: 6 Lectures hours: 2h/week Projects hours: 2h/week Prof. Benjamin Quost Four projects (60%), one written exam (40%)

Course overview

This course aims at presenting the modern techniques of large set of data analysis and at developing basic tools for data mining. The class aims at providing the students with the main theory under the hood of data mining and machine learning. The first part relates to exploratory data analysis, the approach where students analyze data sets using visual tools (plots, charts) and methods (Principal Component Analysis) to summarize their main characteristics and visualize relatedness and distance between populations . The second part concerns unsupervised and supervised learning, with pattern detection methods. Students will learn Bayesian theory, linear, quadratic regression and decisions trees with the implementation of the related loss functions and classifiers. Hence students will see different machine learning models and know how to choose the most robust or efficient model based on the data distribution and nature. In a nutshell, students will be able to describe the sense and information big volumes of data carry and justify the use of a particular method in real application.

Projects Description

All the projects are implemented with R.2017-2018

● Project 1

Descriptive statistics and Principal Component Analysis: Basic analysis of datasets with R, correlation determination and factors influence; manual application of the PCA, then use of R tools to apply it on different datasets.

● Project 2

Automatic Classification: Data visualization via AFTD (Analyse Factorielle d’un Tableau de Distances Factorial Analysis of a Distances Table), to show that this leads to the same results as a PCA; Hierarchical Classification; K-means implementation.

● Project 3

Discrimination, bayesian theory of decision: Implementation of Euclidian classifier, KNN algorithm with performance evaluation; Work on Bayes Rule with comparison between theoretical and practical results.

● Project 4

Discrimination: Implementation of Discriminant function analysis (linear, quadratic, and naive bayesian classifier), Logistic regression (linear and quadratic); Use of Decision Tree libraries and test on real data. 2017-2018

About

Projects of Machine Learning and DataMining - course followed at Université de Technologie de Compiegne (UTC) France.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages