Skip to content

Data Mining (DM) Project of the DM course at Department of Computer Science of University of Pisa.

License

Notifications You must be signed in to change notification settings

alessandrocuda/carvana

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Carvana a Data Mining Project 2019/2020

Data Mining (DM) Project of the DM course at Department of Computer Science of University of Pisa.

Abstract

Carvana is a start-up business launched by a well-established American company. The goal is to change completely the way people buy, finance, and trade their used vehicles by replacing physical infrastructure with technology and top of the line scientific models. This project shows the analysis based on the dataset published on kaggle.com for the Data Mining 2019/2020 Project. The aim is to build a model to advise future customers whether a purchase could be a good or bad buy.

The Project is divided in two part:

MAIN TASKS

  • Data Understanding: Explore the dataset with the analytical tools studied and write a concise “data understanding” report describing data semantics, assessing data quality, the distribution of the variables and the pairwise correlations.

  • Clustering analysis: Explore the dataset using various clustering techniques. Carefully describe your's decisions for each algorithm and which are the advantages provided by the different approaches.

  • Classification: Explore the dataset using classification trees. Use them to predict the target variable.

  • Association Rules: Explore the dataset using frequent pattern mining and association rules extraction. Then use them to predict a variable either for replacing missing values or to predict target variable.

All the details can be found in the report at this link.

A SURVEY THROUGH DIFFERENT CLASSIFICATION METHODS

An additional task for the project: compare results of classification by decision tree with KNN, Naive Bayesian, analysing also the runtime at training and test phase.

All the details can be found in the report at this link.

Authors

License

Copyright 2019 © Alessandro Cudazzo - Giulia Volpi - Flavia Achena - Aleksandra Maslennikova

About

Data Mining (DM) Project of the DM course at Department of Computer Science of University of Pisa.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published