Skip to content

PitCoder/DataMining

Repository files navigation

Data Mining

This repository contains a final project which covers the contents viewed during the data mining course 2018-1 ESCOM IPN

Content

  • ID3 Algorithm Overview
  • ID3 Project Codebase
  • Project Screenshots and Demo
  • License

Data Mining usign fundamentals of Machine Learning with decision trees

An Introduction to Machine Learning With Decision Trees

ID3 Algorithm Overview

ID3 stands for Iterative Dichotomizer 3. The ID3 algorithm was invented by Ross Quinlan. It builds a decision tree from a fixed set of examples and the resulting tree is used to classify future samples.

The basic idea is to construct the decision tree by employing a top down, greddy search through the given data sets to test each attribute at every tree node.

What Are Decisions Trees?

Simply put, a decision tree is a tree with each branch represents a choice between a number of alternatives and each leaf node represents a decision.

Why use it in Machine Learning?

A decision tree is a type of supervised learning algorithm (with a predefined target variable) that is mostly used in classification problems and works for both categorical and continuous input and output variables. It is one of the most widely used and practical methods for inductive inference. (Inductive inference is the process of reaching a general conclusion from specific examples.)

What this Project Covers?

As the main project of the course, this algorithm is applied to data stored in a database of a music streaming application named (Stopify). This with the purpose of obtaining a better visualization of the overall information structure and to generate very basic business knowledge for decision taking by mining and processing the raw data.

Project Codebase

The project codebase is contained within the folder "stopify-application", the codebase include both backend and frontend development, for more information please feel free to take a look at it!.

Project Screenshots and Demo

Entity-Relation Diagram of Stopify Database

The entity relation diagram describes how was made the database's structure, defining entities and their relations.

Entity Relation Diagram

Table Index and Data Selection

The table index shows the current selected table from the database and its attributes, here the user selects the attributes to which analysis will be apply.

Data Selection

Parameters and Functions Setup

Here parameters and functions are set, establishing the criteria that define how the decision tree will be made.

Parameters and Functions Setup

Decision Tree Generation

Once the setup has been defined its corresponding decision tree will be generated. The information display within the tree provide helpful information of the data selected.

Tree Generation

Detailed Report Generation

Finally, the user has the option to generate a more detailed report of the ID3 generation process, all generated data during ID3 creation will be send from the server and it will be displayed in the client's size on a log.

Report Generation

Demonstration of how internally ID3 algorithm operates

ID3 Algorithm GIF

License

License

About

This repository contains a final project which covers the contents viewed during the data mining course 2018-1 ESCOM IPN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published