This repository contains a final project which covers the contents viewed during the data mining course 2018-1 ESCOM IPN
- ID3 Algorithm Overview
- ID3 Project Codebase
- Project Screenshots and Demo
- License
ID3 stands for Iterative Dichotomizer 3. The ID3 algorithm was invented by Ross Quinlan. It builds a decision tree from a fixed set of examples and the resulting tree is used to classify future samples.
The basic idea is to construct the decision tree by employing a top down, greddy search through the given data sets to test each attribute at every tree node.
Simply put, a decision tree is a tree with each branch represents a choice between a number of alternatives and each leaf node represents a decision.
A decision tree is a type of supervised learning algorithm (with a predefined target variable) that is mostly used in classification problems and works for both categorical and continuous input and output variables. It is one of the most widely used and practical methods for inductive inference. (Inductive inference is the process of reaching a general conclusion from specific examples.)
As the main project of the course, this algorithm is applied to data stored in a database of a music streaming application named (Stopify). This with the purpose of obtaining a better visualization of the overall information structure and to generate very basic business knowledge for decision taking by mining and processing the raw data.
The project codebase is contained within the folder "stopify-application", the codebase include both backend and frontend development, for more information please feel free to take a look at it!.
The entity relation diagram describes how was made the database's structure, defining entities and their relations.
The table index shows the current selected table from the database and its attributes, here the user selects the attributes to which analysis will be apply.
Here parameters and functions are set, establishing the criteria that define how the decision tree will be made.
Once the setup has been defined its corresponding decision tree will be generated. The information display within the tree provide helpful information of the data selected.
Finally, the user has the option to generate a more detailed report of the ID3 generation process, all generated data during ID3 creation will be send from the server and it will be displayed in the client's size on a log.
- MIT license
- Copyright 2018 © Eric Alejandro López Ayala.