Skip to content

alimoayedi/Type2-Diabetes-Case-Study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Type2_Diabetes_Case_Study

A Project of Pattern Recognition

  1. Introduction Artificial Intelligence (AI) is the science of human’s intelligence imitation by machines through programing them to act, learn and make decisions in the same way as humans do. Machines can be trained to develop cause-and-affect structures which let them to make decisions. However, AI is a very broad topic dealing with integration of intelligence systems. Machine Learning (ML), a subset of Al, is the science of learning from data. Algorithms and models develop over hidden patterns within datasets and automated detection of patterns is the subject of pattern recognition. In this filed, different algorithms are used to detect hidden relations and discovered ones can be used later by machines to make decisions. The process of feeding a set of data into a system is called learning which is done by using a set of labeled data known as train data. Whenever this set of labeled data is in hand and are given to a system, training is supervised. On the other hand, if labels are missing by any reason, algorithms use unsupervised learning techniques. These techniques develop models by training data and based on that, future unseen data or test data can be classified. Thus, regardless of learning technique, ultimate goal is to decide on the most probable class of test data. There are different widely used algorithms to model data. They mostly have probabilistic nature and try to predicted classes using densities and probabilities. Naïve Bayes, Support Vector Machines (SVMs), logistic regression are examples of learning algorithms. However, it should be kept in mind, developed models should perform as well as possible and being able to generalize decision makings on new unseen data.
  2. Case Study With this brief introduction, as a practical study case, a type 2 diabetes dataset used for diabetic patients’ identification considered. This dataset contains 1611 attributes, 4322 samples and two classes of positive and negative. Moreover, attributes are combination of different variable types 2 (nominal, continues and binary) along with missed values. Thus preprocessing is an inevitable step. In the following sections we first start with preprocessing and continue with fitting of different classification models. Through the study four classification algorithms, Naïve Bayes, logistic regression, Support Vector Machine (SVM) and k-Nearest-Neighbor (kNN) are used and results compared. Also in this study, Weka, a free machine learning software written in Java, and python for preprocessing and model construction are used respectively.

About

A Project of Pattern Recognition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages