Skip to content
Temi Aina edited this page Jun 5, 2023 · 4 revisions

About

Welcome to the PimaDiabetesOutcome wiki! This project focuses on developing a logistic regression model using Python and MS Azure Automated ML, specifically with the well-known Pima Diabetes dataset. Throughout the project, multiple versions were created, each addressing different aspects of missing values in the features columns.

Versions

The main version, accessible here, incorporates the original dataset with missing values represented as 0s in the features columns.

In the branch version, available here, all feature rows with missing values were dropped from the dataset to observe the impact on the model.

A third branch was created to explore an alternative approach. This version, found here), utilizes random imputation to process the missing values in the features columns, replacing the zero values with random values.

Additionally, a fourth branch was developed using MS Azure. For detailed information on this branch, please refer to the corresponding documentation here.

Feel free to explore each version to gain insights into the different strategies employed and their impact on the logistic regression model.

Clone this wiki locally