Machine-Learning

Lab Report 01: Linear Regression on Diabetes Dataset

Overview

This project applies Linear Regression to the Pima Indians Diabetes dataset to predict whether a person has diabetes (Outcome = 1) or not (Outcome = 0). The analysis includes data preprocessing, model training, and evaluation using common classification metrics.

Dataset

The dataset used is diabetes.csv, containing the following features:

Pregnancies
Glucose
BloodPressure
SkinThickness
Insulin
BMI
DiabetesPedigreeFunction
Age
Outcome (Target: 1 = diabetic, 0 = non-diabetic)

Preprocessing Steps

Replace Zeros with Column Means:
For specific columns (Glucose, BloodPressure, SkinThickness, Insulin, BMI), all zero values (which are biologically implausible) are replaced with the mean of non-zero values.
Extreme Value Substitution:
- Set the glucose value of the first row to the maximum glucose in the dataset.
- For rows with minimum age, their glucose is set to the minimum glucose value.

Model Training

Model: LinearRegression from scikit-learn.
Train/Test Split: 80% training and 20% testing.
Target: The model predicts a continuous value which is then rounded to 0 or 1 for binary classification.

Evaluation Metrics

After rounding the predictions, the model is evaluated using:

Accuracy
Precision
Recall
F1-Score
Confusion Matrix

Example Output

Confusion Matrix: [[79 21] [27 27]] Accuracy: 0.688 Precision: 0.563 Recall: 0.500 F1-Score: 0.530

Note

Linear Regression is not typically used for classification tasks. While it provides a rough baseline, more appropriate models (such as Logistic Regression or Decision Trees) are generally better suited for binary classification.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
221002622_CSE312_221D13_LabReport01_LinearRegression.ipynb		221002622_CSE312_221D13_LabReport01_LinearRegression.ipynb
221002622_CSE312_221D13_LabReport02_knnFromScratch.ipynb		221002622_CSE312_221D13_LabReport02_knnFromScratch.ipynb
DM_Lab_222.ipynb		DM_Lab_222.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Machine-Learning

Lab Report 01: Linear Regression on Diabetes Dataset

Overview

Dataset

Preprocessing Steps

Model Training

Evaluation Metrics

Example Output

Note

About

Uh oh!

Releases

Packages

Languages

Tajuddin80/Machine-Learning

Folders and files

Latest commit

History

Repository files navigation

Machine-Learning

Lab Report 01: Linear Regression on Diabetes Dataset

Overview

Dataset

Preprocessing Steps

Model Training

Evaluation Metrics

Example Output

Note

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages