- G. Venkata Sai Ram (22501A0557)
- Aakash Kodali (22501A0501)
- G. Rani (22501A0562)
- B.T. Thanusri (22501A0528)
- A. Mrudula (23505A0501)
- Develop a predictive model to identify potential liver dysfunction using machine learning.
- Utilize real-world medical data to analyze liver function parameters effectively.
- Improve diagnostic accuracy and enable early intervention in liver-related disorders.
- Gain experience in using Python libraries for data science and healthcare analytics.
-
Data Collection: Acquired a liver patient dataset from Kaggle containing 583 records with 10 medical features.
-
Data Preprocessing:
- Handled missing values (e.g., imputation in the
Albumin_and_Globulin_Ratiocolumn). - Encoded categorical values such as
Gender. - Scaled features using
StandardScalerfor model efficiency.
- Handled missing values (e.g., imputation in the
-
Model Building:
- Used Logistic Regression for binary classification.
- Trained the model with stratified train-test split for balanced class representation.
-
Evaluation:
- Evaluated using Accuracy, Precision, Recall, F1-Score, AUC-ROC, and Confusion Matrix.
-
Result:
- Achieved reliable prediction accuracy with potential to enhance via advanced algorithms in future iterations.
-
Age: Patient age
-
Gender: Male/Female
-
Total_Bilirubin: Amount of bilirubin in blood
-
Direct_Bilirubin: Conjugated bilirubin level
-
Alkaline_Phosphotase: Enzyme level, important liver function indicator
-
Alamine_Aminotransferase (ALT): Enzyme related to liver inflammation
-
Aspartate_Aminotransferase (AST): Enzyme, often used in liver tests
-
Total_Protiens: Protein level in the blood
-
Albumin: Main protein in blood plasma
-
Albumin_and_Globulin_Ratio: Protein ratio indicating liver health
-
Dataset (Target):
- 1: No liver dysfunction
- 2: Possible liver dysfunction