# BUILDING ML MODEL TO PREDICT SEPSIS

> BUSINESS UNDERSTANDING


Sepsis, a life-threatening condition, arises from the body's exaggerated response to an infection, causing widespread inflammation, organ damage, and potential organ failure. Prompt medical intervention is essential, as sepsis is often triggered by various infections. Symptoms encompass fever, rapid heart rate, breathing difficulties, low blood pressure, and altered mental status. Treatment entails antibiotics, intravenous fluids, and supportive care. Early detection is paramount for a favorable prognosis.

This project centers on the early detection and classification of sepsis, a pivotal aspect of healthcare. Timely identification markedly improves patient outcomes. The objective is to construct a robust machine learning model for sepsis classification and deploy it as a web application using FastAPI, facilitating real-time predictions.


- Goal: build a ml model to predict sepsis
- Null hypothesis:
- Alternative Hypothesis:

#### Analytical Questions
1. 
2. 



> DATA UNDERSTANDING


- ID: number to represent patient ID
- PRG: Plasma glucose
- PL: Blood Work Result-1 (mu U/ml)
- PR: Blood Pressure (mm Hg)
- SK: Blood Work Result-2 (mm)
- TS: Blood Work Result-3 (mu U/ml)
- M11: Body mass index (weight in kg/(height in m)^2
- BD2: Blood Work Result-4 (mu U/ml)
- Age: patients age (years)
- Insurance: If a patient holds a valid insurance card
- Sepsis: Positive: if a patient in ICU will develop a sepsis , and Negative: otherwise


### Importations 

In [None]:
# Data manipulation packages
import pandas as pd
import numpy as np

#Data visualization packages
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

# Preprocessing
from scipy.stats import chi2_contingency
from scipy.stats import ttest_ind
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer 
from sklearn.preprocessing import StandardScaler,FunctionTransformer
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import StratifiedKFold

# Handling class imbalance by oversampling
from imblearn.over_sampling import SMOTE 

# Models
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier, AdaBoostClassifier
from sklearn.linear_model import LogisticRegression
#from xgboost import XGBClassifier

# Metrics
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score, precision_score, recall_score, roc_auc_score,roc_curve
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import GridSearchCV

# Other Packages
import warnings
warnings.filterwarnings("ignore")

import joblib
import os

### 2. Loading datasets

In [None]:
#Load the test dataset
test_dataset= pd.read_cv('datasets/Paitients_Files_Test.csv')
test_dataset.head()

In [None]:
#Load train dataset
train_dataset= pd.read_csv('datasets/Paitients_Files_Train.csv')
train_dataset.head()