
<h1 style='background-color:Green; font-family:newtimeroman; font-size:250%; text-align:center; border-radius: 15px 50px;' > Auto-Sklearn </h1>


The Auto-Sklearn architecture is composed of 3 phases: meta-learning, bayesian optimization, ensemble selection. The key idea of the meta-learning phase is to reduce the space search by learning from models that performed well on similar datasets. Right after, the bayesian optimization phase takes the space search created in the meta-learning step and creates bayesian models for finding the optimal pipeline configuration. Finally, an ensemble selection model is created by reusing the most accurate models found in the bayesian optimization step. In Figure 2 it’s described the Auto-Sklearn architectur



<img src="https://miro.medium.com/max/1000/1*w8qIzewO97qdqmiZi69Maw.jpeg" width="1200px">

## **Fetal Health Classification**




<img src="https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcSCrD9J44VL76nK3t4Az0dHxyJ5R_tidokCamYpZG_t81xzLLPY92i35kVR7MeUgB1Zcys&usqp=CAU" width="500px">





## Data Set Information:

Reduction of child mortality is reflected in several of the United Nations' Sustainable Development Goals and is a key indicator of human progress.
The UN expects that by 2030, countries end preventable deaths of newborns and children under 5 years of age, with all countries aiming to reduce under‑5 mortality to at least as low as 25 per 1,000 live births.

Parallel to notion of child mortality is of course maternal mortality, which accounts for 295 000 deaths during and following pregnancy and childbirth (as of 2017). The vast majority of these deaths (94%) occurred in low-resource settings, and most could have been prevented.

In light of what was mentioned above, Cardiotocograms (CTGs) are a simple and cost accessible option to assess fetal health, allowing healthcare professionals to take action in order to prevent child and maternal mortality. The equipment itself works by sending ultrasound pulses and reading its response, thus shedding light on fetal heart rate (FHR), fetal movements, uterine contractions and more.



## Dataset in this link


[Here](https://www.kaggle.com/andrewmvd/fetal-health-classification)

In [None]:
pip install auto-sklearn

In [None]:
pip install xlrd

In [None]:
pip install autoviz

## Import Lab

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from autosklearn.classification import AutoSklearnClassifier

## Load the well-known Breast Cancer dataset

In [None]:
df= pd.read_csv('../input/fetal-health-classification/fetal_health.csv')
df.head()

In [None]:
df.info()

## Describe Dataset

In [None]:
df.describe().T

## Check Miss Data

In [None]:
df.isna()

In [None]:
df.isna().sum(axis=0)

## Visualization 

In [None]:
from autoviz.AutoViz_Class import AutoViz_Class
AV = AutoViz_Class()
target='fetal_health'
dfv = AV.AutoViz(filename="",sep=',', depVar=target, dfte=df, header=0, verbose=1, 
                 lowess=False, chart_format='svg', max_rows_analyzed=150000, max_cols_analyzed=30)

## ProfileReport Data

In [None]:
import pandas_profiling as pp
pp.ProfileReport(df)

In [None]:
sns.countplot(df['fetal_health'],label="Count")

In [None]:
plt.figure(figsize=(20,10))
sns.heatmap(df.corr(), annot=True, fmt='.0%')

## Assigning values to features as X and target as y


In [None]:
X=df.drop(["fetal_health"],axis=1)
y=df["fetal_health"]

In [None]:
X[:5]

In [None]:
y[:5]

## Set up a standard scaler for the features

In [None]:
col_names = list(X.columns)
s_scaler = preprocessing.StandardScaler()
X_df= s_scaler.fit_transform(X)
X_df = pd.DataFrame(X_df, columns=col_names)   
X_df.describe().T

In [None]:
X_df.head()

## Split into train and test sets

In [None]:
x_train, x_test, y_train, y_test = train_test_split(X_df, y, test_size=0.25, random_state=23)

In [None]:
print(f'Training Shape x:',x_train.shape)
print(f'Testing Shape x:',x_test.shape)
print('*****___________*****___________*****')
print(f'Training Shape y:',X_df.shape)
print(f'Testing Shape y:',y.shape)

## Auto-Sklearn Initialization

In [None]:
%time
# time_left_for_this_task : Time limit in seconds to find the optimal configuration
# per_run_time_limi : Time limit in seconds for the each model
# ensemble_size: Number of models added to the Ensemble model
# initial_configurations_via_metalearning: "k" configurations to start the Bayesian Optimization
model = AutoSklearnClassifier(time_left_for_this_task=300, 
                              per_run_time_limit=9, 
                              ensemble_size=1, 
                              initial_configurations_via_metalearning=0)
# Init training
model.fit(x_train, y_train)

In [None]:
model.score(x_train, y_train)

In [None]:
model.score(x_test, y_test)

In [None]:
print(model.sprint_statistics())

## Accuracy Score

In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score
y_pred = model.predict(x_test)
cm = confusion_matrix(y_test, y_pred)
print(f'CM:',cm)
print(f'Accuracy:',accuracy_score(y_test, y_pred)* 100 ,'%')

In [None]:
conf_matrix = confusion_matrix(y_pred, y_test)

print(f'Confussion Matrix: \n{conf_matrix}\n')

sns.heatmap(conf_matrix, annot=True)

## Performance Measures

In [None]:
tn = conf_matrix[0,0]
fp = conf_matrix[0,1]
tp = conf_matrix[1,1]
fn = conf_matrix[1,0]

total = tn + fp + tp + fn
real_positive = tp + fn
real_negative = tn + fp

## All Measurement

In [None]:
accuracy  = (tp + tn) / total # Accuracy Rate
precision = tp / (tp + fp) # Positive Predictive Value
recall    = tp / (tp + fn) # True Positive Rate
f1score  = 2 * precision * recall / (precision + recall)
specificity = tn / (tn + fp) # True Negative Rate
error_rate = (fp + fn) / total # Missclassification Rate
prevalence = real_positive / total
miss_rate = fn / real_positive # False Negative Rate
fall_out = fp / real_negative # False Positive Rate

print(f'Accuracy    : {accuracy}')
print(f'Precision   : {precision}')
print(f'Recall      : {recall}')
print(f'F1 score    : {f1score}')
print(f'Specificity : {specificity}')
print(f'Error Rate  : {error_rate}')
print(f'Prevalence  : {prevalence}')
print(f'Miss Rate   : {miss_rate}')
print(f'Fall Out    : {fall_out}')

## Classification Report

In [None]:
from sklearn.metrics import classification_report
print(classification_report(y_pred, y_test))

## Save the model

In [None]:
import joblib

In [None]:
joblib.dump(model, 'model2.pkl')