In [None]:
import numpy as np
import pandas as pd 
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

**Auto-Sklearn is an open-source library for performing AutoML in Python. It makes use of the popular Scikit-Learn machine learning library for data transforms and machine learning algorithms and uses a Bayesian Optimization search procedure to efficiently discover a top-performing model pipeline for a given dataset.**

In this tutorial, you will discover how to use Auto-Sklearn for AutoML with Scikit-Learn machine learning algorithms in Python.

After completing this tutorial, you will know:

Auto-Sklearn is an open-source library for AutoML with scikit-learn data preparation and machine learning models.
How to use Auto-Sklearn to automatically discover top-performing models for classification tasks.
How to use Auto-Sklearn to automatically discover top-performing models for regression tasks.

**Installing the required dependencies**

In [None]:
!sudo apt-get install swig -y
!pip install Cython numpy

In [None]:
!pip install pipelineprofiler

In [None]:
!pip install auto-sklearn

In [None]:
import pandas as pd
import numpy as np
import PipelineProfiler
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from autosklearn.classification import AutoSklearnClassifier

In [None]:
data = pd.read_csv("../input/heart-disease-uci/heart.csv")

In [None]:
data

In [None]:
data.shape

In [None]:
data.info()

In [None]:
data.describe()

In [None]:
data.isnull().sum()

No null values in this dataset we can start training.

In [None]:
target_col = "target"
X = data.loc[:, data.columns != target_col]
y = data.loc[:, target_col]

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state = 42)

In [None]:
model = AutoSklearnClassifier(time_left_for_this_task=2*60)

Giving 120 sec to train the Auto Sklearn Classifier.

In [None]:
model.fit(X_train, y_train)

In [None]:
y_pred = model.predict(X_test)
testing_accuracy = accuracy_score(y_test, y_pred)
print("Test Accuracy score {0}".format(testing_accuracy))

**We are getting an accuracy of 85% on the test set. 
We can see the results by printing model.show_models()**

In [None]:
print(model.show_models())

In [None]:
print(model.sprint_statistics())

In [None]:
profiler_data = PipelineProfiler.import_autosklearn(model)
PipelineProfiler.plot_pipeline_matrix(profiler_data)

**From the above code we can view the results graphically and compare the performance of different models.
It also gives a flowchart of the pipeline the model has undergone.**