In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
!pip install tensorflow_decision_forests

In [None]:
import tensorflow_decision_forests as tfdf
import pandas as pd
from sklearn.model_selection import train_test_split

# Examining Data

This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to
this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4.

In [None]:
df = pd.read_csv("../input/heart-disease-uci/heart.csv")
df.head()

In [None]:
df.info()

* 0 --> No Risk of Heart Attack
* 1 --> Risk of Heart Attack

In [None]:
df["target"].value_counts()

# Tensorflow Decision Forests 

**About decision forests**

Decision forests are a family of machine learning algorithms with quality and speed competitive with (and often favorable to) neural networks, especially when you’re working with tabular data. They’re built from many decision trees, which makes them easy to use and understand - and you can take advantage of a plethora of interpretability tools and techniques that already exist today.

TF-DF provides a slew of state-of-the-art Decision Forest training and serving algorithms such as random forests, gradient-boosted trees, CART, (Lambda)MART, DART, Extra Trees, greedy global growth, oblique trees, one-side-sampling, categorical-set learning, random categorical learning, out-of-bag evaluation and feature importance, and structural feature importance.

Below, we first divide the dataset into X_train, X_test, y_train, y_test.

In [None]:
X = df.drop(["target"], axis=1)
y = df["target"]
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.3, random_state = 2, stratify=y)

Before making the dataset suitable for tensorflow, we combine X_train, y_train and X_test, y_test with the help of the concat() function. We equate these to two variables train_df and test_df.

In [None]:
train_df = pd.concat([X_train, y_train], axis=1)
test_df = pd.concat([X_test, y_test], axis=1)

Tensorflow does not accept dataset as DataFrame. That's why we use the **pd_dataframe_to_tf_dataset** function. As parameters, we give the train_df and test_df data sets that we created above, and the "target" variable that we need to guess as the label. As a result of these processes, our tensorflow datasets are ready.

In [None]:
# Convert the dataset into a TensorFlow dataset.
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_df, label="target")
test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_df, label="target")

## Models In Tensorflow Decision Forests Library

With the help of the **get_all_models()** function, we can see the available models that we can use in tfdf.

In [None]:
tfdf.keras.get_all_models()

### RandomForestModel

In [None]:
# Train the model
model = tfdf.keras.RandomForestModel()
model.fit(train_ds)

In [None]:
# Evaluate the model
model.compile(metrics=["accuracy"])
print(model.evaluate(test_ds))

### GradientBoostedTreesModel

In [None]:
model = tfdf.keras.GradientBoostedTreesModel()
model.fit(train_ds)

In [None]:
# Evaluate the model
model.compile(metrics=["accuracy"])
print(model.evaluate(test_ds))

### CartModel

In [None]:
model = tfdf.keras.CartModel()
model.fit(train_ds)

In [None]:
# Evaluate the model
model.compile(metrics=["accuracy"])
print(model.evaluate(test_ds))