## Introduction
Greetings and welcome! In today's lesson, we explore a crucial and yet often overlooked topic: Bridging TensorFlow and Scikit-Learn through Keras Wrappers. Despite TensorFlow's powerful capabilities for machine learning, it doesn't offer the extensive utilities for model selection and evaluation that Scikit-learn provides. Tools such as cross-validation, grid search, and various metrics are invaluable. But how can we use them with TensorFlow models? The answer lies in Wrappers, specifically the KerasClassifier Wrapper, which will be our primary focus today.

## Understanding KerasClassifier
KerasClassifier is a wrapper designed to bridge the functionality of TensorFlow's Keras deep learning module with Scikit-learn's powerful tools for model selection and evaluation. Keras, an integral part of TensorFlow, is an open-source software library that provides a Python interface for artificial neural networks. For newer versions of TensorFlow, these wrappers are included in a separate library called scikeras, which specifically facilitates the integration between TensorFlow and Scikit-learn. To understand it better, let's dive into the implementation:

The first step is to define a build function thats return a model to the wrapper, this model is suited for the Iris dataset we have been using throughout this course:

```Python
import tensorflow as tf

def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model
```
Next, we import the module and create an instance of KerasClassifier. It requires a function that constructs and compiles a model, which will be the value of the model parameter. Additional arguments can be set that are passed to fit when called like epochs and verbose.

```Python
from scikeras.wrappers import KerasClassifier

model = KerasClassifier(model=build_model, epochs=30, verbose=0)
```
In this instance, we use the function build_model we defined earlier. epochs is provided a value of 30, representing the number of iterations over the dataset to train the model, and verbose is set to 0, meaning no output is shown during training.

## K-Fold Cross-validation with KerasClassifier
One of the many powerful utilities Scikit-learn provides is K-Fold Cross-validation. It's a resampling procedure used to evaluate machine learning models on a limited data sample. The primary purpose is to divide the entire dataset into 'K' balanced folds or subsets. After that, the model is trained on K-1 folds, and the remaining fold is used as the test fold. This process is repeated K times, each time with a different test set.

Now, we will apply K-Fold Cross-validation to our model using the Scikit-learn's cross_val_score function. For this, we will load the preprocessed Iris dataset using a function implemented in another file named data_preprocessing.py we are importing:

```Python
from data_preprocessing import load_preprocessed_data
from sklearn.model_selection import cross_val_score, KFold

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Define KFold
kfold = KFold(n_splits=5)

# Compute the accuracy scores
scores = cross_val_score(model, X_train, y_train, cv=kfold)
```
In this example, we first create a KFold object kfold with 5 splits. This object is then passed as the cv parameter to the cross_val_score function that computes the cross-validation scores for our model using the training data and the specified K-Fold cross-validator.

By default, cross_val_score returns accuracy scores for classification models. Our model works fine with Scikit-learn because the KerasClassifier wrapper ensures the TensorFlow Keras model conforms to the Scikit-learn estimator interface, allowing seamless integration for tasks such as cross-validation, hyperparameter tuning, and other model evaluation strategies.

## Displaying Cross-validation Scores
Cross-validation scores offer a good measure of how well our model will perform on unseen data. It helps reduce overfitting by testing the model on different subsets of the data. In this case, the scores represent the accuracy of the model on each fold. After carrying out cross-validation using cross_val_score, it's time to print the scores:

```Python
print("Cross-validation scores:\n", scores)
The output will be:

```sh

Cross-validation scores:
 [0.80952381 0.85714286 0.61904762 0.66666667 0.57142857]
```
This output indicates differing performance across different folds, reflecting the model's ability to generalize across different subsets of the Iris dataset. The variation in scores suggests that further tuning or model enhancements could lead to improvement in overall performance.

## Lesson Summary and Practice
Great work! You now understand how to bridge TensorFlow and Scikit-learn's functionalities through wrappers such as KerasClassifier. Comprehend the vital role this strategy plays in machine learning workflows, especially in model selection and evaluation, where Scikit-learn excels.

To master these techniques, practice is crucial. That's exactly what we are going to do next, with a series of coding exercises designed to solidify your understanding and skills. Happy coding!



## Bridging Libraries for Evaluation

In this task, you will see how to utilize the KerasClassifier wrapper to bridge TensorFlow and Scikit-Learn.

By running this code, you'll evaluate a TensorFlow model using K-Fold cross-validation provided by Scikit-Learn. This task demonstrates the seamless integration between the two powerful libraries, allowing for efficient model evaluation.

Simply run the code to observe the cross-validation scores output.

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
# Suppress TensorFlow warnings
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model


# Wrap the model with KerasClassifier
model = KerasClassifier(model=build_model, epochs=30, verbose=0)

# Evaluate using 10-fold cross validation
kfold = KFold(n_splits=5)

# Perform cross-validation
scores = cross_val_score(model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)

```

To modify the KFold cross-validation from 5 splits to 7 and increase the number of epochs in the `KerasClassifier` to 50, we just need to update the respective parameters in the code.

Here's the updated code:

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
# Suppress TensorFlow warnings
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# Wrap the model with KerasClassifier, set epochs to 50
model = KerasClassifier(model=build_model, epochs=50, verbose=0)

# Evaluate using 7-fold cross-validation
kfold = KFold(n_splits=7)

# Perform cross-validation
scores = cross_val_score(model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)
```

### Changes Made:
1. **KFold Splits**:
   - 🛠️ **Before**: `n_splits=5`
   - 🛠️ **After**: `n_splits=7`

2. **KerasClassifier Epochs**:
   - 🛠️ **Before**: `epochs=30`
   - 🛠️ **After**: `epochs=50`

### What These Changes Do:
- **KFold Cross-Validation (`n_splits=7`)**:
   - The dataset is split into 7 parts instead of 5. This means that for each iteration, the model will be trained on 6/7 of the data and validated on 1/7, repeating this process 7 times with different validation sets.

- **Increased Epochs (`epochs=50`)**:
   - The model will train for 50 epochs in each fold, giving it more opportunities to learn the data patterns.

### Conclusion
By increasing the number of KFold splits and epochs, the model's training and validation process is adjusted to potentially yield more stable and accurate results across different data splits. The increased epochs provide a more thorough training within each fold.

## Adjust Cross Validation Parameters
Great progress so far! Let's make some changes to deepen your understanding. Modify the number of splits of our KFold cross validation from 5 to 7. Also, set the number of epochs in our KerasClassifier to 50.

These changes will help you see how easy it is to adjust our model wrapper and work with sklearn.

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
# Suppress TensorFlow warnings
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# Wrap the model with KerasClassifier
model = KerasClassifier(model=build_model, epochs=30, verbose=0)

# Evaluate using 5-fold cross validation
kfold = KFold(n_splits=5)

# Perform cross-validation
scores = cross_val_score(model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)
```

## Fix K-Fold Cross-Validation Bug

You've done well in learning how to bridge TensorFlow and Scikit-learn using Keras Wrappers. Now, let's test your debugging skills. Below is the code designed to evaluate a TensorFlow model using K-Fold cross-validation, but there are a couple of issues present.

A common mistake has been introduced in the code, and your task is to identify and correct it. Debugging is crucial for mastering machine learning workflows, so let's dive in!

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
# Suppress TensorFlow warnings
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# Wrap the model with KerasClassifier
model = KerasClassifier(model=build_model, epochs=30, verbose=0)

# Evaluate using 5-fold cross validation 
kfold = KFold(n_splits=5)

# Perform cross-validation with KerasClassifier
scores = cross_val_score(build_model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)

```

The issue in the code lies in the way the `cross_val_score` function is being called. Specifically, the `cross_val_score` function expects an instance of the `KerasClassifier`, but the code is passing the `build_model` function instead. This mistake will raise a `TypeError` because `build_model` is not an estimator that Scikit-learn can use directly.

### **Fix:**
Pass the `model` (the `KerasClassifier` instance) to `cross_val_score` instead of `build_model`.

Here is the corrected code:

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
# Suppress TensorFlow warnings
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# Wrap the model with KerasClassifier
model = KerasClassifier(model=build_model, epochs=30, verbose=0)

# Evaluate using 5-fold cross-validation 
kfold = KFold(n_splits=5)

# Perform cross-validation with KerasClassifier
scores = cross_val_score(model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)
```

### **Explanation of the Fix:**

- **Original Issue**: The `cross_val_score` function was incorrectly passed the `build_model` function instead of the `model` instance. Since `build_model` is just a function and not an estimator, this would cause an error when Scikit-learn tries to use it for cross-validation.

- **Correct Approach**: Pass the `model` (which is an instance of `KerasClassifier`) to `cross_val_score`. This ensures that Scikit-learn can properly handle the Keras model for cross-validation.

### **Outcome**:
With this fix, the code will correctly perform K-Fold cross-validation on the TensorFlow model wrapped by `KerasClassifier`, and it will output the cross-validation scores as expected.

## Integrate Scikit-Learn with TensorFlow

Now that you have seen it working in practice, let's see if you can implement part of the code by yourself.

In this exercise, you will complete the code where the model is wrapped into a KerasClassifier. This step is necessary to seamlessly integrate our model with the Scikit-Learn modules.

Fill in the missing parts denoted by TODO comments.

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# TODO: Wrap the model with KerasClassifier set with 30 epochs

# Evaluate using 5-fold cross-validation
kfold = KFold(n_splits=5)

# Perform cross-validation
scores = cross_val_score(model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)

```

To complete the code, you need to wrap the `build_model` function in a `KerasClassifier` and specify the number of epochs. The wrapped model will be used for cross-validation with Scikit-Learn's `cross_val_score`.

Here’s how you can fill in the missing part:

```python
import tensorflow as tf
from data_preprocessing import load_preprocessed_data
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import cross_val_score, KFold
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)

# Load preprocessed Iris dataset
X_train, X_test, y_train, y_test = load_preprocessed_data()

# Function to build the model
def build_model():
    model = tf.keras.Sequential([
        tf.keras.layers.Input(shape=(4,)),
        tf.keras.layers.Dense(16, activation='relu'),
        tf.keras.layers.Dense(3, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# TODO: Wrap the model with KerasClassifier set with 30 epochs
model = KerasClassifier(model=build_model, epochs=30, verbose=0)

# Evaluate using 5-fold cross-validation
kfold = KFold(n_splits=5)

# Perform cross-validation
scores = cross_val_score(model, X_train, y_train, cv=kfold)

# Display the cross-validation scores
print("Cross-validation scores:\n", scores)
```

### Explanation:

- **Wrapping the Model**:
   - `KerasClassifier(model=build_model, epochs=30, verbose=0)`:
     - `model=build_model`: This tells `KerasClassifier` to use the `build_model` function to create the model.
     - `epochs=30`: This sets the number of epochs to 30 for each fold during cross-validation.
     - `verbose=0`: This suppresses the output during training for cleaner logs.

- **Cross-Validation**:
   - `cross_val_score(model, X_train, y_train, cv=kfold)`: 
     - This function performs cross-validation using the Keras model wrapped as a Scikit-learn estimator, providing the scores for each fold.

### Outcome:

When you run the code, it will perform 5-fold cross-validation on the TensorFlow model, and you'll see the cross-validation scores printed, which indicates the accuracy for each fold. This integration allows you to leverage the strengths of both TensorFlow and Scikit-learn.

## Bridging TensorFlow and Scikit-Learn