<a href="https://colab.research.google.com/github/Ash100/Biopython/blob/main/Keras_Models_for_General_Machine_Learning_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Using Your Keras Models with Scikit Learn for Optimization
I am **Dr. Ashfaq Ahmad**, and this notebook is created for teaching and research purposes. Refering to the people working in the field of Biology, I have tried my level best to keep it as simple as possible. For Detailed instruction and understandings, please watch a video tutorial on **https://www.youtube.com/@Bioinformaticsinsights**

This notebook is based on the book **Deep Learning with Python** by Jason Brownlee.

#Overview of this Notebook
Keras is a popular library for deep learning in Python. The scikit-learn library in Python is built upon the SciPy stack for numerical computation. Here we will learn,
Evaluation of models using resampling methods like k-fold cross validation.
Evaluation of model hyperparameters.


In this sections we will work through examples of using the *KerasClassifier wrapper* for a classification neural network created in Keras and used in the scikit-learn library.

#1. Evaluation of Model with Cross-Validation
The KerasClassifier and KerasRegressor classes in Keras take an argument *build_fn* which is the name of the function to call to create your model. First we will make a model with a function *create model()* that create a simple multilayer neural network for theproblem.
Next, We will pass this function name to the KerasClassifier class by the *build_fn* argument.
We also pass in additional arguments of epoch=150 and batch size=10 in *fit()* function which is called internally by the
KerasClassifier class. In this example we use the scikit-learn StratifiedKFold to perform 10-fold stratified cross validation. This technique will provide a robust
estimate of the performance of a machine learning model on unseen data.

#Some required installation

In [None]:
!pip install --upgrade keras

In [None]:
!pip install --upgrade scikit_learn

From the following, install the specific one, as per your need

In [None]:
!pip install scikeras[tensorflow-cpu]

In [None]:
!pip install scikeras[tensorflow]      # gpu compute platform


In [1]:
# MLP for Pima Indians Dataset with 10-fold cross validation via sklearn
from keras.models import Sequential
from keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from scikeras.wrappers import KerasClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
import numpy

In [3]:
# Function to create model, required for KerasClassifier
def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [None]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load pima indians dataset
dataset = numpy.loadtxt("/content/sample_data/diabetes_1.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = KerasClassifier(build_fn=create_model, epochs=150, batch_size=10, verbose=0)
# evaluate using 10-fold cross validation
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(model, X, Y, cv=kfold)
print(results.mean())

#2. Grid Search Deep Learning Model Parameters
We already know that we can provide arguments to the *fit()* function. We can use these arguments to **further customize the construction of the model**.
In this example we use a grid search to evaluate different configurations for our neural network model and adjust the combination that provides the best estimated performance.

The *create_model()* function is defined to take two arguments, we will tweak some settings to see evaluation and optimization schemes for our model.

In [None]:
# Function to create model, required for KerasClassifier
def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

In [None]:
# fix random seed for reproducibility
import numpy
seed = 7
numpy.random.seed(seed)

# Load the necessary modules
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV

# Function to create model, required for KerasClassifier
def create_model(optimizer='adam', init='glorot_uniform'):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer=init, activation='relu'))
    model.add(Dense(8, kernel_initializer=init, activation='relu'))
    model.add(Dense(1, kernel_initializer=init, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

# load pima indians dataset
dataset = numpy.loadtxt("/content/sample_data/diabetes_1.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]

# create model
model = KerasClassifier(build_fn=create_model, verbose=0)

# grid search epochs, batch size and optimizer
optimizers = ['rmsprop', 'adam']
init = ['glorot_uniform', 'normal', 'uniform']
epochs = [50, 100, 150]
batches = [5, 10, 20]
param_grid = dict(optimizer=optimizers, epochs=epochs, batch_size=batches, init=init)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
grid_result = grid.fit(X, Y)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

#What we get?
Now, we came to know that a batch size with 5, and 150 epochs. Our model performs better.