<a href="https://colab.research.google.com/github/Chirag314/keras-tuner/blob/main/keras_tuner_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [5]:
!pip install -U keras_tuner
!pip install -U tensorflow-addons

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting keras_tuner
  Downloading keras_tuner-1.1.3-py3-none-any.whl (135 kB)
[K     |████████████████████████████████| 135 kB 5.3 MB/s 
Collecting kt-legacy
  Downloading kt_legacy-1.0.4-py3-none-any.whl (9.6 kB)
Collecting jedi>=0.10
  Downloading jedi-0.18.1-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 47.3 MB/s 
Installing collected packages: jedi, kt-legacy, keras-tuner
Successfully installed jedi-0.18.1 keras-tuner-1.1.3 kt-legacy-1.0.4
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensorflow-addons
  Downloading tensorflow_addons-0.18.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 5.3 MB/s 
Installing collected packages: tensorflow-addons
Successfully installed tensorflow-addons-0.18.0


In [11]:
# Keras tuner example from kaggle book
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
import tensorflow_addons as tfa
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import LeakyReLU
from keras.layers import Activation
from keras.optimizers import SGD, Adam
from keras.wrappers.scikit_learn import KerasClassifier
from keras.callbacks import EarlyStopping, ModelCheckpoint
pad_sequences=keras.preprocessing.sequence.pad_sequences
imdb=keras.datasets.imdb
(train_data,train_labels),(test_data,test_labels)=imdb.load_data(num_words=10000)
train_data,val_data,train_labels,val_labels=train_test_split(train_data, train_labels,test_size=0.3
                                                             ,shuffle=True, random_state=0)


In our example, all words are already numerically indexed. We just add to the existing indices the numeric codes that denote padding (so we can easily normalize all the text to the phrase length), the start of the sentence, an unknown word, and an unused word

In [12]:
word_index=imdb.get_word_index()
#The first indices are reserved
word_index={k:(v+3) for k,v in word_index.items()}
word_index["<PAD>"]=0
word_index["<START>"]=1
word_index["<UNK>"]=2
word_index["<UNUSED>"]=3
reverse_word_index=dict([(value,key) for (key,value) in word_index.items()])
def decode_review(text):
  return ''.join([reverse_word_index.get(i,'?') for i in text])


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json


The next step involves creating a custom layer for attention. Attention is the foundation of transformer models and it is one of the most innovative ideas in neural NLP of recent times.

In [16]:
from keras.layers.reshaping.permute import Permute
from tensorflow.keras.layers import Dense,Dropout
from tensorflow.keras.layers import Flatten,RepeatVector,dot, multiply,Permute,Lambda
K=keras.backend
def attention(layer):
  #--attention is all you need
  _,_,units=layer.shape.as_list()
  attention=Dense(1,activation='tanh')(layer)
  attention=Flatten(attention)
  attention=Activation('softmax')(attention)
  attention=RepeatVector(units)(attention)
  attention=Permute([2,1])(attention)
  representation=multiply([layer,attention])
  representation=Lambda(lambda x:K.sum(x,axis=-2),
                        output_shape=(units,))(representation)
  return representation


In [17]:
def get_optimizer(option=0, learning_rate=0.001):
    if option==0:
        return tf.keras.optimizers.Adam(learning_rate)
    elif option==1:
        return tf.keras.optimizers.SGD(learning_rate, 
                                       momentum=0.9, nesterov=True)
    elif option==2:
        return tfa.optimizers.RectifiedAdam(learning_rate)
    elif option==3:
        return tfa.optimizers.Lookahead(
                   tf.optimizers.Adam(learning_rate), sync_period=3)
    elif option==4:
        return tfa.optimizers.SWA(tf.optimizers.Adam(learning_rate))
    elif option==5:
        return tfa.optimizers.SWA(
                   tf.keras.optimizers.SGD(learning_rate, 
                                       momentum=0.9, nesterov=True))
    else:
        return tf.keras.optimizers.Adam(learning_rate)

Having defined two key functions, we now face the most important function to code: the one that will provide different neural architectures given the parameters. We don’t encode all the various parameters we want to connect to the different architectural choices; we only provide the hp parameter, which should contain all the possible parameters we want to use, and that will be run by KerasTuner. Aside from hp in the function input, we fix the size of the vocabulary and the length to be padded (adding dummy values if the effective length is shorter or cutting the phrase if the length is longer):

In [20]:
layers=keras.layers
models=keras.models
def create_tunable_model(hp, vocab_size=10000,pad_length=256):
  #instantiate model params
  embedding_size=hp.Int('embedding_size',min_value=8,max_value=512,step=8)
  spatial_dropout=hp.Float('spatial_dorpout',min_value=0, max_value=0.5,step=0.05)
  conv_layers = hp.Int('conv_layers', min_value=1, max_value=5, step=1)
  rnn_layers = hp.Int('rnn_layers', min_value=1,max_value=5, step=1)
  dense_layers = hp.Int('dense_layers', min_value=1,max_value=3, step=1)
  conv_filters = hp.Int('conv_filters', min_value=32, max_value=512, step=32)
  conv_kernel = hp.Int('conv_kernel', min_value=1, max_value=8, step=1)
  concat_dropout = hp.Float('concat_dropout', min_value=0,max_value=0.5, step=0.05)
  dense_dropout = hp.Float('dense_dropout', min_value=0, max_value=0.5, step=0.05)

Now we can define actual mode

In [19]:
inputs=layers.Input(name='inputs',shape=[pad_length])
layer=layers.Embedding(vocab_size,embedding_size,input_length=pad_length)(inputs)
layer=layers.SpatialDropout1D(spatial_dropout)(layer)

NameError: ignored