<a href="https://colab.research.google.com/github/kilos11/Learn-Keras-for-Deep-Neural-Networks-by-JoJo-Moolayil/blob/main/The_Path_Ahead.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**CNN**

CNNs are the class of DL algorithms used for computer vision use cases
like classifying an image or a video and detecting an object within an
image or even a region within an image. CNN algorithms were a huge
breakthrough in the field of computer vision, as it required a bare
minimum of image processing compared to the other prevalent techniques
of the time and also performed exceptionally well. The performance
improvement with CNN for image classification was phenomenal. The
process of building CNN is also simplified in Keras, where all the logical
components are neatly abstracted. Keras provides CNN layers, and the
overall process of developing CNN models is quite similar to what we
learned while developing regression and classification models.
To give a brief understanding of the process, we will use a small
example with its implementation. The following code snippet showcases a
‘hello world’ equivalent implementation for CNN. We will use the MNIST
data (i.e., a collection of images with handwritten digits). The objective
would be to classify the image as one of the digits from [0,1,2,3,4,5,6,7,8,9].
The data is already available in the Keras dataset module. Though the topic  is entirely new, the comments within the code snippet will provide you
with a basic idea of the model design.

In [None]:
!pip install tensorflow



In [None]:
#Importing the necessary packages
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
#Importing the CNN related layers as described in Chapter 2
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from keras.utils import np_utils

#Printing the Keras version
print("Keras version:", tensorflow.keras.__version__)

In [None]:
#Loading data from Keras datasets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
#Defining the height and weight and number of samples
#Each Image is a 28 x 28 with 1 channel matrix
training_samples, height, width = x_train.shape
testing_samples,_,_ = x_test.shape
print("Training Samples:",training_samples)
print("Testing Samples:",testing_samples)
print("Height: "+str(height)+" x Width:"+ str(width))

In [None]:
#Lets have a look at a sample image in the training data
plt.imshow(x_train[0],cmap='gray', interpolation='none')
#We now have to engineer the image data into the right form
#For CNN, we would need the data in Height x Width X Channels
#form Since the image is in grayscale, we will use channel = 1
channel =1
x_train = x_train.reshape(training_samples, height,
width,channel).astype('float32')
x_test = x_test.reshape(testing_samples, height, width,
channel).astype('float32')
#To improve the training process, we would need to standardize
#or normalize the values We can achieve this using a simple
#divide by 256 for all values
x_train = x_train/255
x_test =x_test/255
#Total number of digits =10
target_classes = 10
# numbers 0-9, so ten classes
n_classes = 10
# convert integer labels into one-hot vectors
y_train = np_utils.to_categorical(y_train, n_classes)
y_test = np_utils.to_categorical(y_test, n_classes)

In [None]:
#Designing the CNN Model
model = Sequential()
model.add(Conv2D(64, (5, 5), input_shape=(height,width ,1),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(n_classes, activation='softmax'))

# Compile model
model.compile(loss='categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])

# Fit the model
model.fit(x_train, y_train, validation_data=(x_test, y_test),
epochs=10, batch_size=200)

In [None]:
#Finally, let’s evaluate the model performance:
metrics = model.evaluate(x_test, y_test, verbose=0)
for i in range(0,len(model.metrics_names)):
 print(str(model.metrics_names[i])+" = "+str(metrics[i]))

#**RNN**
The next step in DL after having explored CNN is to start exploring RNN,
popularly known as “sequence models.” This name became popular
because RNN makes use of sequential information. So far, all the DNNs
that we have explored process training data with the assumption that
there is no relationship between any two training samples. However, this
is an issue for many problems that we can solve using data. Consider
the predictive text feature in your iOS or Android phone; the prediction
of the next word is highly dependent on the last few words you already
typed. That’s where the sequential model comes into the picture. RNNs
can also be understood as neural networks with memory. It connects
a layer to itself and thereby gets simultaneous access to two or more
consecutive input samples to process the end output. This property is
unique to RNN, and with its rise in research, it delivered amazing success
in the field of natural language understanding.
 All the legacy natural
language processing techniques could now be significantly improved
with RNNs. The rise of chatbots, improved autocorrect in text messaging,
suggested reply in e-mail clients and other apps, and machine translation
(i.e., translating text from a source language to a target language, Google
Translate being the classic example) have all been propelled with the
adoption of RNN. There are again different types of LSTM (long short-term
memory) networks that overcome the limitations within the existing RNN
architecture and take performance for natural language processing–related
tasks a notch higher. The most popular versions of RNN are LSTM and
GRU (gated recurrent unit) networks.
Similar to what we did for CNN, we will have a look at a simple (hello
world equivalent) sample implementation for RNN/LSTM networks.
The following code snippet performs a binary classification on the IMDB
reviews dataset within Keras. It is a use case where we are provided with
user reviews (text date) and an associated outcome as Positive or Negative.

In [None]:
#Import the necessary packages
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense, LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence

In [None]:
#Setting a max cap for the number of distinct words
top_words = 5000
#Loading the training and test data from keras datasets
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=top_words)

#Since the length of each text will be varying
#We will pad the sequences (i.e. text) to get a uniform length
#throughout
max_text_length = 500
x_train = sequence.pad_sequences(x_train, maxlen=max_text_length)



In [None]:
#Design the network
embedding_length = 32
model = Sequential()
model.add(Embedding(top_words, embedding_length,
input_length=max_text_length))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))

In [None]:
#Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])

In [None]:
#Fit the model
model.fit(x_train, y_train, validation_data=(x_test, y_test),
epochs=3, batch_size=64)

In [None]:
#Evaluate the accuracy on the test dataset:
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy:",scores[1])

#**CNN + RNN**

Another interesting area to explore within DL is the intersection of CNN
and RNN. Sounds confusing? Just imagine you could combine the power
of CNN (i.e., understanding images) and that of RNN (i.e., understanding
natural text); what could the intersection or combination look like? You
could describe a picture with words. That’s right, by combining RNN and
CNN together, we could help computers describe an image with natural￾style text. The process is called image captioning. Today, if you search on
google.com, a query like “yellow cars,” your results will actually return a
ton of yellow cars. If you imagine that the captioning for these images was
done by humans, which could then be indexed by search engines, you are
absolutely wrong.With humans, we can’t scale the process of captioning
images to billions of images per day. The process is simply not viable. You
would need a smarter way to do that. Image captioning with CNN+RNN
has brought a breakthrough not only in an image search for search engines
but several other products we use in our day-to-day lives. The most important and revolutionary outcome that was delivered to mankind by
the intersection of RNN and CNN was smart glasses (called duLight by
Baidu): a camera equipped to reading glasses that could describe what the
surroundings looked like. This was a great product for visually impaired
people. Today, we have a smaller version of that implemented in a few
apps that can be installed on the phone and works with the phone camera.  