<a href="https://colab.research.google.com/github/Baris000-eng/NLP_Coding_Works_and_Projects/blob/main/Deep_Learning_For_NLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [21]:
# Deep Learning For NLP

# The Basics of Perceptron Model

# Artificial Neural Networks (ANNs) have a basis in biology.

# The biological neurons have the following structure: There are several dendrites that feed into the body of the neuron cells.
# An electrical signal is passed through the dendrites. This signal goes to the cell body and is outputted through an axon.
# Then, the connection with the other neuron is performed.

# The artificial neuron has also inputs and outputs. The indexing starts at 0 for the inputs (i.e. Input 0 is the first input,
# Input 1 is the second input, and so on). The inputs of an artificial neuron will be the values of features. The inputs are
# multiplied by weights. Weights are initialized through some sort of random generation. At this step, the inputs are multiplied
# by the weights. Then, these results (the inputs multiplied by their corresponding weights) are passed to an activation function.

# There are many activation functions to choose from.

# An activation function example: If the sum of the inputs is positive, then return 1. If the sum of the inputs is negative, then
# return 0. For this activation function, we can handle the case where each of the inputs is 0 by adding a bias term to each input.
# This activation function is pretty dramatic, since the small changes are not reflected. We can have small changes in Z, for instance,
# Z from 0.6 to 0.7. It does not matter, this activation function will still output 1 as long as Z is positive. Moreover, for the
# dramatic negative changes in Z (i.e. from -1 to -1000), this activation function will output 0 for both the initial and final values.

# The mathematical representation of the perceptron model:
# Z = i from 0 to n sum(wi*xi) + b

# In the above equation; n is the number of inputs, wi is the weight specific to the input, xi is the input itself, and b is the bias term.

# The perceptron model has 2 inputs and 1 output.



In [2]:
# Introduction to Neural Networks

# Multiple Perceptrons Network: It includes various layers of single perceptrons connected to each other through their inputs and outputs.
# Layer Types:
# 1-) Input Layer:
  # * Input layers are the real values from the data.
# 2-) Hidden Layer
  # * Hidden layers are the layers between the input and output layers.
  # * 3 or more hidden layers are considered as a 'deep network'.
# 3-) Output Layer
  # * Output layer gives the final estimate of the output.

# As we go through more layers, the abstraction level increases.

# Abstraction in the multilayer perceptron context refers to the level of detail or complexity in the features that the neural network
# is learning to represent. In deep learning, the first layers of a neural network tend to capture low-level or basic features, while
# deeper layers capture more abstract and high-level features.



In [3]:
# We might need a more fine-grained activation function that is more responsive to the input changes compared to the simpler ones.
# In this context, the sigmoid function comes to the place.

# Sigmoid function ===> f(x) = 1 / (1 + e^(-x))

# Sigmoid function for the activation function ===> f(w*x+b) = 1 / (1 + e^(-(w*x+b))) = f(Z) = 1 / (1 + e^(-Z))


# Hyperbolic Tangent Activation Function (tanh or tanh(z))
# coshx = (e^x + e^(-x)) / 2
# sinhx = (e^x - e^(-x)) / 2
# tanhx = sinhx / coshx

# Hyperbolic Tantent function takes values between -1 and 1, while the sigmoid function takes values between 0 and 1.

#--------------------------------------------------------------------------------------------------------------------#
# Rectified Linear Unit (ReLU): This is a relatively simple activation function which has a formula of max(0,z).
# Please note that the 'z' in this formula is equal to w*x+b.

# ReLU tends to have the best performance in many situations.

# Softmax function is usually used at the very end of a layer in order to get some sort of
# classification output.

# Keras Basics

# Iris dataset contains measurements of flower petals and sepals, and has corresponding labels to one of three classes (3 flower species).

In [5]:
# Performing necessary imports
import numpy as np
from sklearn.datasets import load_iris


# Loading the iris dataset
iris = load_iris()
print(iris)
print()
print()
print()
print(type(iris))

{'data': array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       [4.4, 2.9, 1.4, 0.2],
       [4.9, 3.1, 1.5, 0.1],
       [5.4, 3.7, 1.5, 0.2],
       [4.8, 3.4, 1.6, 0.2],
       [4.8, 3. , 1.4, 0.1],
       [4.3, 3. , 1.1, 0.1],
       [5.8, 4. , 1.2, 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [5.4, 3.9, 1.3, 0.4],
       [5.1, 3.5, 1.4, 0.3],
       [5.7, 3.8, 1.7, 0.3],
       [5.1, 3.8, 1.5, 0.3],
       [5.4, 3.4, 1.7, 0.2],
       [5.1, 3.7, 1.5, 0.4],
       [4.6, 3.6, 1. , 0.2],
       [5.1, 3.3, 1.7, 0.5],
       [4.8, 3.4, 1.9, 0.2],
       [5. , 3. , 1.6, 0.2],
       [5. , 3.4, 1.6, 0.4],
       [5.2, 3.5, 1.5, 0.2],
       [5.2, 3.4, 1.4, 0.2],
       [4.7, 3.2, 1.6, 0.2],
       [4.8, 3.1, 1.6, 0.2],
       [5.4, 3.4, 1.5, 0.4],
       [5.2, 4.1, 1.5, 0.1],
       [5.5, 4.2, 1.4, 0.2],
     

In [6]:
# It gets the description of the iris dataset.
print(iris.DESCR)

.. _iris_dataset:

Iris plants dataset
--------------------

**Data Set Characteristics:**

    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
                
    :Summary Statistics:

                    Min  Max   Mean    SD   Class Correlation
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20   0.76    0.9565  (high!)

    :Missing Attribute Values: None
    :Class Distribution: 33.3% for each of 3 classes.
    :Creator: R.A. Fisher
    :Donor: Michael Marshall (MARSHALL%PLU@io.arc.nasa.gov)
    :

In [7]:
X_features = iris.data
print(X_features)

[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]
 [5.4 3.9 1.7 0.4]
 [4.6 3.4 1.4 0.3]
 [5.  3.4 1.5 0.2]
 [4.4 2.9 1.4 0.2]
 [4.9 3.1 1.5 0.1]
 [5.4 3.7 1.5 0.2]
 [4.8 3.4 1.6 0.2]
 [4.8 3.  1.4 0.1]
 [4.3 3.  1.1 0.1]
 [5.8 4.  1.2 0.2]
 [5.7 4.4 1.5 0.4]
 [5.4 3.9 1.3 0.4]
 [5.1 3.5 1.4 0.3]
 [5.7 3.8 1.7 0.3]
 [5.1 3.8 1.5 0.3]
 [5.4 3.4 1.7 0.2]
 [5.1 3.7 1.5 0.4]
 [4.6 3.6 1.  0.2]
 [5.1 3.3 1.7 0.5]
 [4.8 3.4 1.9 0.2]
 [5.  3.  1.6 0.2]
 [5.  3.4 1.6 0.4]
 [5.2 3.5 1.5 0.2]
 [5.2 3.4 1.4 0.2]
 [4.7 3.2 1.6 0.2]
 [4.8 3.1 1.6 0.2]
 [5.4 3.4 1.5 0.4]
 [5.2 4.1 1.5 0.1]
 [5.5 4.2 1.4 0.2]
 [4.9 3.1 1.5 0.2]
 [5.  3.2 1.2 0.2]
 [5.5 3.5 1.3 0.2]
 [4.9 3.6 1.4 0.1]
 [4.4 3.  1.3 0.2]
 [5.1 3.4 1.5 0.2]
 [5.  3.5 1.3 0.3]
 [4.5 2.3 1.3 0.3]
 [4.4 3.2 1.3 0.2]
 [5.  3.5 1.6 0.6]
 [5.1 3.8 1.9 0.4]
 [4.8 3.  1.4 0.3]
 [5.1 3.8 1.6 0.2]
 [4.6 3.2 1.4 0.2]
 [5.3 3.7 1.5 0.2]
 [5.  3.3 1.4 0.2]
 [7.  3.2 4.7 1.4]
 [6.4 3.2 4.5 1.5]
 [6.9 3.1 4.

In [9]:
y_labels = iris.target
print(y_labels)
print()
print(type(y_labels))

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]

<class 'numpy.ndarray'>


In [10]:
from keras.utils import to_categorical
print(y_labels.shape)
y_labels = to_categorical(y_labels)
print(y_labels.shape)

(150,)
(150, 3)


In [12]:
# divide the dataset into train-test

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X_features, y_labels, random_state = 42, test_size = 0.33)
print()
print('------------------------------------------------------------------------------------------------------')
print('The X_train dataset is as below: ')
print()
print(X_train)
print()
print('The X_test dataset is as below: ')
print()
print(X_test)
print()
print('The y_train dataset is as below: ')
print()
print(y_train)
print()
print('The y_test dataset is as below: ')
print(y_test)
print()
print('------------------------------------------------------------------------------------------------------')



------------------------------------------------------------------------------------------------------
The X_train dataset is as below: 

[[5.7 2.9 4.2 1.3]
 [7.6 3.  6.6 2.1]
 [5.6 3.  4.5 1.5]
 [5.1 3.5 1.4 0.2]
 [7.7 2.8 6.7 2. ]
 [5.8 2.7 4.1 1. ]
 [5.2 3.4 1.4 0.2]
 [5.  3.5 1.3 0.3]
 [5.1 3.8 1.9 0.4]
 [5.  2.  3.5 1. ]
 [6.3 2.7 4.9 1.8]
 [4.8 3.4 1.9 0.2]
 [5.  3.  1.6 0.2]
 [5.1 3.3 1.7 0.5]
 [5.6 2.7 4.2 1.3]
 [5.1 3.4 1.5 0.2]
 [5.7 3.  4.2 1.2]
 [7.7 3.8 6.7 2.2]
 [4.6 3.2 1.4 0.2]
 [6.2 2.9 4.3 1.3]
 [5.7 2.5 5.  2. ]
 [5.5 4.2 1.4 0.2]
 [6.  3.  4.8 1.8]
 [5.8 2.7 5.1 1.9]
 [6.  2.2 4.  1. ]
 [5.4 3.  4.5 1.5]
 [6.2 3.4 5.4 2.3]
 [5.5 2.3 4.  1.3]
 [5.4 3.9 1.7 0.4]
 [5.  2.3 3.3 1. ]
 [6.4 2.7 5.3 1.9]
 [5.  3.3 1.4 0.2]
 [5.  3.2 1.2 0.2]
 [5.5 2.4 3.8 1.1]
 [6.7 3.  5.  1.7]
 [4.9 3.1 1.5 0.2]
 [5.8 2.8 5.1 2.4]
 [5.  3.4 1.5 0.2]
 [5.  3.5 1.6 0.6]
 [5.9 3.2 4.8 1.8]
 [5.1 2.5 3.  1.1]
 [6.9 3.2 5.7 2.3]
 [6.  2.7 5.1 1.6]
 [6.1 2.6 5.6 1.4]
 [7.7 3.  6.1 2.3]
 [5.5 

In [13]:
# The below important statement makes sure that all values fit into the range between 0 and 1.
from sklearn.preprocessing import MinMaxScaler


lst = [5, 10, 15, 20, 25, 30, 35, 40, 45, 50]
lst = np.array(lst)

# The below division makes each element in the list called lst stay in the range from 0 to 1.
lst = lst / 50
print(lst)



[0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]


In [14]:
scalar_object = MinMaxScaler()
scalar_object.fit(X_train) # It fits the scalar object to the X_train data.

In [18]:
scaled_X_train = scalar_object.transform(X_train)
scaled_X_test = scalar_object.transform(X_test)

print('The scaled X_train data is as below: ')
print()
print(scaled_X_train)
print()
print()
print()
print('The scaled X_test data is as below: ')
print()
print(scaled_X_test)
print()
print()
print('Data types of the scaled X_train and X_test data are as below: ')
print()
print(type(scaled_X_train))
print(type(scaled_X_test))

# The fact that all the values are between 0 and 1 in the train-test splits helps neural networks in the sense that
# biases and weights do not grow too large.

The scaled X_train data is as below: 

[[0.41176471 0.40909091 0.55357143 0.5       ]
 [0.97058824 0.45454545 0.98214286 0.83333333]
 [0.38235294 0.45454545 0.60714286 0.58333333]
 [0.23529412 0.68181818 0.05357143 0.04166667]
 [1.         0.36363636 1.         0.79166667]
 [0.44117647 0.31818182 0.53571429 0.375     ]
 [0.26470588 0.63636364 0.05357143 0.04166667]
 [0.20588235 0.68181818 0.03571429 0.08333333]
 [0.23529412 0.81818182 0.14285714 0.125     ]
 [0.20588235 0.         0.42857143 0.375     ]
 [0.58823529 0.31818182 0.67857143 0.70833333]
 [0.14705882 0.63636364 0.14285714 0.04166667]
 [0.20588235 0.45454545 0.08928571 0.04166667]
 [0.23529412 0.59090909 0.10714286 0.16666667]
 [0.38235294 0.31818182 0.55357143 0.5       ]
 [0.23529412 0.63636364 0.07142857 0.04166667]
 [0.41176471 0.45454545 0.55357143 0.45833333]
 [1.         0.81818182 1.         0.875     ]
 [0.08823529 0.54545455 0.05357143 0.04166667]
 [0.55882353 0.40909091 0.57142857 0.5       ]
 [0.41176471 0.227272

In [19]:
from keras.models import Sequential
from keras.layers import Dense

In [26]:
sequential_model = Sequential()

# 12 neurons, 4-dimensional input, rectified linear unit is the activation function
sequential_model.add(Dense(12, input_dim = 4, activation = 'relu'))

 # 12 neurons, 4-dimensional input, rectified linear unit is the activation function
sequential_model.add(Dense(12, input_dim = 4, activation = 'relu'))

 # Each neuron will keep the probability of belonging to a particular class, which is why we one-hat encoded.
sequential_model.add(Dense(3, activation = 'softmax'))

print(sequential_model)
print(type(sequential_model))

sequential_model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])


<keras.src.engine.sequential.Sequential object at 0x7d639acc0490>
<class 'keras.src.engine.sequential.Sequential'>


In [28]:
model_summary = sequential_model.summary()
print(model_summary)

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_15 (Dense)            (None, 12)                60        
                                                                 
 dense_16 (Dense)            (None, 12)                156       
                                                                 
 dense_17 (Dense)            (None, 3)                 39        
                                                                 
Total params: 255 (1020.00 Byte)
Trainable params: 255 (1020.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None


In [29]:
sequential_model.fit(scaled_X_train, y_train, epochs = 300, verbose = 2)

Epoch 1/300
4/4 - 1s - loss: 1.1217 - accuracy: 0.2200 - 1s/epoch - 254ms/step
Epoch 2/300
4/4 - 0s - loss: 1.1143 - accuracy: 0.2300 - 18ms/epoch - 5ms/step
Epoch 3/300
4/4 - 0s - loss: 1.1068 - accuracy: 0.2300 - 19ms/epoch - 5ms/step
Epoch 4/300
4/4 - 0s - loss: 1.1000 - accuracy: 0.2200 - 18ms/epoch - 4ms/step
Epoch 5/300
4/4 - 0s - loss: 1.0933 - accuracy: 0.2400 - 14ms/epoch - 4ms/step
Epoch 6/300
4/4 - 0s - loss: 1.0871 - accuracy: 0.3000 - 14ms/epoch - 4ms/step
Epoch 7/300
4/4 - 0s - loss: 1.0809 - accuracy: 0.2900 - 14ms/epoch - 3ms/step
Epoch 8/300
4/4 - 0s - loss: 1.0751 - accuracy: 0.2600 - 13ms/epoch - 3ms/step
Epoch 9/300
4/4 - 0s - loss: 1.0690 - accuracy: 0.2700 - 14ms/epoch - 4ms/step
Epoch 10/300
4/4 - 0s - loss: 1.0626 - accuracy: 0.3000 - 16ms/epoch - 4ms/step
Epoch 11/300
4/4 - 0s - loss: 1.0563 - accuracy: 0.3300 - 16ms/epoch - 4ms/step
Epoch 12/300
4/4 - 0s - loss: 1.0504 - accuracy: 0.3400 - 15ms/epoch - 4ms/step
Epoch 13/300
4/4 - 0s - loss: 1.0446 - accuracy: 

<keras.src.callbacks.History at 0x7d639819fee0>

In [33]:
predict_x = sequential_model.predict(scaled_X_test)
print(predict_x)

[[2.21595448e-03 9.10821140e-01 8.69629309e-02]
 [9.98083949e-01 1.91604684e-03 3.27572529e-08]
 [2.15246033e-07 5.22168027e-03 9.94778156e-01]
 [1.65880576e-03 8.14364374e-01 1.83976769e-01]
 [1.25785579e-03 7.92195201e-01 2.06546888e-01]
 [9.95942116e-01 4.05781856e-03 1.74157719e-07]
 [7.72106787e-03 9.79186177e-01 1.30926529e-02]
 [1.45989898e-05 4.55390699e-02 9.54446375e-01]
 [7.36773538e-04 4.03388470e-01 5.95874727e-01]
 [5.99228358e-03 9.69845951e-01 2.41617002e-02]
 [1.03231745e-04 2.03765199e-01 7.96131551e-01]
 [9.95638549e-01 4.36117779e-03 3.34005392e-07]
 [9.98422623e-01 1.57723494e-03 3.59373651e-08]
 [9.96161163e-01 3.83850723e-03 2.37517142e-07]
 [9.97687817e-01 2.31205719e-03 7.15739290e-08]
 [1.24881766e-03 8.51767123e-01 1.46984056e-01]
 [2.99871067e-06 1.70870833e-02 9.82909918e-01]
 [6.73606573e-03 9.66370821e-01 2.68930607e-02]
 [2.40819063e-03 8.91937375e-01 1.05654448e-01]
 [3.96938549e-06 1.83103364e-02 9.81685579e-01]
 [9.94184077e-01 5.81550598e-03 4.779759

In [40]:
predicted_classes = np.argmax(predict_x,axis=1)
print('Class assignment: ' + str(predicted_classes) + '')

Class assignment: [1 0 2 1 1 0 1 2 2 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1
 0 0 0 2 1 1 0 0 1 1 2 1 2]


In [39]:
actual_classes = y_test.argmax(axis=1)
print('Actual classes: '+str(actual_classes)+'')

Actual classes: [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0 0 0 1 0 0 2 1
 0 0 0 2 1 1 0 0 1 2 2 1 2]


In [47]:
# Model performance evaluation
from sklearn.metrics import classification_report, accuracy_score, confusion_matrix

In [48]:
# confusion matrix
confusion_matrix = confusion_matrix(actual_classes, predicted_classes)
print(confusion_matrix)

[[19  0  0]
 [ 0 14  1]
 [ 0  1 15]]


In [49]:
# accuracy score
accuracy_score = accuracy_score(actual_classes, predicted_classes)
print('The overall accuracy score is: '+str(accuracy_score)+'')

The overall accuracy score is: 0.96


In [50]:
# classification report
classification_report = classification_report(actual_classes, predicted_classes)
print(classification_report)

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       0.93      0.93      0.93        15
           2       0.94      0.94      0.94        16

    accuracy                           0.96        50
   macro avg       0.96      0.96      0.96        50
weighted avg       0.96      0.96      0.96        50



In [51]:
# saving the deep learning model
sequential_model.save('myfirstDLmodel.h5')

  saving_api.save_model(


In [52]:
# loading the saved deep learning model
from keras.models import load_model

new_model = load_model('myfirstDLmodel.h5')
print(new_model)
print(type(new_model))

<keras.src.engine.sequential.Sequential object at 0x7d63880aa2c0>
<class 'keras.src.engine.sequential.Sequential'>


In [56]:
predicted_y_values = new_model.predict(scaled_X_test)
print('Predicted y values are: \n'+str(predicted_y_values)+'')
new_predicted_classes = np.argmax(predicted_y_values,axis=1)
print()
print()
print('Class assignment: \n' + str(new_predicted_classes) + '')

Predicted y values are: 
[[2.21595448e-03 9.10821140e-01 8.69629309e-02]
 [9.98083949e-01 1.91604684e-03 3.27572529e-08]
 [2.15246033e-07 5.22168027e-03 9.94778156e-01]
 [1.65880576e-03 8.14364374e-01 1.83976769e-01]
 [1.25785579e-03 7.92195201e-01 2.06546888e-01]
 [9.95942116e-01 4.05781856e-03 1.74157719e-07]
 [7.72106787e-03 9.79186177e-01 1.30926529e-02]
 [1.45989898e-05 4.55390699e-02 9.54446375e-01]
 [7.36773538e-04 4.03388470e-01 5.95874727e-01]
 [5.99228358e-03 9.69845951e-01 2.41617002e-02]
 [1.03231745e-04 2.03765199e-01 7.96131551e-01]
 [9.95638549e-01 4.36117779e-03 3.34005392e-07]
 [9.98422623e-01 1.57723494e-03 3.59373651e-08]
 [9.96161163e-01 3.83850723e-03 2.37517142e-07]
 [9.97687817e-01 2.31205719e-03 7.15739290e-08]
 [1.24881766e-03 8.51767123e-01 1.46984056e-01]
 [2.99871067e-06 1.70870833e-02 9.82909918e-01]
 [6.73606573e-03 9.66370821e-01 2.68930607e-02]
 [2.40819063e-03 8.91937375e-01 1.05654448e-01]
 [3.96938549e-06 1.83103364e-02 9.81685579e-01]
 [9.94184077e-0