<a href="https://colab.research.google.com/github/esragcetnky/Edureka-LSTM/blob/main/Edureka_%7C_LSTM_Explained.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

I created this notebook to learn and practice LSTM.

I studied this topic from an Edureka video named "LSTM Explained | What Is LSTM | Deep Learning Training | Edureka"

Link : https://www.youtube.com/watch?v=zD13uQIgac8

# 1.What is NLP?

* concerned with the interactions between computers and human language
* how to program computers to process and analyze large amounts of natural language data

# 2.Ways to Process Text Data

*  1.machine learning
* 2.deep learning
      LSTM
      Neural Network
      RNN
      Transfer network


# 3.Recurrent Neural Network

* designed to recognize a data's sequential characteristics and use patterns to predict the next likely scenario
* sequential data : 
* states
* X : inputs 
      X=[URI, IS, A, REALLY, GOOD, MOVIE]
* y : training values
* y̅ : predicted values
* a : 
      a[0]= URI
      a[1]:= URI IS
      .
      .
      .
      a[N]= URI IS A REALLY GOOD MOVIE
* u : is equal for everyone
* w : weights
* v
* activation function :tanh
      a<t>  = tanh ( (X * u) + (a * w) )
      y<t> = sigmoid ( ( a<t> * v ) + bias ) or softmax ( ( a<t> * v ) + bias )
      loss = np.sum((y̅ - y)*theta)
* embedding layer
      embedding matrix
      which is like filters or kernels in cnn
      it is helps to reduce size
![](https://drive.google.com/uc?export=view&id=17FfSgboFdXUFz4H65NW3ixRm8kVQ4Iw6)

# 4.LSTM

 * vanishing gradient problemi için çözüm getiriyor.
 * lstm rnnlerin hidden layerlarına yapılan modifikasyonla oluşturulmuştur.
 * vanishing gradient problem : weightleri güncellemek için kullanılan gradientlar çok derin sinir ağlarında zamanla gradientler kaybolur (vanishes) bundan dolayı ağ weightleri güncelleyemez hatta bazen ağ tamamen çalışamaz hale gelir
 * lstm has feedback connections.It can not only process single data points, but also entire sequences of data.
 * lstm bu sorunları çözmek için 3 gate kullanır
        1. forget gate : cell state'inde daha fazla gerekli olmayan bilgileri siler.
        2. input gate :  cell state'ine eklenecek gerekli yeni bilgiler input gate sayesinde yüklenir.
        3. output gate : cell state'ine eklenecek gerekli yeni bilgiler aynı zamanda output gate sayesinde yüklenebilir.

![](https://drive.google.com/uc?export=view&id=1glo-eAOhIovNgauQ5IozknbUG9cZZEzj)

# 5.Implementing LSTM

## 5.1 Import Libraries & Dataset

*Import libraries*

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

*Import dataset*

In [None]:
data=pd.read_csv("../input/us-baby-names/NationalNames.csv")

## 5.2 Analyze Dataset

In [None]:
data.info()

In [None]:
data.shape

In [None]:
data.head()

## 5.3 Preprocessing Data

*Label encoder for gender column*

In [None]:
data['Gender']=data['Gender'].astype('category').cat.codes

In [None]:
data.head()

*We only need unique names*

In [None]:
df= data.groupby('Name').mean()['Gender'].reset_index()

In [None]:
df.shape

In [None]:
df.head()

*Making gender's type int*

In [None]:
df['Gender']=df['Gender'].astype('int')

In [None]:
df.head()

In [None]:
import string

*We need a list of alphabets to convert words to number*

In [None]:
letters=list(string.ascii_lowercase)
letters

*We can use vocab to encode letters to numbers*

In [None]:
vocab=dict(zip(letters,range(1,27)))
vocab

*We can use r_vocab to decode words*

In [None]:
r_vocab=dict(zip(range(1,27),letters))
r_vocab

*This function will return every letter in name column to number and save the result to dataframe*

In [None]:
def word_to_number():
  for i  in range(0,df.shape[0]):
    seq=[ vocab[letters.lower()] for letters in df['Name'][i]]
    df['Name'][i]=seq

In [None]:
# to convert our names to list of equivalent numbers
word_to_number()

*Let's see how our dataframe looks like after encoding*

In [None]:
df.head()

*We need to determine the number of boxes in lstm. We will send each letter to one box, hence the number of letters has an impact on the number of boxes. If we use the maximum number of letters in a name then there will be lots of zeros which will bring loss and our accuracy is going to drop.*

*We can look histogrom of name length then we can decide the best number for boxes*

In [None]:
X=df['Name'].values
Y=df['Gender'].values

In [None]:
name_length=[len(X[i]) for i in range (0, df.shape[0])]

In [None]:
len(name_length)

In [None]:
plt.hist(name_length,bins=20)
plt.show()

*We decided to have 10 boxes so next step we need to convert each names to 10 digit row*

In [None]:
from keras.preprocessing.sequence import pad_sequences
x=pad_sequences(df['Name'].values,
                maxlen=10,
                padding='pre')

In [None]:
x

In [None]:
x.shape

## 5.4 Creating Model

In [None]:
from keras.layers import Input,Embedding,Dense,LSTM
from keras.models import Model

In [None]:
vocab_size=len(vocab)+1
vocab_size

In [None]:
# input layer
inp=Input(shape=(10,))
# embedding layer 
emn=Embedding(input_dim=vocab_size,
              output_dim =5 )(inp)
# lstm layers
lstm1=LSTM(units=32,
           return_sequences=True)(emn)
lstm2=LSTM(units=64)(lstm1)

out=Dense(units=1,
          activation='sigmoid')(lstm2)

my_model=Model(inputs=inp,
               outputs=out)

In [None]:
my_model.summary()

## 5.5 Compile & Train Model

In [None]:
my_model.compile(optimizer='adam',
                 loss='binary_crossentropy',
                 metrics=['acc'])

In [None]:
his=my_model.fit(x,Y,epochs=10, batch_size=256,validation_split=0.2)

## 5.6 Visualize Result

In [None]:
plt.style.use('seaborn-darkgrid')

*Accuracy and Validation Accuracy*

In [None]:
fig, ax=plt.subplots(nrows=1,ncols=1,figsize=(10,5))
ax.plot(his.history['acc'],label='Accuracy')
ax.plot(his.history['val_acc'],label='Validation Accuracy')
ax.legend()
fig.show()

*Loss and Validation Loss*

In [None]:
fig, ax=plt.subplots(nrows=1,ncols=1,figsize=(10,5))
ax.plot(his.history['loss'],label='Loss')
ax.plot(his.history['val_loss'],label='Validation Loss')
ax.legend()
fig.show()

## 5.7 Predict for Random Name

In [None]:
def predict_name(name):
  test_name=name.lower()
  seq=[vocab[i] for i in test_name]
  x_test=pad_sequences([seq],10)
  y_pred=my_model.predict(x_test)
  if y_pred < 0.5:
    print("Name is female...")
  else:
    print("Name is male...")

In [None]:
predict_name('Ugur')

In [None]:
predict_name('Ayse')

In [None]:
predict_name('Mustafa')

In [None]:
predict_name('Natasha')

# 6.LSTM Use Cases
* name entity recognition
* sentiment analysis
* machine translation
