<a href="https://colab.research.google.com/github/kamrulhuda/TensorflowTutorial/blob/main/LSTM_Basics.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Long Short-Term Memory Model**
The Long Short-Term Memory, as it was called, was an abstraction of how computer memory works. It is "bundled" with whatever processing unit is implemented in the Recurrent Network, although outside of its flow, and is responsible for keeping, reading, and outputting information for the model. The way it works is simple: you have a linear unit, which is the information cell itself, surrounded by three logistic gates responsible for maintaining the data. One gate is for inputting data into the information cell, one is for outputting data from the input cell, and the last one is to keep or forget data depending on the needs of the network.

Thanks to that, it not only solves the problem of keeping states, because the network can choose to forget data whenever information is not needed, it also solves the gradient problems, since the Logistic Gates have a very nice derivative.

Long Short-Term Memory Architecture **bold text**
The Long Short-Term Memory is composed of a linear unit surrounded by three logistic gates. The name for these gates vary from place to place, but the most usual names for them are:

the "Input" or "Write" Gate, which handles the writing of data into the information cell
the "Output" or "Read" Gate, which handles the sending of data back onto the Recurrent Network
the "Keep" or "Forget" Gate, which handles the maintaining and modification of the data stored in the information cell

In [1]:
import numpy as np
import pandas as pd
import tensorflow as tf
print(tf.__version__)

2.4.1


In [3]:
LSTM_CELL_SIZE=4

state= (tf.zeros([1,LSTM_CELL_SIZE]),)*2
state

(<tf.Tensor: shape=(1, 4), dtype=float32, numpy=array([[0., 0., 0., 0.]], dtype=float32)>,
 <tf.Tensor: shape=(1, 4), dtype=float32, numpy=array([[0., 0., 0., 0.]], dtype=float32)>)

In [4]:
lstm= tf.keras.layers.LSTM(LSTM_CELL_SIZE, return_sequences=True, return_state=True)

lstm.states= state

print(lstm.states)

(<tf.Tensor: shape=(1, 4), dtype=float32, numpy=array([[0., 0., 0., 0.]], dtype=float32)>, <tf.Tensor: shape=(1, 4), dtype=float32, numpy=array([[0., 0., 0., 0.]], dtype=float32)>)


In [5]:
sample_input= tf.constant([[3,2,2,2,2,2]], dtype=tf.float32)

batch_size=1
sequence_max_length= 1
number_of_feaures= 6

new_shape= (batch_size,sequence_max_length,number_of_feaures)

inputs= tf.constant(np.reshape(sample_input,new_shape),dtype= tf.float32)



In [6]:
output, final_memory_state, final_carry_state=lstm(inputs)

In [7]:
print('Output:      ', tf.shape(output))
print('Final Memory State:      '  , tf.shape(final_memory_state))
print('Final Carry State:      '  , tf.shape(final_carry_state))

Output:       tf.Tensor([1 1 4], shape=(3,), dtype=int32)
Final Memory State:       tf.Tensor([1 4], shape=(2,), dtype=int32)
Final Carry State:       tf.Tensor([1 4], shape=(2,), dtype=int32)
