# LSTM Intuition

A cell of LSTM consists of cell state, input gate, and output gate 
In every LSTM cell, the inputs are previous hidden state (hi-1), previous cell state (ci-1), and current input (Xi). 

The outputs are current hidden state (hi) and current cell state (ci)

In [2]:
def get_LSTM_cell(previous_hidden_state, previous_cell_state, current_input):
    return current_hidden_state, current_cell_state

## LSTM Flow

### 1. Forget gate layer
Forget gate decides what information that should be kept or thrown way

In forget gate, previous hidden state and current input go through sigmoid function.
This function makes value between 0 and 1, The closer to 0 means to forget, and the closer to 1 means to keep

In [3]:
def get_LSTM_cell(previous_hidden_state, previous_cell_state, current_input):
    combine = previous_hidden_state+current_input
    # forget gate
    forget_gate_vector = sigmoid_fn(combine)
    return current_hidden_state, current_cell_state

### 2. Input gate layer
Input gate updates the cell state

Process A = generate input gate result

Previous hidden state and current input into sigmoid function. This function will transform value into 0~1. The closer to 0 means not important.

Process B = generate candidate

Previous hidden state and current input into tanh function. This function will transform value into -1~1 and regulate the network

Process A * process B

The sigmoid output will decide which information is important to keep from the tanh output

In [4]:
def get_LSTM_cell(previous_hidden_state, previous_cell_state, current_input):
    combine = previous_hidden_state+current_input
    # forget gate
    forget_gate_vector = sigmoid_fn(combine)
    # input gate
    input_gate_output = sigmoid_fn(combine)
    candidate = tanh_fn(combine)
    input_gate_vector = input_gate_output * candidate
    return current_hidden_state, current_cell_state

### 3. Cell State
Previous cell state will be multiplied by forget gate vector. This has a possibility of dropping values in the cell state if it gets multiplied by values near 0.
This result will be added by input gate vector which updates the cell state to new values that the neural network finds relevant

In [5]:
def get_LSTM_cell(previous_hidden_state, previous_cell_state, current_input):
    combine = previous_hidden_state+current_input
    # forget gate
    forget_gate_vector = sigmoid_fn(combine)
    # input gate
    input_gate_output = sigmoid_fn(combine)
    candidate = tanh_fn(combine)
    input_gate_vector = input_gate_output * candidate
    # cell state
    current_cell_state = (previous_cell_state * forget_gate_vector) + input_gate_vector
    return current_hidden_state, current_cell_state

### 4. Output gate
The output gate decides what the next hidden state should be.

Process A

pass the previous hidden state and the current input into a sigmoid function
 
Process B

pass new cell state to tanh function

multiply process A by process B to decide what information the hidden state should carry

In [6]:
def get_LSTM_cell(previous_hidden_state, previous_cell_state, current_input):
    combine = previous_hidden_state+current_input
    # forget gate
    forget_gate_vector = sigmoid_fn(combine)
    # input gate
    input_gate_output = sigmoid_fn(combine)
    candidate = tanh_fn(combine)
    input_gate_vector = input_gate_output * candidate
    # cell state
    current_cell_state = (previous_cell_state * forget_gate_vector) + input_gate_vector
    #output gate
    output_gate_vector = sigmoid_fn(combine)
    current_hidden_state = tanh_fn(current_cell_state) * output_gate_vector
    return current_hidden_state, current_cell_state