# LSTM (Long-Short Term Memory)
LSTM is a special variant of recurrent neural network(RNN), which provide a solution for the
exploding and vanishing gradient problem. It performs unreasonably well on many
works due to its structure flexibility and the notion of order in time,
like translation (many inputs to many outputs), classfication (many inputs to one outputs)
and so on.
## LSTM logic

## LSTM with torch
All the snippets below are implemented in torch7, which is one of the powerful open
source library for neural network.
### LSTM cell in a plain way

```lua
require 'nn'
require 'nngraph'

-- There are three gates in one sigle LSTM cell, which are input, forget,
-- output gate respectively.
------------------------------------
-- Define the inputs and dimension of the linear module
------------------------------------
ninputs = 4; noutputs = 4
x = nn.Identity()()
h = nn.Identity()()
c = nn.Identity()()

function input_linear_sum(x, h)
    return nn.CAddTable()({
        nn.Linear(ninputs,noutputs)(x),
        nn.Linear(ninputs,noutputs)(h)
        })
end
-- Define the input gate
input_gate = nn.Sigmoid()(new input_linear_sum(x,h))
-- Define the forget gate
forget_gate = nn.Sigmoid()(new input_linear_sum(x,h))
-- Define the output gate
output_gate = nn.Sigmoid()(new input_linear_sum(x,h))
-- Define the cell state
c_in  = nn.Tanh()(new input_linear_sum(x,h))
-- Update cell state
c_new = nn.CAddTable()({
    nn.CMulTable()({forget_gate,c}),
    nn.CMulTable()({input_gate, c_in})
    })
-- Compute output
h_new = nn.CMulTable()({output_gate, nn.Tanh()(c_new)})
-- Define the model graph
lstm_model = nn.gModule({x,h,c},{h_new,c_new})
```

### LSTM cell in a neat way
Wrap all the variables of a LSTM cell in a function.

```lua
require 'nn'
require 'nngraph'

--------------------------------------
-- Define LSTM cell in one function
--------------------------------------
function lstm(x,h,c)
	function input_linear_sum(x,h)
		return nn.CAddTable()({
            nn.Linear(ninputs,noutputs)(x),
            nn.Linear(ninputs,noutputs)(h)
            })
	end
	-- Define 3 gates (input, output, forget)
	local input_gate = nn.Sigmoid()(new input_linear_sum(x,h))
	local output_gate = nn.Sigmoid()(new input_linear_sum(x,h))
	local forget_gate = nn.Sigmoid()(new input_linear_sum(x,h))
	-- Squash (transform) the input with tanh
	local c_in  = nn.Tanh()(new input_linear_sum(x,h))
	local c_new = nn.CAddTable()({
			nn.CMulTable()({forget_gate,c}),
			nn.CMulTable()({input_gate, c_in})
			})
	local h_new = nn.CMulTable()({output_gate,nn.Tanh()(c_new)})
	-- Return new h and c
	return c_new, h_new
end

```
