#### basic RNN cells and wrappers
http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features

可查看源代码：rnn_cell.py and contrib/rnn_cell.
    
As of the time of this writing, the basic RNN cells and wrappers are:

+ BasicRNNCell – A vanilla RNN cell.
+ GRUCell – A Gated Recurrent Unit cell.
+ BasicLSTMCell – An LSTM cell based on Recurrent Neural Network Regularization. No peephole connection or cell clipping.
+ LSTMCell – A more complex LSTM cell that allows for optional peephole connections and cell clipping.
+ MultiRNNCell – A wrapper to combine multiple cells into a multi-layer cell.
+ DropoutWrapper – A wrapper to add dropout to input and/or output connections of a cell.

根据API文档，其构造函数中num_units是没有默认值，必须由网络设计者给定。API文档中对这个参数的作用描述如下

num_units: int, The number of units in the LSTM cell

对于上述描述，笔者表示仍然看不懂，因为“units in the LSTM cell”这个概念在API文档上并没有直接定义。

为了解决这个问题，我们从TF源代码入手，分析上述API对应的源代码 core_rnn_cell_impl.py，找到如下源代码
```python
class BasicLSTMCell(RNNCell):
  """Basic LSTM recurrent network cell.
  The implementation is based on: http://arxiv.org/abs/1409.2329.
  We add forget_bias (default: 1) to the biases of the forget gate in order to
  reduce the scale of forgetting in the beginning of the training.
  It does not allow cell clipping, a projection layer, and does not
  use peep-hole connections: it is the basic baseline.
  For advanced models, please use the full LSTMCell that follows.
  """

  def __init__(self, num_units, forget_bias=1.0, input_size=None,
               state_is_tuple=True, activation=tanh, reuse=None):
    """Initialize the basic LSTM cell.
    Args:
      num_units: int, The number of units in the LSTM cell.
      forget_bias: float, The bias added to forget gates (see above).
      input_size: Deprecated and unused.
      state_is_tuple: If True, accepted and returned states are 2-tuples of
        the `c_state` and `m_state`.  If False, they are concatenated
        along the column axis.  The latter behavior will soon be deprecated.
      activation: Activation function of the inner states.
      reuse: (optional) Python boolean describing whether to reuse variables
        in an existing scope.  If not `True`, and the existing scope already has
        the given variables, an error is raised.
    """
    if not state_is_tuple:
      logging.warn("%s: Using a concatenated state is slower and will soon be "
                   "deprecated.  Use state_is_tuple=True.", self)
    if input_size is not None:
      logging.warn("%s: The input_size parameter is deprecated.", self)
    self._num_units = num_units
    self._forget_bias = forget_bias
    self._state_is_tuple = state_is_tuple
    self._activation = activation
    self._reuse = reuse

  @property
  def output_size(self):
    return self._num_units
```

注意到其中output_size()函数返回值就是num_units，所以可以推断num_units决定了LSTM Cell输出向量的维度，
对于一个batch中的一个sample，num_units决定了这个sample的维度。进一步，num_units可以理解为RNN网络表征特征的复杂度，
需要区分的特征越复杂，就需要越多的维度来表征。

In [14]:
import tensorflow as tf
import numpy as np

tf.reset_default_graph()

# Create input data
X = np.random.randn(2, 10, 8)

# The second example is of length 6 
X[1,6,:] = 0
X_lengths = [10, 6]

# ============================
# tf.nn.rnn_cell --> tf.contrib.rnn
# cell = tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)
# cell = tf.contrib.rnn.DropoutWrapper(cell=cell, output_keep_prob=0.5)
# If before you were using: MultiRNNCell([LSTMCell(...)] * num_layers), change to: MultiRNNCell([LSTMCell(...) for _ in range(num_layers)]).
# cell = tf.contrib.rnn.MultiRNNCell(cells=[cell] * 4, state_is_tuple=True)
# cell = tf.contrib.rnn.MultiRNNCell([cell for _ in range(5)])

# ============================
cells=[]
for _ in range(4):
    
    # 参数num_units: int, The number of units in the LSTM cell
    cell = tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)
    cell = tf.contrib.rnn.DropoutWrapper(cell=cell, output_keep_prob=0.5)
    cells.append(cell)
cell = tf.contrib.rnn.MultiRNNCell(cells, state_is_tuple=True)


# ============================
# is_training = True
# NUM_LAYERS = 4

# # cell = tf.contrib.rnn.LSTMCell(num_units=64, state_is_tuple=True)
# cell = tf.contrib.rnn.BasicLSTMCell(NUM_LAYERS)
# if is_training:
#     cell = tf.contrib.rnn.DropoutWrapper(cell=cell, output_keep_prob=0.5)

# for _ in range(NUM_LAYERS):
#     cells.append(cell)
# cell = tf.contrib.rnn.MultiRNNCell(cells, state_is_tuple=True)
    
        

outputs, last_states = tf.nn.dynamic_rnn(
    cell=cell,
    dtype=tf.float64,
    sequence_length=X_lengths,
    inputs=X)

result = tf.contrib.learn.run_n(
    {"outputs": outputs, "last_states": last_states},
    n=1,
    feed_dict=None)


print(result[0]["outputs"].shape)
print(result[0]["outputs"])
assert result[0]["outputs"].shape == (2, 10, 64)

# Outputs for the second example past past length 6 should be 0
assert (result[0]["outputs"][1,7,:] == np.zeros(cell.output_size)).all()

print(result[0]["last_states"][0].h.shape)
print(result[0]["last_states"][0].h)

Instructions for updating:
graph_actions.py will be deleted. Use tf.train.* utilities instead. You can use learn/estimators/estimator.py as an example.
Instructions for updating:
graph_actions.py will be deleted. Use tf.train.* utilities instead. You can use learn/estimators/estimator.py as an example.
Instructions for updating:
graph_actions.py will be deleted. Use tf.train.* utilities instead. You can use learn/estimators/estimator.py as an example.
(2, 10, 64)
[[[ 0.0014168   0.00064998  0.         ..., -0.          0.00074619
    0.00175768]
  [ 0.00136874  0.          0.00150519 ..., -0.00395315  0.          0.00420968]
  [ 0.00180548  0.          0.00125465 ..., -0.00368526  0.0026687
    0.00566239]
  ..., 
  [ 0.00759778 -0.00582007  0.         ...,  0.         -0.00136309
   -0.00707219]
  [ 0.0127089  -0.01148308  0.01078373 ...,  0.00687817 -0.         -0.        ]
  [ 0.         -0.00485191  0.         ...,  0.         -0.         -0.        ]]

 [[-0.00129557  0.          