<a href="https://colab.research.google.com/github/ShinAsakawa/2019cnps/blob/master/notebooks/2019cnps_elmo_tesorflow_hub.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Original: <https://tfhub.dev/google/elmo/2>
Paper: <https://arxiv.org/abs/1802.05365>
Project: <https://allennlp.org/elmo>

# エルモ ELMo の実習

- Embeddings from a language model trained on the 1 Billion Word Benchmark.

## Overview
Computes contextualized word representations using character-based word representations and bidirectional LSTMs, as described in the paper "Deep contextualized word representations" [^1].

This modules supports inputs both in the form of raw text strings or tokenized text strings.

The module outputs fixed embeddings at each LSTM layer, a learnable aggregation of the 3 layers, and a fixed mean-pooled vector representation of the input.

The complex architecture achieves state of the art results on several benchmarks. Note that this is a very computationally expensive module compared to word embedding modules that only perform embedding lookups. The use of an accelerator is recommended.

### Trainable parameters
The module exposes 4 trainable scalar weights for layer aggregation.

### Example use

```python
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
embeddings = elmo(
    ["the cat is on the mat", "dogs are in the fog"],
    signature="default",
    as_dict=True)["elmo"]
```

---

```python
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
tokens_input = [["the", "cat", "is", "on", "the", "mat"],
                ["dogs", "are", "in", "the", "fog", ""]]
tokens_length = [6, 5]
embeddings = elmo(
    inputs={
        "tokens": tokens_input,
        "sequence_len": tokens_length
    },
    signature="tokens",
    as_dict=True)["elmo"]
```

We set the trainable parameter to True when creating the module so that the 4 scalar weights (as described in the paper) can be trained. In this setting, the module still keeps all other parameters fixed.

### Inputs
The module defines [two signatures](https://www.tensorflow.org/hub/basics#applying_a_module): default, and tokens.

With the default signature, the module takes untokenized sentences as input. The input tensor is a string tensor with shape `[batch_size]`. The module tokenizes each string by **splitting on spaces**.

With the tokens signature, the module takes tokenized sentences as input. The input tensor is a string tensor with shape `[batch_size, max_length]` and an int32 tensor with shape `[batch_size]` corresponding to the sentence length. The length input is necessary to exclude padding in the case of sentences with varying length.

### Outputs
The output dictionary contains:

* `word_emb`: the character-based word representations with shape `[batch_size, max_length, 512]`.
* `lstm_outputs1`: the first LSTM hidden state with shape `[batch_size, max_length, 1024]`.
* `lstm_outputs2`: the second LSTM hidden state with shape `[batch_size, max_length, 1024]`.
* `elmo`: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape `[batch_size, max_length, 1024]`
* `default`: a fixed mean-pooling of all contextualized word representations with shape `[batch_size, 1024]`.

In [0]:
import tensorflow_hub as hub
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

In [0]:
embeddings = elmo(
    ["the cat is on the mat", "dogs are in the fog"],
    signature="default",
    as_dict=True)["elmo"]


In [0]:
type(embeddings)
embeddings.shape
#type(embeddings[0,0,0])
#print(embeddings[0,0,0])
print(embeddings.name)

In [0]:
#elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
tokens_input = [["the", "cat", "is", "on", "the", "mat"],
                ["dogs", "are", "in", "the", "fog", ""]]
tokens_length = [6, 5]
embeddings = elmo(
    inputs={
        "tokens": tokens_input,
        "sequence_len": tokens_length
    },
    signature="tokens",
    as_dict=True)["elmo"]


In [0]:
import tensorflow as tf
import tensorflow_hub as hub

# Create graph and finalize (finalizing optional but recommended).
g = tf.Graph()
with g.as_default():
  # We will be feeding 1D tensors of text into the graph.
  text_input = tf.placeholder(dtype=tf.string, shape=[None])
  embed = hub.Module("https://tfhub.dev/google/universal-sentence-encoder/2")
  embedded_text = embed(text_input)
  init_op = tf.group([tf.global_variables_initializer(), tf.tables_initializer()])
g.finalize()

# Create session and initialize.
session = tf.Session(graph=g)
session.run(init_op)

In [0]:
result = session.run(embedded_text, feed_dict={text_input: ["Hello world"]})

In [0]:
result

In [0]:
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
#help(elmo)

### References

[^1]: Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer. Deep contextualized word representations. arXiv preprint arXiv:1802.05365, 2018.

