# Hash Table Implementation in Tensorflow

Hash tables in tensorflow extend the interface `tf.contrib.lookup.LookupInterface`. The basic implementation is the `tf.contrib.lookup.HashTable``. It is a immutable object. It is first initialized and, then, the initial values are exploited in learning. It is very useful in defining vocabularies.

In [1]:
import tensorflow as tf
sess = tf.InteractiveSession()

In [2]:
keys = tf.constant(["a", "b", "c"])
values = tf.range(3)


table = tf.contrib.lookup.HashTable(
    tf.contrib.lookup.KeyValueTensorInitializer(keys, values), -1)
out = table.lookup(tf.constant("b"))
table.init.run()
print("Gathered index: " + str(out.eval()))

Gathered index: 1


The problem with `HashTable`s is that they cannot be updated during learning (i.e. no new keys can be added). If we have a pre-initialized set of labels, they can be a viable choice. If, instead, we need to add new keys during learning, TensorFlow provides a `MutableHashTable` and `MutableDenseHashTable`. The use is similar but we need to define specific operations to insert/update key-value pairs.

In [3]:
keys = tf.constant(["ciao", "come", "stai"], tf.string)
values = tf.constant([1., 2., 3.], tf.float32)
query = tf.constant(["come", "stai"], tf.string)
embeddings = tf.constant([[12.4, 17.2, 0.77], [1.11, 2.22, 3.77], [0.11, 4.39, 11.1], [90.11, 112.39, 1.1]])


table = tf.contrib.lookup.MutableDenseHashTable(key_dtype = tf.string,
                                        value_dtype = tf.float32, #int64 values not supported for string keys
                                        default_value = -1,
                                        empty_key = tf.constant("pad"))


insert_op = table.insert(keys,values) #In the MutableHashTable the insert/update is an operation, since it can be run during learning
sess.run(insert_op)


ind = table.lookup(query)
ind = tf.cast(ind, tf.int64)

out = tf.gather(params=embeddings, indices=ind)
print("Gathered values:")
print(sess.run(out))

Gathered values:
[[  1.09999999e-01   4.38999987e+00   1.11000004e+01]
 [  9.01100006e+01   1.12389999e+02   1.10000002e+00]]
