## Dataset

In [1]:
import tensorflow as tf

In [2]:
sess = tf.InteractiveSession()

In [3]:
dataset1 = tf.contrib.data.Dataset.from_tensor_slices(tf.random_uniform([4, 10]))
print(dataset1.output_types)  # ==> "tf.float32"
print(dataset1.output_shapes)  # ==> "(10,)"

<dtype: 'float32'>
(10,)


In [4]:
dataset2 = tf.contrib.data.Dataset.from_tensor_slices(
   (tf.random_uniform([4]),
    tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)))
print(dataset2.output_types)  # ==> "(tf.float32, tf.int32)"
print(dataset2.output_shapes)  # ==> "((), (100,))"

(tf.float32, tf.int32)
(TensorShape([]), TensorShape([Dimension(100)]))


In [5]:
dataset3 = tf.contrib.data.Dataset.zip((dataset1, dataset2))
print(dataset3.output_types)  # ==> (tf.float32, (tf.float32, tf.int32))
print(dataset3.output_shapes)  # ==> "(10, ((), (100,)))"


(tf.float32, (tf.float32, tf.int32))
(TensorShape([Dimension(10)]), (TensorShape([]), TensorShape([Dimension(100)])))


It is often convenient to give names to each component of an element, for example if they represent different features of a training example.

In [6]:
dataset = tf.contrib.data.Dataset.from_tensor_slices(
   {"X": tf.random_uniform([4]),
    "label": tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)})
print(dataset.output_types)  # ==> "{'a': tf.float32, 'b': tf.int32}"
print(dataset.output_shapes)  # ==> "{'a': (), 'b': (100,)}"

{'X': tf.float32, 'label': tf.int32}
{'X': TensorShape([]), 'label': TensorShape([Dimension(100)])}


### Creating an iterator

Once you have built a Dataset to represent your input data, the next step is to create an Iterator to access elements from that dataset.

The tf.data API currently supports the following iterators, in increasing level of sophistication:
* one-shot
* initializable
* reinitializable 
* feedable

#### One-shot iterator

A one-shot iterator is the simplest form of iterator, which only supports iterating once through a dataset, with no need for explicit initialization. 

In [7]:
dataset = tf.contrib.data.Dataset.range(10)
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
#a = sess.run(next_element)
for i in range(10):
    value = sess.run(next_element) # 현재 값을 return하고 증가
    print('{} ---- value:\t {}'.format(i, value))
    assert i == value

0 ---- value:	 0
1 ---- value:	 1
2 ---- value:	 2
3 ---- value:	 3
4 ---- value:	 4
5 ---- value:	 5
6 ---- value:	 6
7 ---- value:	 7
8 ---- value:	 8
9 ---- value:	 9


#### Initializable iterator

An initializable iterator requires you to run an explicit iterator.initializer operation before using it. In exchange for this inconvenience, it enables you to parameterize the definition of the dataset, using one or more tf.placeholder() tensors that can be fed when you initialize the iterator. 

In [8]:
max_value = tf.placeholder(tf.int64, shape=[])
dataset = tf.contrib.data.Dataset.range(max_value)

iterator = dataset.make_initializable_iterator()
next_element = iterator.get_next()

# Initialize an iterator over a dataset with 10 elements.
sess.run(iterator.initializer, feed_dict={max_value: 10})
for i in range(10):
    value = sess.run(next_element)
    print('{} ---- value:\t {}'.format(i, value))
    assert i == value

0 ---- value:	 0
1 ---- value:	 1
2 ---- value:	 2
3 ---- value:	 3
4 ---- value:	 4
5 ---- value:	 5
6 ---- value:	 6
7 ---- value:	 7
8 ---- value:	 8
9 ---- value:	 9


In [9]:
# Initialize the same iterator over a dataset with 15 elements.
sess.run(iterator.initializer, feed_dict={max_value: 15})
for i in range(15):
    value = sess.run(next_element)
    print('{} ---- value:\t {}'.format(i, value))
    assert i == value

0 ---- value:	 0
1 ---- value:	 1
2 ---- value:	 2
3 ---- value:	 3
4 ---- value:	 4
5 ---- value:	 5
6 ---- value:	 6
7 ---- value:	 7
8 ---- value:	 8
9 ---- value:	 9
10 ---- value:	 10
11 ---- value:	 11
12 ---- value:	 12
13 ---- value:	 13
14 ---- value:	 14


#### Reinitializable iterator

A reinitializable iterator can be initialized from multiple different Dataset objects. 

In [22]:
n_training = 20
n_test = 10

In [26]:
# Define training and validation datasets with the same structure.
training_dataset = tf.contrib.data.Dataset.range(n_training).map(
    lambda x: x + tf.random_uniform([], -10, 10, tf.int64))

print(training_dataset)

<MapDataset shapes: (), types: tf.int64>


In [27]:
validation_dataset = tf.contrib.data.Dataset.range(n_test)
print(validation_dataset)

<RangeDataset shapes: (), types: tf.int64>


In [37]:
# A reinitializable iterator is defined by its structure. 
# We could use the `output_types` and `output_shapes` properties of either `training_dataset`
# or `validation_dataset` here, because they are compatible.
iterator = tf.contrib.data.Iterator.from_structure(training_dataset.output_types,
                                           training_dataset.output_shapes)
next_element = iterator.get_next()

training_init_op = iterator.make_initializer(training_dataset)
validation_init_op = iterator.make_initializer(validation_dataset)

# Run 20 epochs in which the training dataset is traversed, followed by the
# validation dataset.
for epoch in range(20):
    # Initialize an iterator over the training dataset.
    print('== Epoch {} =='.format(epoch))
    sess.run(training_init_op)
    for i in range(n_training):
        a = sess.run(next_element)
        print('training\t{}\t{}'.format(i,a))

    # Initialize an iterator over the validation dataset.
    sess.run(validation_init_op)
    for i in range(n_test):
        a = sess.run(next_element)
        print('validation\t{}\t{}'.format(i,a))

== Epoch 0 ==
training	0	-5
training	1	4
training	2	-5
training	3	12
training	4	7
training	5	3
training	6	15
training	7	7
training	8	0
training	9	0
training	10	19
training	11	3
training	12	12
training	13	13
training	14	11
training	15	20
training	16	15
training	17	26
training	18	11
training	19	25
validation	0	0
validation	1	1
validation	2	2
validation	3	3
validation	4	4
validation	5	5
validation	6	6
validation	7	7
validation	8	8
validation	9	9
== Epoch 1 ==
training	0	-3
training	1	-1
training	2	10
training	3	-3
training	4	-1
training	5	2
training	6	4
training	7	4
training	8	10
training	9	11
training	10	12
training	11	4
training	12	5
training	13	20
training	14	20
training	15	20
training	16	14
training	17	9
training	18	22
training	19	22
validation	0	0
validation	1	1
validation	2	2
validation	3	3
validation	4	4
validation	5	5
validation	6	6
validation	7	7
validation	8	8
validation	9	9
== Epoch 2 ==
training	0	5
training	1	8
training	2	-7
training	3	2
training	4	3
training	5	3
training	6	1

https://www.tensorflow.org/guide/datasets

https://github.com/Andreea-G/tensorflow_examples/blob/master/rnn_model.py