<a href="https://colab.research.google.com/github/nicoloceneda/Python-edu/blob/master/TensorFlow_Dataset_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TensorFlow - Dataset API

---



In [0]:
import tensorflow as tf
import numpy as np

## Creating a TensorFlow Dataset from existing tensors
Create a dataset from a **list**, a **Numpy array** or a **tensor** using `tf.data.Dataset.from_tensor_slices`

In [3]:
a = [1, 2, 3]
dataset_a = tf.data.Dataset.from_tensor_slices(a)
print(dataset_a)

b = np.array([4, 5, 6])
dataset_b = tf.data.Dataset.from_tensor_slices(b)
print(dataset_b)

c = tf.constant([7, 8, 9])
dataset_c = tf.data.Dataset.from_tensor_slices(c)
print(dataset_c)

<TensorSliceDataset shapes: (), types: tf.int32>
<TensorSliceDataset shapes: (), types: tf.int64>
<TensorSliceDataset shapes: (), types: tf.int32>


Iterate **entry by entry** through a dataset using `for ... in`

In [6]:
a = [1, 2, 3]
dataset_a = tf.data.Dataset.from_tensor_slices(a)

for pos, item in enumerate(dataset_a):
  print('item {}'.format(pos), item)

item 0 tf.Tensor(1, shape=(), dtype=int32)
item 1 tf.Tensor(2, shape=(), dtype=int32)
item 2 tf.Tensor(3, shape=(), dtype=int32)


Create **batches** from a dataset using `batch`




In [5]:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9]
dataset_a = tf.data.Dataset.from_tensor_slices(a)

dataset_batch = dataset_a.batch(batch_size=3)

for pos, batch in enumerate(dataset_batch):
  print('batch {}:'.format(pos), batch)



batch 0: tf.Tensor([1 2 3], shape=(3,), dtype=int32)
batch 1: tf.Tensor([4 5 6], shape=(3,), dtype=int32)
batch 2: tf.Tensor([7 8 9], shape=(3,), dtype=int32)


## Combining two tensors into a joint dataset
Create a **joint dataset** (to create a one-to-one correspondence between the elements of two tensors) using `tf.data.Dataset.zip` or `tf.data.Dataset.from_tensor_slices`


In [20]:
# First create two separate datasets, then join them (zip)
tensor_a = tf.random.uniform(shape=(4, 3), minval=0, maxval=1, dtype=tf.float64)
dataset_a = tf.data.Dataset.from_tensor_slices(tensor_a)

tensor_b = tf.random.uniform(shape=(4, ), minval=0, maxval=4, dtype=tf.int64)
dataset_b = tf.data.Dataset.from_tensor_slices(tensor_b)

dataset_c = tf.data.Dataset.zip((dataset_a, dataset_b))

for item in dataset_c:
  print('x:', item[0].numpy(), 'y:', item[1].numpy())

x: [0.32045744 0.33842543 0.91074693] y: 1
x: [0.89966699 0.30803455 0.49515623] y: 2
x: [0.30847655 0.66568763 0.14313214] y: 2
x: [0.9777679  0.60444822 0.65572032] y: 3


In [21]:
# Directly create a joint dataset (from_tensor_slices)
tensor_a = tf.random.uniform(shape=(4, 3), minval=0, maxval=1, dtype=tf.float64)
tensor_b = tf.random.uniform(shape=(4, ), minval=0, maxval=4, dtype=tf.int64)

dataset_c = tf.data.Dataset.from_tensor_slices((tensor_a, tensor_b))

for item in dataset_c:
  print('x:', item[0].numpy(), 'y:', item[1].numpy())

x: [0.38623001 0.77751202 0.44822608] y: 2
x: [0.06361701 0.0336372  0.23347828] y: 2
x: [0.82468352 0.07897202 0.57829134] y: 0
x: [0.32838318 0.55132781 0.26043303] y: 2


Apply **feature scaling** to scale the values to the range [-1, +1] using `map`

