<a href="https://colab.research.google.com/github/mac-raj/TensorFlow/blob/main/TensorFlow4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# TensorFlow Input Pipeline

The input pipeline is a quick and easy utility provided in tf.dataapi to make complex input pipelines from simple and reusable codes and all in few lines of code. It also allows handling a large amount of data, thus giving low-end machines an advantage in computing them.

It does it by wrapping the data into tf.data.dataset class and performing a series of operations on them called ETL - Extract, Transform, Load.

In [1]:
import tensorflow as tf

In [2]:
daily_customer_number = [19,3,90,34,67,89,23,-1,678]

In [3]:
daily_customer_number

[19, 3, 90, 34, 67, 89, 23, -1, 678]

In [5]:
tf_dataset =tf.data.Dataset.from_tensor_slices(daily_customer_number)

In [6]:
tf_dataset

<TensorSliceDataset element_spec=TensorSpec(shape=(), dtype=tf.int32, name=None)>

In [7]:
#Print objects
for customer in tf_dataset:
  print(customer)

tf.Tensor(19, shape=(), dtype=int32)
tf.Tensor(3, shape=(), dtype=int32)
tf.Tensor(90, shape=(), dtype=int32)
tf.Tensor(34, shape=(), dtype=int32)
tf.Tensor(67, shape=(), dtype=int32)
tf.Tensor(89, shape=(), dtype=int32)
tf.Tensor(23, shape=(), dtype=int32)
tf.Tensor(-1, shape=(), dtype=int32)
tf.Tensor(678, shape=(), dtype=int32)


In [8]:
# Convert Into numpy
for customer in tf_dataset:
  print(customer.numpy())

19
3
90
34
67
89
23
-1
678


In [9]:
# Convert Into numpy
for customer in tf_dataset.as_numpy_iterator():
  print(customer)

19
3
90
34
67
89
23
-1
678


In [10]:
# Print first 3 element
for customer in tf_dataset.take(3):
  print(customer.numpy())

19
3
90


In [14]:
# Filter negative data
tf_filtered_data = tf_dataset.filter(lambda x: x>0)
for customer in tf_filtered_data.as_numpy_iterator():
  print(customer)

Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089


19
3
90
34
67
89
23
678


In [19]:
# Normalize data
tf_normalized_data = tf_dataset.map(lambda x: x/678)
for customer in tf_normalized_data.as_numpy_iterator():
  print(customer)

0.028023598820058997
0.004424778761061947
0.13274336283185842
0.05014749262536873
0.09882005899705015
0.13126843657817108
0.03392330383480826
-0.0014749262536873156
1.0


In [20]:
# multiply each element with 78
tf_multiply_data = tf_dataset.map(lambda x: x*78)
for customer in tf_multiply_data.as_numpy_iterator():
  print(customer)

1482
234
7020
2652
5226
6942
1794
-78
52884


In [21]:
# Shuffle data element 
tf_shuffle_data = tf_dataset.shuffle(2)
for customer in tf_shuffle_data.as_numpy_iterator():
  print(customer)

3
90
34
19
89
23
-1
67
678


In [22]:
# Shuffle data element 
tf_shuffle_data = tf_dataset.shuffle(4)
for customer in tf_shuffle_data.as_numpy_iterator():
  print(customer)

3
90
19
34
89
23
678
67
-1


In [24]:
# Convert Into Batch
for customer in tf_dataset.batch(2):
  print(customer.numpy())

[19  3]
[90 34]
[67 89]
[23 -1]
[678]


In [28]:
# ETL using one single line
tf_dataset =tf.data.Dataset.from_tensor_slices(daily_customer_number)
tf_data_Dataset = tf_dataset.filter(lambda x: x>0).map(lambda y: y/678).shuffle(2).batch(2)
for customer in tf_data_Dataset.as_numpy_iterator():
  print(customer)

[0.00442478 0.13274336]
[0.0280236  0.05014749]
[0.09882006 0.13126844]
[1.        0.0339233]
