# Build Tensorflow Input Pipelines

There are two distinct ways to create a dataset:
* A data source constructs a `Dataset` from data stored in memory or in one or more files.
* A data transformation constructs a dataset from one or more `tf.data.Dataset` objects.

The `tf.data` API introduces a `tf.data.Dataset` abstraction that represents a sequence of elements, in which each element consists of one or more components. For example, in an image pipeline, an element might be a single training example, with a pair of tensor components representing the image and its label.

In [3]:
import os
import pathlib
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib as plt
%matplotlib inline

np.set_printoptions(precision=4)

## Basic mechanics

Start with a data source:
* from data in memory, use `tf.data.Dataset.from_tensors()`, `tf.data.Dataset.from_tensor_slices()`
* from data stored in a file in TFRecord format, use `tf.data.TFRecordDataset`

Now we have a `Dataset` object, we can transform it into a new `Dataset` by chaining
method calls on the `tf.data.Dataset` object.
* pre-element transformation: `Dataset.map()`
* multi-element transformation: `Dataset.batch()`