# Building Data Pipeline With DataLoader
This tutorial shows how to loading data with DataLoader, a powerful tool to handle multiple dataset.
- `Dataset` and `Loader` introduction
- Construct data iterator 

`Dataset` is an object to search needed data in one or more given directories.
It has 3 APIs for globing files.

- `Dataset.include(*args)` glob standard glob pattern
- `Dataset.include_reg(*args)` and `Dataset.exclude(*args)` glob regular expression

`Loader` is an object construct using `Dataset`.

If `Loader` is constructed only from `hr_data`, it generates **LR** data bicubically.
If `Loader` is constructed from both `hr_data` and `lr_data`, it pairs two dataset.

The data iterator is got by `Loader.make_one_shot_iterator(...)`, which can be called multiple times.
Each item is a `dict` object containing `hr`, `lr` and `name` entries.
The shape of LR data is specified by `batch_shape` (hence the shape of HR is `batch_shape * scale`).

`batch_shape` also supports 5-D for videos such as [4, 3, 3, 16, 16] (N, T, C, H, W).
For the channel placement, the `DATA_FORMAT` comes with the backend:
- For **pytorch**, it's `"channels_first"`
- For **tensorflow**, it's `"channels_last"`.

In [10]:
from VSR.DataLoader import Dataset, Loader

data1 = Dataset('../../Tests/data').include('*.png')  # glob all png files under /mnt/data
data2 = Dataset('../../Tests/data')  # glob all supported files under data2
data2_hr = data2.include_reg('\\bhr\\b')
data2_lr = data2.include_reg('\\blr\\b')

loader1 = Loader(hr_data=data1, scale=4, threads=1)
loader2 = Loader(hr_data=data2_hr, lr_data=data2_lr)

itr1 = loader1.make_one_shot_iterator(batch_shape=[4, 3, 16, 16], steps=2, shuffle=True)
for item in itr1:
    label = item['hr']
    feature = item['lr']
    print(label.shape, feature.shape)
    print(item['name'])

loader2.set_color_space('lr', 'L')
loader2.set_color_space('hr', 'L')
itr2 = loader2.make_one_shot_iterator(batch_shape=[4, 1, 16, 16], steps=2, shuffle=True)
for item in itr2:
    label = item['hr']
    feature = item['lr']
    print(label.shape, feature.shape)
    print(item['name'])

(4, 3, 64, 64) (4, 3, 16, 16)
['img_005_SRF_2_LR', 'c_10', 'img1', 'img_005_SRF_2_LR']
(4, 3, 64, 64) (4, 3, 16, 16)
['img0', 'xiuxian003', 'xiuxian003', 'img_001_SRF_2_LR']
(4, 1, 16, 16) (4, 1, 16, 16)
['xiuxian003', 'xiuxian001', 'xiuxian003', 'xiuxian003']
(4, 1, 16, 16) (4, 1, 16, 16)
['xiuxian002', 'xiuxian001', 'xiuxian001', 'xiuxian001']


- [x] Next: [Calling Model with Executor](3.%20Calling%20model%20with%20executor.ipynb)