# The uoimdb Class
It provides a convenient abstraction for working with a database of images and loading them into image pipelines. Upon instantiation, a Pandas DataFrame is constructed using the list of images gathered from the file pattern `os.path.join(IMAGE_BASE_DIR, IMAGE_FILE_PATTERN)`. It contains a database of all images and their associated timestamps. For efficiency (especially with large datasets, as the file timestamp extraction process can be slow), the dataframe is cached to a pickle file located at `os.path.join(DATA_LOCATION, 'imdb-{md5(abs_file_pattern)}.pkl')`. 

By default, all pipelines are initialized with a bgr2rgb filter and a resize filter. You can pass a custom callable that initializes pipeines via the `pipeline_init` parameter or pass `False` to disable it. It can also be disabled via the config file using `PIPELINE_INIT: False`. To only disable resizing, just set `RESIZE: 1` or `RESIZE: False`. 

In [1]:
import uoimdb as uo

In [2]:
imdb = uo.uoimdb(IMAGE_BASE_DIR='../../images')
print(imdb.df.shape)
imdb.df.head()

ScannerError: while scanning an alias
  in "<byte string>", line 10, column 21:
    IMAGE_FILE_PATTERN: * # the glob file pattern for ge ... 
                        ^
expected alphabetic or numeric character, but found ' '
  in "<byte string>", line 10, column 22:
    IMAGE_FILE_PATTERN: * # the glob file pattern for get ... 
                         ^

## Loading images
The main function of the imdb is to feed images into pipelines. There are a few ways to do this.
```python
imdb.load_around(src, window=(5,5)) # feed in a single src and get images around it
imdb.load_images(srcs) # feed in an iterator of srcs
imdb.feed_images(imgs) # feed in an iterator of images
```
These all return a pipeline with the specified images ready to be loaded. These are equivalent to calling `imdb.pipeline.feed(...)`.

## Pipelines
Pipelines represent chains of operations on a sequence of images. 

In [None]:
srcs = imdb.df.index[:10] # the first 10 images in the database
for im in imdb.load_images(srcs).grey():
    plt.imshow(im)
    plt.show()

You can create a pipeline and then re-use it on multiple sets of images.

In [None]:
# creates the pipeline
pipeline = imdb.pipeline().grey()

for i in [0, 100, 200]:
    print(i)
    srcs = imdb.df.index[i:i+10]
    for im in pipeline.feed(srcs=srcs):
        plt.imshow(im)
        plt.show()

Internally, pipes are generators that feed into each other so can provide complex image operations with efficient memory usage.

### Retrieving values
