# Demonstration of intake-omnisci

This is an [intake](https://intake.readthedocs.io/en/latest/) plugin for OmniSci databases. It allows the user to specify data sources   
via human-readable YAML catalogs, and then transparently load them and begin analyzing data. 
  
The catalog used in this notebook uses a URI to an OmniSci Database and SQL expressions.  

We begin by loading the [catalog](./catalog.yml) file:

In [None]:
import intake
catalog = intake.open_catalog('catalog.yml')

### Inspecting the catalog

We can interactively inspect the items in the catalog:

In [None]:
list(catalog)

We can also display the individual catalog items to get more information:

In [None]:
catalog.flights

With the catalog loaded, we can read full datasets into memory.

In [None]:
fault_df = catalog.faults.read()

In [None]:
fault_df.head()

### Catalog source

This package also includes an intake source that itself provides a catalog.
This is used to generate a data source for each table in a database:

In [None]:
tables = catalog.metis
list(tables)

### Lazy evaluation of expressions

Loading a table into memory is fine for smaller datasets, but it doesn't scale well up to larger datasets.
We would like to be able to build queries from an intake source that allows them to execute lazily.

In order to accomplish this, we have provided functionality to get an ibis expresison from a source:

In [None]:
ibis_expr = tables.ca_roads_tiger.to_ibis()
ibis_expr.head().execute()