# Getting Started with OmnisciDB

## Omnisci Database Setup

These examples will be using a local [Omnisci database](https://www.omnisci.com/platform/omniscidb). The first thing we'll do is get that up and running. There are several [options for installation](https://docs.omnisci.com/installation-and-configuration/installation), but for this tutorial we'll use the Open Source CPU installation using Docker. 

In [None]:
# Once you have docker running locally,
# pull the docker image
! docker pull omnisci/core-os-cpu
# create a local storage directory
! mkdir ${HOME}/omnisci-storage
# run docker container
! docker run -d --name omnisci -p 6274:6274 -v ${HOME}/omnisci-storage:/omnisci-storage omnisci/core-os-cpu

> Note: If you recieve an error message saying this container is already in use, remove the old container using   
`docker rm old_container_id`

The docker container automatically launched an OmniSci database in which we can now connect.

## Connecting to the database using Ibis

We will be using Ibis to create and manage our connection to the database. Ibis will allow us to construct complex data anlytics using a Pandas-like API. It will convert our analytics methods to a SQL query, but will push the computational burden of the query to the server. In this way, users can query extremely large databases on remote servers without heavy local computation.

>**Note**: Unless you're using the conda environment from this repo, you'll also need to install the omnisci backend for ibis  
`pip install ibis-framework[omnisci]`  

In [None]:
import ibis
# set up the credentials to the OmniSci db inside of docker
creds = {
    'user': 'admin',
    'password': 'HyperInteractive',
    'host': '127.0.0.1',
    'port': 6274,
    'dbname': 'omnisci'
}
# connect to the database using Ibis
omnisci_client = ibis.omniscidb.connect(user=creds['user'], password=creds['password'],
                                  host=creds['host'], port=creds['port'], database=creds['dbname'])

### Exploring the database using Ibis

Let's use the client to take a look at the database (For a more in-depth look at Ibis functionality, check out the Ibis tutorials at [...]() )  
We can quickly get a list of the tables available in the database.  

In [None]:
omnisci_client.list_tables()

Now we will make a connection to the 'upstream_reservoir' table.

In [None]:
states = omnisci_client.table('omnisci_states')
states

You'll notice that when you inspect `states` you see a schema object, not actual results.   
Ibis won't load the data and perform any computation, but instead will leave the data in the   
backend defined in the connection, and we will _ask_ the backend to perform the computations.  

This is a valuable tool when working with big data in which our client side cannot handle the  
volume of data until we have reduced it.  

If you'd perfer for the backend to run the computation immediately, you can set  
`ibis.options.interactive = True`.
  
Let's take a quick look at information Ibis has for this table without actually pulling the data locally:

In [None]:
# get the table info
display(states.info())

# get the table metadata
display(states.metadata())

Ibis is converting our expression into a SQL expression. Let's take a look at the actual SQL query.

In [None]:
states.compile()

The table has 52 rows which is small enough for us to handle locally so we can go ahead and execute the query which will bring us back the requested table (we haven't asked it to perforrm any calculations yet)  

In [None]:
# execute the query
states_df = states.execute()
print(f'Return Type: {type(states_df)}')
states_df

Now we can immediately continue our data analytics using a Pandas DataFrame or we can modify our Ibis query to perform some calculations before pulling back data.