# Intake-Postgres Plugin: Basic Usage


### Setup
1. Start a PostgreSQL server. If Docker is installed, an easy way to do this is with the following command:
    ```
    docker run -p 5432:5432 mdillon/postgis:9.6-alpine
    ```
    Wait until the line _"LOG:  database system is ready to accept connections"_ appears.
1. In the same conda environment as this notebook, install `pandas`, `sqlalchemy`, and `psycopg2`. Optionally, `postgresql` can also be installed (this is only the client library, not the database server):
    ```
    conda install pandas sqlalchemy psycopg2 postgres
    ```
1. Install the _intake-postgres_ plugin:
    ```
    conda install -c intake intake-postgres
    ```

### Basic Usage

First, import the necessary modules:

In [1]:
## For inserting test data
import pandas as pd
from sqlalchemy import create_engine

## For using Intake
from intake.catalog import Catalog

Insert some data into database table:

In [2]:
engine = create_engine('postgresql://postgres@localhost:5432/postgres')
df = pd.DataFrame({'first_name': ['Joe', 'Cindy', 'Bob'],
                   'last_name': ['Schmoe', 'Sherman', 'Bullock']},
                 index=[1, 2, 3])
df.to_sql('person', engine, if_exists='replace')

Verify the data was written, by connecting to the database directly with the `psql` command-line tool:

In [3]:
# Verify the data was written
!psql -h localhost -U postgres -c 'select * from person;'

You are connected to database "postgres" as user "postgres" on host "localhost" at port "5432".
Null display is "NULL".
Timing is on.
Target width is 1000000000.
Expanded display is used automatically.
Pager usage is off.
 index | first_name | last_name 
-------+------------+-----------
     1 | Joe        | Schmoe
     2 | Cindy      | Sherman
     3 | Bob        | Bullock
(3 rows)

Time: 0.644 ms


Write out a __basic\_catalog.yml__ file with the appropriate schema:

In [4]:
%%writefile basic_catalog.yml
plugins:
  source:
    - module: intake_postgres

sources:
  - name: people
    driver: postgres
    args:
      uri: 'postgresql://postgres@localhost:5432/postgres'
      sql_expr: 'select * from person'
    
  - name: last_names
    driver: postgres
    args:
      uri: 'postgresql://postgres@localhost:5432/postgres'
      sql_expr: 'select last_name from person'

Overwriting basic_catalog.yml


Access the catalog with Intake:

In [5]:
catalog = Catalog('basic_catalog.yml')
catalog

<intake.catalog.base.Catalog at 0x10617dba8>

Inspect the metadata about the sources (optional):

In [6]:
catalog['people'].get().discover()

{'datashape': None,
 'dtype': [('index', dtype('int64')),
  ('first_name', dtype('O')),
  ('last_name', dtype('O'))],
 'npartitions': 1,
 'shape': (3,)}

In [7]:
catalog['last_names'].get().discover()

{'datashape': None,
 'dtype': [('last_name', dtype('O'))],
 'npartitions': 1,
 'shape': (3,)}

Read the data from the sources:

In [8]:
catalog['people'].get().read()

Unnamed: 0,index,first_name,last_name
0,1,Joe,Schmoe
1,2,Cindy,Sherman
2,3,Bob,Bullock


In [9]:
catalog['last_names'].get().read()

Unnamed: 0,last_name
0,Schmoe
1,Sherman
2,Bullock
