## Setting up database connectivity

1. Create a file in your home directory containing the database credentials for your target environment. 
For example:
```
~ vim ~/.dslab_user.cred
```
The content of this file should look like so (with appropriate values for HOSTNAME, PORT, USER, DATABASE & PASSWORD).
```
[database_creds]
host: HOSTNAME
port: PORT
user: USER
database: DATABASE
password: PASSWORD
```
2. Please set the permissions of this file to u+rwx (700), so that only you can access this file.
```
~ chmod 700 ~/.dslab_user.cred
```
You should see the following
```
~ ls -l ~/.dslab_user.cred 
-rwx------  1 USER  720748206  93 Jun 29 17:27 $HOME/.dslab_user.cred
```

## Creating database connection string

In [2]:
import psycopg2
import pandas
import pandas.io.sql as psql
import ConfigParser
import os

USER_CRED_FILE = os.path.join(os.path.expanduser('~'), '.dslab_user.cred')

def fetchDBCredentials(dbcred_file=USER_CRED_FILE):
    """
       Read database access credentials from the file in $HOME/.ipynb_dslab.cred
    """
    #Read database credentials from user supplied file
    conf = ConfigParser.ConfigParser()
    conf.read(dbcred_file)
    #host, port, user, database, password
    host = conf.get('database_creds','host')
    port = conf.get('database_creds','port')
    user = conf.get('database_creds','user')
    database = conf.get('database_creds','database')
    password = conf.get('database_creds','password')

    #Initialize connection string
    conn_str =  """dbname='{database}' user='{user}' host='{host}' port='{port}' password='{password}'""".format(                       
                    database=database,
                    host=host,
                    port=port,
                    user=user,
                    password=password
            )
    return conn_str

## Test your connection to the database

In [3]:
conn = psycopg2.connect(fetchDBCredentials())
df = psql.read_sql("""select random() as x, random() as y from generate_series(1, 10) q;""", conn)
df.head()

Unnamed: 0,x,y
0,0.412527,0.181654
1,0.91989,0.524397
2,0.79738,0.93413
3,0.547616,0.706693
4,0.545286,0.483169


If you see an HTML table with 3 columns and 5 rows above, your connection to the database was successful. 

## Opening Connections to Multiple Clusters

If you want to open multiple connections (say one for GPDB and for your HAWQ cluster), you can create another file similar to `~/.dslab_user.cred`, populate the appropriate credentials, and supply this file as input to the `fetchDBCredentials()` function shown above.

For instance, let's say you created another file `~/.dslab_user.cred.gpdb` containing the appropriate credentials to connect to your GPDB cluster, then you can open a connection to this cluster with psycopg2 as follows:
```
db_cred_file_gpdb = os.path.join(os.path.expanduser('~'), '.dslab_user.cred.gpdb')
conn_gpdb = psycopg2.connect(fetchDBCredentials(db_cred_file_gpdb))
df = psql.read_sql("""select random() as x, random() as y from generate_series(1, 10) q;""", conn_gpdb)
df.head()
```
For all subsequent instance where you want to query the data on GPDB, you can use `conn_gpdb` in place of `conn` in your code.