## Connecting to the GeoSpock SQL Access cluster

There are many ways to connect Jupyter Notebooks to your GeoSpock presto cluster for SQL access. We give three examples here, with a username and password read from a local file and the host string parameterised. There is also a quick query to run with each to demonstrate that results are returned as expected.

 - Using presto-python-client
 - Using sqlalchemy
 - Using ipython-sql

### Setting up and reading credentials

Please update the host (and port if necessary) in line with your deployment, and save your GeoSpock credentials as a username and password separated by a space in a `.geospockcredentials.txt` file in your home directory.

In [64]:
from os.path import expanduser

home = expanduser("~")
with open(home + "/.geospockcredentials.txt", "r") as file:
    x = file.readlines()
    
[username,password] = x[0].split(" ")
host = "<your deploymeny host>"
port = 8446

### Connecting using presto-python-client

To use presto-python-client, first run `pip install presto-python-client` at your command prompt.

In [71]:
import prestodb.dbapi as presto
import prestodb

conn = presto.Connection(
    host=host, 
    port=port, 
    user=username,
    http_scheme='https',
    auth=prestodb.auth.BasicAuthentication(username, password))
cur = conn.cursor()

cur.execute('SELECT * from geospock.banks.default LIMIT 10')
result = cur.fetchall()

df = pd.DataFrame(sorted(result, key=lambda x: x[2], reverse=True))
display(df)

Unnamed: 0,0,1,2,3,4
0,Wells Fargo,unknown,Wells Fargo Bank,34.840997,-82.363497
1,Wells Fargo,unknown,Wells Fargo Bank,34.851535,-82.395671
2,Suntrust Bank,unknown,SunTrust Bank,34.840418,-82.363554
3,Suntrust,Suntrust,SunTrust Bank,34.885805,-82.353804
4,Suntrust,Suntrust,SunTrust Bank,34.776407,-82.314119
5,Regions Bank Drive Through,unknown,Regions Bank,34.850946,-82.397663
6,Regions Bank,Regions Bank,Regions Bank,34.836923,-82.367071
7,Bank of America,unknown,Bank of America,34.82621,-82.396032
8,Bank of America,unknown,Bank of America,34.830817,-82.370332
9,Bank of America,Bank of America,Bank of America,34.881311,-82.358911


### Connecting using sqlalchemy

To use sqlalchemy, first run `pip install sqlalchemy` at your command prompt.

In [70]:
import pandas as pd
from sqlalchemy.engine import create_engine

# Presto
engine = create_engine('presto://{username}:{password}@{host}:{port}'.format(username=username, password=password, host=host, port=port),
                      connect_args={'protocol': 'https'}) 

query = 'select * from geospock.banks.default limit 10'

df = pd.read_sql(query, engine)
df.head()

Unnamed: 0,bank_brand,bank_brand_4,bank_brand_6,latitude,longitude
0,Chase,unknown,JPMorgan Chase Bank,40.769781,-73.98182
1,Bank of America,unknown,Bank of America,40.771105,-73.981835
2,Chase,unknown,JPMorgan Chase Bank,40.774518,-73.981135
3,Chase,unknown,JPMorgan Chase Bank,40.777675,-73.978887
4,Chase,unknown,JPMorgan Chase Bank,40.779807,-73.976776


### Connecting using ipython-sql

To use ipython-sql, first run `pip install ipython-sql` at your command prompt.

In [72]:
%load_ext sql

%config SqlMagic.autocommit=False



In [73]:
%sql presto://{username}:{password}@{host}:{port}/geospock?protocol=https

%sql select * from geospock.banks.default limit 10

Done.


bank_brand,bank_brand_4,bank_brand_6,latitude,longitude
Chase,unknown,JPMorgan Chase Bank,40.7697814,-73.9818199
Bank of America,unknown,Bank of America,40.7711048,-73.9818347
Chase,unknown,JPMorgan Chase Bank,40.7745185,-73.9811352
Chase,unknown,JPMorgan Chase Bank,40.7776752,-73.978887
Chase,unknown,JPMorgan Chase Bank,40.7798071,-73.9767764
Chase,unknown,JPMorgan Chase Bank,40.7652782,-73.9642202
Bank of America,unknown,Bank of America,40.7651106,-73.9637106
Chase,unknown,JPMorgan Chase Bank,40.7427016,-73.95287643258229
Chase,unknown,JPMorgan Chase Bank,40.74261155,-73.91838275696541
Chase,unknown,JPMorgan Chase Bank,40.7677332,-73.9563296
