## Connecting to Presto

The three mandatory arguments to create a connection are host, port, and user. Other arguments such as source allow to identify the origin of the query. A common use case is to use it to tell which service, tool, or code sent the query.

Let's create a connection:

In [2]:
import prestodb.dbapi as presto

conn = presto.Connection(host="presto", port=8080, user="demo")
cur = conn.cursor()
cur

<prestodb.dbapi.Cursor at 0x7fb0d80f95b0>

## Employee Gender Data

The employees database contains information on employee gender. For example:

In [3]:
cur.execute("SELECT emp_no,gender FROM mysql.employees.employees LIMIT 5")
rows = cur.fetchall()

import pandas as pd
from IPython.display import display

df = pd.DataFrame(rows)
display(df)

Unnamed: 0,0,1
0,10001,M
1,10002,F
2,10003,M
3,10004,M
4,10005,M


## Semantic Mapping

However, gender is stored in the database as `M` and `F`. We can use semantic mapping to map these values to `Male` and `Female`, which may be of more value to an operational user.

In [5]:
cur.execute("SELECT emp_no, CASE gender WHEN 'M' THEN 'Male' WHEN 'F' THEN 'Female' END AS sex FROM mysql.employees.employees LIMIT 5")
rows = cur.fetchall()

import pandas as pd
from IPython.display import display

df = pd.DataFrame(rows)
display(df)

Unnamed: 0,0,1
0,10001,Male
1,10002,Female
2,10003,Male
3,10004,Male
4,10005,Male


## Semantic View

Semantic mappings can be persisted into a view:

In [9]:
cur.execute("CREATE OR REPLACE VIEW hive.default.employee_genders AS SELECT emp_no, CASE gender WHEN 'M' THEN 'Male' WHEN 'F' THEN 'Female' END AS sex FROM mysql.employees.employees")
cur.fetchall()

[[True]]

## Querying Semantic View

The semantic view can now be queried

In [10]:
cur.execute("SELECT * FROM hive.default.employee_genders LIMIT 5")
rows = cur.fetchall()

import pandas as pd
from IPython.display import display

df = pd.DataFrame(rows)
display(df)

Unnamed: 0,0,1
0,10001,Male
1,10002,Female
2,10003,Male
3,10004,Male
4,10005,Male


## Querying Semantic Values

The semantic view can be queried by semantic value. That is, we can query for female employees by searching for 'Female'. Note, the underyling datastore stores female as 'F'.

In [12]:
cur.execute("SELECT * FROM hive.default.employee_genders WHERE sex='Female' LIMIT 5")
rows = cur.fetchall()

import pandas as pd
from IPython.display import display

df = pd.DataFrame(rows)
display(df)

Unnamed: 0,0,1
0,10002,Female
1,10006,Female
2,10007,Female
3,10009,Female
4,10010,Female
