# SQLAlchemy OmniSci


OmniSciDB is the world's fastest open source SQL engine, equally powerful at the heart of the OmniSci platform as it is accelerating third-party analytic apps.

OmniSciDB optimizes the memory and compute layers to deliver unprecedented performance. OmniSciDB was designed to keep hot data in GPU memory for the fastest access possible. Other GPU database systems have taken the approach of storing the data in CPU memory, only moving it to GPU at query time, trading the gains they receive from GPU parallelism with transfer overheads over the PCIe bus.

OmniSciDB avoids this transfer inefficiency by caching the most recently touched data in High Bandwidth Memory on the GPU, which offers up to 10x the bandwidth of CPU DRAM and far lower latency. OmniSciDB is also designed to exploit efficient inter-GPU communication infrastructure such as NVIDIA NVLink when available.

SQLAlchemy OmniSci is a dialect that allows the usage of SQLAlchemy with OmniSciDB Database.

SQLALchemy is a powerful library that provides a common API to work different databases systems.


## Installation

You can install sqlalchemy-omnisci using conda or pip:

```bash
$ conda install -y sqlalchemy-omnisci
```

or

```bash
$ pip install sqlalchemy-omnisci
```

`sqlalchemy-omnisci` is a `sqlalchemy` dialect, so you don't need to import that directly. 
Just import `sqlalchemy` and create a connection using the following structure:

`omnisci://<user>:<pass>@<host>:<port>/<db>?protocol=<protocol>`


In [1]:
import sqlalchemy as sqla
from sqlalchemy import create_engine
import pandas as pd

engine = create_engine(
    "omnisci://demouser:HyperInteractive@"
    "metis.mapd.com:443/mapd?protocol=https"
)

con = engine.connect()

In [2]:
con.dialect.get_table_names(con)

['flights_donotmodify',
 'contributions_donotmodify',
 'tweets_nov_feb',
 'zipcodes_orig',
 'zipcodes',
 'demo_vote_clean',
 'us_faults',
 'zipcodes_2017',
 'us_county_level_tiger_edges_2018',
 'ca_roads_tiger',
 'input_node',
 'uk_wells',
 'RentalListings',
 'github',
 'ships_ais',
 'asf_msg_graph']

In [3]:
metadata = sqla.MetaData()

In [14]:
flights_donotmodify = sqla.Table(
    'flights_donotmodify', 
    metadata, 
    autoload=True, 
    autoload_with=engine
)
flights_donotmodify

Table('flights_donotmodify', MetaData(), Column('flight_year', SMALLINT(), table=<flights_donotmodify>), Column('flight_month', SMALLINT(), table=<flights_donotmodify>), Column('flight_dayofmonth', SMALLINT(), table=<flights_donotmodify>), Column('flight_dayofweek', SMALLINT(), table=<flights_donotmodify>), Column('deptime', SMALLINT(), table=<flights_donotmodify>), Column('crsdeptime', SMALLINT(), table=<flights_donotmodify>), Column('arrtime', SMALLINT(), table=<flights_donotmodify>), Column('crsarrtime', SMALLINT(), table=<flights_donotmodify>), Column('uniquecarrier', VARCHAR(length=52), table=<flights_donotmodify>), Column('flightnum', SMALLINT(), table=<flights_donotmodify>), Column('tailnum', VARCHAR(length=52), table=<flights_donotmodify>), Column('actualelapsedtime', SMALLINT(), table=<flights_donotmodify>), Column('crselapsedtime', SMALLINT(), table=<flights_donotmodify>), Column('airtime', SMALLINT(), table=<flights_donotmodify>), Column('arrdelay', SMALLINT(), table=<flight

In [15]:
query = sqla.select([flights_donotmodify]).limit(1)
str(query.compile())

'SELECT flights_donotmodify.flight_year, flights_donotmodify.flight_month, flights_donotmodify.flight_dayofmonth, flights_donotmodify.flight_dayofweek, flights_donotmodify.deptime, flights_donotmodify.crsdeptime, flights_donotmodify.arrtime, flights_donotmodify.crsarrtime, flights_donotmodify.uniquecarrier, flights_donotmodify.flightnum, flights_donotmodify.tailnum, flights_donotmodify.actualelapsedtime, flights_donotmodify.crselapsedtime, flights_donotmodify.airtime, flights_donotmodify.arrdelay, flights_donotmodify.depdelay, flights_donotmodify.origin, flights_donotmodify.dest, flights_donotmodify.distance, flights_donotmodify.taxiin, flights_donotmodify.taxiout, flights_donotmodify.cancelled, flights_donotmodify.cancellationcode, flights_donotmodify.diverted, flights_donotmodify.carrierdelay, flights_donotmodify.weatherdelay, flights_donotmodify.nasdelay, flights_donotmodify.securitydelay, flights_donotmodify.lateaircraftdelay, flights_donotmodify.dep_timestamp, flights_donotmodify.

In [16]:
results = con.execute(query).fetchall()
results

[(2008, 1, 15, 2, 1640, 1652, 1813, 1819, 'YV', 7279, 'N37208', 93, 87, 62, -6, -12, 'IAD', 'CAE', 401, 3, 28, 0, None, 0, None, None, None, None, None, datetime.datetime(2008, 1, 15, 16, 40), datetime.datetime(2008, 1, 15, 18, 13), 'Mesa Airlines', 'Corporation', 'CANADAIR', datetime.date(1998, 2, 9), 'CL-600-2B19', 'Valid', 'Fixed Wing Multi-Engine', 'Turbo-Jet', 1997, 'Washington Dulles International', 'Chantilly', 'VA', 'USA', 38.94453048706055, -77.455810546875, 'Columbia Metropolitan', 'Columbia', 'SC', 'USA', 33.938838958740234, -81.11953735351562, -8622341.0, 4713729.5, -9030186.0, 4020592.75)]

In [17]:
df = pd.DataFrame(results)
df.columns = results[0].keys()
df

Unnamed: 0,flight_year,flight_month,flight_dayofmonth,flight_dayofweek,deptime,crsdeptime,arrtime,crsarrtime,uniquecarrier,flightnum,...,dest_name,dest_city,dest_state,dest_country,dest_lat,dest_lon,origin_merc_x,origin_merc_y,dest_merc_x,dest_merc_y
0,2008,1,15,2,1640,1652,1813,1819,YV,7279,...,Columbia Metropolitan,Columbia,SC,USA,33.938839,-81.119537,-8622341.0,4713729.5,-9030186.0,4020592.75


### Filtering

In [18]:
query = (
    sqla.select([flights_donotmodify])
    .where(flights_donotmodify.columns.flight_year==2008)
)
str(query.compile())

'SELECT flights_donotmodify.flight_year, flights_donotmodify.flight_month, flights_donotmodify.flight_dayofmonth, flights_donotmodify.flight_dayofweek, flights_donotmodify.deptime, flights_donotmodify.crsdeptime, flights_donotmodify.arrtime, flights_donotmodify.crsarrtime, flights_donotmodify.uniquecarrier, flights_donotmodify.flightnum, flights_donotmodify.tailnum, flights_donotmodify.actualelapsedtime, flights_donotmodify.crselapsedtime, flights_donotmodify.airtime, flights_donotmodify.arrdelay, flights_donotmodify.depdelay, flights_donotmodify.origin, flights_donotmodify.dest, flights_donotmodify.distance, flights_donotmodify.taxiin, flights_donotmodify.taxiout, flights_donotmodify.cancelled, flights_donotmodify.cancellationcode, flights_donotmodify.diverted, flights_donotmodify.carrierdelay, flights_donotmodify.weatherdelay, flights_donotmodify.nasdelay, flights_donotmodify.securitydelay, flights_donotmodify.lateaircraftdelay, flights_donotmodify.dep_timestamp, flights_donotmodify.

In [53]:
sql = str(query.compile(engine, compile_kwargs={"literal_binds": True}))
print(sql)

SELECT flights_donotmodify.flight_year, flights_donotmodify.flight_month, flights_donotmodify.flight_dayofmonth, flights_donotmodify.flight_dayofweek, flights_donotmodify.deptime, flights_donotmodify.crsdeptime, flights_donotmodify.arrtime, flights_donotmodify.crsarrtime, flights_donotmodify.uniquecarrier, flights_donotmodify.flightnum, flights_donotmodify.tailnum, flights_donotmodify.actualelapsedtime, flights_donotmodify.crselapsedtime, flights_donotmodify.airtime, flights_donotmodify.arrdelay, flights_donotmodify.depdelay, flights_donotmodify.origin, flights_donotmodify.dest, flights_donotmodify.distance, flights_donotmodify.taxiin, flights_donotmodify.taxiout, flights_donotmodify.cancelled, flights_donotmodify.cancellationcode, flights_donotmodify.diverted, flights_donotmodify.carrierdelay, flights_donotmodify.weatherdelay, flights_donotmodify.nasdelay, flights_donotmodify.securitydelay, flights_donotmodify.lateaircraftdelay, flights_donotmodify.dep_timestamp, flights_donotmodify.a

### Using with Pandas

In [9]:
sql = "SELECT * from flights_donotmodify LIMIT 10"
pd.read_sql(sql, engine)

Unnamed: 0,flight_year,flight_month,flight_dayofmonth,flight_dayofweek,deptime,crsdeptime,arrtime,crsarrtime,uniquecarrier,flightnum,...,dest_name,dest_city,dest_state,dest_country,dest_lat,dest_lon,origin_merc_x,origin_merc_y,dest_merc_x,dest_merc_y
0,2008,1,29,2,1433,1436,1458,1503,YV,2853,...,Los Angeles International,Los Angeles,CA,USA,33.942535,-118.408073,-12468680.0,3953075.25,-13181127.0,4021088.75
1,2008,1,29,2,818,821,850,850,YV,2728,...,Long Beach (Daugherty ),Long Beach,CA,USA,33.817722,-118.151611,-12468680.0,3953075.25,-13152577.0,4004352.75
2,2008,1,29,2,1125,1125,1137,1155,YV,2730,...,Long Beach (Daugherty ),Long Beach,CA,USA,33.817722,-118.151611,-12468680.0,3953075.25,-13152577.0,4004352.75
3,2008,1,29,2,1444,1447,1500,1507,YV,2732,...,Long Beach (Daugherty ),Long Beach,CA,USA,33.817722,-118.151611,-12468680.0,3953075.25,-13152577.0,4004352.75
4,2008,1,29,2,1746,1746,1807,1805,YV,2734,...,Long Beach (Daugherty ),Long Beach,CA,USA,33.817722,-118.151611,-12468680.0,3953075.25,-13152577.0,4004352.75
5,2008,1,29,2,2115,2115,2141,2139,YV,2739,...,Long Beach (Daugherty ),Long Beach,CA,USA,33.817722,-118.151611,-12468680.0,3953075.25,-13152577.0,4004352.75
6,2008,1,29,2,946,949,1342,1350,YV,2941,...,Memphis International,Memphis,TN,USA,35.042416,-89.976669,-12468680.0,3953075.25,-10016157.0,4169647.0
7,2008,1,29,2,1950,1951,2340,2351,YV,2946,...,Memphis International,Memphis,TN,USA,35.042416,-89.976669,-12468680.0,3953075.25,-10016157.0,4169647.0
8,2008,1,29,2,1752,1752,1940,1919,YV,2864,...,Rogue Valley International,Medford,OR,USA,42.374229,-122.873497,-12468680.0,3953075.25,-13678215.0,5217203.0
9,2008,1,29,2,1100,1103,1143,1204,YV,2769,...,Monterey Peninsula,Monterey,CA,USA,36.586983,-121.842949,-12468680.0,3953075.25,-13563495.0,4381693.0


## References

- https://www.omnisci.com/platform/omniscidb