# Examples for Kinetica ODBC

The following steps will test:
1. Creating a table with DDL.
2. Saving a Dataframe to a table.
2. Loading a table to a dataframe.

For this to work the Kinetica driver and connection must be configured in `/etc/odbcinst.ini` and `/etc/odbc.ini`.

See also:
* [PyODBC Wiki](https://github.com/mkleehammer/pyodbc/wiki)
* [Pandas SQL I/O](https://pandas.pydata.org/pandas-docs/stable/io.html#io-sql)

### Import kodbc_io script.

In [1]:
# Local libraries should automatically reload
%reload_ext autoreload
%autoreload 1

# to access Kinetica Jupyter I/O functions
import sys
sys.path.append('../KJIO') 

%aimport kodbc_io
SCHEMA = 'TEST'

  return f(*args, **kwds)
  return f(*args, **kwds)


### Function get_odbc()

Convenience function to get a kinetica ODBC connection.

### Create test table

In [2]:
_sql = '''
create or replace replicated table {}.test_odbc (
    str_col    varchar(16) not null,
    double_col double not null,     
    float_col  float not null,    
    int_col    integer not null
)
'''.format(SCHEMA)

_cnxn = kodbc_io.get_odbc()
_cnxn.execute(_sql)
_cnxn.close()

print('Created table: test_odbc')

Connected to GPUdb ODBC Server (6.2.0.12.20180720232954)
Created table: test_odbc


### Create test dataset

In [3]:
import numpy as np
import pandas as pd

# Create a dataframe from a dict of series. 
_test_df = pd.DataFrame({ 
    'str_col' : ['A', 'B', 'C', 'D'],
    #'cat_col' : pd.Categorical(["test","train","test","train"]),
    'double_col' : 1.,
    #'ts_col' : pd.date_range('1/1/2000', periods=4),
    'float_col' : pd.Series(range(4), dtype='float32'),
    'int_col' : np.array(np.random.randn(4)*10, dtype='int32')
    })

_test_df.head()

Unnamed: 0,str_col,double_col,float_col,int_col
0,A,1.0,0.0,7
1,B,1.0,1.0,-22
2,C,1.0,2.0,16
3,D,1.0,3.0,0


### Insert rows into test_odbc table

In [4]:
_sql = '''
insert into test_odbc (str_col, double_col, float_col, int_col)
values (?, ?, ?, ?)
'''

_records = _test_df.to_records(index=False)
_cnxn = kodbc_io.get_odbc()
_cursor = _cnxn.cursor()

for _row in _records:
    _cursor.execute(_sql, *_row.item())

_cnxn.close()
print('Inserted {} rows.'.format(len(_records)))

Connected to GPUdb ODBC Server (6.2.0.12.20180720232954)
Inserted 4 rows.


### Read table to dataframe

Here we use the Panads `read_sql()` function to fetch the sql into a Dataframe.

In [5]:
import pandas as pd

def kodbc_read_sql(_sql):
    _cnxn = get_odbc()
    _sql_df = pd.read_sql(_sql, _cnxn)
    _cnxn.close()
    return _sql_df

In [6]:
_sql_df = kodbc_io.get_df('select * from test_odbc')
_sql_df

Connected to GPUdb ODBC Server (6.2.0.12.20180720232954)


Unnamed: 0,str_col,double_col,float_col,int_col
0,A,1.0,0.0,7
1,B,1.0,1.0,-22
2,C,1.0,2.0,16
3,D,1.0,3.0,0


### Check type conversions

Kietica types are converted to Pandas datatypes.

In [7]:
_sql_df.dtypes

str_col        object
double_col    float64
float_col     float64
int_col         int64
dtype: object