# IntegratedML applied to biomedical data, using PyODBC
This notebook demonstrates the following:
- Connecting to InterSystems IRIS via PyODBC connection
- Creating, Training and Executing (PREDICT() function) an IntegratedML machine learning model, applied to breast cancer tumor diagnoses
- INSERTING machine learning predictions into a new SQL table
- Executing a relatively complex SQL query containing IntegratedML PREDICT() and PROBABILITY() functions, and flexibly using the results to filter and sort the output

### ODBC and pyODBC Resources
Often, connecting to a database is more than half the battle when developing SQL-heavy applications, especially if you are not familiar with the tools, or more importantly the particular database system. If this is the case, and you are just getting started using PyODBC and InterSystems IRIS, this notebook and these resources below may help you get up to speed!

https://gettingstarted.intersystems.com/development-setup/odbc-connections/

https://irisdocs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=BNETODBC_support#BNETODBC_support_pyodbc

https://stackoverflow.com/questions/46405777/connect-docker-python-to-sql-server-with-pyodbc

https://stackoverflow.com/questions/44527452/cant-open-lib-odbc-driver-13-for-sql-server-sym-linking-issue

In [2]:
# make the notebook full screen
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

### 1. Install system packages for ODBC

In [1]:
!apt-get update
!apt-get install gcc
!apt-get install -y tdsodbc unixodbc-dev
!apt install unixodbc-bin -y
!apt-get clean 

Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Get:2 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [2696 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic InRelease [242 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]      
Get:5 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]    
Get:6 http://archive.ubuntu.com/ubuntu bionic/universe amd64 Packages [11.3 MB]
Get:7 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [884 kB]
Get:8 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [21.1 kB]
Get:9 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1490 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic/multiverse amd64 Packages [186 kB]
Get:11 http://archive.ubuntu.com/ubuntu bionic/main amd64 Packages [1344 kB]   
Get:12 http://archive.ubuntu.com/ubuntu bionic/restricted amd64 Packages [13.5 kB]
Get:13 http://a

Get:7 http://archive.ubuntu.com/ubuntu bionic/main amd64 libxext6 amd64 2:1.3.3-1 [29.4 kB]
Get:8 http://archive.ubuntu.com/ubuntu bionic/main amd64 fontconfig amd64 2.12.6-0ubuntu2 [169 kB]
Get:9 http://archive.ubuntu.com/ubuntu bionic/universe amd64 libmng2 amd64 2.0.2-0ubuntu3 [169 kB]
Get:10 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 x11-common all 1:7.7+19ubuntu7.1 [22.5 kB]
Get:11 http://archive.ubuntu.com/ubuntu bionic/main amd64 libice6 amd64 2:1.0.9-2 [40.2 kB]
Get:12 http://archive.ubuntu.com/ubuntu bionic/main amd64 libsm6 amd64 2:1.2.2-1 [15.8 kB]
Get:13 http://archive.ubuntu.com/ubuntu bionic/main amd64 libxt6 amd64 1:1.1.5-1 [160 kB]
Get:14 http://archive.ubuntu.com/ubuntu bionic/main amd64 libaudio2 amd64 1.9.4-6 [50.3 kB]
Get:15 http://archive.ubuntu.com/ubuntu bionic/main amd64 mysql-common all 5.8+1.0.4 [7308 B]
Get:16 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libmysqlclient20 amd64 5.7.37-0ubuntu0.18.04.1 [689 kB]
Get:17 http://archiv

7[24;0f[42m[30mProgress: [ 22%][49m[39m [############..............................................] 8Selecting previously unselected package mysql-common.
Preparing to unpack .../14-mysql-common_5.8+1.0.4_all.deb ...
7[24;0f[42m[30mProgress: [ 23%][49m[39m [#############.............................................] 8Unpacking mysql-common (5.8+1.0.4) ...
7[24;0f[42m[30mProgress: [ 24%][49m[39m [##############............................................] 8Selecting previously unselected package libmysqlclient20:amd64.
Preparing to unpack .../15-libmysqlclient20_5.7.37-0ubuntu0.18.04.1_amd64.deb ...
Unpacking libmysqlclient20:amd64 (5.7.37-0ubuntu0.18.04.1) ...
7[24;0f[42m[30mProgress: [ 25%][49m[39m [##############............................................] 8Selecting previously unselected package qtcore4-l10n.
Preparing to unpack .../16-qtcore4-l10n_4%3a4.8.7+dfsg-7ubuntu1_all.deb ...
7[24;0f[42m[30mProgress: [ 26%][49m[39m [###############........

7[24;0f[42m[30mProgress: [ 59%][49m[39m [##################################........................] 8Setting up libjbig0:amd64 (2.1-3.1build1) ...
7[24;0f[42m[30mProgress: [ 60%][49m[39m [##################################........................] 8Setting up qtcore4-l10n (4:4.8.7+dfsg-7ubuntu1) ...
7[24;0f[42m[30mProgress: [ 61%][49m[39m [###################################.......................] 8Setting up mysql-common (5.8+1.0.4) ...
7[24;0f[42m[30mProgress: [ 62%][49m[39m [####################################......................] 8update-alternatives: using /etc/mysql/my.cnf.fallback to provide /etc/mysql/my.cnf (my.cnf) in auto mode
Setting up qtchooser (64-ga1b6736-5) ...
7[24;0f[42m[30mProgress: [ 63%][49m[39m [####################################......................] 8Setting up libtiff5:amd64 (4.0.9-5ubuntu0.4) ...
7[24;0f[42m[30mProgress: [ 65%][49m[39m [#####################################.....................] 87[24;0f[42

#### Use this command to troubleshoot a failed pyodbc installation:
!pip install --upgrade --global-option=build_ext --global-option="-I/usr/local/include"  --global-option="-L/usr/local/lib" pyodbc

In [2]:
!pip install pyodbc

Collecting pyodbc
  Downloading pyodbc-4.0.32.tar.gz (280 kB)
     |████████████████████████████████| 280 kB 2.1 MB/s            
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: pyodbc
  Building wheel for pyodbc (setup.py) ... [?25ldone
[?25h  Created wheel for pyodbc: filename=pyodbc-4.0.32-cp36-cp36m-linux_x86_64.whl size=280778 sha256=3d0b1025f722595d0fe533a57ee7fc7ca998df1624ee0cab994c19ecd07a52de
  Stored in directory: /root/.cache/pip/wheels/88/b8/31/f860f814f8621bb0807b257e53f640202b8216ee9f2d4b4f31
Successfully built pyodbc
Installing collected packages: pyodbc
Successfully installed pyodbc-4.0.32


In [3]:
!rm /etc/odbcinst.ini
!rm /etc/odbc.ini

In [4]:
!ln -s /tf/odbcinst.ini /etc/odbcinst.ini
!ln -s /tf/odbc.ini /etc/odbc.ini

In [5]:
!cat /tf/odbcinst.ini

[InterSystems ODBC35]
UsageCount=1
Driver=/tf/libirisodbcu35.so
Setup=/tf/libirisodbcu35.so
SQLLevel=1
FileUsage=0
DriverODBCVer=02.10
ConnectFunctions=YYN
APILevel=1
DEBUG=1
CPTimeout=<not pooled>



In [8]:
!cat /tf/odbc.ini

[user]
Driver=InterSystems ODBC35
Protocol=TCP
Host=irisimlsvr
Port=51773
Namespace=USER
UID=SUPERUSER
Password=SYS
Description=Sample namespace
Query Timeout=0
Static Cursors=0



In [9]:
!odbcinst -j

unixODBC 2.3.4
DRIVERS............: /etc/odbcinst.ini
SYSTEM DATA SOURCES: /etc/odbc.ini
FILE DATA SOURCES..: /etc/ODBCDataSources
USER DATA SOURCES..: /root/.odbc.ini
SQLULEN Size.......: 8
SQLLEN Size........: 8
SQLSETPOSIROW Size.: 8


### 2. Verify you see "InterSystems ODBC35" in the drivers list

In [10]:
import pyodbc
print(pyodbc.drivers())

['InterSystems ODBC35']


### 3. Get an ODBC connection 

In [12]:
import pyodbc 
import time


#input("Hit any key to start")
dsn = 'IRIS QuickML demo via PyODBC'
server = 'irisimlsvr' #'192.168.99.101' 
port = '1972' #'9091'
database = 'USER' 
username = 'SUPERUSER' 
password = 'SYS' 
cnxn = pyodbc.connect('DRIVER={InterSystems ODBC35};SERVER='+server+';PORT='+port+';DATABASE='+database+';UID='+username+';PWD='+ password)

### Ensure it read strings correctly.
cnxn.setdecoding(pyodbc.SQL_CHAR, encoding='utf8')
cnxn.setdecoding(pyodbc.SQL_WCHAR, encoding='utf8')
cnxn.setencoding(encoding='utf8')

### 4. Get a cursor; start the timer

In [13]:
cursor = cnxn.cursor()
start= time.clock()

### 5. specify the training data, and give a model name

In [14]:
dataTable = 'Biomedical.BreastCancer'
dataTablePredict = 'Result02'
dataColumn =  'Diagnosis'
dataColumnPredict = "PredictedDiagnosis"
modelName = "bc" #chose a name - must be unique in server end

### 6. Train and predict

In [15]:
cursor.execute("CREATE MODEL %s PREDICTING (%s)  FROM %s" % (modelName, dataColumn, dataTable))
cursor.execute("TRAIN MODEL %s FROM %s" % (modelName, dataTable))
cursor.execute("Create Table %s (%s VARCHAR(100), %s VARCHAR(100))" % (dataTablePredict, dataColumnPredict, dataColumn))
cursor.execute("INSERT INTO %s  SELECT TOP 20 PREDICT(%s) AS %s, %s FROM %s" % (dataTablePredict, modelName, dataColumnPredict, dataColumn, dataTable)) 
cnxn.commit()

Error: ('HY000', '[HY000] [Iris ODBC][State : HY000][Native Code 400]\n[libirisodbcu35.so]\n[SQLCODE: <-400>:<Fatal error occurred>]\r\n[Location: <ServerLoop>]\r\n[%msg: <ERROR #2816: %ML No data supplied>] (400) (SQLExecDirectW)')

### 7. Show the predict result

In [18]:
import pandas as pd
from IPython.display import display

df1 = pd.read_sql("SELECT * from %s ORDER BY ID" % dataTablePredict, cnxn)
display(df1)

Unnamed: 0,PredictedDiagnosis,Diagnosis
0,M,M
1,M,M
2,M,M
3,M,M
4,M,M
5,M,M
6,M,M
7,M,M
8,M,M
9,M,M


### 8. Show a complicated query
IntegratedML function PREDICT() and PROBABILITY() can appear virtually anywhere in a SQL query, for maximal flexibility!
Below we are SELECTing columns as well as the result of the PROBABILITY function, and then filtering on the result of the PREDICT function. To top it off, ORDER BY is using the output of PROBSBILITY for sorting.

In [30]:
df2 = pd.read_sql("SELECT ID, PROBABILITY(bc FOR 'M') AS Probability, Diagnosis FROM %s \
                    WHERE MeanArea BETWEEN 300 AND 600 AND MeanRadius > 5 AND PREDICT(%s) = 'M' \
                    ORDER BY Probability" % (dataTable, modelName),cnxn)         
display(df2)

Unnamed: 0,ID,Probability,Diagnosis
0,298,0.516043,M
1,41,0.544233,M
2,136,0.583259,M
3,74,0.675478,M
4,147,0.944923,M
5,216,0.964738,M
6,42,0.971131,M
7,172,0.98059,M
8,45,0.996879,M
9,436,0.997486,M


### 9. Close and clean 

In [None]:
cnxn.close()
end= time.clock()
print ("Total elapsed time: ")
print (end-start)
#input("Hit any key to end")