<html>
    <p> <font size = 10 >Jupyter Magic with SQL</font>
</html>

## Table of contents 
1. [Loading sql extension](#load) 
2. [Connection to the database](#connect)
3. [Querying](#query)
4. [Using variables](#variables)
5. [pyodbc without Jupyter Magic](#pyodbc)

# loading sql extension <a class="anchor" id="load"></a>

First of all, we are loading iPython sql library, then dependencies for work, then the extension to enable "SQL Magic"

In [None]:
!pip3 install ipython-sql

In [None]:
!pip install pyodbc

In [None]:

import pandas as pd 
import pyodbc
import numpy as np
import matplotlib as plt

%matplotlib inline

In [None]:
%load_ext sql

# how to connect to the database <a class="anchor" id="connect"></a>

Change the connection string to your database to play with the commands

Connection to the sql server database using SQL Alchemy (Object Relational Mapper for Python)
Connection string format 'mssql+pyodbc://user:password@server/database?DRIVER={enty in /etc/odbcinst.ini}' 


In [None]:
%sql mssql+pyodbc://@localsqldsn

# querying <a class="anchor" id="query"></a>

Lets start with a simple query <br>
If your query is short, you can write one-liner code:

In [None]:
f = 4
r = 5
f+r

In [18]:
%sql SELECT TOP 10 *  FROM [Sales].[SalesOrderDetail]

 * mssql+pyodbc://@localsqldsn
Done.


SalesOrderID,SalesOrderDetailID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid,ModifiedDate
43659,1,4911-403C-98,1,776,1,2024.994,0.0,2024.994,B207C96D-D9E6-402B-8470-2CC176C42283,2011-05-31 00:00:00
43659,2,4911-403C-98,3,777,1,2024.994,0.0,6074.982,7ABB600D-1E77-41BE-9FE5-B9142CFC08FA,2011-05-31 00:00:00
43659,3,4911-403C-98,1,778,1,2024.994,0.0,2024.994,475CF8C6-49F6-486E-B0AD-AFC6A50CDD2F,2011-05-31 00:00:00
43659,4,4911-403C-98,1,771,1,2039.994,0.0,2039.994,04C4DE91-5815-45D6-8670-F462719FBCE3,2011-05-31 00:00:00
43659,5,4911-403C-98,1,772,1,2039.994,0.0,2039.994,5A74C7D2-E641-438E-A7AC-37BF23280301,2011-05-31 00:00:00
43659,6,4911-403C-98,2,773,1,2039.994,0.0,4079.988,CE472532-A4C0-45BA-816E-EEFD3FD848B3,2011-05-31 00:00:00
43659,7,4911-403C-98,1,774,1,2039.994,0.0,2039.994,80667840-F962-4EE3-96E0-AECA108E0D4F,2011-05-31 00:00:00
43659,8,4911-403C-98,3,714,1,28.8404,0.0,86.5212,E9D54907-E7B7-4969-80D9-76BA69F8A836,2011-05-31 00:00:00
43659,9,4911-403C-98,1,716,1,28.8404,0.0,28.8404,AA542630-BDCD-4CE5-89A0-C1BF82747725,2011-05-31 00:00:00
43659,10,4911-403C-98,6,709,1,5.7,0.0,34.2,AC769034-3C2F-495C-A5A7-3B71CDB25D4E,2011-05-31 00:00:00


If the query spans several lines, you can put the query into the variable and execute it:

In [19]:
db_query = '''
SELECT VendorID, [250] AS Emp1, [251] AS Employee, [256] AS Emp3, [257] AS Emp4, [260] AS Emp5  
FROM   
(SELECT PurchaseOrderID, EmployeeID, VendorID  
FROM Purchasing.PurchaseOrderHeader) p  
PIVOT  
(  
COUNT (PurchaseOrderID)  
FOR EmployeeID IN  
( [250], [251], [256], [257], [260] )  
) AS pvt  
ORDER BY pvt.VendorID;  '''

In [20]:
%sql $db_query

 * mssql+pyodbc://@localsqldsn
Done.


VendorID,Emp1,Employee,Emp3,Emp4,Emp5
1492,2,5,4,4,4
1494,2,5,4,5,4
1496,2,4,4,5,5
1498,2,5,4,4,4
1500,3,4,4,5,4
1504,2,5,5,4,5
1506,2,4,5,5,5
1508,2,4,4,6,5
1510,2,4,4,5,5
1514,2,4,4,5,4


# using variables <a class="anchor" id="variables"></a>


In [None]:
customerid = 11000

Two syntax options are working: $variable or :variable:

In [None]:
%sql select top 5 *  from [Sales].[SalesOrderHeader] where CustomerID = :customerid

In [None]:
%sql select top 5 * from [Sales].[SalesOrderHeader] where CustomerID  = $customerid

In [None]:
type(customerid)

In [None]:
tablename = '[Sales].[SalesOrderHeader]'

In [None]:
%sql select top 100 * from $tablename 

# store resultset into variable <a class="anchor" id="dataset"></a>

In [None]:
db_query = '''
select top 100 * from [Sales].[SalesOrderHeader] '''

In [None]:
result = %sql $db_query

In [None]:
result[0]

Print full resultset:

In [None]:
type(result)

In [None]:
res = pd.DataFrame(data = result.dicts())

In [None]:
res

Print second line from resultset ( starts from 0):

In [None]:
result[10]

Print one cell from resultset:

Iterate over the resultset:

In [None]:
for i in result.dicts():
    print (i['SalesOrderNumber'])

Look how easy to visualize the query result using pie chart: 

In [None]:
pl = %sql select top 5 SalesPersonID, sum(TotalDue) as SalesAmount from $tablename group by SalesPersonID order by sum(TotalDue) desc;

In [None]:
pl.bar()

# pyodbc without Jupyter Magic <a class="anchor" id="pyodbc"></a>

In [None]:
import pyodbc
import pandas as pd
conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=AdventureWorks2019;Trusted_Connection=yes')

In [None]:
import pyodbc
conn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=2022dsiwd.database.windows.net;DATABASE=AdventureWorks2019;UID=admin2022dsiwd;PWD=Divergence!2022dsiwd')

In [None]:
sql = "select top 100 * from [Sales].[SalesOrderHeader] "
data = pd.read_sql(sql,conn)

In [None]:
data

In [None]:
type(data)

In [None]:
import pandas
sql = "SELECT * from [dbo].[titanic2022]"
data = pandas.read_sql(sql,conn2)
data

In [None]:
foo =data[['Pclass','Survived']].values.tolist()
#.values.tolist()
foo.copy()

In [None]:
foo =[[d,f] for d in data['Survived'].values for f in data['Pclass']]
{f for f in foo}

In [None]:
#Azure SQl Connection
conn2 = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=2022dsiwd.database.windows.net;UID=admin2022dsiwd;DATABASE=AdventureWorks2020;PWD=Divergence!2022dsiwd')