# SQL & Python

Data Analysis using SQL in Python: the path from raw data to data analysis involves several processes including: data cleaning and data pre-processing where SQL language can be very useful. 

There are several Python packages available that interface with SQL databases. 

1. SQLlite - is Python basic approach to SQL databases, 2. SQL Server; there is PostgreSQL, etc


## 1. SQLlite

In [1]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import sqlite3 #sqlite3 module provides SQL interface 

In [8]:
#Create a database

#create a connection object that represents the database
conn = sqlite3.connect('mydb.db') 

In [9]:
#Create a table in your database
#A SQL query to create a table
query = 'CREATE TABLE table1(id int, x1 real, x2 real, y real)'

In [10]:
#Execute the query
conn.execute(query)
conn.commit()

In [11]:
#Insert some data into the table
conn.execute("INSERT INTO table1(id, x1,x2,y) VALUES(1, 2, 8, 9)")

<sqlite3.Cursor at 0x26b698a51f0>

In [12]:
#Insert more data
conn.execute("INSERT INTO table1(id, x1,x2,y) VALUES(2, 4, 5, 11), (3, 0, 1.4, 10.8)")

<sqlite3.Cursor at 0x26b698a5570>

In [13]:
#to retrieve 2 rows
conn.execute('SELECT * FROM table1').fetchmany(2) 

[(1, 2.0, 8.0, 9.0), (2, 4.0, 5.0, 11.0)]

In [14]:
temp =conn.execute("SELECT*FROM table1").fetchall()
temp

[(1, 2.0, 8.0, 9.0), (2, 4.0, 5.0, 11.0), (3, 0.0, 1.4, 10.8)]

In [15]:
temp[:2]

[(1, 2.0, 8.0, 9.0), (2, 4.0, 5.0, 11.0)]

## 2. SQL Server

Connecting to a SQL Server database using pyodbc module 

In [None]:
pip install pyodbc 

In [16]:
import pyodbc 

#### Connect() 

- connect() creates a new database session & it returns an instance of the connection class that can be used to create a cursor object to run SQL queries to the database

To connect to a database, call connect() and pass the relevant parameters to it


In [17]:
conn = pyodbc.connect('Driver={SQL Server};'
                      'Server=K\SQLEXPRESS;'
                      'Database=mytestdb;'
                      'Trusted_Connection=yes;')

In [18]:
# Create a cursor object to perform database operations
cursor = conn.cursor()

In [19]:
#Execute a query
# A sample query of all data from "dbo.mytesttable" on the SQL Server database "mytestdb":
cursor.execute('SELECT * FROM mytestdb.dbo.mytesttable')

#retrieve query results
for row in cursor:
    print(row)

(1, 'Kyra', 'Michel')


### Using pandas read_sql_query

retrieved query is a DataFrame

In [20]:
import pandas as pd

table = pd.read_sql_query('SELECT * FROM mytestdb.dbo.mytesttable',conn)
table

Unnamed: 0,ix,firstname,lastname
0,1,Kyra,Michel


In [21]:
# Close the cursor and connection 
cursor.close()
conn.close()

## 3. PostgreSQL

Connecting to PostgreSQL database is similar to SQL Server 

In Jupyter notebook, to connect to a PostgreSQL database similarly using a connect() method from psycopg2 module and passing the relevant parameters.

Steps:


In [None]:
#quick installation
pip install -U pip 
pip install psycopg2-binary 

In [22]:
import psycopg2


#### Create a cursor object:

conn = psycopg2.connect(
    host="localhost",
    database="database_name",
    user="postgres",
    password="your_password")

#### Open a cursor to run SQL commands to the database:

cursor = conn.cursor()

#### Execute sample query:

cursor.execute("SELECT * FROM my_data")

#### Retrieve query results:

query = cursor.fetchall()

query

#### Close the cursor and connection to so the server:

cursor.close()

conn.close()
