# Sql and Python

One of the primary purpose of studying SQL for a data scientist or an ml engineer is to be able to query data from the database and create some kind of analysis. Some times though, one may need to do an analysis that would require more capabilities than what sql can offer, eg creating visualizations, or sending the processed data over to some kind of api service etc. This then necessitates the need to be able to access data from sql using a programming language that can do things such as visualizatons, ml, api requests etc. 

Another reason one may want to know about connecting with sql through python is that many times one needs to create databases!! Suppose you have data inside json files, you can't ingest this data in a database. You will need to process the json first and only then it can be fed into a db. 

The whole discussion on python-sql will revolve around two things:

1. Connecting to an existing sql database and fetching data into python
2. Creating a new database and populatiing it with tables using python


## Connecting to an existing database and fetching data

We can use `sqlalchemy` to connect to different kinds of databases. Its an ORM, heavily used by web-developers as well as data engineering folks.

In [1]:
from sqlalchemy.ext.automap import automap_base 
from sqlalchemy import create_engine
Base = automap_base() 
engine = create_engine('sqlite:///../../data/music.db')

In [2]:
## Automap lets you look at the tables in a db
Base.prepare(engine, reflect=True)
Base.classes.keys()

['Album',
 'Artist',
 'Customer',
 'Employee',
 'Genre',
 'Invoice',
 'InvoiceLine',
 'Track',
 'MediaType',
 'Playlist']

In [3]:
## One can use the engine object to query data even without using Automap
data = engine.execute("Select * from Artist limit 5;")

In [4]:
data.fetchall()

[(1, 'AC/DC'),
 (2, 'Accept'),
 (3, 'Aerosmith'),
 (4, 'Alanis Morissette'),
 (5, 'Alice In Chains')]

In [5]:
 engine.execute("Select * from Employee limit 5;").fetchall()

[(1, 'Adams', 'Andrew', 'General Manager', None, '1962-02-18 00:00:00', '2002-08-14 00:00:00', '11120 Jasper Ave NW', 'Edmonton', 'AB', 'Canada', 'T5K 2N1', '+1 (780) 428-9482', '+1 (780) 428-3457', 'andrew@chinookcorp.com'),
 (2, 'Edwards', 'Nancy', 'Sales Manager', 1, '1958-12-08 00:00:00', '2002-05-01 00:00:00', '825 8 Ave SW', 'Calgary', 'AB', 'Canada', 'T2P 2T3', '+1 (403) 262-3443', '+1 (403) 262-3322', 'nancy@chinookcorp.com'),
 (3, 'Peacock', 'Jane', 'Sales Support Agent', 2, '1973-08-29 00:00:00', '2002-04-01 00:00:00', '1111 6 Ave SW', 'Calgary', 'AB', 'Canada', 'T2P 5M5', '+1 (403) 262-3443', '+1 (403) 262-6712', 'jane@chinookcorp.com'),
 (4, 'Park', 'Margaret', 'Sales Support Agent', 2, '1947-09-19 00:00:00', '2003-05-03 00:00:00', '683 10 Street SW', 'Calgary', 'AB', 'Canada', 'T2P 5G3', '+1 (403) 263-4423', '+1 (403) 263-4289', 'margaret@chinookcorp.com'),
 (5, 'Johnson', 'Steve', 'Sales Support Agent', 2, '1965-03-03 00:00:00', '2003-10-17 00:00:00', '7727B 41 Ave', 'Cal

In [6]:
## But using automap one can access metadata such as column names
Artist = Base.classes.Artist
Album = Base.classes.Album

In [7]:
Artist.__table__.columns.keys()

['ArtistId', 'Name']

In [8]:
## One can use another api as well to query data (this is a very popular method while developing web-applications
## feel free to read more about query types here https://docs.sqlalchemy.org/en/14/orm/tutorial.html#querying
from sqlalchemy.orm import Session
session = Session(engine)
for artist in session.query(Artist).limit(10):
    print(artist.ArtistId, artist.Name)

1 AC/DC
2 Accept
3 Aerosmith
4 Alanis Morissette
5 Alice In Chains
6 Antônio Carlos Jobim
7 Apocalyptica
8 Audioslave
9 BackBeat
10 Billy Cobham


In [8]:
### What if you want the data in a pandas dataframe?
import pandas as pd
df_sql = pd.read_sql("Select * from artist;",engine)

In [9]:
df_sql

Unnamed: 0,ArtistId,Name
0,1,AC/DC
1,2,Accept
2,3,Aerosmith
3,4,Alanis Morissette
4,5,Alice In Chains
...,...,...
270,271,"Mela Tenenbaum, Pro Musica Prague & Richard Kapp"
271,272,Emerson String Quartet
272,273,"C. Monteverdi, Nigel Rogers - Chiaroscuro; Lon..."
273,274,Nash Ensemble


## Creating a db and ingesting flat files as tables.

The second task that can be handled effectively by python-sql is the creation of databases and tables within them.

We will learn the following:

1. Ingesting single table in a db from a flat file
2. Ingesting multiple related tables in a db from flat files