# SQL Connections

#### NOTE: Highly recommend you explore specific libraries for your specific SQL Engine. Simple search for your database+python in Google and the top results should hopefully include an API.

* [MySQL](https://www.google.com/search?q=mysql+python)
* [PostgreSQL](https://www.google.com/search?q=postgresql+python)
* [MS SQL Server](https://www.google.com/search?q=MSSQLserver+python)
* [Orcale](https://www.google.com/search?q=oracle+python)
* [MongoDB](https://www.google.com/search?q=mongodb+python)

Let's review pandas capabilities by using SQLite, which comes built in with Python.

## Example SQL Database (temporary in your RAM)

You will need to install sqlalchemy with:

    pip install sqlalchemy
    
to follow along. To understand how to make a connection to your own database, make sure to review: https://docs.sqlalchemy.org/en/13/core/connections.html

https://docs.sqlalchemy.org/en/14/dialects/

___

In [4]:
import pandas as pd
import numpy as np

We're going to be using sqlalchemy to connect to the SQL engine.

A temporary database inside of your RAM

In [2]:
from sqlalchemy import create_engine

In [3]:
temp_db = create_engine('sqlite:///:memory:')

# And what this does is it just creates a temporary SQLite database inside of a computer's ram

In [5]:
# this temporary database is currently empty

df = pd.DataFrame(data=np.random.randint(low=1, high=100, size=(4, 4)), columns=['a', 'b', 'c', 'd'])

In [6]:
df

Unnamed: 0,a,b,c,d
0,88,76,59,39
1,88,46,36,28
2,26,30,1,31
3,25,25,75,53


In [8]:
# create a new table into my database

df.to_sql(name='new_table', con=temp_db)

# the main parameters are name and connection, which essentially allow you to check 
# if there's a table that already exists there, and if so, you can either make it fail
# or overwrite, which is 'if_exists' argument.

In [9]:
# Now, let's imagine I try to run this cell twice

df.to_sql(name='new_table', con=temp_db)

ValueError: Table 'new_table' already exists.

In [12]:
# if_exists : {'fail', 'replace', 'append'}, default 'fail'
    # How to behave if the table already exists.

    
# 'replace' - it's actually going to drop the table in your SQL database and then replace it with your dataframe

So now that this database exists and it has a table in it, let's actually grab information or read from the SQL database. And there's two main ways you can do this.

- One is to read in an entire table at once. Keep in mind, if you have a very large table, you probably don't want to do this because pandas holds the dataframes in its RAM. So if you have a table in a very large scale engine that is a 50 gigabyte table, this is not going to be able to read that in unless you have at least 50 gigabytes of RAM.

- Pass in an entire SQL query into pandas and it will first perform that query and then return the results as a dataframe.

In [13]:
new_df = pd.read_sql(sql='new_table', con=temp_db)

In [14]:
new_df

Unnamed: 0,index,a,b,c,d
0,0,88,76,59,39
1,1,88,46,36,28
2,2,26,30,1,31
3,3,25,25,75,53


In [15]:
pd.read_sql_query(sql='SELECT a, c FROM new_table', con=temp_db)

Unnamed: 0,a,c
0,88,59
1,88,36
2,26,1
3,25,75
