## Connecting a MySQL database to a python notebook 
The following tutorial was prepared by Elizabeth Widman. Feedback is welcome to <e.a.widman1@gmail.com>

* [Step 1: Import and install mysql.connector](#first-bullet)
* [Step 2 Connecting to your remote MySQL database](#second-bullet)
* [Step 3. Pull data into a pandas dataframe](#third-bullet)
* [Example of main program flow](#fourth-bullet)

### Step 1: Import and install mysql.connector <a class="anchor" id="first-bullet"></a>

I used the easy install pkg which allows you to just click through using a gui

https://dev.mysql.com/doc/connector-python/en/

Also you may need to do an install in your virtual environment

I used:

conda install -c anaconda mysql-connector-python=2.0.4

### Step 2: Connecting to your remote MySQL database <a class="anchor" id="second-bullet"></a>

Below is the script I used to connect to the remote database. You will need to tweak the script to include your username, password, port etc. Note: you must also be connected to the database in terminal. For example:

mysql --login-path=insight < script_name.sql

You need to modify the code above to reflect your specific database connection. Also the database maintainer needs to add your public ssh key to their database permissions.

Note: After some time the remote connection may time out so if things are not working correctly in your notebook be sure to check you connection in terminal.

In [1]:
#Import mysql.connector into notebook
import mysql.connector
import pandas as pd

In [2]:
# Function connects to MySQL database
def connectMySQL(db_name):
    #configurations of your system
    config = {
            'user': 'insight',
            'password': 'insight',
            'host': '127.0.0.1',
            'port': '3307',
            'database': db_name
            }
    

    # open database connection
    cnx = None
    try:
        cnx = mysql.connector.connect(**config)    
    except mysql.connector.Error as err:
        if err.errno == mysql.connector.errorcode.ER_ACCESS_DENIED_ERROR:
            print("Something is wrong with your user name or password")
        elif err.errno == mysql.connector.errorcode.ER_BAD_DB_ERROR:
            print("Database does not exist")
        else:
            print(err)
        raise
    
    print 'Connected to {} database'.format(db_name)
    return cnx

### Step 3. Pull data into a pandas dataframe  <a class="anchor" id="third-bullet"></a>

In [3]:
#Function that takes a table in a cursor object (which is a connection to the entire database)
#selects a table from the database and pulls it into a pandas dataframe with column names

def df_with_headers(cur, table_name):
    #Run a SQL command on cursor object
    do_this="DESCRIBE {}".format(table_name)
    cur.execute(do_this)
    
    #The method .fetchall() fetches all (or all remaining) rows of a query result set 
    #and returns a list of tuples. 
    #If no more rows are available, it returns an empty list.
    
    #get the column names of a table and character info, primary key (yes/no), default value
    stuff = cur.fetchall()
    #turn it into a pd dataframe
    df=pd.DataFrame(list(stuff))
    
    #Turn first column into pd vector of colnames
    col_names=[str(x) for x in list(df[0])]
    #turn that into a list
    col_list=",".join(col_names)
    
    #Create an sql statement using the list of column names
    do_this2="SELECT {} FROM {}".format(col_list,table_name)
    
    #run the sql statement on the cursor object
    cur.execute(do_this2)
    
    #fetch remaining rows and return as tuples
    stuff2 = cur.fetchall()
    
    df2=pd.DataFrame(list(stuff2),columns=col_names)
    
    return df2

### Example of main program flow  <a class="anchor" id="fourth-bullet"></a>

In [6]:
# Example of main program flow 

#cnx is a MySQLConnection object and connects to customer database
cnx=connectMySQL('database_name')

#.cursor() instantiates objects that can execute operations such as SQL statements. 
# Cursor objects interact with the MySQL server using a MySQLConnection object.
cur = cnx.cursor()

#Pull out a table
customers = df_with_headers(cur, "table_name")

In [None]:
#Remember to close cursor and database connection when done
# done with cursor
cur.close()             

# done with database
cnx.close()