# IMPORTS

- pandas -> data manipulation
- sqlalchemy -> database connection
- pymysql -> mysql connector
- openpyxl -> handling excel files
- os -> to access os variables
- dotenv -> to keep secret values in .env

In [1]:
import pandas as pd
from sqlalchemy import create_engine
import pymysql
import openpyxl
import os
from dotenv import load_dotenv

import mysql.connector

# Connect to local DB

Using the dotenv lib to keep my user/password secret and not accidently commit to github. Even though we are working with a Docker image locally. 

## [First Attempt]
Welp, looks like there were several issues we had to deal with: 
- First the lib `mysql-connector` does not work nicely with the version of MySQL (v9.2) I am running. Anything over 8.0 starting using a new authentication type (caching_sha2_password). Had to uninstall my-connector, and install `my-connector-python`. 
- Even after installing the new lib the issues persisted. I eventually closed everything, deleted my venv and recreated it. That seems to fix the issue. 

In [9]:
load_dotenv()

db_user = os.getenv("DB_USER")
db_password = os.getenv("DB_PASSWORD")
db_host = "127.0.0.1"
db_name = "data_science"
engine = mysql.connector.connect(user=db_user, password=db_password, host=db_host, database=db_name)

# Run simple command to test connection
cur = engine.cursor()
cur.execute("SELECT CURDATE()")
row = cur.fetchone()
print("Current Date is: {0}".format(row[0]))

Current Date is: 2025-03-18


In [None]:
# Read csv file in dataframe
df_csv = pd.read_csv("data\dog_license_2017_data.csv")
print(df_csv.head())

# Attempt to load dataframe into DB table
df_csv.to_sql(name="dog_license_2017_data", con=engine, if_exists="replace", index=False)

Lets try doing some simple `SELECT` statements

In [None]:
cur.execute("SELECT * FROM dog_license_2017_data")
rows = cur.fetchall()

for row in rows:
    print(row)

In [11]:
cur.execute("SELECT * FROM dog_license_2017_data ORDER BY ValidDate")
row = cur.fetchone()

print(row)

('Dog Individual Spayed Female', 'LABRADOR RETRIEVER', 'BLACK', 'MARTICIA', 15101, 2017, None)


Welp looks like more issues. Seems Pandas DF only likes to work with certain lib. Trying again with `sqlalchemy` and `pymysql`

## [Second Attempt]

In [None]:
import pandas as pd
import pymysql
from sqlalchemy import create_engine

load_dotenv()

db_user = os.getenv("DB_USER")
db_password = os.getenv("DB_PASSWORD")
db_host = "127.0.0.1"
db_name = "data_science"

engine = create_engine("mysql+pymysql://" + db_user + ":" + db_password + "@" + db_host + "/" + db_name)
df_csv.to_sql('dog_license_2017_data', con = engine, if_exists = 'replace',index = False)