# CORE: Books
Jude Maico Jr.

Consider the following "flat" file that a start-up has just started using for its first customers: Client's Original File. They quickly realized that saving this information in .csv format will not meet their needs as they grow. First, consider how you would design a relational database to meet their needs. Be sure to consider conventions of normalization and what information should be separated.

# Part 1: Design an ERD
Create an ERD (figure out how many tables to include and the relationships between them) to represent a database that tracks users and their favorite books. Here are some considerations as you design the database:

- For the purposes of this assignment, you may assume that each book only has one author (or that we are only tracking the primary author), but that the same author may have written multiple books.
- Each user should have a first name, last name, and email.
- We will be saving a list of each user's favorite books.
- Each book should have a title and an author. (The author's whole name can be one attribute)
- Note that each user will have multiple favorite books, and a book could certainly be the favorite of many users.
- Use the MySQL Workbench for designing the ERD.
- Hint: When you link two tables with a many to many relationship, MySQL Workbench will automatically create a joiner table for you! It will also automatically make the keys primary keys, which you will want to uncheck.

<img src='books.png'>

# Part 2: Create the database in Python
Continue working in Jupyter Notebook with the ERD image.

Rather than creating the database in MySQL workbench with forward engineering, we are going to develop our Python skills by creating the database in Python using PyMySQL that you practiced in the "MySQL with Python" lesson.

Note that working with MySQL via Python will be a required component of the belt exam, so getting comfortable with it now will help prepare you!

You will need to create a connection. This time, you may wish to call it "books"

Normally, you would have to take the time to transform the original .csv file from your client into the appropriate normalized tables. However, for this task, the transformation steps have been completed for you and you are provided a .csv for each table you will need. (Note that you will be learning and practicing efficient ways to make these transformations next week!)

The four files you will need to add as tables to your database are:

1. users

2. books

3. authors

4. favorites

Note that these files may not perfectly match the schema you designed. Notice how they are different, but move forward with these tables even if they are not exactly the same as your original plan. (Notably, we will not have created_at and updated_at attributes)

Once you have added these tables to your database, the database is now available to query from MySQL workbench OR in your Jupyter Notebook using SQLAlchemy!

Testing the Database
After creating your 4 tables, you should run the "SHOW TABLES;" query in your notebook.

As a final step to this task, write a query at the end of your Jupyter Notebook to list the titles of all of John Doe's favorite books. An example of the SQL syntax: Note this will depend on how you named your tables.

In [1]:
import pandas as pd
import pymysql
pymysql.install_as_MySQLdb()
from urllib.parse import quote_plus as urlquote
from sqlalchemy import create_engine
from sqlalchemy_utils import create_database, database_exists

In [2]:
connection = 'mysql+pymysql://root:DataRespT1229@localhost/books'
engine = create_engine(connection)
engine

Engine(mysql+pymysql://root:***@localhost/books)

In [3]:
if database_exists(connection):
    print('It exists!')
else:
    create_database(connection)
    print("The database created!")

The database created!


In [4]:
# users
users = pd.read_csv('users.csv')
users

Unnamed: 0,id,first_name,last_name,email
0,1,John,Doe,JD@books.com
1,2,Robin,Smith,Robin@books.com
2,3,Gloria,Rodriguez,grodriquez@books.com


In [5]:
# books
books = pd.read_csv('books.csv')
books

Unnamed: 0,id,title,author_id
0,1,The Shining,1
1,2,It,1
2,3,The Great Gatsby,2
3,4,The Call of the Wild,3
4,5,Pride and Prejudice,4
5,6,Frankenstein,5


In [6]:
# authors
authors = pd.read_csv('authors.csv')
authors

Unnamed: 0,id,author_name
0,1,Stephen King
1,2,F.Scott Fitgerald
2,3,Jack London
3,4,Jane Austen
4,5,Mary Shelley


In [7]:
# favourites
favorites = pd.read_csv('favorites.csv')
favorites

Unnamed: 0,user_id,book_id
0,1,1
1,1,2
2,1,3
3,2,4
4,2,5
5,3,5
6,3,6


In [8]:
# add tables to sql
users.to_sql('users', engine, if_exists = 'replace')
books.to_sql('books', engine, if_exists = 'replace')
authors.to_sql('authors', engine, if_exists = 'replace')
favorites.to_sql('favorites', engine, if_exists = 'replace')

7

In [9]:
# checking if table shows up
q = """
SHOW TABLES;
"""
pd.read_sql(q, engine)

Unnamed: 0,Tables_in_books
0,authors
1,books
2,favorites
3,users


In [3]:
# titles of all of John Doe's favorite books
q2 = ''' 
SELECT b.title
FROM books AS b
JOIN favorites AS f ON b.id = f.book_id
WHERE f.user_id = 
(SELECT u.id 
FROM users AS u 
WHERE (u.first_name = 'John' AND u.last_name = 'Doe'));
'''
pd.read_sql(q2, engine)

Unnamed: 0,title
0,The Shining
1,It
2,The Great Gatsby


# Part 3: Exporting the database and committing to GitHub
Now that you've created your database and verified it works, open MySQL Workbench and use the Export Database tool to save the .SQL file for your database in your assignment repository.