# READ THIS BEFORE RUNNING THIS NOTEBOOK.
## Create a Postgres DB Named 'jeoparty'
1. Use pgAdmin to create a postgres database named 'jeoparty'.
2. Run the sql script found in 'jeo_tables.sql' in the 'jeoparty' query tool.
 - This file is found in the same folder as this notebook. 
 - This will create the table to be populated by data pulled from the json file.
 - No capital letters will be used in the naming of the table, table columns or dataframe columns
 - postgres has issues with capital letters
3. Add your pgAdmin username and password to the 'jeoparty_passwords.py' file before running any cells in this workbook.
 - This file is also found in the same folder as this notebook.
 - These values have been defaulted to 'postgres' for both values in the 'jeoparty_passwords.py' file in the repository

In [1]:
#Dependencies
import pandas as pd
from sqlalchemy import create_engine
from jeoparty_passwords import jeo_username
from jeoparty_passwords import jeo_password

In [2]:
# JSON file containing information for the 1000 winningest Jeopardy! Contestants pulled from
# https://cluebase.readthedocs.io/en/latest/# , a Jeopardy! API with excellent documenation.
contestant_file = "../Resources/contestants1000.json"

In [3]:
#Read JSON into the dataframe using pandas.read_json. Only the 'data' objects will be read in.
player_df = pd.read_json(contestant_file, orient='values')['data']
player_df.head(3)

0    {'id': 208, 'name': 'Ken Jennings', 'notes': '...
1    {'id': 75, 'name': 'James Holzhauer', 'notes':...
2    {'id': 204, 'name': 'David Madden', 'notes': '...
Name: data, dtype: object

In [4]:
#pandas.json_normalize is used to 'flatten' the json in the dataframe.
#'contestant_id' will be used as the index.
player_df = pd.json_normalize(player_df).set_index('id') 
player_df.index.names = ['contestant_id']

In [5]:
#Cleaned and stunningly beautiful dataframe.
player_df.head(3)

Unnamed: 0_level_0,name,notes,games_played,total_winnings
contestant_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
208,Ken Jennings,"a software engineer from Salt Lake City, Utah",94,2522700
75,James Holzhauer,"a professional sports gambler from Las Vegas, ...",33,2464216
204,David Madden,"a student originally from Ridgewood, New Jersey",29,432400


In [6]:
# Create Connection to 'jeoparty' postgres database
# This connection string will use information from the 'jeoparty_passwords.py' file.
connection_string = f"{jeo_username}:{jeo_password}@localhost:5432/jeoparty"
engine = create_engine(f'postgresql://{connection_string}')

In [7]:
# Confirm table name 'contestants' exists.
engine.table_names()

['categories', 'contestants']

In [8]:
#Append data in dataframe to 'contestants' table
player_df.to_sql(name='contestants', con=engine, if_exists='append', index=True)