# US Presidential Election Analysis: Electoral College, Popular Vote, or Both?

## Objective
This notebook contains the third step in a larger effort to analyze historical US Presidential Election data. It focuses on building Data Mart objects that combine the data from steps 1 and 2 into object(s) that will power dashboard(s)
1. [ ] Create Data Mart object(s) from tables available in `elections` Postgres DB
2. [ ] Write DM object(s) to csv file(s) to import into Google Data Studio, Superset, or other BI tools

### Notes
- I've been unable to connect my local Postgres database to Google Data Studio, due to an unknown network issue, so this notebook will write csv file(s) with the DM objects instead of writing the objects back into the Postgres database (where they can't be used, grrrrr)

## 1. Setup

### 1.1 Import Modules

In [5]:
from db_tools import DBC
import getpass
import pandas as pd

### 1.2 Set Parameters 

In [49]:
# Define parameters for database connection, schema name, table name, and column definitions
# Obfuscate password using getpass
db_config = {
    'user': 'postgres',
    'host': 'localhost',
    'port': '5432',
    'dbname': 'elections',
    'password': getpass.getpass()
}
# Data Warehouse schema
schema = 'dwh'
# Query that defines data mart object
dm_sql = f"""
select v.year as votes__year
    , v.state as votes__state
    , v.is_total as votes__is_total
    , v.total_electoral_votes
        as votes__total_electoral_votes
    , v.president_electoral_votes
        as votes__president_electoral_votes
    , v.president_electoral_rank
        as votes__president_electoral_rank
    , vs.region as votes__state__region
    , vs.division as votes__state__division
    , round((vs.area_land + vs.area_water) / 1e6, 2)
        as votes__state__area
    , c.name as candidate__name
    , c.party as candidate__party
    , c.party_2 as candidate__party_2
    , c.state as candidate__state
    , c.state_2 as candidate__state_2
from {schema}.votes v 
left join {schema}.state vs on v.state = vs.state
join {schema}.candidate c on v.candidate_id = c.candidate_id 
"""
csv_file = "presidential_election_dm_object.csv"

 ·········


## 2. Create Data Mart Object as CSV file

### 2.1 Define DB Connection

In [48]:
dbc = DBC(db_config)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4110 entries, 0 to 4109
Data columns (total 14 columns):
 #   Column                            Non-Null Count  Dtype  
---  ------                            --------------  -----  
 0   votes__year                       4110 non-null   int64  
 1   votes__state                      4028 non-null   object 
 2   votes__is_total                   4110 non-null   bool   
 3   votes__total_electoral_votes      4110 non-null   int64  
 4   votes__president_electoral_votes  4110 non-null   int64  
 5   votes__president_electoral_rank   4110 non-null   int64  
 6   votes__state__region              4028 non-null   float64
 7   votes__state__division            4028 non-null   float64
 8   votes__state__area                4028 non-null   float64
 9   candidate__name                   4110 non-null   object 
 10  candidate__party                  3399 non-null   object 
 11  candidate__party_2                234 non-null    object 
 12  candid

### 2.2 Build DM Object

In [None]:
df_dm = dbc.select_query_to_df(dm_sql, close=True)
df_dm.info()

### 2.3 Write DM Object to CSV file

In [50]:
df_dm.to_csv(csv_file, index=False)