# 2016 and 2020 presidential elections in the Bronx

Here, we will be looking at 2016 and 2020 election night results to see how Bronxites voted in the previous elections. The objectives are to visualize the data and identify trends, if any.

Note: Column labels for the 2020 dataset were erratic, and I corrected it manually.

In [1]:
# importing libraries
import pandas as pd

In [2]:
# reading files

df_2016 = pd.read_csv("data/2016-Citywide President Vice President Citywide EDLevel.csv")
df_2020 = pd.read_csv("data/2020-Citywide President Vice President Citywide EDLevel.csv")

## Cleaning datasets 

We want to keep only data for **1)** the Bronx, or assembly districts 77 to 87, and **2)** Democrat and Republican votes.

In [3]:
# filtering only bronx data
df_2016_bx = df_2016[df_2016["County"] == "Bronx"]
df_2020_bx = df_2020[df_2020["County"] == "Bronx"]

In [4]:
# keeping only democrat and republican votes data

# 2016: Trump vs Clinton
df_2016_bx_prez = df_2016_bx[df_2016_bx["Unit Name"].str.contains("Trump|Clinton", case=False, na=False)]

# 2020: Trump vs Biden
df_2020_bx_prez = df_2020_bx[df_2020_bx["Unit Name"].str.contains("Trump|Biden", case=False, na=False)]

In [5]:
# create a new column that combines AD and ED digits
# for consistency with codes used for mapping

# df_2016_bx_prez["ed_map_for_viz"] = df_2016_bx_prez["AD"].astype(str) + df_2016_bx_prez["ED"].astype(str).str.zfill(3)
# df_2020_bx_prez["ed_map_for_viz"] = df_2020_bx_prez["AD"].astype(str) + df_2020_bx_prez["ED"].astype(str).str.zfill(3)

df_2016_bx_prez.loc[:, "ed_map_for_viz"] = df_2016_bx_prez["AD"].astype(str) + df_2016_bx_prez["ED"].astype(str).str.zfill(3)
df_2020_bx_prez.loc[:, "ed_map_for_viz"] = df_2020_bx_prez["AD"].astype(str) + df_2020_bx_prez["ED"].astype(str).str.zfill(3)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_2016_bx_prez.loc[:, "ed_map_for_viz"] = df_2016_bx_prez["AD"].astype(str) + df_2016_bx_prez["ED"].astype(str).str.zfill(3)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_2020_bx_prez.loc[:, "ed_map_for_viz"] = df_2020_bx_prez["AD"].astype(str) + df_2020_bx_prez["ED"].astype(str).str.zfill(3)


## Getting total votes per candidate

In [6]:
def consol_candidate(df, column_name, candidate_name):
    """
    This function consolidates candidates per election district. 

    Parameters:
    df = Pandas DataFrame
    column_name (str) = name of the column to query for the keyword or candidate name search
    candidate_name (str) = string for keyword or candidate name search
    """

    filtered_df = df[df[column_name].str.contains(candidate_name, case=False, na=False)]
    return filtered_df

In [7]:
col_name = "Unit Name"

# creating dfs per candidate per election year
bx_trump_2016 = consol_candidate(df_2016_bx_prez, col_name, "Trump")
bx_clinton_2016 = consol_candidate(df_2016_bx_prez, col_name, "Clinton")
bx_trump_2020 = consol_candidate(df_2020_bx_prez, col_name, "Trump")
bx_biden_2020 = consol_candidate(df_2020_bx_prez, col_name, "Biden")

In [8]:
def count_votes(df, column_name, column_value):
    """
    This function totals the number of votes per candidate. 

    Parameters:
    df = Pandas DataFrame
    column_name (str) = name of the column that anchors the data
    column_value (str) = string for keyword or candidate name search
    """

    totals_df = df.groupby(column_name)[column_value].sum().reset_index()
    return totals_df

In [9]:
col1 = "ed_map_for_viz"
col2 = "Tally"

# summing votes per candidate per election year per election district
bx_trump_2016 = count_votes(bx_trump_2016, col1, col2)
bx_clinton_2016 = count_votes(bx_clinton_2016, col1, col2)
bx_trump_2020 = count_votes(bx_trump_2020, col1, col2)
bx_biden_2020 = count_votes(bx_biden_2020, col1, col2)

In [10]:
# renaming columns to prepare for merging

bx_trump_2016.rename(columns={col2: "Trump"}, inplace=True)
bx_clinton_2016.rename(columns={col2: "Clinton"}, inplace=True)
bx_trump_2020.rename(columns={col2: "Trump"}, inplace=True)
bx_biden_2020.rename(columns={col2: "Biden"}, inplace=True)

In [11]:
# merging dfs per election year

totals_2016 = pd.merge(bx_trump_2016, bx_clinton_2016, on="ed_map_for_viz")
totals_2020 = pd.merge(bx_trump_2020, bx_biden_2020, on="ed_map_for_viz")

In [12]:
# convert our columns to int

totals_2016[["Trump", "Clinton"]] = totals_2016[["Trump", "Clinton"]].astype(int)
totals_2020[["Trump", "Biden"]] = totals_2020[["Trump", "Biden"]].astype(int)

In [13]:
# getting the difference of votes
# for consistency, Dems will have a negative value

totals_2016["votes_value_for_viz"] = totals_2016["Trump"] - totals_2016["Clinton"]
totals_2020["votes_value_for_viz"] = totals_2020["Trump"] - totals_2020["Biden"]

In [14]:
# save to csv

totals_2016.to_csv("2016-votes.csv", encoding="UTF-8", index=False)
totals_2020.to_csv("2020-votes.csv", encoding="UTF-8", index=False)