# Political Party Prediction Based on Votes

#### As a fun title example we'll use a public data set of how US congressmen voted on 17 different issues in the year 1964. Let's see if we can figure out their political party based on their votes alone. Using a deep neural network!

#### For those outside the united states. The main two political parties in USA are "Democrat" and "Republican" in modern times they represent progressive and conservative ideologies respectively

#### Politics in 1984 weren't quite as polarized as they are today but we should still be able to get over 90% accuracy without much trouble 

### The main point of this project is implementing neural networks in Keras 

In [2]:
#lets start by importing the the dependency and the raw csv file using pandas 
## and make a dataFrame out of it with columns labels
import pandas as pd

feature_names = ["party", "handicapped-infants", "water-project-cost-sharing",
                 "adoption-of-the-budget-resolution", "physician-free-freeze",
                 "el-salvador-aid", "religions-groups-in-schools",
                 "anti-satellite-test-ban", "aid-to-nicaraguan-contras",
                 "mx-missle", "immigration", "synfuels-corporation-cutback",
                 "education-spending", "superfund-right-to-sue", "crime",
                 "duty-free-exports", "export-administration-act-south-africa"
                 ] 

vote_data = pd.read_csv("/content/drive/MyDrive/Political party Prediction based on vote/house-votes-84.data.txt", na_values=["?"], names = feature_names)
vote_data.head()

Unnamed: 0,party,handicapped-infants,water-project-cost-sharing,adoption-of-the-budget-resolution,physician-free-freeze,el-salvador-aid,religions-groups-in-schools,anti-satellite-test-ban,aid-to-nicaraguan-contras,mx-missle,immigration,synfuels-corporation-cutback,education-spending,superfund-right-to-sue,crime,duty-free-exports,export-administration-act-south-africa
0,republican,n,y,n,y,y,y,n,n,n,y,,y,y,y,n,y
1,republican,n,y,n,y,y,y,n,n,n,n,n,y,y,y,n,
2,democrat,,y,y,,y,y,n,n,n,n,y,n,y,y,n,n
3,democrat,n,y,y,n,,y,n,n,n,n,y,n,y,n,n,y
4,democrat,y,y,y,n,y,y,n,n,n,n,y,,y,y,y,y


## We can use describe() to get a feel of how the data looks in aggregate

In [3]:
vote_data.describe()

Unnamed: 0,party,handicapped-infants,water-project-cost-sharing,adoption-of-the-budget-resolution,physician-free-freeze,el-salvador-aid,religions-groups-in-schools,anti-satellite-test-ban,aid-to-nicaraguan-contras,mx-missle,immigration,synfuels-corporation-cutback,education-spending,superfund-right-to-sue,crime,duty-free-exports,export-administration-act-south-africa
count,435,423,387,424,424,420,424,421,420,413,428,414,404,410,418,407,331
unique,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
top,democrat,n,y,y,n,y,y,y,y,y,y,n,n,y,y,n,y
freq,267,236,195,253,247,212,272,239,242,207,216,264,233,209,248,233,269


## we can see that, there's some missing data to deal with here, some politicians abstained on some votes or just weren't present when the vote was taken. we will just drop the rows with missing data to keep it simple but in practice I'd want to first make sure that doing so didn't introduce any sort of bias into the analysis i.e if one party abstains more than another that could be problematic for example 

In [4]:
vote_data.dropna(inplace=True)
vote_data.describe()

Unnamed: 0,party,handicapped-infants,water-project-cost-sharing,adoption-of-the-budget-resolution,physician-free-freeze,el-salvador-aid,religions-groups-in-schools,anti-satellite-test-ban,aid-to-nicaraguan-contras,mx-missle,immigration,synfuels-corporation-cutback,education-spending,superfund-right-to-sue,crime,duty-free-exports,export-administration-act-south-africa
count,232,232,232,232,232,232,232,232,232,232,232,232,232,232,232,232,232
unique,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
top,democrat,n,n,y,n,y,y,y,y,n,y,n,n,y,y,n,y
freq,124,136,125,123,119,128,149,124,119,119,128,152,124,127,149,146,189


### The neural network needs normalized numbers not strings to make it work. Now the next thing is to replace all the y's and n's with 1's and 0's and also represent the parties with 1's and 0's as well

In [5]:
vote_data.replace(("y", "n"), (1, 0), inplace=True)
vote_data.replace(("democrat", "republican"), (1, 0), inplace=True)

In [6]:
vote_data.head()

Unnamed: 0,party,handicapped-infants,water-project-cost-sharing,adoption-of-the-budget-resolution,physician-free-freeze,el-salvador-aid,religions-groups-in-schools,anti-satellite-test-ban,aid-to-nicaraguan-contras,mx-missle,immigration,synfuels-corporation-cutback,education-spending,superfund-right-to-sue,crime,duty-free-exports,export-administration-act-south-africa
5,1,0,1,1,0,1,1,0,0,0,0,0,0,1,1,1,1
8,0,0,1,0,1,1,1,0,0,0,0,0,1,1,1,0,1
19,1,1,1,1,0,0,0,1,1,1,0,1,0,0,0,1,1
23,1,1,1,1,0,0,0,1,1,1,0,0,0,0,0,1,1
25,1,1,0,1,0,0,0,1,1,1,1,0,0,0,0,1,1


### The next things is to extract the features and labels in the form that keras will expect

In [7]:
vote_data[feature_names].head()

Unnamed: 0,party,handicapped-infants,water-project-cost-sharing,adoption-of-the-budget-resolution,physician-free-freeze,el-salvador-aid,religions-groups-in-schools,anti-satellite-test-ban,aid-to-nicaraguan-contras,mx-missle,immigration,synfuels-corporation-cutback,education-spending,superfund-right-to-sue,crime,duty-free-exports,export-administration-act-south-africa
5,1,0,1,1,0,1,1,0,0,0,0,0,0,1,1,1,1
8,0,0,1,0,1,1,1,0,0,0,0,0,1,1,1,0,1
19,1,1,1,1,0,0,0,1,1,1,0,1,0,0,0,1,1
23,1,1,1,1,0,0,0,1,1,1,0,0,0,0,0,1,1
25,1,1,0,1,0,0,0,1,1,1,1,0,0,0,0,1,1


In [8]:
all_features = vote_data.drop(["party"], axis=1)

In [9]:
all_features.head()

Unnamed: 0,handicapped-infants,water-project-cost-sharing,adoption-of-the-budget-resolution,physician-free-freeze,el-salvador-aid,religions-groups-in-schools,anti-satellite-test-ban,aid-to-nicaraguan-contras,mx-missle,immigration,synfuels-corporation-cutback,education-spending,superfund-right-to-sue,crime,duty-free-exports,export-administration-act-south-africa
5,0,1,1,0,1,1,0,0,0,0,0,0,1,1,1,1
8,0,1,0,1,1,1,0,0,0,0,0,1,1,1,0,1
19,1,1,1,0,0,0,1,1,1,0,1,0,0,0,1,1
23,1,1,1,0,0,0,1,1,1,0,0,0,0,0,1,1
25,1,0,1,0,0,0,1,1,1,1,0,0,0,0,1,1


In [10]:
all_features = all_features.values
all_classes = vote_data["party"].values

In [11]:
all_features

array([[0, 1, 1, ..., 1, 1, 1],
       [0, 1, 0, ..., 1, 0, 1],
       [1, 1, 1, ..., 0, 1, 1],
       ...,
       [0, 0, 0, ..., 1, 0, 1],
       [0, 0, 1, ..., 1, 0, 1],
       [0, 0, 1, ..., 0, 0, 1]])

In [12]:
all_classes

array([1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0,
       0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0,
       1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1,
       1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0,
       0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 1, 0,
       1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1,
       0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1,
       1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1,
       0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1,
       0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1])

## Now that we have change the value to what keras can use to perform the prediction. Lets build the model using keras 

In [15]:
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.models import Sequential
from sklearn.model_selection import cross_val_score

def create_model():
  model = Sequential()
  #16 feature inputs (votes) going into an 32-unit layer
  model.add(Dense(32, input_dim=16, kernel_initializer="normal", activation="relu"))
  # Another hidden layer of 16units
  model.add(Dense(16, kernel_initializer="normal", activation="relu"))
  # Output layer with a binary classification (Democrat or Republican political party)
  model.add(Dense(1, kernel_initializer="normal", activation="sigmoid"))
  # Compile model
  model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
  return model

from scikeras.wrappers import KerasClassifier

# wrap our keras model in an estimator compatible with scikit_learn
estimator = KerasClassifier(build_fn=create_model, epochs=100, verbose=0)
# Now we can use scikit_learn's cross_val_score to evaluate this model identically to the others
cv_scores = cross_val_score(estimator, all_features, all_classes, cv=10)
cv_scores.mean()


  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"




  "``build_fn`` will be renamed to ``model`` in a future release,"




  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"
  "``build_fn`` will be renamed to ``model`` in a future release,"


0.9438405797101449