# On Your Own - Own Your Own Knowledge
This workbook focuses on performing metric MDS on voting data.  We have pairwise information on how often two politicians voted differently on a number of bills.  So a value of zero means they voted exactly the same as another politican.  A value of (say) 15, indicates they voted very frequantly differently. 

Your goal is to take this voting information which is a high dimensional dissimilarity matrix and reduce it to 2 dimensions so we can plot it and see who votes similar to one another.


## Import Necessary Libraries

In [1]:
import pandas as pd
import numpy as np
import random
import math
import matplotlib.pyplot as plt
import seaborn as sns
from numpy.linalg import eig
from sklearn.manifold import MDS
from matplotlib import pyplot as plt
from skimage.io import imshow
import sklearn.datasets as dt
import seaborn as sns         
import numpy as np
from sklearn.metrics.pairwise import manhattan_distances, euclidean_distances
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
%matplotlib inline
%load_ext ipydex.displaytools

## 1. Read in `voting.csv`
- Check the shape
- Print out the top five rows

In [2]:
vote_df = pd.read_csv('./voting.csv')


## 2. Save the list of names in "Column 1" aside, and then delete the column from the dataframe.

In [3]:
names = vote_df['Column 1']
vote_df1 = vote_df.drop('Column 1',1).copy()
vote_df1.head()


  vote_df1 = vote_df.drop('Column 1',1).copy()


Unnamed: 0,Hunt(R),Sandman(R),Howard(D),Thompson(D),Freylinghuysen(R),Forsythe(R),Widnall(R),Roe(D),Heltoski(D),Rodino(D),Minish(D),Rinaldo(R),Maraziti(R),Daniels(D),Patten(D)
0,0,8,15,15,10,9,7,15,16,14,15,16,7,11,13
1,8,0,17,12,13,13,12,16,17,15,16,17,13,12,16
2,15,17,0,9,16,12,15,5,5,6,5,4,11,10,7
3,15,12,9,0,14,12,13,10,8,8,8,6,15,10,7
4,10,13,16,14,0,8,9,13,14,12,12,12,10,11,11


## 3. Create a MDS Model
- Do you need to create a distance matrix?  Why or why not?
- Try 2 dimensions so you can easily plot results.

MDS *only* requires a dissimiliarity matrix, the data is already provided as such.  Taking that data distance data and applying a euclidean distance function to it would not make much sense. 

In [4]:
#1. Compute Distance matrix using euclidean distances
dist_euclid = euclidean_distances(vote_df1)

#2. Create the MDS model
mds = MDS(metric=True, n_components=2, dissimilarity='precomputed', random_state=0)

#3. Get the embeddings
pts = mds.fit_transform(dist_euclid)



## 4. Recast the points into a dataframe

In [5]:
pts_df = pd.DataFrame(pts, columns=('Dim 1', 'Dim 2'))
pts_df.head()

Unnamed: 0,Dim 1,Dim 2
0,21.319534,-11.431254
1,14.776346,-22.64584
2,-14.54918,-2.760352
3,-7.180575,-9.444718
4,13.51974,-7.27082


## 5. Add the list of names that used to be in column 1 back into the dataframe

In [6]:
pts_df['names'] = names
pts_df.head()

Unnamed: 0,Dim 1,Dim 2,names
0,21.319534,-11.431254,Hunt(R)
1,14.776346,-22.64584,Sandman(R)
2,-14.54918,-2.760352,Howard(D)
3,-7.180575,-9.444718,Thompson(D)
4,13.51974,-7.27082,Freylinghuysen(R)


## 6. Use following code to plot the results.  
- You may need to play around with this print statement to get your naming conventions to match.

In [9]:
import plotly.graph_objects as go

layout = dict(plot_bgcolor='white',
              margin=dict(t=20, l=20, r=20, b=20),
              xaxis=dict(title='Dim 1',
                         range=[-15, 15],
                         linecolor='#d9d9d9',
                         showgrid=False,
                         mirror=True),
              yaxis=dict(title='Dim 2',
                         range=[-15, 15],
                         linecolor='#d9d9d9',
                         showgrid=False,
                         mirror=True))

data = go.Scatter(x=pts_df['Dim 1'],
                  y=pts_df['Dim 2'],
                  text=pts_df['names'],
                  textposition='top right',
                  textfont=dict(color='#E58606'),
                  mode='markers+text',
                  marker=dict(color='#5D69B1', size=8),
                  line=dict(color='#52BCA3', width=1, dash='dash'),
                  name='citations')

fig = go.Figure(data=data, layout=layout)

fig.show()

## 7.  What do you notice about the resulting graph?  Does anything surprise you?