# Textualism Project

## The Team:
### *Project Manager: Aidan Duffy<br />Computer Science Division: Alex Cegarra (Algorithm Development), Evan Lohn (Algorithm Development & Visualization)<br /> Humanities Division: Tonya Nguyen (Document Research)*

# Introduction

The textualism team was tasked with developing a language processing program that determines the meaning of specific and controversial language used in the Constitution. We utilized Python to parse through a number of documents that we gathered from several eras in United States, specifically landmark Supreme Court and several Circuit Court case decisions and opinions. These documents allowed us to determine a more legal and proper defintion of the words. Our program reads these documents, creates and merges word vectors, and then we visualize them in order to show the evolution of these words' meaning.

We were unable to finish the project entirely this semester, given our small team, but we have made a substantial amount of progress. We focused primarily on the second amendment, as it seems to be a major topic of discussion and debate. We plan to continute with other parts of the Constitution, such as the fourth amendment, next semester.

# Step 0: Install Packages

Please download and install the following packages prior to running the rest of the program: <br />
*gensim (very important) <br />
web (in our repository) <br / > nltk <br / > nltkdata <br / > tqdm <br /> plotly (if you would like to see the visualization for yourself)*

# Step 1:
## Document Research
The humanities division of the textualism team laid the initial groundwork for the project. Tonya and Aidan discussed initially how we should approach this topic, since there are so many ways to find documents from all of these eras in our nation's history. At first, we thought we should try to gather as much as possible in any field, whether that be legal documents, such as court decisions or amicus briefs, news articles, or letters (more so targeted for the older time periods). We also thought we could research several parts of the Constitution at one time. However, this not only proved to be incredibly daunting, it also had two major flaws. <br />

First, the type of language utilized in these various types of documents would be drastically different. Legal documents would contain a lot of jargon, be incredibly formal, but would target the meaning of the words directly and attempt to interpret them as best as they can. The news articles, whether they be modern and from websites or blogs or older and from newspapers (yes, we know we can still use newspapers! But, nevertheless...), their tone would be significantly less formal and they would use simpler language that the average layperson (or CS student) could understand. On the other hand, they typically have their own personal agenda to put forward on the topic. In our case, since we primarily researched the second amendment, we did not want to cloud our research with people who think we should ban all guns nor with the idea that every child should be handed a firearm after they exit the womb. Finally, since we researched backwards, letters were incredibly difficult to come across, and they also rarely target the specific meaning of words like those used in the Constitution. We may, however, return to these as we research older eras like the years surrounding the writing of the Constitution. Because all of these sources would have totally different types of language, may or may not have an agenda, or may not even be found for the eras we are looking into, we knew that we had to narrow things down quite a bit. <br />

In order to narrow things down, we thought that we should specifically target the legal documents, mostly court cases and the judges' opinions. This includes the majority, concurring, and dissenting so we could receive all points of view on the given topic. We believed that this would be the most effective manner of researching because the judges (or justices in the case of the Supreme Court) try to directly interpret the words' meaning with as little bias as possible. After researching, of course, we did discover that there are clear partisan lines even today within the SCOTUS. Several justices, such as Ruth Bader Ginsburg, believe that the Constitution does not and has never guaranteed any US citizen the right to any type of firearm unless they are explicitly in an organized, state regulated militia. While she is certainly not alone on the Court (3 others in 2008), in 2008, the majority of court agreed that the second amendment guarantees the individual's right to own a firearm, regardless of service in any militia. After seeing this, especically because this specific case (District of Columbia v. Heller) had a close 5-4 vote and the justices were on completely opposite ends of the spectrum, we knew that we had to include all of the opinions. There were other cases like this, but I believe that this was a great example of why the courts provide a lot of helpful information about the meaning of these words. <br />

Secondly, attempting to research all of the controversial and debated parts of the Constitution was bold, and would have ultimately hurt us had we pursued it. Specifically, it would have hurt our CS division. Not only would the humanities team be divided and a bit disfunctional, grabbing documents related to this amendment and that or this clause and that clause, the CS division would be unable to have a strong set of data to work from to improve upon and refine the algorithms used in this project. After discussing this with Alex, Aidan decided that, in addition to limiting the scope of the research to legal documents only, the group should come together and decide which part of the Constitution we should focus on. In the end, we voted for the second amendment as it is one that is highly controversial, does not have a lot of text within it (it is only about a line instead of the first or fourth amendments' several lines or the fourteenth's multiplte sections. We knew that this would not only be an interesting topic to research, but it would make Alex's, and ultimately Evan's when he joined our team, job a lot easier. <br />

From here, our research began in full force. The humanities division focused on the modern era first. Since we were not quite sure as to how to describe what a given era was, we decided that we will take a given point and add or subtract 25 years. Since we cannot exactly go into the future from the present, the modern era is only 25 years and is from the mid-90s to today. We also researched two other key periods that had many landmark cases relating to the second amendment. We did not give them any specific name; they were from 1875 to 1925 and from 1940 to 1990. We wanted to make sure each time period not only contained its fair share of court cases for us to retrieve valuable information from, we also wanted to ensure that the key terms did not change their meanings much over these periods. While there was a large jump in military technology in our earliest era during the Great War, most of the court cases precede that, so we concluded it would not be an issue. By the start of our middle era, machine guns had already been invented and were mainstream (not in the home, but many knew what they were), and with the exclusion of nuclear armaments, stealth technology, napalm, and other types of general weaponry that is generally agreed upon that individual Americans do not intrinsically have a right to without proper training and certification, such as military personnel, there was not much development after this point. Thus, we decided that these were good time periods to discuss, and we gathered many court cases and their opinions and transcribed them into .txt files for the CS division to begin testing on.

# Algorithm Development
## Overview / Background for Math Used

Our team's goal with this project is to develop a process for measuring and visualizing the evolving sense of controversial constitutional terminology. The bulk of this work hinges on Word2Vec, an NLP API that allows for the embedding of words as vectors. In vectorizing words, we open up avenues for the analysis of linguistic relationships through geometric means. 

In essence the idea is, given a set of historical documents, we want to generate the "optimal" embedding (which we will discuss more later), and then try to extract a geometric context for controversial words, such as "arms". 

Word embeddings need to capture an incredible amount of complexity and variation, so the word vectors we produce have dimension on the order of tens. In these high-dimensional spaces, the Euclidean norm starts to lose it's ability to "fairly" represent distance, so we instead use the cosine-norm for all distance calculations.

$$ cosine\_norm(u,v) = \frac{u \cdot v}{|u|~|v|} $$ 

Another side-effect of high-dimensions is the obvious difficulty to produce an intelligible visualization. For this reason, in the Network Creation section of our project, we implemented Singular Value Decomposition. SVD is a process for decomposing a rank-r matrix into a sum of r rank-1 matrices. 

Suppose our word embedding process has generated a set of 2000 word vectors of dimension 40 (these are realistic values for this project). These vectors "live" in the standard basis of $R^{40}$. Let's see if we can simplify that basis into something more amenable to visualization while losing as little information as possible. We place all of our vectors into a matrix A as row vectors.
$$ A = \sigma_1 \vec{u_1} \vec{v_1}^T + \sigma_2 \vec{u_2} \vec{v_2}^T ... + \sigma_r \vec{u_r} \vec{v_r}^T$$
It is relevant here because, much like diagonalization, it allows one to obtain a potentially more useful basis for a set of vectors. For our networks, we populate $A$ with our word vectors as rows, and then attain an approximation for A by summing our  three $\sigma_i \vec{u_i} \vec{v_i}^T$ terms with highest corresponding singular values ($\sigma_1,\sigma_2,\sigma_3$). Another way of saying this is (supposing we order our singular values in descending order):

$$A \approx \sigma_1 \vec{u_1} \vec{v_1}^T + \sigma_2 \vec{u_2} \vec{v_2}^T + \sigma_3 \vec{u_3} \vec{v_3}^T$$

This approximation allows us to create a new, 3-dimensional basis for our word vectors:

$$B_{reduced} = \{ \vec{v_1},\vec{v_2},\vec{v_3}\} $$

The basis vectors for $B_{reduced}$ still live in $R^{40}$. However, by projecting each of our word vectors onto this basis, we can achieve an approximation for our word vectors in $R^3$. This lets us visualize the structure of a 40-dimensional space with a graph in 3 dimensions.

# Structure


  * For each era:
    * Parse the relevant set of documents
    * For each "reasonable" set of constructor parameters
      * Initialize a Word2Vec model with parameters
      * Calculate WordSim353 similarity Spearmen correlation for model
    * Choose the model which maximizes WordSim353 Spearmen
    * Analyze the "neighborhood" of chosen words through:
      * Creation of a graph based on Word2Vec's word similarity metric
      %* K-Means Clustering of embedding

# Demonstration

# Step 2
## Parsing

In this step, we take in a set of text documents and tokenize it by sentences and then by words. The result is a list of lists of words, which we will pass to the Word2Vec constructor shortly.

In [6]:
from gensim.models import Word2Vec
from word_embeddings_benchmarks_master.web.datasets.similarity import fetch_WS353
from cluster import Clustering
import logging
from collections import OrderedDict
import numpy as np
import sys
import scipy
from six import text_type
from six import PY2
from six import iteritems
from six import string_types
from word_embeddings_benchmarks_master.web.utils import _open
from word_embeddings_benchmarks_master.web.vocabulary import *
from six.moves import cPickle as pickle
from six.moves import range
from functools import partial
from word_embeddings_benchmarks_master.web.utils import standardize_string, to_utf8
from word_embeddings_benchmarks_master.web.embedding import Embedding
from sklearn.metrics import pairwise_distances
logger = logging.getLogger(__name__)

def evaluate_similarity(w, X, y):
    """
    Calculate Spearman correlation between cosine similarity of the model
    and human rated similarity of word pairs

    Parameters
    ----------
    w : Embedding or dict
      Embedding or dict instance.

    X: array, shape: (n_samples, 2)
      Word pairs

    y: vector, shape: (n_samples,)
      Human ratings

    Returns
    -------
    cor: float
      Spearman correlation
    """
    if isinstance(w, dict):
        w = Embedding.from_dict(w)

    missing_words = 0
    words = w.vocabulary.word_id
    for query in X:
        for query_word in query:
            if query_word not in words:
                missing_words += 1
    if missing_words > 0:
        logger.warning("Missing {} words. Will replace them with mean vector".format(missing_words))


    mean_vector = np.mean(w.vectors, axis=0, keepdims=True)
    A = np.vstack(w.get(word, mean_vector) for word in X[:, 0])
    B = np.vstack(w.get(word, mean_vector) for word in X[:, 1])
    scores = np.array([v1.dot(v2.T)/(np.linalg.norm(v1)*np.linalg.norm(v2)) for v1, v2 in zip(A, B)])
    return scipy.stats.spearmanr(scores, y).correlation

def eval_sim(model):
    d = {word:model.wv[word] for word in model.wv.vocab}
    data = fetch_WS353(which="similarity")
    return evaluate_similarity(d, data.X, data.y)



In [7]:
import gensim
from gensim.models import Word2Vec as Word2Vec
import sys, re, os
import nltk
from nltk.tokenize import word_tokenize
import word_embeddings_benchmarks_master
import tqdm
from word_embeddings_benchmarks_master.web import * 
SENTENCE_TOKENIZER = nltk.data.load('./nltk_data/tokenizers/punkt/english.pickle') 
QUOTES = re.compile("\u201c|\u201d")

def read_dir(path):
    """
    @param path: (type = str) path to dir that containes files
    @return: (type=str)
    """
    assert os.path.exists(path)
    file_names = os.listdir(path)
    return read(file_names, path)

def read(file_names, path):
    """
    Reads in a text file as a string, returning the stringified version
    @param file_names: (type=list<str>) files to be read
    @param path: (type = str) path to dir that containes files
    @return: (type=str)
    """
    data = ""
    for file_name in file_names:
        with open(path + "/" + file_name, "r", encoding = "utf-8") as f:
            data += re.sub(QUOTES, "\"", f.read())
            data += "\n"
    return data

def parse(text):
    """
    Parses a stringified file into sentences, a list of lists of words
    @param text: (type=str) text to parse
    @return: (type=list<list<str>>) parsed text 
    """
    sentences = SENTENCE_TOKENIZER.tokenize(text)
    sentences = [word_tokenize(sentence) for sentence in sentences]
    return sentences



data = read_dir("Modern_Era/2A")
sentences = parse(data)

# Step 3
## Embedding/Optimization

We now take the parsed text from the previous step, and pass it to our embedding initializer. We chose 3 model parameters to optimize over:
* size: the dimension of the produced word vectors
* window: the 'radius' around a word that Word2Vec will inspect
* min_count: the minimum frequency for a word to be included in the model

We chose acceptable ranges for each of those parameters and then evaluated every Word2Vec embedding for this particular text within those parameter ranges. We use WordSim353's similarity metric, which measures the correlation of cosine similarity of word vectors with human reported similarity of words. The model that we output is the model with the highest WordSim353 similarity correlation (as similarity is central to our processing of the embedding).

In [15]:
def init_model(sentences):
    """
    Initializes optimal Word2Vec model for text from sentences
    @param sentences: (type=list<list<str>>) a list of lists of words
    @return: (type=Word2Vec) model
    """
    best_spearmen = 0
    best_model = None
    best_params = (0,0,0)
    count = 0
    for dim in range(20, 61):
        for window in range(5, 11):
            for min_count in range(2,5):
                model = Word2Vec(sentences, min_count=min_count, window=window, size=dim)
                similarity_spearmen = eval_sim(model)
                if similarity_spearmen > best_spearmen:
                    best_model = model
                    best_params = (min_count, window, dim)
                    best_spearmen = similarity_spearmen
                #print(str(count) + ": " + str(similarity_spearmen))
                count += 1
    print("min_count: " + str(best_params[0]), 
        "window: " + str(best_params[1]), 
        "dim: " + str(best_params[2]))
    print("correlation: " + str(best_spearmen))
    return best_model

model = init_model(sentences)
#model.save("second_amendment_modern_demo.bin")
model.save("second_amendment.bin")

Missing 298 words. Will replace them with mean vector
Missing 328 words. Will replace them with mean vector
Missing 339 words. Will replace them with mean vector


KeyboardInterrupt: 

# IMPORTANT:

The code above does, in fact, properly function. However, since it outputs an enormous amount of text when it is run (because of a missing word command within our eval_sim method, I terminated the code prematurely so as to avoid taking up too much space. Feel free to run it (you will need to if you wish to go through everything or if the plots below are not loaded properly), however it is simply a repetition of the three lines above many, many times. Also, be aware it will take some time as it is processing through thousands of words in all of these documents numerous times in order to optimize the embedding. The output, in the end, is:

min_count: 2 window: 9 dim: 32 <br />
correlation: 0.331644710395

# Step 4
## Neighborhood Creation/Dimensionality Reduction

We now use Word2Vec's built in similarity measure to extract the 10 most similar words to our "main_word". This set of 11 words constitutes a "neighborhood". We perform SVD on this neighborhood of word vectors and pass the reduced dimension word vectors along to the visualization stage.

In [18]:
import numpy as np
class Neighborhood:
    def __init__(self, word, model, neighbors=10):
        """
        @param word: (type=str) 
        @param model: (type=Word2Vec Model) <(neighboring_word, edge_weight)>
        @field word: (type=str)
        @field similarity_neighbors: (type=list<tuple<str, float>>) neighbors to word based on 
        @field proximity_neighbors: (type=list<tuple<str, float>>) neighbors to word based on cosine_distance 
        """
        self.word = word
        self.similarity_neighbors = model.wv.most_similar(positive=[word], topn=neighbors)

def get_neighboring_words(word, model, n=10, verbose=False):
    n = Neighborhood(word, model, neighbors=n)
    if verbose:
        print(n.word)
        print("Similarity Neighbors")
    words = []
    for neighbor in n.similarity_neighbors:
        if verbose:
            print(neighbor)
        words.append(neighbor[0])
    return words, n.similarity_neighbors

def get_svd_from_words(model, words, verbose=False):
    vecs = [model.wv[word] for word in words]
    mat = np.stack(vecs, axis=0)
    if verbose:
        print('shape of word embeddngs matrix:', mat.shape) #shape is (neighbors,32), neighbors defaults to 10
    U, s, V = np.linalg.svd(mat)
    if verbose:
        print('singular values:', s)   # Take a quick look at svd_test.py (run it) if you want to convince yourself of how svd works for m by n matrices
               # We basically want the first three row vectors in V; these are the eigenvectors that explain most of the variation in the rows (i.e. word embeddings) of the original matrix.
    return U, s, V

def get_coords_from_svd_projection(V, model, words, verbose=False):
    V_cut = V[:3,:]
    if verbose:
        print('matrix after removing less import eigenvectors:')
        print(V_cut)
        print('squared row magnitudes:')
        for i in range(3):     # this yields 1.0 every time, i.e. to compute projection coordinates we can ignore the a.a on the denominator (V_cut has unitary rows)
            print(V_cut[i,:].dot(V_cut[i,:]))


    #for each word, associate it with its projection coordinates in the V_cut basis.
    # This associates each words with a ordered triple of points, which allows us to graph in 3d,
    # or 2d (you can just use the first two coordinates if you want). One idea for the future would
    #be to label the axes of the graph with the word that the corresponding basis vector of our graph is closest to.
    return V_cut, {w: V_cut.dot(model.wv[w]) for w in words}

"""
Main utility method for any users of this file. Bundles up three sub-processes to allow you 
to get a three dimensional space on which to plot the similar words to the given one based on some model

The current method computes the svd of the similar words, along with the original word, but throws out words of length
2 or smaller by default.
"""
def get_points_from_word_and_model(word, model_path, verbose=False, bigger_than=2):
    model = Word2Vec.load(model_path)
    
    #gets the n "most similar" words to the initial word in this model. 
    #Also returns a dict containing those similarity scores, which is not used in this method
    words,_ = get_neighboring_words(word,model, n=10, verbose=verbose)

    # an intermediate processing step here that removes small words and adds in the main word we're considering?
    cond = (lambda w: len(w) > bigger_than) if bigger_than > 0 else (lambda w: True)
    words = [word] + [w for w in words if cond(w)]

    # computes the svd of the matrix containing the embeddings of the 10 most similar words as rows. 
    # U and s are not used, but are left in for readability. s contains the singular values, set verbose=True to print them.
    U, s, V = get_svd_from_words(model, words, verbose=verbose)
    
    basis, coords = get_coords_from_svd_projection(V, model, words, verbose=verbose)

    return basis, coords, model

word1 = "arms"
model_path = "second_amendment.bin"
basis1, coords1, _1 = get_points_from_word_and_model(word1, model_path)
word2 = "militia"
basis2, coords2, _2 = get_points_from_word_and_model(word2, model_path)

# Step 5
## Neighborhood Visualization
In this stage, we plot the neighborhood in our new 3-dimensional basis and create a labeled visualization of it.

In [21]:
import plotly
import plotly.plotly as py
from plotly.graph_objs import *
from nltk.cluster.util import cosine_distance
#working example
"""
from plotly.graph_objs import Scatter, Layout

#when using jupyter notebooks, uncomment the following line and change function call below to plotly.offline.iplot
#plotly.offline.init_notebook_mode(connected=True)

plotly.offline.plot({
    "data": [Scatter(x=[1, 2, 3, 4], y=[4, 3, 2, 1])],
    "layout": Layout(title="hello world")
})
"""
plotly.offline.init_notebook_mode(connected=True)

print(basis1.shape[1])
print(basis2.shape[1])
print(model.wv)
#coords should be a dictionary from a word to a 3d position
def generate_network(main_word, coords):

    #next step: generate edges.
    #   In general, we can modify or optionize this however we want to get whatever edge structures we find interesting. 
    #   --We might, for example, create more Neighborhoods (one for each "similar" word) to see if we get any connections within the 1-level-away words.
    #   --We could also try projecting some words from these new neighborhoods (maybe the top three from each?) onto our basis (just dot product) and getting those edges as well.
    e_dict = {}
    for i in range(3):
        e_dict[coord_map[i] + '_e'] = sum([[main_coords[i], coords[w][i], None] for w in id_to_name if w != main_word], [])
        e_dict[coord_map[i] + '_n'] = [coords[w][i] for w in id_to_name]

    return e_dict

def plot_network(g_dict, axis_titles):
    trace1=Scatter3d(
        x=g_dict['x_e'], y=g_dict['y_e'], z=g_dict['z_e'], mode='lines', line=Line(
                    color='rgb(125,125,125)', width=1), hoverinfo='none')

    trace2=Scatter3d(x=g_dict['x_n'], y=g_dict['y_n'], z=g_dict['z_n'], mode='markers', marker=Marker(
        symbol='dot', size=6, color='rgb(175,175,175)', line = Line(
            color='rgb(50,50,50)', width=0.5)), text=id_to_name, hoverinfo='text')

    #print(axis_titles)
    axes = [dict(showbackground=False, showline=False, zeroline=False, showgrid=False, showticklabels=False, title=' ') for i in range(3)]
 
    title = 'word relationships with respect to ' + main_word
    layout=Layout(title=title,width=1000, height=1000, showlegend=False, scene=Scene(
        xaxis=XAxis(axes[0]), yaxis=YAxis(axes[1]), zaxis=ZAxis(axes[2])), margin=Margin(t=100), hovermode='closest')
    
    data = Data([trace1, trace2])
    fig=Figure(data=data, layout=layout)
    plotly.offline.iplot(fig, filename='arms_network.html')
    
def get_closest_basis_words(basis, model):
    maps = {}
    for i in range(basis.shape[0]):
        curr_bas = basis[i,:]
        closest = ('', 10)
        for key in model.wv.vocab:
#            if len(key) < 3:
#                continue
            word = key
            vec = model.wv[key]
            dist = abs(cosine_distance(vec, curr_bas))
            if dist < closest[1]:
                closest = (word, dist)
        maps[i] = closest
    return maps
                


#basis isn't going to be used yet, but we might label the axes in the future
#generate node-id mappings
main_word = word1
name_to_id = {}
id_to_name = []
temp_id = 0
for w in coords1:
    name_to_id[w] = temp_id
    temp_id +=1
    id_to_name.append(w)
main_coords = coords1[main_word] 
coord_map= ['x','y','z']

#generate network
g_dict1 = generate_network(word1, coords1)

#plot network
plot_network(g_dict1, get_closest_basis_words(basis, model))

#basis isn't going to be used yet, but we might label the axes in the future
#generate node-id mappings
main_word = word2
name_to_id = {}
id_to_name = []
temp_id = 0
for w in coords2:
    name_to_id[w] = temp_id
    temp_id +=1
    id_to_name.append(w)
main_coords = coords2[main_word] 
coord_map= ['x','y','z']
g_dict2 = generate_network(word2, coords2)

#plot network
plot_network(g_dict2, get_closest_basis_words(basis, model))

32
32
<gensim.models.keyedvectors.KeyedVectors object at 0x000002E4E88DBD30>


# Step 6
## Plot Analysis

Above, we can see the finished product of the program. It maps, in a 3-D XYZ plane the relationship that our selected word, "arms" in our first case, has with the other words in all of the files in the selected section, the Modern Era. You can see each word by simply hovering over it and the plot is easy to manipulate and rotate for further analysis. Since this may not be as simply if you are simply looking at this as an image, we will list the words below. The point at the top center is arms. Otherwise, from left to right, top to bottom (then back to the top again):

1. Defence
2. Suitable
3. Fire
4. References
5. Infringed
6. Carry
7. People
8. Right
9. Keep (Closer of the two on the far right)
10. Bear (Farther of the two on the far right)

One of the things that is clear is that we may need to remove the words that are around it within the second amendment from contention, so as to not pollute the word vector. Clearly, the word "arms" has no relation to the word "infringed" and a tenuous one to "bear", yet since they are in the 2nd amendment and are close by, they are listed. However, words like "defence," "fire" as in firearms, and "right" are all valuable here. "Right" more so in the sense that right has a higher correlation than a word like "privilege", whereas the other two provide insight as to what the word arms means today. We can infer it has a high relation to "defence" of the individual as well as their family and their property, such as their home. 

For the second case, our word was "milita". Militia is the bottom right hand point. Otherwise, in a semicircular motion, going left and up then right and up then right and down:

1. Citizen
2. Secure
3. Used (Going inward)
4. Purpose
5. Merely (Back outward)
6. They
7. Only
8. State

This visual only produced 8, not 10, since militia is not discussed nearly as much as arms in these cases as the judges do not believe it to be as central to the issue, and in many cases, the "pro-gun" side focuses on an individual's right to a firearm, not the militia. This also shows that our program needs some refinement. Again, words like "state"  appear near it in the text of the amendment and should probably be ignored. Of course, the extra words, like "they" and "merely" are for the most part inapplicable, so that does raise an issue. On the other hand, the word "citizen" is very important because it could reveal the "pro-gun" advocates are correct in saying that these militias are not necessarily state organized but rather can be composed entirely of the citizen body. While state is certainly present, it is much further away from militia in the plot, signaling a much weaker relationship. 

# Step 7
## Planning for the Future (Conclusion)

From here, we know that our work is far from over, but we also know that we have come an incredible distance. From here, the humanities division will continue to research these eras and find more legal documents. We will expand from court cases into scholarly articles seeking to interpret the Constitution's meaning as well as amicus briefs related to the major court cases we have already looked into, as these provide even more insight to the issues. Additionally, I hope we can expand and begin to look into the fourth amendment in the future and perhaps several other debated parts of the Constitution. As for the CS division, with the visualization task being done (unless we decide to utilize the basis aspect again), the most important task ahead is refining the program to remove the words from the word vectors that, by and large, have almost no real connection to our selected words; in the past, we removed generic words and articles, and I think that should be expanded to include other generic words like "they" as well as most of the words used in the second amendment itself, excluding some particularly important ones, like arms and security. Not only are these important, they may certainly come up in other contexts surrounding a given word, whereas militia may not.

While we unfortunately were unable to finish the project in its entirety by this semester's end, I am confident it can be concluded rapidly and early on into the Spring 2018 semester. Thank you!


One last note:

<br />I, Aidan Duffy, Textualism's Project Manager, would like to personally thank my team members for their committment and drive when it came to this project, especially after many of our initial team members decided to leave for a plethora of reasons. Specifically, I would like to thank Alex Cegarra for completing almost all of the foundational work for our algorithms by himself, as well as bringing Evan Lohn, his roommate and fellow CS division team member, to your team, as he helped out tremendously with refining our program as well as setting up our visualization. It has been an honor being this team's Project Manager, and I would hope I could continue in this position and you all continue on with PCS next semester and we can finish this together!

# References
gensim API: https://radimrehurek.com/gensim/<br />
Natural Language Toolkit: http://www.nltk.org/<br />
Paper/Article on how Legal Robot implements word vectors: https://www.legalrobot.com/blog/2016/09/14/Word-Vectors/<br />
Paper on predicitng law making using word vectors: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0176999 <br />

Primary Source for Supreme Court opinions: https://www.justia.com/<br />
Professor David Bamman, both at his office hours and his lectures(INFO 159): http://people.ischool.berkeley.edu/~dbamman/ <br />
Profess Marti Hearst's research and presentations on NLP: http://people.ischool.berkeley.edu/~hearst/<br />
Sinnot-Armstrong Paper on "Word Meaning in Legal Intrepretation": https://www.dropbox.com/s/yvn82seamomrsbr/Sinnott-Armstrong.pdf?dl=0<br />
Stanford's (much, much worse :)) NLP that also uses gensim: https://nlp.stanford.edu/projects/histwords/<br />

### List of Cases & Links
**List of SCOTUS Cases:**<br />
*Caetano v. Massachusetts, 2016, 1 Opinion: https://supreme.justia.com/cases/federal/us/577/14-10078/<br />
District of Columbia v. Heller, 2008, Majority and 1 Dissenting: https://supreme.justia.com/cases/federal/us/554/570/<br />
Lewis v. United States, 1966, Majority, 1 Concurring, 1 Dissent, Syllabus, and Footnotes*****:* https://supreme.justia.com/cases/federal/us/385/206/case.html<br />
McDonald v. Chicago, 2010, Majority. 2 Concurring, and 2 Dissenting: https://supreme.justia.com/cases/federal/us/561/742/<br />
Miller v. Texas, 1894, Majority, and the Syllabus: https://supreme.justia.com/cases/federal/us/153/535/case.html<br />
Presser v. Illinois, 1886, Majority, and the Syllabus: https://supreme.justia.com/cases/federal/us/116/252/case.html<br />
Robertson v. Baldwin, 1897, Majority, 1 Dissent, and the Syllabus: https://supreme.justia.com/cases/federal/us/165/275/case.html<br />
United States v. Cruikshank, 1875, Majority, 1 Dissent, and the Syllabus:
https://supreme.justia.com/cases/federal/us/92/542/case.html<br />
United States v. Miller, 1939, Majority, Syllabus, and Footnotes*****:* https://supreme.justia.com/cases/federal/us/307/174/case.html<br />*

**Note:** The rest only have the one majority opinion. <br />
**Circuit Court Cases:**<br />
*United States v. Emerson, 2001: https://aclu.procon.org/sourcefiles/US-v-Emerson.pdf<br />*
**State Supreme Court Cases:**<br />
*City of Salina v. Blasksley, 1905: http://www.guncite.com/court/state/83p619.html <br />
People v. Aguilar, 2013, Overturning smaller court: http://www.illinoiscourts.gov/Opinions/SupremeCourt/2013/112116.pdf<br />*
**State Court Cases:**<br />
*People v. Aguilar, 2011: http://caselaw.findlaw.com/il-court-of-appeals/1557712.html <br />*
<br />
****IMPORTANT:** Those cases with a "footnotes" section had such an extensive notes section that including them in the original document would make it much too long. They warranted their own document since there were so many notes.