# Scraping the Federalist Papers from Yale's Website
#### I couldn't find the Federalist Papers all in one place
#### So I scraped them and put them in a pandas dataframe
#### Brief demonstration of web scraping with BeautifulSoup
#### I was planning on doing an EDA on the vocabulary of the three authors of the FP (Jay, Madison, Hamilton)
#### but I moved on to other projects 

In [1]:
import requests
import pandas as pd
import string
import re 
import nltk
import itertools
from nltk.corpus import stopwords
from bs4 import BeautifulSoup
from string import punctuation, digits

from federalist_methods import *

In [2]:
%%time 
url = 'https://avalon.law.yale.edu/18th_century/'

# accounting for inconsistent URLs on Yale's server
# I set up a for loop to gather each paper, each paper had its own URL
one_through_nine = [url+'fed0{0}.asp'.format(i) for i in range(1, 10)]
ten_through_eightysix = [url+'fed{0}.asp'.format(i) for i in range(10, 86)]
urls = one_through_nine + ten_through_eightysix

page_requests = [requests.get(url) for url in urls]
soups = [BeautifulSoup(page.text, 'html.parser') for page in page_requests]

CPU times: user 1.72 s, sys: 125 ms, total: 1.84 s
Wall time: 11.7 s


In [5]:
paper_total = [return_paper(i) for i in range(0, len(soups))]
author_total = [return_author(i) for i in range(0, len(soups))]
title_total = [return_title(i) for i in range(0, len(soups))]
cleaned_paper_total = [clean_paper(i) for i in range(0, len(soups))]

paper_s = pd.Series(paper_total)
author_s = pd.Series(author_total)
title_s = pd.Series(title_total)

In [6]:
cleaned_author = [re.sub('\r\n', '', item.strip()) for item in author_total]
cleaned2x_author = [re.sub('For the Independent Journal.', 'JAY', item.strip()) for item in cleaned_author]
flat_title_list = [title[0] for title in title_total]
cleaned_flat_title_list = [re.sub('The Federalist Papers : ', '', item) for item in flat_title_list]

In [7]:
federalist_df = pd.DataFrame(cleaned_paper_total, index=[cleaned2x_author, cleaned_flat_title_list])

In [8]:
federalist_df

Unnamed: 0,Unnamed: 1,0,1,2,3,4,5,6,7,8,9,...,158,159,160,161,162,163,164,165,166,167
HAMILTON,No. 1,to the people of the state of new york,it is not however my design to dwell upon obse...,i am well aware that it would be disingenuous ...,candor will oblige us to admit that even such ...,so numerous indeed and so powerful are the cau...,this circumstance if duly attended to would fu...,and a further reason for caution in this respe...,ambition avarice personal animosity party oppo...,were there not even these inducements to moder...,for in politics as in religion it is equally a...,...,,,,,,,,,,
JAY,No. 2,to the people of the state of new york,when the people of america reflect that they a...,nothing is more certain than the indispensable...,it is well worthy of consideration therefore w...,it has until lately been a received and uncont...,but politicians now appear who insist that thi...,however extraordinary this new doctrine may ap...,whatever may be the arguments or inducements w...,it has often given me pleasure to observe that...,providence has in a particular manner blessed ...,...,,,,,,,,,,
JAY,No. 3,to the people of the state of new york,it is not a new observation that the people of...,that consideration naturally tends to create g...,the more attentively i consider and investigat...,it is of high importance to the peace of ameri...,because the prospect of present loss or advant...,the case of the treaty of peace with britain a...,because even if the governing party in a state...,but the national government not being affected...,as to those just causes of war which proceed f...,...,,,,,,,,,,
JAY,No. 4,to the people of the state of new york,it is too true however disgraceful it may be t...,these and a variety of other motives which aff...,but independent of these inducements to war wh...,with france and with britain we are rivals in ...,with them and with most other european nations...,in the trade to china and india we interfere w...,the extension of our own commerce in our own v...,spain thinks it convenient to shut the mississ...,from these and such like considerations which ...,...,,,,,,,,,,
JAY,No. 5,it was remarked in the preceding paper that we...,this subject is copious and cannot easily be e...,the history of great britain is the one with w...,we may profit by their experience without payi...,although it seems obvious to common sense that...,notwithstanding their true interest with respe...,the most sanguine advocates for three or four ...,independent of those local circumstances which...,for it cannot be presumed that the same degree...,whenever and from whatever causes it might hap...,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
HAMILTON,No. 81,to the people of the state of new york,that there ought to be one court of supreme an...,the reasons for it have been assigned in anoth...,the only question that seems to have been rais...,the same contradiction is observable in regard...,the very men who object to the senate as a cou...,the arguments or rather suggestions upon which...,the power of construing the laws according to ...,this is as unprecedented as it is dangerous,in britain the judical power in the last resor...,...,,,,,,,,,,
HAMILTON,No. 82,to the people of the state of new york,the erection of a new government whatever care...,t is time only that can mature and perfect so ...,such questions accordingly have arisen upon th...,the principal of these respect the situation o...,is this to be exclusive or are those courts to...,if the latter in what relation will they stand...,these are inquiries which we meet with in the ...,the only thing in the proposed constitution wh...,this might either be construed to signify that...,...,,,,,,,,,,
HAMILTON,No. 83,to the people of the state of new york,the objection to the plan of the convention wh...,the disingenuous form in which this objection ...,the mere silence of the constitution in regard...,to argue with respect to the latter would howe...,with regard to civil causes subtleties almost ...,every man of discernment must at once perceive...,but as the inventors of this fallacy have atte...,the maxims on which they rely are of this natu...,hence say they as the constitution has establi...,...,,,,,,,,,,
HAMILTON,No. 84,to the people of the state of new york,in the course of the foregoing review of the c...,there however remain a few which either did no...,these shall now be discussed but as the subjec...,the most considerable of the remaining objecti...,among other answers given to this it has been ...,i add that new york is of the number,and yet the opposers of the new system in this...,to justify their zeal in this matter they alle...,to the first i answer that the constitution pr...,...,,,,,,,,,,


In [8]:
joined_papers = []
for i in range(0, len(soups)):
    joined_papers.append([' '.join(item) for item in pd.Series(paper_total)[i]])

In [9]:
# a flatter dataframe 
df = pd.DataFrame(pd.Series(paper_total).explode())
df.reset_index(inplace=True)

In [12]:
df

Unnamed: 0,index,0
0,0,[to the people of the state of new york:]
1,0,"[it is not, however, my design to dwell upon o..."
2,0,"[and yet, however just these sentiments will b..."
3,0,"[i propose, in a series of papers, to discuss ..."
4,0,[in the progress of this discussion i shall en...
...,...,...
1115,84,"[this is not all., every constitution for the ..."
1116,84,"[but every amendment to the constitution, if o..."
1117,84,[in opposition to the probability of subsequen...
1118,84,"[if the foregoing argument is a fallacy, certa..."
