I started this project because I think it might be interesting to explore early economics writings. The idea is to understand the logic that was dominant in early economic writings. On of the goals of this project is to compare the ideas of the founders of modern economics as a discipline. 

What is exciting for me is to use Python, webscraping and NLP techniques to properly conduct a project. 

Here's a list of what I would like to touch in the project :
- Making graphics (e.g. wordcloud)
- Using deep learning NLP techniques
- Compare early writers to contemporeanous writers

In [2]:
# Loading the packages
import requests
from bs4 import BeautifulSoup

# Querying the texts

All the texts I am going to use come from the [Project Gutenberg website](https://www.gutenberg.org/wiki/Main_Page). Project Gutenberg was founded by Michael Hart in 1971 with the goal of providing free electronic books and ebooks. It contains over 60,000 ebooks that are available in many format including plain text, HTML or ebook. 

For this project, I am interested in analysing early economic writings. For that purpose I am going to query the following books : 
- [An Inquiry into the Nature and Causes of the Wealth of Nations by Adam Smith](https://www.gutenberg.org/ebooks/3300)
- [On The Principles of Political Economy, and Taxation by David Ricardo](https://www.gutenberg.org/ebooks/33310)
- [The Condition of the Working-Class in England in 1844](https://www.gutenberg.org/ebooks/17306)
- [Principles of Political Economy by John Stuart Mill](https://www.gutenberg.org/ebooks/30107)
- [The General Theory of Employment, Interest, and Money](http://gutenberg.net.au/ebooks03/0300071h/printall.html), Taken from Gutenberg Australia

The following books weren't available on the Project Gutenberg site,   
so I found them on the Online Library of Liberty website :
- [Alfred Marshall, Principles of Economics (8th ed.) [1890]](https://oll.libertyfund.org/titles/marshall-principles-of-economics-8th-ed)
- [Karl Marx Capital A Critique of Political Economy Volume I Book One: The Process of Production of Capital](https://oll.libertyfund.org/titles/marx-capital-a-critique-of-political-economy-volume-i-the-process-of-capitalist-production)
- [Ludwig von Mises, Human Action: A Treatise on Economics, vol. 1 (LF ed.) [1996]] (https://oll.libertyfund.org/titles/mises-human-action-a-treatise-on-economics-vol-1-lf-ed)

Thankfully all these books are hosted online, so everyone can can read them for free. Also there are in HTML format, meaning that we can extract the text inside HTML tags and CSS. For each book, I am going to create a Python class to represent how we can use them later in the project.

In [3]:
class InquiryNatureCauseWealth:
    def __init__(self):
        self.title = "An Inquiry into the Nature and Causes of the Wealth of Nations"
        self.author = "Adam Smith"
        self.year = 1776
    def get_text(self, url = "https://www.gutenberg.org/files/3300/3300-h/3300-h.htm"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text

In [4]:
wealth_of_nation = InquiryNatureCauseWealth()
print(wealth_of_nation.get_text()[1:5])

['\r\n      According, therefore, as this produce, or what is purchased with it, bears\r\n      a greater or smaller proportion to the number of those who are to consume\r\n      it, the nation will be better or worse supplied with all the necessaries\r\n      and conveniencies for which it has occasion.\r\n    ', '\r\n      But this proportion must in every nation be regulated by two different\r\n      circumstances: first, by the skill, dexterity, and judgment with which its\r\n      labour is generally applied; and, secondly, by the proportion between the\r\n      number of those who are employed in useful labour, and that of those who\r\n      are not so employed. Whatever be the soil, climate, or extent of territory\r\n      of any particular nation, the abundance or scantiness of its annual supply\r\n      must, in that particular situation, depend upon those two circumstances.\r\n    ', '\r\n      The abundance or scantiness of this supply, too, seems to depend more upon\r\n    

In [46]:
class PrinciplePoliticalEconomyTaxation:
    def __init__(self):
        self.title = "On The Principles of Political Economy, and Taxation"
        self.author = "David Ricardo"
        self.year = 1817
    def get_text(self, url = "https://www.gutenberg.org/files/33310/33310-h/33310-h.htm"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text

In [48]:
political_economy = PrinciplePoliticalEconomyTaxation()
print(political_economy.get_text()[1:20])

In [54]:
class ConditionWorkingClassEngland:
    def __init__(self):
        self.title = "The Condition of the Working-Class in England in 1844"
        self.author = "Frederick Engels"
        self.year = 1845
    def get_text(self, url = "https://www.gutenberg.org/files/17306/17306-h/17306-h.htm"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text


In [57]:
condition_working = ConditionWorkingClassEngland()
print(condition_working.get_text()[1:5])

['Transcribed from the January 1943 George Allen & Unwin reprint\nof the March 1892 edition by David Price, email ccx074@coventry.ac.uk', 'by\nFREDERICK ENGELS', 'Translated by Florence Kelley Wischnewetzky', 'London']


In [58]:
class GeneralTheroyEmployment:
    def __init__(self):
        self.title = "The General Theory of Employment, Interest, and Money"
        self.author = "John Maynard Keynes"
        self.year = 1845
    def get_text(self, url = "http://gutenberg.net.au/ebooks03/0300071h/printall.html"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text

In [59]:
general_theory = GeneralTheroyEmployment()
print(general_theory.get_text()[1:5])

["This book is chiefly addressed to my fellow economists. I hope\r\nthat it will be intelligible to others. But its main purpose is\r\nto deal with difficult questions of theory, and only in the\r\nsecond place with the applications of this theory to practice.\r\nFor if orthodox economics is at fault, the error is to be found\r\nnot in the superstructure, which has been erected with great care\r\nfor logical consistency, but in a lack of clearness and of\r\ngenerality in the pre misses. Thus I cannot achieve my object of\r\npersuading economists to re-examine critically certain of their\r\nbasic assumptions except by a highly abstract argument and also\r\nby much controversy. I wish there could have been less of the\r\nlatter. But I have thought it important, not only to explain my\r\nown point of view, but also to show in what respects it departs\r\nfrom the prevailing theory. Those, who are strongly wedded to\r\nwhat I shall call 'the classical theory', will fluctuate, I\r\nexpect, b

In [60]:
class PrinciplesOfEconomics:
    def __init__(self):
        self.title = "Principles of Economics"
        self.author = "Alfred Marshall"
        self.year = 1890
    def get_text(self, url = "https://oll.libertyfund.org/titles/marshall-principles-of-economics-8th-ed"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text

In [61]:
principles_economics = PrinciplesOfEconomics()
print(principles_economics.get_text()[1:5])

['Full site\nTitle names\nAuthor names\nEssays\nGroups ', 'Advanced Search', 'Alfred Marshall, Principles of Economics (London: Macmillan and Co. 8th ed. 1920).\n https://oll.libertyfund.org/titles/1676', 'This is the 8th edition of what is regarded to be the first “modern” economics textbook, leading in various editions from the 19th into the 20th century. The final 8th edition was Marshall’s most-used and most-cited.']


In [62]:
class CritiquePoliticalEconomy:
    def __init__(self):
        self.title = "Capital: A Critique of Political Economy. Volume I: The Process of Capitalist Production"
        self.author = "Karl Marx"
        self.year = 1867
    def get_text(self, url = "https://oll.libertyfund.org/titles/marx-capital-a-critique-of-political-economy-volume-i-the-process-of-capitalist-production"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text

In [63]:
critique_political_economy = CritiquePoliticalEconomy()
print(critique_political_economy.get_text()[1:5])

['Full site\nTitle names\nAuthor names\nEssays\nGroups ', 'Advanced Search', 'Karl Marx, Capital: A Critique of Political Economy. Volume I: The Process of Capitalist Production, by Karl Marx. Trans. from the 3rd German edition, by Samuel Moore and Edward Aveling, ed. Frederick Engels. Revised and amplified according to the 4th German ed. by Ernest Untermann (Chicago: Charles H. Kerr and Co., 1909).\n https://oll.libertyfund.org/titles/965', 'Vol. I of the major work of criticism of the capitalist system by one of the leading theorists of 19th century socialism. Only vol. 1 appeared in Marx’s lifetime; the other two vols. were published postumously by Engels. Marx prided himself on having discovered the “laws” which governed the operation of the capitalist system, laws which would inevitably lead to its collapse. A German language version of Das Kapital is also available in HTML and facsimile PDF.']


In [64]:
class HumanAction:
    def __init__(self):
        self.title = "Human Action: A Treatise on Economics, vol. 1 (LF ed.)"
        self.author = "Ludwig von Mises"
        self.year = 1996
    def get_text(self, url = "https://oll.libertyfund.org/titles/mises-human-action-a-treatise-on-economics-vol-1-lf-ed"):
        get_url = requests.get(url).text
        soup = BeautifulSoup(get_url, "html.parser")
        raw_text = [paragraph.text for paragraph in soup.find_all("p")]
        return raw_text

In [65]:
human_action = HumanAction()
human_action.get_text()[1:5]

['Full site\nTitle names\nAuthor names\nEssays\nGroups ',
 'Advanced Search',
 'Purchase now from Liberty Fund',
 'Ludwig von Mises, Human Action: A Treatise on Economics, in 4 vols., ed. Bettina Bien Greaves (Indianapolis: Liberty Fund, 2007). Vol. 1.\n https://oll.libertyfund.org/titles/1893']