# Automatic information extraction with NER and Reaganomics

In [95]:
import spacy
nlp = spacy.load('en_core_web_sm')

In [96]:
from spacy import displacy

In [97]:
with open('reaganomics.txt') as f:
    docr = nlp(f.read())

In [7]:
displacy.render(docr,style='ent',jupyter=True)

In [98]:
moneyents = [ent for ent in docr.ents if ent.label_ == "MONEY"]

In [99]:
len(moneyents) # amount of money type entities in docr

27

In [221]:
def showMoneyents(moneyent):
    for moneyent in moneyents:
        print(f'{moneyent.text:{30}} {moneyent.label_} {moneyent.start:{4}} {moneyent.end:{4}} {docr[moneyent.start-5:moneyent.end+5].text:{50}}')

In [222]:
showMoneyents(moneyents)

$150 billion                   MONEY  905  908 paid by business corporations by $150 billion over a five-year
$267.1 billion                 MONEY 1276 1279 constant 2000 dollars) from $267.1 billion in 1980 (4.9%
$393.1 billion                 MONEY 1294 1297 of public expenditure) to $393.1 billion in 1988 (5.8%
$997 billion to $2.85          MONEY 1530 1536 raising the national debt from $997 billion to $2.85 trillion.[33] This led to
$712 billion                   MONEY 1918 1921 the public debt rose from $712 billion in 1980 to $2.052
$2.052 trillion                MONEY 1924 1927 712 billion in 1980 to $2.052 trillion in 1988, a roughly
less than $10,000              MONEY 2257 2261 percentage of total households making less than $10,000 a year (in real
over $75,000                   MONEY 2286 2289 the percentage of households making over $75,000 went from 20.2% to
over $2 trillion               MONEY 2664 2668 Reagan’s presidency, an over $2 trillion increase. The compound annu

In [214]:
# Creating list of money NERs without stop words and extracting respective lemmas
moneyentsConn = [docr[ent.start-8:ent.end+8].as_doc() for ent in docr.ents if ent.label_=="MONEY"]
contextline = []
for doc in moneyentsConn:
    for token in doc:
        if not token.is_stop:
            contextline.append(token.lemma_)
    contextline.append('\t')

In [215]:
# Joining list in one string and then splitting by delimeter '\t'
contextline = ' '.join(map(str, contextline)).split('\t')

In [233]:
# Modifying showMoneyents function to include new contextlines
def showMoneyents(moneyent):
    for i,moneyent in enumerate(moneyents):
        print(f'{moneyent.text:{30}} {moneyent.label_:{5}} {moneyent.start:{4}} {moneyent.end:{5}} {contextline[i]:{50}}')

In [234]:
showMoneyents(moneyents)

$150 billion                   MONEY  905   908 trim taxis pay business corporation $ 150 billion - year period . 
$267.1 billion                 MONEY 1276  1279  rise ( constant 2000 dollar ) $ 267.1 billion 1980 ( 4.9 % gdp 
$393.1 billion                 MONEY 1294  1297  22.7 % public expenditure ) $ 393.1 billion 1988 ( 5.8 % gdp 
$997 billion to $2.85          MONEY 1530  1536  budget deficit , raise national debt $ 997 billion $ 2.85 trillion.[33 ] lead U.S. move 
$712 billion                   MONEY 1918  1921  dollar term , public debt rise $ 712 billion 1980 $ 2.052 trillion 1988 
$2.052 trillion                MONEY 1924  1927  rise $ 712 billion 1980 $ 2.052 trillion 1988 , roughly - fold 
less than $10,000              MONEY 2257  2261  household , percentage total household make $ 10,000 year ( real 2007 dollar ) 
over $75,000                   MONEY 2286  2289  1988 percentage household make $ 75,000 go 20.2 % 25.7 % 
over $2 trillion               MONEY 2664  2668  - R

In [225]:
moneysents = []
for moneyent in moneyents:
    for sent in docr.sents:
        if sent.start < moneyent.start and sent.end > moneyent.end:
            moneysents.append(sent)

In [228]:
print(*moneysents,sep='\n\n')

This act slashed estate taxes and trimmed taxes paid by business corporations by $150 billion over a five-year period.

Reagan significantly increased public expenditures, primarily the Department of Defense, which rose (in constant 2000 dollars) from $267.1 billion in 1980 (4.9% of GDP and 22.7% of public expenditure) to $393.1 billion in 1988 (5.8% of GDP and 27.3% of public expenditure); most of those years military spending was about 6% of GDP, exceeding this number in 4 different years.

Reagan significantly increased public expenditures, primarily the Department of Defense, which rose (in constant 2000 dollars) from $267.1 billion in 1980 (4.9% of GDP and 22.7% of public expenditure) to $393.1 billion in 1988 (5.8% of GDP and 27.3% of public expenditure); most of those years military spending was about 6% of GDP, exceeding this number in 4 different years.

As a short-run strategy to reduce inflation and lower nominal interest rates, the U.S. borrowed both domestically and abroad

In [253]:
# Modifying showMoneyents function to include related sentences
def showMoneyents(moneyent):
    for i,moneyent in enumerate(moneyents):
        print(f'{moneyent.text:{30}} {moneyent.label_:{5}} {moneyent.start:{4}} {moneyent.end:{5}} \n{moneysents[i]} \n')

In [254]:
showMoneyents(moneyents)

$150 billion                   MONEY  905   908 
This act slashed estate taxes and trimmed taxes paid by business corporations by $150 billion over a five-year period. 

$267.1 billion                 MONEY 1276  1279 
Reagan significantly increased public expenditures, primarily the Department of Defense, which rose (in constant 2000 dollars) from $267.1 billion in 1980 (4.9% of GDP and 22.7% of public expenditure) to $393.1 billion in 1988 (5.8% of GDP and 27.3% of public expenditure); most of those years military spending was about 6% of GDP, exceeding this number in 4 different years. 

$393.1 billion                 MONEY 1294  1297 
Reagan significantly increased public expenditures, primarily the Department of Defense, which rose (in constant 2000 dollars) from $267.1 billion in 1980 (4.9% of GDP and 22.7% of public expenditure) to $393.1 billion in 1988 (5.8% of GDP and 27.3% of public expenditure); most of those years military spending was about 6% of GDP, exceeding this numbe