# Numerical Fact Checking System. Univeristy of Sheffield

Pre-requisites:
 * Gradle
 * Java jdk8
 * Python 3
  * numpy
  * jnius
  * fuzzywuzzy
  * sklearn
  * urllib3
 

## Configuration
This defines the colleciton of tables that is used to populate the knowledge base

In [1]:
world = "herox"

## Common setup

Import required dependencies and download/install Stanford CoreNLP

In [2]:
import sys
import os
import re

#Set path manually to incldue sources location
if 'src/' not in sys.path:
    sys.path.append('src/')


If the following step fails. Run `gradlew writeClasspath` on the terminal in this folder. Then try again

In [3]:
#Load Java classpath for stanford corenlp using gradle. this will also install it if missing
from subprocess import run,PIPE
if 'CLASSPATH' not in os.environ:
    if not (os.path.exists('build') and os.path.exists('build/classpath.txt')):
        print("Generating classpath")
        r=run(["./gradlew", "writeClasspath"],stdout=PIPE, stderr=PIPE, universal_newlines=True)
        print(r.stdout)
        print(r.stderr)
              
    print("Loading classpath")
    os.environ['CLASSPATH'] = open('build/classpath.txt','r').read()
    print("Done")

Loading classpath
Done


### Feature Generation

For each of the downloaded web pages. Parse the page and identify matches between the values in our tables and the data given in the web page. This only needs to be run once and will rememeber if it has been run before

In [4]:
from run.ds_generate_positive_features_for_query import precompute_features
precompute_features(world)

herox/1.csv	430,514,418.88	”Exxon Mobil" Market Value

Done 1 out of 16
Search for ”Exxon Mobil" Market Value
Query already executed
Missing file. Remove from dictionary and search again
Search for ”Exxon Mobil" Market Value
New Query
Done URL 1 out of 10
 
Looking in document for values similar to 430,514,418.88
https://www.forbes.com/companies/exxon-mobil/
https://www.forbes.com/companies/exxon-mobil/
No meaningful text in this document
Done URL 2 out of 10
 
Looking in document for values similar to 430,514,418.88
https://www.statista.com/statistics/264121/market-value-of-the-corporation-exxon-mobil-since-2001/
https://www.statista.com/statistics/264121/market-value-of-the-corporation-exxon-mobil-since-2001/
No meaningful text in this document
Done URL 3 out of 10
 
Looking in document for values similar to 430,514,418.88
https://en.wikipedia.org/wiki/ExxonMobil
https://en.wikipedia.org/wiki/ExxonMobil
199 candidate matches
["It is the largest direct descendant of John D. Rockefelle

Annotated
Target 430514418.88		Actual 5		Class		0
Target 430514418.88		Actual 10		Class		0
Target 430514418.88		Actual 2		Class		0
Target 430514418.88		Actual 10		Class		0
Target 430514418.88		Actual 2013		Class		0
Target 430514418.88		Actual 5070000.0		Class		0
Target 430514418.88		Actual 5070000.0		Class		0
Target 430514418.88		Actual 4190000.0000000005		Class		0
Target 430514418.88		Actual 4190000.0000000005		Class		0
Target 430514418.88		Actual 2015		Class		0
Target 430514418.88		Actual 10		Class		0
Target 430514418.88		Actual 2011		Class		0
Target 430514418.88		Actual 1		Class		0
Target 430514418.88		Actual 2011		Class		0
Target 430514418.88		Actual 11		Class		0
Target 430514418.88		Actual 2017		Class		0
Target 430514418.88		Actual 11		Class		0
Target 430514418.88		Actual 2017		Class		0
Target 430514418.88		Actual 1973		Class		0
Target 430514418.88		Actual 1973		Class		0
Target 430514418.88		Actual 40610000000.0		Class		0
Target 430514418.88		Actual 40610000000.0		Class		0
Target 

No meaningful text in this document
Done URL 5 out of 10
 
Looking in document for values similar to 430,514,418.88
https://en.wikipedia.org/wiki/List_of_public_corporations_by_market_capitalization
https://en.wikipedia.org/wiki/List_of_public_corporations_by_market_capitalization
29 candidate matches
['This list is based on the Financial Times Global 500 rankings .', 'Only companies with free float at least 15 % are included , value of unlisted stock classes is excluded .', 'This list is up to date as of March 31 , 2018 -LSB- update -RSB- .', 'This list is up to date as of December 31 , 2017 -LSB- update -RSB- .', 'This list is up to date as of December 31 , 2016 -LSB- update -RSB- .', 'This Financial Times Global 500 -- based list is up to date as of December 31 , 2015 -LSB- update -RSB- .', 'This Financial Times Global 500 -- based list is up to date as of December 31 , 2014 -LSB- update -RSB- .', 'This Financial Times Global 500 -- based list is up to date as of December 31 , 2013 

Annotated
Target 90000		Actual 2		Class		0
Target 90000		Actual 72		Class		0
Target 90000		Actual 282		Class		0
Target 90000		Actual 2014		Class		0
Target 90000		Actual 26000		Class		0
Target 90000		Actual 2015		Class		0
Target 90000		Actual 26000		Class		0
Target 90000		Actual 2015		Class		0
Target 90000		Actual 26000		Class		0
Target 90000		Actual 2015		Class		0
Target 90000		Actual 17		Class		0
Target 90000		Actual 35369		Class		0
Target 90000		Actual 2015		Class		0
Target 90000		Actual 18		Class		0
Target 90000		Actual 1156		Class		0
Target 90000		Actual 11		Class		0
Target 90000		Actual 11		Class		0
Target 90000		Actual 2016		Class		0
Target 90000		Actual 1156		Class		0
Target 90000		Actual 11		Class		0
Target 90000		Actual 11		Class		0
Target 90000		Actual 2016		Class		0
Target 90000		Actual 95000		Class		1
Target 90000		Actual 4		Class		0
Target 90000		Actual 2014		Class		0
Target 90000		Actual 95000		Class		1
Target 90000		Actual 4		Class		0
Target 90000		Actual 2014		Class		0


1 candidate matches
["In 2017 , over 2500 separated children were referred to the Refugee Council 's Children 's Section ."]
About to parse
Annotated
Target 90000		Actual 2500		Class		0
Target 90000		Actual 2017		Class		0
Target 90000		Actual 2500		Class		0
Target 90000		Actual 2017		Class		0
[{'class': 0, 'header_match_intersection': 0, 'complete_bow': ['separated'], 'dep_path_bow': [], 'type': 'number', 'value': 2500, 'entity': 'Unaccompanied children', 'entity_utterance': 'children'}, {'class': 0, 'header_match_intersection': 0, 'complete_bow': [',', 'over', '2500', 'separated'], 'dep_path_bow': ['2017'], 'type': 'date', 'value': 2017, 'entity': 'Unaccompanied children', 'entity_utterance': 'children'}, {'class': 0, 'header_match_intersection': 0, 'complete_bow': ['separated', 'children', 'were', 'referred', 'to', 'the', 'Refugee', 'Council', "'s"], 'dep_path_bow': [], 'type': 'number', 'value': 2500, 'entity': 'Unaccompanied children', 'entity_utterance': 'Children'}, {'class': 0, 

6 candidate matches
['The training , which will be made available to 1,000 foster carers and support workers , is backed by updated statutory guidance , a review of local authority funding and a drive to improve inter-agency advice and information sharing .', 'The UK has a proud history of protecting those in need and this strategy is just one way we are ensuring unaccompanied asylum seeking children with a right to be in the UK are supported .', 'Last year almost 3,000 unaccompanied children claimed asylum in the UK and they all require ongoing care and protection .', "The government 's new training for foster carers and support workers will be backed by new funding worth # 200,000 , between 2017 and 2019 .", 'An additional # 60,000 investment will provide additional resources for social workers supporting children .', 'The strategy sets out plans to make best practice guides available to social workers , a review of first encounter standards for the police , and comprehensive informa

Annotated
Target 1987		Actual 18		Class		0
Target 1987		Actual 1988		Class		0
Target 1987		Actual 2007		Class		0
Target 1987		Actual 2007		Class		0
Target 1987		Actual 1		Class		0
Target 1987		Actual 1993		Class		0
Target 1987		Actual 2011		Class		0
Target 1987		Actual 2012		Class		0
Target 1987		Actual 76		Class		0
Target 1987		Actual 132		Class		0
Target 1987		Actual 43		Class		0
Target 1987		Actual 2006		Class		0
Target 1987		Actual 2		Class		0
Target 1987		Actual 2		Class		0
Target 1987		Actual 2006		Class		0
Target 1987		Actual 1967		Class		0
Target 1987		Actual 17		Class		0
Target 1987		Actual 2008		Class		0
Target 1987		Actual 15		Class		0
Target 1987		Actual 2		Class		0
Target 1987		Actual 1997		Class		0
Target 1987		Actual 2		Class		0
Target 1987		Actual 1997		Class		0
Target 1987		Actual 2016		Class		0
Target 1987		Actual 102		Class		0
Target 1987		Actual 1992		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1992		Class		0
Target 1987		Actual 1996		Class		0
Ta

15 candidate matches
["-LSB- The article below originally appeared in the French daily L'Humanité on December 14 , 2001 , translated to English by Global Outlook in 2002 , and published by Global Research in March 2004 .", "Notice the striking similarity with the creation over 50 years ago by the British MI5 , the CIA and the same Mossad of what is now known as the `` global terror '' -LRB- based on the real Arab terrorist groups founded earlier by the German Nazis -RRB- .", "According to the Israeli weekly Koteret Rashit -LRB- October 1987 -RRB- , `` The Islamic associations as well as the university had been supported and encouraged by the Israeli military authority '' in charge of the -LRB- civilian -RRB- administration of the West Bank and Gaza .", "And in 1978 , they created an `` Islamic University '' in Gaza .", 'At the end of 1992 , there were six hundred mosques in Gaza .', 'In 1984 , Ahmed Yassin was arrested and condemned to twelve years in prison , after the discovery of a 

Annotated
Target 1987		Actual 2		Class		0
Target 1987		Actual 25		Class		0
Target 1987		Actual 20		Class		0
Target 1987		Actual 2010		Class		0
Target 1987		Actual 30		Class		0
Target 1987		Actual 1994		Class		0
Target 1987		Actual 42.9		Class		0
Target 1987		Actual 74		Class		0
Target 1987		Actual 132		Class		0
Target 1987		Actual 17		Class		0
Target 1987		Actual 2008		Class		0
Target 1987		Actual 2006		Class		0
Target 1987		Actual 1967		Class		0
Target 1987		Actual 2010		Class		0
Target 1987		Actual 84		Class		0
Target 1987		Actual 863		Class		0
Target 1987		Actual 75		Class		0
Target 1987		Actual 2006		Class		0
Target 1987		Actual 2004		Class		0
Target 1987		Actual 190		Class		0
Target 1987		Actual 2		Class		0
Target 1987		Actual 1993		Class		0
Target 1987		Actual 16		Class		0
Target 1987		Actual 2009		Class		0
Target 1987		Actual 2011		Class		0
Target 1987		Actual 2017		Class		0
Target 1987		Actual 20		Class		0
Target 1987		Actual 2006		Class		0
Target 1987		Actual 20		Class		0
Targ

35 candidate matches
["Hamas is a Palestinian militant movement that also serves as one of the territories ' two major political parties .", "A nationalist-Islamist spinoff of Egypt 's Muslim Brotherhood , Hamas was founded in 1987 , during the first intifada , and later emerged at the forefront of armed resistance to Israel .", 'Hamas candidates won Palestinian elections in 2006 , but their government was dismissed in 2007 , resulting in the political bifurcation of the West Bank and Gaza .', 'Beginning in the late 1960s , Yassin preached and performed charitable work in the West Bank and Gaza Strip , both of which were occupied by Israeli forces following the 1967 Six Day War .', "Yassin established Hamas as the Brotherhood 's local political arm in December 1987 , following the outbreak of the first intifada , a Palestinian uprising against Israeli control of the West Bank , Gaza , and East Jerusalem .", 'Hamas first employed suicide bombing , a tactic with which it would later beco

Done URL 6 out of 10
 
Looking in document for values similar to 1987
https://www.bbc.co.uk/news/world-middle-east-13331522
https://www.bbc.co.uk/news/world-middle-east-13331522
28 candidate matches
["Its name is an Arabic acronym for the Islamic Resistance Movement , originating as it did in 1987 after the beginning of the first intifada , or Palestinian uprising , against Israel 's occupation of the West Bank and Gaza Strip .", 'But since 2005 , it has also engaged in the Palestinian political process , becoming the first Islamist group in the Arab world to win election through the ballot box -LRB- before reinforcing its power in Gaza by ousting its Fatah rivals -RRB- .', 'In May 2017 , the group published a new policy document for the first time since its founding .', 'In 2006 , Hamas won a stunning victory in the Palestinian Legislative Council -LRB- PLC -RRB- elections , but tensions with the rival Fatah faction of Palestinian Authority -LRB- PA -RRB- President Mahmoud Abbas heigh

No meaningful text in this document
Done URL 8 out of 10
 
Looking in document for values similar to 1987
https://www.aljazeera.com/indepth/features/2017/10/hamas-fatah-goal-approaches-171012064342008.html
https://www.aljazeera.com/indepth/features/2017/10/hamas-fatah-goal-approaches-171012064342008.html
19 candidate matches
['Hamas and Fatah are the two most dominant parties in the Palestinian political scene .', 'On Thursday , the two movements announced they had reached a deal to end a decade-long rift that brought them to an armed conflict in 2007 .', "Hamas has been the de facto ruler in the Gaza Strip since 2007 , after defeating President Mahmoud Abbas ' long dominant Fatah party in parliamentary elections .", 'While the two groups work towards the same goal of building a Palestinian state on the territories that Israel occupied in 1967 , consisting of East Jerusalem , the Gaza Strip and the West Bank , there are some stark differences .', 'The secular movement was founded in Ku

165 candidate matches
["Créé en 1987 par Sheikh Ahmed Yassin , Abdel Aziz al-Rantissi et Mohammed Taha , tous trois issus des Frères musulmans , sa charte affirme que `` la terre de Palestine est une terre islamique '' .", "il prône la destruction de l'État d'Israël et l'instauration d'un État islamique palestinien sur tout le territoire de l'ancienne Palestine mandataire , avant de demander `` l'établissement d'un État palestinien entièrement souverain et indépendant dans les frontières du 4 juin 1967 , avec Jérusalem pour capitale '' .", 'Entre avril 1993 et 2005 , le Hamas a organisé des attentats suicides visant essentiellement des civils .', "Le dernier attentat-suicide contre Israël revendiqué par le Hamas remonte ainsi à janvier 2005 ; il a déclaré en avril 2006 renoncer à ce type d'actions , préférant alors tirer des roquettes de type Qassam et des missiles Grad sur des villes israéliennes , dont Sdérot , Ashdod , Ashkelon et Beer Sheva .", "Le 29 janvier 2015 la justice égypti

Annotated
Target 1987		Actual 2012		Class		0
Target 1987		Actual 13		Class		0
Target 1987		Actual 1994		Class		0
Target 1987		Actual 1994		Class		0
Target 1987		Actual 1995		Class		0
Target 1987		Actual 23		Class		0
Target 1987		Actual 34		Class		0
Target 1987		Actual 2013		Class		0
Target 1987		Actual 2011		Class		0
Target 1987		Actual 16		Class		0
Target 1987		Actual 6		Class		0
Target 1987		Actual 2008		Class		0
Target 1987		Actual 13		Class		0
Target 1987		Actual 2006		Class		0
Target 1987		Actual 1967		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1996		Class		0
Target 1987		Actual 1990		Class		0
Target 1987		Actual 1990		Class		0
Target 1987		Actual 1990		Class		0
Target 1987		Actual 2012		Class		0
Target 1987		Actual 4		Class		0
Target 1987		Actual 7		Class		0
Target 1987		Actual 2008		Class		0
Target 1987		Actual 2008		Cla

11 candidate matches
["06/18/02 `` UPI '' -- -- In the wake of a suicide bomb attack Tuesday on a crowded Jerusalem city bus that killed 19 people and wounded at least 70 more , the Islamic Resistance Movement , Hamas , took credit for the blast .", 'Israeli officials called it the deadliest attack in Jerusalem in six years .', 'According to documents United Press International obtained from the Israel-based Institute for Counter Terrorism , Hamas evolved from cells of the Muslim Brotherhood , founded in Egypt in 1928 .', "Islamic movements in Israel and Palestine were `` weak and dormant '' until after the 1967 Six Day War in which Israel scored a stunning victory over its Arab enemies .", 'After 1967 , a great part of the success of the Hamas/Muslim Brotherhood was due to their activities among the refugees of the Gaza Strip .', "`` Social influence grew into political influence , '' first in the Gaza Strip , then on the West Bank , said an administration official who spoke on condit

1 candidate matches
['Data from the early 20th century are somewhat less precise than more recent data because there were fewer stations collecting measurements at the time , especially in the Southern Hemisphere .']
About to parse
Annotated
[]
Done URL 8 out of 10
 
Looking in document for values similar to 16.6
https://en.wikipedia.org/wiki/Climate_of_the_United_States
https://en.wikipedia.org/wiki/Climate_of_the_United_States
48 candidate matches
['West of the 100th meridian , much of the US is semi-arid to desert in the far southwestern US , and Mediterranean along the California coast .', 'East of the 100th meridian , the climate is humid continental in the northern areas east through New England , to humid subtropical in the Gulf and South Atlantic regions .', 'In the summer , storms are much more localized , with short-duration thunderstorms common in many areas east of the 100th meridian .', 'In the warm season , storm systems affecting a large area are less frequent , and weat

Annotated
Target 16.6		Actual 13		Class		0
Target 16.6		Actual 330		Class		0
Target 16.6		Actual 1		Class		0
Target 16.6		Actual 25		Class		0
Target 16.6		Actual 83		Class		0
Target 16.6		Actual 1927		Class		0
Target 16.6		Actual 1993		Class		0
Target 16.6		Actual 1982		Class		0
Target 16.6		Actual 1		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': ['is', '13:1', ',', 'meaning'], 'dep_path_bow': [], 'type': 'number', 'value': 13, 'entity': 'United States', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['is', '13:1', ',', 'meaning', '13', 'inches', '-LRB-'], 'dep_path_bow': [], 'type': 'number', 'value': 330, 'entity': 'United States', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['is', '13:1', ',', 'meaning', '13', 'inches', '-LRB-', '330', 'mm', '-RRB-', 'of', 'snow', 'melts', 'down', 'to'], 'dep_path_bow': [], 'type': 'number', 'value': 1, 'entity': 'Unite

No meaningful text in this document
Done URL 6 out of 10
 
Looking in document for values similar to 78
https://www.usatoday.com/story/news/nation/2014/10/08/us-life-expectancy-hits-record-high/16874039/
https://www.usatoday.com/story/news/nation/2014/10/08/us-life-expectancy-hits-record-high/16874039/
18 candidate matches
['Life expectancy in the USA rose in 2012 to 78.8 years -- a record high .', "That was an increase of 0.1 year from 2011 when it was 78.7 years , according to a new report on mortality in the USA from the Centers for Disease Control and Prevention 's National Center for Health Statistics .", "Life expectancy for females is 81.2 years ; for males , it 's 76.4 years .", 'That difference of 4.8 years is the same as in 2011 .', "Those life expectancy estimates are for people born in 2012 and represent `` the average number of years that a group of infants would live if the group was to experience throughout life the age-specific death rates present in the year of birth ,

Annotated
[]
Done URL 10 out of 10
 
Looking in document for values similar to 78
https://countryeconomy.com/demography/life-expectancy/usa
https://countryeconomy.com/demography/life-expectancy/usa
No meaningful text in this document
herox/6.csv	699202	"United States" Number of abortions

Done 6 out of 16
Search for "United States" Number of abortions
Query already executed
Missing file. Remove from dictionary and search again
Search for "United States" Number of abortions
New Query
Done URL 1 out of 10
 
Looking in document for values similar to 699202
https://en.wikipedia.org/wiki/Abortion_in_the_United_States
https://en.wikipedia.org/wiki/Abortion_in_the_United_States
147 candidate matches
['Various anti-abortion laws have been in force in each state since at least 1900 .', 'Before the U.S. Supreme Court decision Roe v. Wade legalized abortion nationwide in 1973 , it was already legal in several states , but the decision imposed a uniform framework for state legislation on the subje

Annotated
Target 699202		Actual 2006		Class		0
Target 699202		Actual 3		Class		0
Target 699202		Actual 1971		Class		0
Target 699202		Actual 1983		Class		0
Target 699202		Actual 1983		Class		0
Target 699202		Actual 17		Class		0
Target 699202		Actual 2008		Class		0
Target 699202		Actual 27		Class		0
Target 699202		Actual 1		Class		0
Target 699202		Actual 1987		Class		0
Target 699202		Actual 1988		Class		0
Target 699202		Actual 2003		Class		0
Target 699202		Actual 1973		Class		0
Target 699202		Actual 21		Class		0
Target 699202		Actual 2003		Class		0
Target 699202		Actual 1973		Class		0
Target 699202		Actual 1973		Class		0
Target 699202		Actual 1973		Class		0
Target 699202		Actual 1024		Class		0
Target 699202		Actual 1024		Class		0
Target 699202		Actual 1803		Class		0
Target 699202		Actual 84		Class		0
Target 699202		Actual 88		Class		0
Target 699202		Actual 2002		Class		0
Target 699202		Actual 34		Class		0
Target 699202		Actual 32		Class		0
Target 699202		Actual 2008		Class		0
[{'class': 

No meaningful text in this document
Done URL 3 out of 10
 
Looking in document for values similar to 699202
https://en.wikipedia.org/wiki/Abortion_statistics_in_the_United_States
https://en.wikipedia.org/wiki/Abortion_statistics_in_the_United_States
15 candidate matches
['Abortions are conducted in all 50 states , but abortions are more common in some states than they are in others .', 'The Guttmacher Institute did a study that shows which 25 states have the most abortions .', 'States vary widely in terms of overall population , so the total abortions are presented in terms of abortions per 1000 women between the ages of 15 and 45 .', 'This list includes totals from 25 states in 2008 and includes states that did not send data to the Center for Disease Control that year .', 'For example , California and Florida had a combined total of 308,550 abortions in 2008 and did not send a report to the CDC that year .', 'Delaware has the most abortions per 1000 women -LRB- 15-44 -RRB- at 40 .', '

Annotated
Target 699202		Actual 100000		Class		0
Target 699202		Actual 12		Class		0
Target 699202		Actual 2		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 28		Class		0
Target 699202		Actual 24		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 38		Class		0
Target 699202		Actual 14		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 100		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 45		Class		0
Target 699202		Actual 7		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 45		Class		0
Target 699202		Actual 7		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 1		Class		0
Target 699202		Actual 89		Class		0
Target 699202		Actual 90		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 46		Class		0
Target 699202		Actual 6		Class		0
Target 699202		Actual 2010		Class		0
Target 699202		Actual 46		Class		0
Target 699202		Actual 6		Cl

49 candidate matches
['1 .', 'Finer LB and Zolna MR , Declines in unintended pregnancy in the United States , 2008 -- 2011 , New England Journal of Medicine , 2016 , 374 -LRB- 9 -RRB- :843 -- 852 , doi :10.1056 / NEJMsa1506575 .', '2 .', 'Jones RK and Jerman J , Abortion incidence and service availability in the United States , 2014 , Perspectives on Sexual and Reproductive Health , 2017 , 49 -LRB- 1 -RRB- :17 -- 27 , doi :10.1363 / psrh .12015 .', '3 .', "Findings from the 2014 U.S. Abortion Patient Survey , Journal of Women 's Health , 2017 , doi :10.1089 / jwh .2017.6410 .", '4 .', 'Jones RK and Jerman J , Abortion incidence and service availability in the United States , 2011 , Perspectives on Sexual and Reproductive Health , 2014 , 46 -LRB- 1 -RRB- :3 -- 14 , doi :10.1363 / 46e0414 .', '5 .', 'Jones RK and Jerman J , Population group abortion rates and lifetime incidence of abortion : United States , 2008 -- 2014 , American Journal of Public Health , 2017 , doi :10.2105 / AJPH .20

17 candidate matches
['-LRB- December 2005 -RRB- Roe v. Wade -- the landmark 1973 U.S. Supreme Court case establishing that most U.S. laws against abortion violate a constitutional right to privacy -- will come under more scrutiny in coming months as a newly reconfigured Supreme Court hears arguments in an abortion rights case -LRB- Ayotte v. Planned Parenthood of Northern New England -RRB- .', 'As a result , the number of maternal complications or deaths caused by unsafe abortions -- those abortions performed by unskilled providers or in unsanitary settings -- also could rise .1 Women have abortions regardless of whether the procedure is legal in the country in which they reside .', "Evidence shows that laws that restrict abortion do n't guarantee low induced abortion rates : Nearly one-half of all abortions worldwide are performed in countries that allow abortions only in very limited circumstances .2 While abortion rates are high in Eastern European countries such as Russia and Roma

Missing file. Remove from dictionary and search again
Search for "United States" Abortion Rate per 1,000 births
New Query
Done URL 1 out of 10
 
Looking in document for values similar to 210
https://en.wikipedia.org/wiki/Abortion_statistics_in_the_United_States
15 candidate matches
['Abortions are conducted in all 50 states , but abortions are more common in some states than they are in others .', 'The Guttmacher Institute did a study that shows which 25 states have the most abortions .', 'States vary widely in terms of overall population , so the total abortions are presented in terms of abortions per 1000 women between the ages of 15 and 45 .', 'This list includes totals from 25 states in 2008 and includes states that did not send data to the Center for Disease Control that year .', 'For example , California and Florida had a combined total of 308,550 abortions in 2008 and did not send a report to the CDC that year .', 'Delaware has the most abortions per 1000 women -LRB- 15-44 -RRB-

Annotated
Target 210		Actual 2013		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': ['continued', 'to', 'decline', 'and', 'reached', 'historic', 'lows', 'in'], 'dep_path_bow': [], 'type': 'date', 'value': 2013, 'entity': 'United States', 'entity_utterance': 'United States'}]
Done URL 5 out of 10
 
Looking in document for values similar to 210
http://www.johnstonsarchive.net/policy/abortion/graphusabrate.html
http://www.johnstonsarchive.net/policy/abortion/graphusabrate.html
3 candidate matches
['Note : data is scaled relative to 1980 -LRB- 1980 value = 100 -RRB- .', 'Plotted data includes : © 2002-2011 , 2014 by Wm. Robert Johnston .', 'Last modified 28 November 2014 .']
About to parse
Annotated
[]
Done URL 6 out of 10
 
Looking in document for values similar to 210
https://www.guttmacher.org/sites/default/files/report_pdf/us-adolescent-pregnancy-trends-2013.pdf
https://www.guttmacher.org/sites/default/files/report_pdf/us-adolescent-pregnancy-trends-2013.pdf
No me

Annotated
Target 91		Actual 3600000.0		Class		0
Target 91		Actual 3600000.0		Class		0
Target 91		Actual 2001		Class		0
Target 91		Actual 1		Class		0
Target 91		Actual 1		Class		0
Target 91		Actual 144		Class		0
Target 91		Actual 27		Class		0
Target 91		Actual 50		Class		0
Target 91		Actual 72		Class		0
Target 91		Actual 200		Class		0
Target 91		Actual 1000000.0		Class		0
Target 91		Actual 1000000.0		Class		0
Target 91		Actual 1200		Class		0
Target 91		Actual 1		Class		0
Target 91		Actual 3		Class		0
Target 91		Actual 10		Class		0
Target 91		Actual 8		Class		0
Target 91		Actual 2		Class		0
Target 91		Actual 1		Class		0
Target 91		Actual 20		Class		0
Target 91		Actual 1		Class		0
Target 91		Actual 4495		Class		0
Target 91		Actual 2		Class		0
Target 91		Actual 11000		Class		0
Target 91		Actual 2005		Class		0
Target 91		Actual 4		Class		0
Target 91		Actual 2009		Class		0
Target 91		Actual 1990		Class		0
Target 91		Actual 200		Class		0
Target 91		Actual 29		Class		0
Target 91		Actual 1852		

No meaningful text in this document
Done URL 5 out of 10
 
Looking in document for values similar to 91
https://nces.ed.gov/fastfacts/display.asp?id=516
https://nces.ed.gov/fastfacts/display.asp?id=516
10 candidate matches
['The percentages of 3-year-olds , 4-year-olds , and 5-year-olds enrolled in preprimary programs fluctuated between 2000 and 2015 .', 'In 2015 , some 38 percent of 3-year-olds , 67 percent of 4-year-olds , and 87 percent of 5-year-olds were enrolled in preprimary programs , which were not measurably different from the percentages enrolled in 2000 -LRB- 39 percent , 65 percent , and 88 percent , respectively -RRB- .', 'In 2015 , the percentage of children enrolled in preprimary programs remained higher for 5-year-olds than for 4-year-olds , and higher for 4-year-olds than for 3-year-olds .', 'Among 3 - to 5-year-olds who were enrolled in preschool programs in 2015 , some 51 percent attended full-day programs .', 'The percentage of 3 - to 5-year-old preschool students 

Annotated
Target 91		Actual 1996		Class		0
Target 91		Actual 1996		Class		0
Target 91		Actual 2010		Class		0
Target 91		Actual 1000000000000.0		Class		0
Target 91		Actual 1000000000000.0		Class		0
Target 91		Actual 2012		Class		0
Target 91		Actual 1000		Class		0
Target 91		Actual 160000		Class		0
Target 91		Actual 20		Class		0
Target 91		Actual 2010		Class		0
Target 91		Actual 1		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': [',', 'the'], 'dep_path_bow': ['passed', '1996'], 'type': 'date', 'value': 1996, 'entity': 'United States Teenagers', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': [',', 'the', 'United', 'States', 'passed', 'a', 'law', 'banning'], 'dep_path_bow': ['law'], 'type': 'date', 'value': 1996, 'entity': 'United States Teenagers', 'entity_utterance': 'states'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['study', 'conducted', 'at', 'the', 'University', 'of', 'Nevada', ',', 'Las',

Annotated
Target 37591		Actual 35000000000.0		Class		0
Target 37591		Actual 35000000000.0		Class		0
Target 37591		Actual 1		Class		0
Target 37591		Actual 2675		Class		0
[{'class': 0, 'header_match_intersection': 0, 'complete_bow': ['to', 'the'], 'dep_path_bow': [], 'type': 'number', 'value': 35000000000.0, 'entity': 'United States Teenagers', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 0, 'complete_bow': ['to', 'the'], 'dep_path_bow': [], 'type': 'number', 'value': 35000000000.0, 'entity': 'United States Teenagers', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['refers', 'to', 'the', 'process', 'of', 'applying', 'for', 'entrance', 'to', 'institutions', 'of', 'higher', 'education', 'for', 'undergraduate', 'study', 'at'], 'dep_path_bow': [], 'type': 'number', 'value': 1, 'entity': 'United States Teenagers', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_

Annotated
Target 37591		Actual 2		Class		0
Target 37591		Actual 3		Class		0
Target 37591		Actual 2001		Class		0
Target 37591		Actual 2000		Class		0
Target 37591		Actual 2		Class		0
Target 37591		Actual 3		Class		0
Target 37591		Actual 2001		Class		0
Target 37591		Actual 2000		Class		0
Target 37591		Actual 1999		Class		0
Target 37591		Actual 1997		Class		0
Target 37591		Actual 1997		Class		0
Target 37591		Actual 20		Class		0
Target 37591		Actual 63		Class		0
Target 37591		Actual 2001		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': [','], 'dep_path_bow': [], 'type': 'number', 'value': 2, 'entity': 'United States Teenagers', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': [',', 'two', 'out', 'of', 'every'], 'dep_path_bow': [], 'type': 'number', 'value': 3, 'entity': 'United States Teenagers', 'entity_utterance': 'United States'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['-RRB-', 'reported', 'th

Annotated
[]
Done URL 2 out of 10
 
Looking in document for values similar to 3282570
https://en.wikipedia.org/wiki/Colony_collapse_disorder
https://en.wikipedia.org/wiki/Colony_collapse_disorder
164 candidate matches
['While such disappearances have occurred throughout the history of apiculture , and were known by various names -LRB- disappearing disease , spring dwindle , May disease , autumn collapse , and fall dwindle disease -RRB- , the syndrome was renamed colony collapse disorder in late 2006 in conjunction with a drastic rise in the number of disappearances of western honey bee -LRB- Apis mellifera -RRB- colonies in North America .', 'European beekeepers observed similar phenomena in Belgium , France , the Netherlands , Greece , Italy , Portugal , and Spain , Switzerland and Germany , albeit to a lesser degree , and the Northern Ireland Assembly received reports of a decline greater than 50 % .', 'According to the Agriculture and Consumer Protection Department of the Food and A

Annotated
Target 3282570		Actual 4		Class		0
Target 3282570		Actual 2015		Class		0
Target 3282570		Actual 1284		Class		0
Target 3282570		Actual 7		Class		0
Target 3282570		Actual 60		Class		0
Target 3282570		Actual 1		Class		0
Target 3282570		Actual 2013		Class		0
Target 3282570		Actual 2013		Class		0
Target 3282570		Actual 2692		Class		0
Target 3282570		Actual 5		Class		0
Target 3282570		Actual 2006		Class		0
Target 3282570		Actual 1908		Class		0
[{'class': 0, 'header_match_intersection': 0, 'complete_bow': ['March', '2015', 'as', 'the', 'Saving'], 'dep_path_bow': [], 'type': 'date', 'value': 4, 'entity': 'America', 'entity_utterance': 'America'}, {'class': 0, 'header_match_intersection': 0, 'complete_bow': ['as', 'the', 'Saving'], 'dep_path_bow': [], 'type': 'date', 'value': 2015, 'entity': 'America', 'entity_utterance': 'America'}, {'class': 0, 'header_match_intersection': 0, 'complete_bow': ["'s", 'Pollinators', 'Act', '-LRB-', 'H.R.'], 'dep_path_bow': [], 'type': 'date', 'value': 

Annotated
[]
Done URL 9 out of 10
 
Looking in document for values similar to 3282570
http://usda.mannlib.cornell.edu/MannUsda/viewDocumentInfo.do?documentID=1191
http://usda.mannlib.cornell.edu/MannUsda/viewDocumentInfo.do?documentID=1191
1 candidate matches
['Description : This file contains the annual report of the number of colonies producing honey , yield per colony , honey production , average price , price by color class and value ; honey stocks by state and U.S. Publication Coverage : Sep 24 , 1976 to Mar 14 , 2018 Select a decade or year to expand/collapse .']
About to parse
Annotated
[]
Done URL 10 out of 10
 
Looking in document for values similar to 3282570
https://fas.org/sgp/crs/misc/R43191.pdf
https://fas.org/sgp/crs/misc/R43191.pdf
herox/9.csv	550000000	"United States" Financial Intermediary Funds 2016

Done 11 out of 16
Search for "United States" Financial Intermediary Funds 2016
Query already executed
Missing file. Remove from dictionary and search again
Search for "U

404 candidate matches
['2017 : 42.0 % 7 82016 : 39.0 % 2015 : 41.0 % 2014 : 31.0 % 92013 : 37 % 102012 : 33.1 % 9 10\xa011\xa0122010 : 31.1 % 2008 : 34.0 % 2006 : 33.1 % 2004 : 34.7 % 2002 : 33.5 % 2000 : 32.4 % 1998 : 34.8 % 1996 : 40.1 % 1994 : 40.6 % 1993 : 42.0 % 1991 : 39.6 % 1990 : 42.2 % 1989 : 46.0 % 1988 : 39.8 % 1987 : 46.0 % 1985 : 44.2 % 1984 : 44.9 % 1982 : 45.3 % 1980 : 47.3 % 1977 : 50.4 % 1976 : 46.5 % 1974 : 46.1 % 1973 : 47.0 % 2012 : 20.5 % 132010 : 19.4 % 2008 : 25.2 % 2006 : 20.4 % 2004 : 21.0 % 2002 : 22.2 % 2000 : 21.6 % 1998 : 22.5 % 1996 : 25.8 % 1994 : 27.3 % 1993 : 25.7 % 1991 : 27.8 % 1990 : 27.8 % 1989 : 29.6 % 1988 : 27.3 % 1987 : 30.0 % 1985 : 31.4 % 1984 : 29.6 % 1982 : 32.5 % 1980 : 31.2 % 1977 : 33.5 % 1976 : 30.5 % 1974 : 28.4 % 1973 : 30.8 % 2012 : 21.4 % 142010 : 19.7 % 2008 : 25.4 % 2006 : 20.3 % 2004 : 20.0 % 2002 : 23.2 % 2000 : 20.0 % 1998 : 22.1 % 1996 : 27.2 % 1994 : 27.1 % 1993 : 29.7 % 1991 : 29.0 % 1990 : 28.6 % 1989 : 30.6 % 1988 : 26.4 % 

Annotated
Target 11078		Actual 3		Class		0
Target 11078		Actual 20		Class		0
Target 11078		Actual 35		Class		0
Target 11078		Actual 2017		Class		0
Target 11078		Actual 1981		Class		0
Target 11078		Actual 2010		Class		0
Target 11078		Actual 2012		Class		0
Target 11078		Actual 2015		Class		0
Target 11078		Actual 18		Class		0
Target 11078		Actual 1968		Class		0
Target 11078		Actual 18		Class		0
Target 11078		Actual 1968		Class		0
Target 11078		Actual 2012		Class		0
Target 11078		Actual 35		Class		0
Target 11078		Actual 2012		Class		0
Target 11078		Actual 1993		Class		0
Target 11078		Actual 2011		Class		0
Target 11078		Actual 18		Class		0
Target 11078		Actual 1990		Class		0
Target 11078		Actual 18		Class		0
Target 11078		Actual 1990		Class		0
Target 11078		Actual 207		Class		0
Target 11078		Actual 1		Class		0
Target 11078		Actual 100000		Class		0
Target 11078		Actual 2003		Class		0
Target 11078		Actual 18		Class		0
Target 11078		Actual 1968		Class		0
Target 11078		Actual 18		Class		0
Targe

No meaningful text in this document
Done URL 6 out of 10
 
Looking in document for values similar to 11078
https://en.wikipedia.org/wiki/Gun_violence_in_the_United_States
https://en.wikipedia.org/wiki/Gun_violence_in_the_United_States
231 candidate matches
["In 2013 , there were 73,505 nonfatal firearm injuries -LRB- 23.2 injuries per 100,000 U.S. citizens -RRB- , and 33,636 deaths due to `` injury by firearms '' -LRB- 10.6 deaths per 100,000 U.S. citizens -RRB- .", "These deaths consisted of 11,208 homicides , 21,175 suicides , 505 deaths due to accidental or negligent discharge of a firearm , and 281 deaths due to firearms use with `` undetermined intent '' .", 'In 2012 , there were 8,855 total firearm-related homicides in the US , with 6,371 of those attributed to handguns .', 'In 2012 , 64 % of all gun-related deaths in the U.S. were suicides .', 'In 2010 , there were 19,392 firearm-related suicides , and 11,078 firearm-related homicides in the U.S. .', 'In 2010 , 358 murders were 

Annotated
Target 11078		Actual 1		Class		0
Target 11078		Actual 2015		Class		0
Target 11078		Actual 2007		Class		0
Target 11078		Actual 25		Class		0
Target 11078		Actual 60		Class		0
Target 11078		Actual 25		Class		0
Target 11078		Actual 60		Class		0
Target 11078		Actual 1		Class		0
Target 11078		Actual 2011		Class		0
Target 11078		Actual 2014		Class		0
Target 11078		Actual 355		Class		0
Target 11078		Actual 2015		Class		0
Target 11078		Actual 1791		Class		0
Target 11078		Actual 2001		Class		0
Target 11078		Actual 2001		Class		0
Target 11078		Actual 0.86		Class		0
Target 11078		Actual 100000		Class		0
Target 11078		Actual 0.26		Class		0
Target 11078		Actual 1996		Class		0
Target 11078		Actual 19392		Class		0
Target 11078		Actual 2013		Class		0
Target 11078		Actual 2010		Class		0
Target 11078		Actual 44000000.0		Class		0
Target 11078		Actual 44000000.0		Class		0
Target 11078		Actual 1997		Class		0
Target 11078		Actual 2016		Class		0
Target 11078		Actual 4		Class		0
Target 11078		Actual 

No meaningful text in this document
Done URL 8 out of 10
 
Looking in document for values similar to 11078
http://www.gunpolicy.org/firearms/compareyears/194/number_of_gun_homicides
http://www.gunpolicy.org/firearms/compareyears/194/number_of_gun_homicides
Done URL 9 out of 10
 
Looking in document for values similar to 11078
https://www.theguardian.com/news/datablog/2012/jul/22/gun-homicides-ownership-world-list
https://www.theguardian.com/news/datablog/2012/jul/22/gun-homicides-ownership-world-list
5 candidate matches
["The Small Arms Survey is also useful - although it is from 2007 , it collates civilian gun ownership rates for 178 countries around the world , and has ` normalised ' the data to include a rate per 100,000 population .", "With less than 5 % of the world 's population , the United States is home to roughly 35 -- 50 per cent of the world 's civilian-owned guns , heavily skewing the global geography of firearms and any relative comparison So , given those caveats , we ca

Annotated
Target 167000		Actual 1		Class		0
Target 167000		Actual 1		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': ['of', 'several', 'different', 'types', 'of', 'school', 'in', 'the', 'history', 'of', 'education', 'in', 'the'], 'dep_path_bow': ['one'], 'type': 'number', 'value': 1, 'entity': 'United Kingdom', 'entity_utterance': 'United Kingdom'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['that', 'follows', 'the', 'academic', 'and', 'cultural', 'traditions', 'established', 'in', 'the'], 'dep_path_bow': [], 'type': 'number', 'value': 1, 'entity': 'United Kingdom', 'entity_utterance': 'United Kingdom'}]
Done URL 2 out of 10
 
Looking in document for values similar to 167000
https://en.wikipedia.org/wiki/Manchester_Free_School
https://en.wikipedia.org/wiki/Manchester_Free_School
88 candidate matches
['The Manchester Grammar School -LRB- MGS -RRB- is the largest independent day school for boys in the United Kingdom -LRB- ages 7 -- 18 -RRB- and i

Annotated
Target 167000		Actual 7		Class		0
Target 167000		Actual 18		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': ['-LRB-', 'ages'], 'dep_path_bow': [], 'type': 'number', 'value': 7, 'entity': 'United Kingdom', 'entity_utterance': 'United Kingdom'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['-LRB-', 'ages', '7', '--'], 'dep_path_bow': [], 'type': 'number', 'value': 18, 'entity': 'United Kingdom', 'entity_utterance': 'United Kingdom'}]
Done URL 3 out of 10
 
Looking in document for values similar to 167000
https://www.facebook.com/pages/Alcester-Grammar-School/106141472749489
https://www.facebook.com/pages/Alcester-Grammar-School/106141472749489
No meaningful text in this document
Done URL 4 out of 10
 
Looking in document for values similar to 167000
https://www.theguardian.com/education/2016/aug/22/third-britain-medallists-rio-olympics-private-schools-sutton-trust
https://www.theguardian.com/education/2016/aug/22/third-britain-medallists-

No meaningful text in this document
Done URL 8 out of 10
 
Looking in document for values similar to 167000
https://www.justlanded.com/english/United-Kingdom/UK-Guide/Education/Introduction
https://www.justlanded.com/english/United-Kingdom/UK-Guide/Education/Introduction
42 candidate matches
['Full-time education is compulsory in the UK for all children between the ages of 5 -LRB- 4 in Northern Ireland -RRB- and 16 , including the children of foreign nationals permanently or temporarily resident in the UK for a year or longer .', 'No fees are payable in state schools , which are attended by over 90 per cent of pupils .', 'The rest attend one of the 3,200 private fee-paying schools , which include American , international and foreign schools .', 'A large majority of pupils stay on at school after the age of 16 or go on to higher education , but a study in 2012 showed a drop of almost 32,000 students staying in education post-16 .', 'Currently , in 2013 , young people in Year 11 -LRB- En

Annotated
[]
Done URL 2 out of 10
 
Looking in document for values similar to 91.5
https://data.oecd.org/healthcare/child-vaccination-rates.htm
https://data.oecd.org/healthcare/child-vaccination-rates.htm
1 candidate matches
['It is measured as a percentage of children at around age 1 .']
About to parse
Annotated
[]
Done URL 3 out of 10
 
Looking in document for values similar to 91.5
https://www.cdc.gov/measles/stats-surv.html
https://www.cdc.gov/measles/stats-surv.html
Done URL 4 out of 10
 
Looking in document for values similar to 91.5
https://en.wikipedia.org/wiki/Measles
https://en.wikipedia.org/wiki/Measles
97 candidate matches
['Symptoms usually develop 10 -- 12 days after exposure to an infected person and last 7 -- 10 days .', 'Initial symptoms typically include fever , often greater than 40 ° C -LRB- 104.0 ° F -RRB- , cough , runny nose , and inflamed eyes .', "Small white spots known as Koplik 's spots may form inside the mouth two or three days after the start of symptoms 

Annotated
[]
Done URL 5 out of 10
 
Looking in document for values similar to 91.5
https://www.usatoday.com/story/news/2015/02/04/schoolvaccinationrates/22840549/
https://www.usatoday.com/story/news/2015/02/04/schoolvaccinationrates/22840549/
17 candidate matches
['Nearly one in seven public and private schools have measles vaccination rates below 90 % -- a rate considered inadequate to provide immunity , according to a USA TODAY analysis of immunization data in 13 states .', 'Hundreds of thousands of students attend schools -- ranging from small , private academies in New York City to large public elementary schools outside Boston to Native American reservation schools in Idaho -- where vaccination rates have dropped precipitously low , sometimes under 50 % .', 'In the 32 public elementary schools in Boise , Idaho , for example , vaccination rates for measles in 2013-14 ranged from 84.5 % at William Howard Taft Elementary to 100 % at Adams Elementary , just 4 miles away .', 'More trou

Annotated
Target 91.5		Actual 2012		Class		0
Target 91.5		Actual 1		Class		0
Target 91.5		Actual 2		Class		0
Target 91.5		Actual 70		Class		0
Target 91.5		Actual 1989		Class		0
Target 91.5		Actual 1991		Class		0
[{'class': 0, 'header_match_intersection': 1, 'complete_bow': ['youth', 'vaccination', 'rate', 'for', 'each'], 'dep_path_bow': [], 'type': 'date', 'value': 2012, 'entity': 'USA', 'entity_utterance': 'USA'}, {'class': 0, 'header_match_intersection': 0, 'complete_bow': ['data', 'helps'], 'dep_path_bow': [], 'type': 'number', 'value': 1, 'entity': 'USA', 'entity_utterance': 'USA'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['contraction', 'counts', 'and', 'youth', 'vaccination', 'rates', 'by', 'year', 'is', 'shown', 'above', '--', 'the'], 'dep_path_bow': [], 'type': 'number', 'value': 2, 'entity': 'USA', 'entity_utterance': 'USA'}, {'class': 0, 'header_match_intersection': 1, 'complete_bow': ['outbreak', 'of', 'the', 'disease', 'happened', 'between', 'the', 'yea

Annotated
[]
Done URL 8 out of 10
 
Looking in document for values similar to 91.5
https://newsatjama.jama.com/2013/09/16/vaccination-rates-for-us-children-remain-generally-high-but-measles-outbreaks-underscore-shortfalls-in-some-regions/
https://newsatjama.jama.com/2013/09/16/vaccination-rates-for-us-children-remain-generally-high-but-measles-outbreaks-underscore-shortfalls-in-some-regions/
15 candidate matches
About to parse
Annotated
[]
Done URL 9 out of 10
 
Looking in document for values similar to 91.5
http://www.dw.com/en/measles-infection-rate-triples-in-germany/a-43449784
http://www.dw.com/en/measles-infection-rate-triples-in-germany/a-43449784
8 candidate matches
['According to immunization quotas presented by the Robert Koch Institute -LRB- RKI -RRB- , a federal government agency and research institute responsible for disease control and prevention , all German states achieved a 95-percent vaccination rate for the first measles vaccination .', 'But vaccination rates averaged

Annotated
[]
Done URL 10 out of 10
 
Looking in document for values similar to 2.5
https://www.cia.gov/library/publications/the-world-factbook/fields/2226.html
https://www.cia.gov/library/publications/the-world-factbook/fields/2226.html
No meaningful text in this document
herox/15.csv	15.1	"USA" Daily Smokers 2014
Done 16 out of 16
Search for "USA" Daily Smokers 2014
Query already executed
Missing file. Remove from dictionary and search again
Search for "USA" Daily Smokers 2014
New Query
Done URL 1 out of 10
 
Looking in document for values similar to 15.1
https://en.wikipedia.org/wiki/List_of_countries_by_cigarette_consumption_per_capita
https://en.wikipedia.org/wiki/List_of_countries_by_cigarette_consumption_per_capita
7 candidate matches
['Cigarettes are smoked by over 1 billion people , which is nearly 20 % of the world population in 2014 .', 'About 800 million of these smokers are men .', 'More than 80 % of all smokers now live in countries with low or middle incomes , and 60 % in

Annotated
[]
Done URL 4 out of 10
 
Looking in document for values similar to 15.1
http://ec.europa.eu/eurostat/statistics-explained/index.php/Tobacco_consumption_statistics
http://ec.europa.eu/eurostat/statistics-explained/index.php/Tobacco_consumption_statistics
29 candidate matches
['This article is one of a set of statistical articles concerning health determinants in the EU which forms part of an online publication on health statistics .', 'The data in this article are from the European health interview survey -LRB- EHIS -RRB- which was conducted between 2013 and 2015 and which covered persons aged 15 and over .', 'Men were more likely than women to be daily smokers Among the 27 EU Member States for which data are available , the proportion of daily smokers ranged from 8.7 % in Sweden to 27.0 nbsp ; % in Greece and 27.3 % in Bulgaria -LRB- see Table 1 -RRB- .', 'Among men , the proportion of daily smokers ranged from 7.5 % in Sweden to 37.3 % in Cyprus , while among women , the pr

Annotated
[]
Done URL 8 out of 10
 
Looking in document for values similar to 15.1
https://www.livescience.com/48923-usa-smoking-declines-to-lowest.html
https://www.livescience.com/48923-usa-smoking-declines-to-lowest.html
13 candidate matches
['The percentage of U.S. adults who smoke cigarettes was 17.8 percent in 2013 , a drop from 20.9 percent in 2005 , and the lowest rate of smoking since researchers began tracking this figure in 1965 , according to the report from the U.S. Centers for Disease Control and Prevention -LRB- CDC -RRB- .', 'The report also found the number of cigarette smokers was 42.1 million in 2013 , a drop from 45.1 million in 2005 , even though the U.S. population is increasing .', '-LSB- 10 Scientific Quit-Smoking Tips -RSB- The decline in smoking rates varied among different populations and regions .', 'For the first time , the researchers had data to break out the smoking rate of people who are lesbian , gay or bisexual .', 'The smoking rate in this group was 2

## Fact Checking

### Training
Load Modules for fact checking, generate the features and train our classifier from our training data

In [4]:
from classifier.Classifier import Classifier
from classifier.LogisticRegressionClassifier import LogisticRegressionClassifier
from classifier.features.generate_features import FeatureGenerator, num, is_num
from distant_supervision.utterance_detection import f_threshold_match
from factchecking.question import Question
from tabular.filtering import load_collection

In [5]:
fg = FeatureGenerator()
Xs,ys = fg.generate_training(world)

Done: 0.0
Search for ”Exxon Mobil" Market Value
Query already executed
Done: 6.25
Search for "Unaccompanied children" claimed asylum
Query already executed
Done: 12.5
Search for "Hamas" Founded
Query already executed
Done: 18.75
Search for "United States" Average Temperature
Query already executed
Done: 25.0
Search for "United States" Life expectancy
Query already executed
Done: 31.25
Search for "United States" Number of abortions
Query already executed
Done: 37.5
Search for "United States" Abortion Rate per 1,000 births
Query already executed
Done: 43.75
Search for "United States Teenagers" Percentage Enrolled in education
Query already executed
Done: 50.0
Search for "United States Teenagers" Enrolled in education
Query already executed
Done: 56.25
Search for "America" bee colonies 2011
Query already executed
Done: 62.5
Search for "United States" Financial Intermediary Funds 2016
Query already executed
Done: 68.75
Search for "United States" Homocides by firearm
Query already executed


In [6]:
from sklearn.linear_model import LogisticRegression
class LogisticRegressionClassifier(Classifier):
    def train(self, Xs, ys):
        print("Training classifier 3")
        self.lr = LogisticRegression(penalty='l1', C=0.78)
        self.lr.fit(Xs, ys)
        print("Trained")

    def predict(self, q_features):
        ys = (self.lr.predict(q_features), self.lr.predict_proba(q_features))
        return ys


classifier = LogisticRegressionClassifier()
classifier.train(Xs,ys)

Training classifier 3
Trained


### Runtime

Load the source data

In [7]:
tables = load_collection("herox")
print(tables.files)

LOADED:
[{'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/1.tsv', 'id': 'hx-0'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/2.tsv', 'id': 'hx-1'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/3.tsv', 'id': 'hx-2'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/4.tsv', 'id': 'hx-3'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/5.tsv', 'id': 'hx-4'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/8.tsv', 'id': 'hx-5'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/9.tsv', 'id': 'hx-6'}, {'answer': '0', 'utterance': 'No Utterance', 'table': 'herox/10.tsv', 'id': 'hx-7'}, {'answer': '0', 'utterance': 'N', 'table': 'herox/11.tsv', 'id': 'hx-8'}, {'answer': '0', 'utterance': 'N', 'table': 'herox/12.tsv', 'id': 'hx-9'}, {'answer': '0', 'utterance': 'N', 'table': 'herox/13.tsv', 'id': 'hx-10'}, {'answer': '0', 'utterance': 'N', 'table': 'herox/14.tsv', 'id': 'hx-11'}]
register table herox

Define the fact checking function

In [8]:
def fact_check(q):
    question = Question(text=q, type="NUM")
    question.parse()
    tuples,q_features = fg.generate_test(tables,question)
    q_match = False
    
    if len(tuples)>0:
        q_predicted = classifier.predict(q_features)

        for i in range(len(tuples)):
            tuple = tuples[i]
            
            skip = False
            if 'date' in tuple[1].keys() and len(question.dates):
                for date in question.dates:
                    dstrs = set()
                    for d in question.dates:
                        dstrs.add(str(d))
                    if not len(set(tuple[1]['date']).intersection(dstrs)):
                        skip = True
                        
            if skip:
                continue
    

            if is_num(tuple[1]['value']):
                prediction = q_predicted[0][i]
                features = q_features[i]

                
             
                if prediction == 1:
                    print(str(tuple) + "\t\t" + ("Possible Match" if prediction else "No match"))
                    for number in question.numbers:
                        value = num(tuple[1]['value'])

                        if value is None:
                            continue

                        if f_threshold_match(number, value, 0.05):
                            print(str(tuple) + "\t\t" + "Threshold Match to 5%")
                            q_match = True

                    for number in question.dates:
                        value = num(tuple[1]['value'])
                        if number == value:
                            print(str(tuple) + "\t\t" + "Exact Match")
                            q_match = True
        print(question.text)
        print(q_match)

    else:
        print(question.text)
        print("No supporting information can be found in the knowledge base")
    print("\n\n")

# Fact checking

In [9]:
fact_check("23,000 sheep Germany 2014")

('herox/2.tsv', {'relation': 'Asylum applicants considered to be unaccompanied minors', 'entity': 'Germany', 'value': '4400', 'date': ['2014']})		Possible Match
('herox/2.tsv', {'relation': 'Asylum applicants considered to be unaccompanied minors', 'entity': 'Germany', 'value': '4400', 'date': ['2014']})		Possible Match
('herox/2.tsv', {'relation': 'Asylum applicants considered to be unaccompanied minors', 'entity': 'Germany', 'value': '4400', 'date': ['2014']})		Possible Match
('herox/2.tsv', {'relation': 'Asylum applicants considered to be unaccompanied minors', 'entity': 'Germany', 'value': '4400', 'date': ['2014']})		Possible Match
('herox/5.tsv', {'relation': 'DEU', 'entity': 'Germany', 'value': '80.84390244', 'date': ['2014']})		Possible Match
('herox/5.tsv', {'relation': 'DEU', 'entity': 'Germany', 'value': '80.84390244', 'date': ['2014']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Germany', 'value': '80.84390244', 'date': ['2014']})		Po

In [10]:
fact_check("-5 degree celsius was the temperature in USA")

('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Threshold Match to 5%
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Threshold Match to 5%
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Threshold Match to 5%
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-5.2027664', 'date': ['1901', 'January'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature

('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-0.42599672', 'date': ['1927', 'March'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-0.42599672', 'date': ['1927', 'March'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-0.42599672', 'date': ['1927', 'March'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '-0.42599672', 'date': ['1927', 'March'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '4.801546', 'date': ['1927', 'April'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '4.801546', 'date': ['1927', 'April'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '4.801546', 'date': ['1927', 'April'], 'entity': 'USA'})		Possible Match
('herox/4.tsv', {'relation': 'Temperature (C)', 'value': '4.801546', 'date': [

('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'United Kingdom', 'value': '79.24878049', 'date': ['2006']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'United Kingdom', 'value': '79.24878049', 'date': ['2006']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'United Kingdom', 'value': '79.24878049', 'date': ['2006']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'United Kingdom', 'value': '79.24878049', 'date': ['2006']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Other small states', 'value': '62.47576521', 'date': ['2006']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Other small states', 'value': '62.47576521', 'date': ['2006']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Other small states', 'value': '62.47576521', 'date': ['2006'

('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Latin America & the Caribbean (IDA & IBRD countries)', 'value': '73.24000644', 'date': ['2007']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Latin America & the Caribbean (IDA & IBRD countries)', 'value': '73.24000644', 'date': ['2007']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Latin America & the Caribbean (IDA & IBRD countries)', 'value': '73.24000644', 'date': ['2007']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Latin America & Caribbean (excluding high income)', 'value': '73.37919597', 'date': ['2008']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Latin America & Caribbean (excluding high income)', 'value': '73.37919597', 'date': ['2008']})		Possible Match
('herox/5.tsv', {'relation': 'Life expectancy at birth', 'entity': 'Latin America & Caribbean (e