# NED Benchmark Tutorial

In this tutorial, we demonstrate how to use a pretrained Bootleg NED model to run inference on RSS500 and KORE50, two standard sentence-level NED benchmarks.


### Requirements 

To run this tutorial, you'll need to download the following: 

- Pretrained Bootleg model and config [here](https://bootleg-emb.s3.amazonaws.com/models/2020_10_22/bootleg_wiki.tar.gz)
- RSS500 data [here](https://bootleg-emb.s3.amazonaws.com/data/rss500.tar.gz)
- KORE50 data [here](https://bootleg-emb.s3.amazonaws.com/data/kore50.tar.gz)
- Entity data [here](https://bootleg-emb.s3.amazonaws.com/data/wiki_entity_data.tar.gz)
- Embedding data [here](https://bootleg-emb.s3.amazonaws.com/data/emb_data.tar.gz)
- Pretrained BERT model [here](https://bootleg-emb.s3.amazonaws.com/pretrained_bert_models.tar.gz)

For convenience, you can run the commands below (from the root directory of the repo) to download all the above files and unpack them to `models`, `data`, and `pretrained_bert_models` directories. It will take several minutes to download all the files. 

    bash download_model.sh 
    bash download_data.sh 
    bash download_bert.sh
    

## 1. Prepare the Config File

Necessary import statements. You will need to have installed bootleg as package to run these (see Installation instructions in the README).

In [1]:
import sys
import logging
from importlib import reload
reload(logging)
logging.basicConfig(stream=sys.stdout, format='%(asctime)s %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

from bootleg import run
from bootleg.utils.parser_utils import get_full_config

If you have a GPU with at least 12GB of memory available, set the below to `False` to run inference on the CPU. 

In [2]:
use_cpu = True

Load the model config so we can set additional parameters and load the saved model during evaluation.  

In [3]:
root_dir = # FILL IN FULL PATH TO ROOT REPO DIRECTORY HERE
config_path = f'{root_dir}/models/bootleg_wiki/bootleg_config.json'
config_args = get_full_config(config_path)

Update the config parameters to point to the downloaded model checkpoint and data. 

In [4]:
# set the model checkpoint path 
config_args.run_config.init_checkpoint = f'{root_dir}/models/bootleg_wiki/bootleg_model.pt'

# set the path for the entity db and candidate map
config_args.data_config.entity_dir = f'{root_dir}/data/wiki_entity_data'
config_args.data_config.alias_cand_map = 'alias2qids_rss500.json'

# set the data path and RSS500 test file 
config_args.data_config.data_dir = f'{root_dir}/data/rss500'
config_args.data_config.test_dataset.file = 'test_rss500.jsonl'

# set the embedding paths 
config_args.data_config.emb_dir =  f'{root_dir}/data/emb_data'
config_args.data_config.word_embedding.cache_dir =  f'{root_dir}/pretrained_bert_models'

# set the save directory 
config_args.run_config.save_dir = f'{root_dir}/results'

# set whether to run inference on the CPU
config_args.run_config.cpu = use_cpu

## 2. Run Inference for RSS500

Once the config is set up, run model evaluation! You should get that 428/450 mentions (men) are correct (crct). 

In [5]:
run.model_eval(args=config_args, mode="eval", logger=logger, is_writer=True)

2020-10-21 17:19:19,476 PyTorch version 1.5.0 available.




2020-10-21 17:19:23,012 TensorFlow version 2.2.0 available.
2020-10-21 17:19:23,775 Loading entity_symbols...
2020-10-21 17:19:29,472 Loaded entity_symbols with 5310039 entities.
2020-10-21 17:19:29,574 Loading slices...
2020-10-21 17:19:29,607 Finished loading slices.
2020-10-21 17:19:47,648 Loading dataset...
2020-10-21 17:19:47,679 Finished loading dataset.
2020-10-21 17:19:54,652 Sampled 353 indices from dataset (dev/test) for evaluation.
2020-10-21 17:19:55,147 Loading embeddings...
2020-10-21 17:20:20,700 Finished loading embeddings.
2020-10-21 17:20:20,844 Loading model from /dfs/scratch0/lorr1/bootleg/bootleg-internal/new_tutorial_data/models/bootleg_wiki/bootleg_model.pt...
2020-10-21 17:20:28,923 Successfully loaded model from /dfs/scratch0/lorr1/bootleg/bootleg-internal/new_tutorial_data/models/bootleg_wiki/bootleg_model.pt starting from checkpoint epoch 1 and step 0.
2020-10-21 17:20:29,016 ************************RUNNING EVAL test_rss500.jsonl************************
2020-

Running eval: 100%|██████████| 23/23 [00:22<00:00,  1.04it/s]

2020-10-21 17:20:51,319 
+------------+------------+-------+--------+-------------+------------+-------+-----------+----------+
| head       | slice      |   men |   crct |   crct_top5 |   crct_pop |    f1 |   f1_top5 |   f1_pop |
| final_loss | final_loss |   520 |    429 |         474 |        276 | 0.825 |     0.912 |    0.531 |
+------------+------------+-------+--------+-------------+------------+-------+-----------+----------+





The `final_loss` head corresponds to the final prediction head. We only have a single data subset, or slice, for the benchmark, which is the overall slice that includes all mentions (we call this slice `final_loss`). 

The `f1_pop` metric is the score on simple baseline which simply predicts most popular candidate for each mention without any other contextual information. We see that Bootleg improves over 29 F1 points over this baseline. 

The F1 score reported here is the micro-averaged F1 score over the entities and assumes 100% candidate recall (every mention has a candidate list). However, for the benchmarks, some mentions are in the benchmark but do not have a corresponding candidate list. Thus, for benchmarks we need to re-compute the F1 taking into account the candidate recall, where the number above is equivalent to the benchmark precision. 

## 3. Analyze the Errors

To understand what examples Bootleg gets wrong, we also support a `dump_preds` mode. Rather than computing aggregate quality metrics, this mode writes a jsonlines file with the predicted candidates for each mention and their associated probabilities. 

Running this is very similar as before, except we need to switch the mode to `dump_preds`. 

In [6]:
pred_file, _ = run.model_eval(args=config_args, mode="dump_preds", logger=logger, is_writer=True)

2020-10-21 17:20:51,382 Loading entity_symbols...
2020-10-21 17:20:57,033 Loaded entity_symbols with 5310039 entities.
2020-10-21 17:20:57,038 Loading slices...
2020-10-21 17:20:57,043 Finished loading slices.
2020-10-21 17:21:15,805 Loading dataset...
2020-10-21 17:21:15,818 Finished loading dataset.
2020-10-21 17:21:22,605 Loading embeddings...
2020-10-21 17:22:00,844 Finished loading embeddings.
2020-10-21 17:22:01,200 Loading model from /dfs/scratch0/lorr1/bootleg/bootleg-internal/new_tutorial_data/models/bootleg_wiki/bootleg_model.pt...
2020-10-21 17:22:13,147 Successfully loaded model from /dfs/scratch0/lorr1/bootleg/bootleg-internal/new_tutorial_data/models/bootleg_wiki/bootleg_model.pt starting from checkpoint epoch 1 and step 0.
2020-10-21 17:22:13,259 ************************DUMPING PREDICTIONS FOR test_rss500.jsonl************************
2020-10-21 17:22:13,361 368 samples, 23 batches, 353 len dataset
2020-10-21 17:26:29,812 Writing predictions to /dfs/scratch0/lorr1/bootle

We provide utility functions to load in the predicted labels as well as the original file and generate a merged [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html) with both predicted and gold labels. 

In [7]:
from utils import score_predictions, load_title_map, load_cand_map
import pandas as pd
pd.options.display.max_colwidth = 500

title_map = load_title_map(f'{root_dir}/data/wiki_entity_data/entity_mappings')
pred_df = score_predictions(orig_file=f'{root_dir}/data/rss500/test_rss500.jsonl', 
                 pred_file=pred_file,
                 title_map=title_map)

100%|██████████| 353/353 [00:00<00:00, 7404.76it/s]


Let's take a look at the DataFrame! The schema returned is:

`sentence`: the sentence,

`sent_idx`: the unique index of the sentences,

`aliases`: all mentions in the sentence,

`span`: the span of the predicted mention,

`slices`: the slices this example is in (see the training tutorials for definitions),

`alias`: the mention to predict,

`alias_idx`: the index of the mention (the first mention, second mentions, ...),

`is_gold_label`: is the mention a weak label or a gold label from the data,

`gold_qid`: the gold entity,

`pred_qid`: the predicted entity,

`gold_title`: the gold title,

`pred_title`: the predicted title,

`all_gold_qids`: all gold entities in the sentence,

`all_pred_qids`: all predicted entities,

`gold_label_aliases`: all gold mentions in the sentence,

`all_is_gold_labels`: which mentions are gold or weak labels,

`all_spans`: the spans of all mentions

In [8]:
pred_df.sample(10)

Unnamed: 0,sentence,sent_idx,aliases,span,slices,alias,alias_idx,is_gold_label,gold_qid,pred_qid,gold_title,pred_title,all_gold_qids,all_pred_qids,gold_label_aliases,all_is_gold_labels,all_spans
427,"When Los Angeles Lakers player Kobe Bryant faced sexual assault charges in Vail in 2003 , it took a media challenge to unseal an affidavit in which police laid out their case for an arrest .",289,"[los angeles lakers_289, kobe bryant_289]","[1, 4]",[],los angeles lakers_289,0,True,Q121783,Q121783,Los Angeles Lakers,Los Angeles Lakers,"[Q121783, Q25369]","[Q121783, Q25369]","[los angeles lakers_289, kobe bryant_289]","[True, True]","[[1, 4], [5, 7]]"
194,"`` We are monitoring the situation , `` European Commission spokeswoman Mina Andreeva told media .",136,[european commission_136],"[8, 10]",[],european commission_136,0,True,Q8880,Q8880,European Commission,European Commission,[Q8880],[Q8880],[european commission_136],[True],"[[8, 10]]"
300,"An Oct. 5 , 2011 , photo shows Apple founder Steve Jobs ' home in Palo Alto , Calif. .",204,"[apple_204, steve jobs_204]","[8, 9]",[],apple_204,0,True,Q312,Q312,Apple Inc.,Apple Inc.,"[Q312, Q19837]","[Q312, Q19837]","[apple_204, steve jobs_204]","[True, True]","[[8, 9], [10, 12]]"
363,"The B-52 took to the skies Tuesday , but no other information about the test flight was available , John Haire , a spokesman for Edwards Air Force Base in California , said in an email .",244,[edwards air force base_244],"[25, 29]",[],edwards air force base_244,0,True,Q217563,Q217563,Edwards Air Force Base,Edwards Air Force Base,[Q217563],[Q217563],[edwards air force base_244],[True],"[[25, 29]]"
386,"Klein comes to Portland by way of Norcross , Ga. , where she has been the human resource manager for the Deutz Corp. for the last four years .",260,[portland_260],"[3, 4]",[],portland_260,0,True,Q49201,Q6106,"Portland, Maine","Portland, Oregon",[Q49201],[Q6106],[portland_260],[True],"[[3, 4]]"
222,"In Seattle , Washington , Felix Hernandez pitched Seattle 's first perfect game and the 23rd in majors history as the Mariners edged Tampa Bay .",152,"[seattle_152, felix hernandez_152]","[1, 2]",[],seattle_152,0,True,Q466586,Q5083,Seattle Mariners,Seattle,"[Q466586, Q1196594]","[Q5083, Q1196594]","[seattle_152, felix hernandez_152]","[True, True]","[[1, 2], [5, 7]]"
196,"Maloaufa'atasi Faumuina Aug. 1 , 2012 Maloaufa'atasi Faumuina , 51 , of Ewa Beach died in Aiea .",138,"[ewa beach_138, aiea_138]","[12, 14]",[],ewa beach_138,0,True,Q2138861,Q2138861,"ʻEwa Beach, Hawaii","ʻEwa Beach, Hawaii","[Q2138861, Q423505]","[Q2138861, Q423505]","[ewa beach_138, aiea_138]","[True, True]","[[12, 14], [16, 17]]"
469,"Associated Press writers Frank Bajak in Lima , Peru , Jill Lawless , David Stringer and Raissa Ioussouf in London , and Louise Nordstrom and Karl Ritter in Stockholm all contributed to this report .",319,[associated press_319],"[0, 2]",[],associated press_319,0,True,Q40469,Q40469,Associated Press,Associated Press,[Q40469],[Q40469],[associated press_319],[True],"[[0, 2]]"
24,"That 's good , because CBS Sports analyst Boomer Esiason does n't think he has a prayer of earning a starting quarterback job in the NFL .",18,"[cbs sports_18, boomer esiason_18]","[5, 7]",[],cbs sports_18,0,True,Q2931052,Q2931052,CBS Sports,CBS Sports,"[Q2931052, Q725373]","[Q2931052, Q725373]","[cbs sports_18, boomer esiason_18]","[True, True]","[[5, 7], [8, 10]]"
462,"`` It 's an exceptional situation that has to do with Maxim 's sporting abilities , `` said Max Laulie , a spokesman for Chile 's prison police .",313,[chile_313],"[24, 25]",[],chile_313,0,True,Q3100431,Q298,Chilean Gendarmerie,Chile,[Q3100431],[Q298],[chile_313],[True],"[[24, 25]]"


You can also add the possible candidates with their associated socres by passing the `cand_map` arg to `score_predictions`.

In [10]:
# Load candidate mappings to pass as input
cands_map = load_cand_map(f'{root_dir}/data/wiki_entity_data/entity_mappings', 'alias2qids_rss500.json')
pred_df = score_predictions(orig_file=f'{root_dir}/data/rss500/test_rss500.jsonl',
                 pred_file=pred_file,
                 title_map=title_map,
                 cands_map=cands_map)


100%|██████████| 353/353 [00:00<00:00, 5514.59it/s]


In [11]:
pred_df.sample(10)

Unnamed: 0,sentence,sent_idx,aliases,span,slices,alias,alias_idx,is_gold_label,gold_qid,pred_qid,gold_title,pred_title,all_gold_qids,all_pred_qids,gold_label_aliases,all_is_gold_labels,all_spans,cands
161,"But Steven Smith , 29 , a customer at a Dunkin ' Donuts in downtown Boston , said he would rather pay with a credit card .",114,"[dunkin donuts_114, boston_114]","[15, 16]",[],boston_114,1,True,Q100,Q100,Boston,Boston,"[Q847743, Q100]","[Q847743, Q100]","[dunkin donuts_114, boston_114]","[True, True]","[[10, 13], [15, 16]]","[(Boston, 1.5685531252529472e-05), (Boston Red Sox, 2.532582584535703e-05), (New England Patriots, 3.9017930248519406e-05), (Boston University, 2.729496918618679e-05), (Boston Bruins, 1.7604202184884343e-06), (2018 Boston Red Sox season, 2.3378809146379353e-06), (Boston (band), 2.6560886908555403e-05), (Boston accent, 4.223111318424344e-05), (Boston Garden, 2.684879655134864e-05), (David Boston, 8.435853487753775e-06), (Walter Boston, 0.0006575779407285154), (Sports in Boston, 0.020751932635..."
491,"US and British regulators have already fined Barclays , based in Britain , US $ 453 million for submitting false information between 2005 and 2009 to keep the interest rate , known as LIBOR , low .",334,"[barclays_334, britain_334]","[7, 8]",[],barclays_334,0,True,Q245343,Q245343,Barclays,Barclays,"[Q245343, Q23666]","[Q245343, Q145]","[barclays_334, britain_334]","[True, True]","[[7, 8], [11, 12]]","[(Barclays, 0.9999998807907104), (Eliza Henderson Boardman Otis, 8.943177931541868e-08)]"
375,"Kayla , who lives in Marblehead , is the first United States athlete to ever win a gold medal at the Olympics in judo .",253,"[kayla_253, marblehead_253]","[0, 1]",[],kayla_253,0,True,Q2358331,Q2358331,Kayla Harrison,Kayla Harrison,"[Q2358331, Q27416]","[Q2358331, Q27416]","[kayla_253, marblehead_253]","[True, True]","[[0, 1], [5, 6]]","[(List of Desperate Housewives characters, 0.6721351742744446), (Kayla Brady, 0.00020057997608091682), (Kayla Ewell, 0.000505270843859762), (Kayla Day, 0.0014676775317639112), (List of captive killer whales, 7.170605385908857e-05), (Kayla Harrison, 0.048275385051965714), (Kayla Iacovino, 0.009471606463193893), (Kayla Williams (author), 0.0005187817150726914), (Kayla Bashore Smedley, 0.16545122861862183), (Kayla Mueller, 0.00010972433665301651), (Kayla Banwarth, 0.0026879259385168552), (Kayla..."
455,"`` It was an open match , `` Marseille coach Elie Baup said .",308,"[marseille_308, elie baup_308]","[8, 9]",[],marseille_308,0,True,Q132885,Q132885,Olympique de Marseille,Olympique de Marseille,"[Q132885, Q128932]","[Q132885, Q128932]","[marseille_308, elie baup_308]","[True, True]","[[8, 9], [10, 12]]","[(Marseille, 3.8954611227381974e-05), (Olympique de Marseille, 6.262424108172127e-07), (Opéra de Marseille, 1.476374836784089e-05), (Marseille Provence Airport, 1.2835255525089906e-08), (Open 13, 7.683397029722983e-07), (CN Marseille, 8.204135610867525e-07), (Marseille tramway, 2.679290673768264e-07), (Marseille (band), 3.7334348235162906e-07), (Hervé Marseille, 2.2851475023344392e-07), (Kencia Marseille, 1.041611881191784e-06), (Wagner Marseille, 1.059897236643792e-08), (Hans-Joachim Marsei..."
380,"U of L College Republicans host first annual Legacy Dinner Students , elected officials and candidates gather to discuss politics By : Katherine Smith Larry Cox , state director for Senator Mitch McConnell , and Darrell Brock , chairman of the Republican Party of Kentucky , both attended the first annual Legacy Dinner Republican Party of Kentucky chairman Darrell Brock addressed the crowd at the first annual College Republican Legacy Dinner On Monday , April 10th , the University of Louisvil...",256,"[darrell brock_256, republican party of kentucky_256]","[41, 45]",[],republican party of kentucky_256,1,True,Q7314656,Q7314656,Republican Party of Kentucky,Republican Party of Kentucky,"[Q5224589, Q7314656]","[Q5224589, Q7314656]","[darrell brock_256, republican party of kentucky_256]","[True, True]","[[35, 37], [41, 45]]","[(Republican Party of Kentucky, 1.0)]"
509,Bologna president Claudio Sabatini told The Associated Press on Friday that he had reached a tentative deal with Bryant 's agent Rob Pelinka for a 10-game contract worth more than $ 3 million US .,351,[bologna_351],"[0, 1]",[],bologna_351,0,True,Q1891,Q1891,Bologna,Bologna,[Q1891],[Q1891],[bologna_351],[True],"[[0, 1]]","[(Bologna, 0.004951123613864183), (University of Bologna, 0.00012221888755448163), (Bologna F.C. 1909, 0.001400307985022664), (Bologna Process, 9.045022306963801e-05), (Guido Reni, 0.025340747088193893), (Bologna sausage, 0.002795316744595766), (Battle of Bologna, 0.07389577478170395), (Joseph Bologna, 0.008049306459724903), (Giuseppe Bologna, 0.0007861307240091264), (Bologna Outdoor, 0.00037268863525241613), (Jack Bologna, 0.00028855132404714823), (ATP Bologna, 2.5569166609784588e-05), (Enr..."
178,"Pietersen is by far England 's most naturally gifted strokemaker , but his international career lies in tatters after a dramatic week that started with allegations that he sent text messages to South African players containing derogatory comments about England captain Andrew Strauss and coach Andy Flower .",126,"[england_126, andrew strauss_126]","[4, 5]",[],england_126,0,True,Q1321565,Q1321565,England cricket team,England cricket team,"[Q1321565, Q507772]","[Q1321565, Q507772]","[england_126, andrew strauss_126]","[True, True]","[[4, 5], [41, 43]]","[(United Kingdom, 0.5751771926879883), (England, 4.878557433585229e-07), (Great Britain, 0.0001647264143684879), (Premier League, 9.717416560306447e-08), (Church of England, 8.398117756769352e-08), (England cricket team, 1.3774587159787188e-07), (History of cricket in South Africa from 1945–46 to 1970, 0.0005827209679409862), (England national rugby union team, 1.7482111047684157e-07), (Edward II of England, 5.86058422413771e-06), (Edward I of England, 6.850871159258531e-07), (John England (..."
318,Hospital spokesman Tan Ching-ting -LRB- 譚慶鼎 -RRB- said the hospital would respond after it had received and reviewed the report .,216,[hospital_216],"[0, 1]",[],hospital_216,0,True,Q1418766,Q16917,National Taiwan University Hospital,Hospital,[Q1418766],[Q16917],[hospital_216],[True],"[[0, 1]]","[(Knights Hospitaller, 0.00264916243031621), (Johns Hopkins Hospital, 0.00044850129052065313), (Hospital, 0.0068001775071024895), (King's College Hospital, 0.09544882923364639), (Sydney Hospital, 0.001592479646205902), (Pennsylvania Hospital, 0.022517558187246323), (The Hospital, 0.01738847605884075), (Hospital ship, 1.6475885786348954e-05), (Military hospital, 0.002538090804591775), (Hospital District, 1.6291141946567222e-05), (Alcatraz Hospital, 0.010642416775226593), (Ralph Hospital, 0.00..."
257,"Democrats and Republicans were at least outwardly thrilled by GOP presidential candidate Mitt Romney 's choice announced Saturday of Ryan as running mate , with Democrats seeing him as a hardliner who will put off moderates , and Republicans viewing him as a charismatic budget-cutter .",173,"[gop_173, mitt romney_173]","[9, 10]",[],gop_173,0,True,Q29468,Q29468,Republican Party (United States),Republican Party (United States),"[Q29468, Q4496]","[Q29468, Q4496]","[gop_173, mitt romney_173]","[True, True]","[[9, 10], [12, 14]]","[(Republican Party (United States), 3.3815301776485285e-06), (New York Republican State Committee, 1.6862987877175328e-06), (Upper Silesian Industrial Region, 5.432253047388258e-08), (OPN1MW, 8.823018760040213e-08), (Goa Opinion Poll, 3.2471698432345875e-06), (The Gop, 6.698076617794868e-07), (Group of pictures, 8.602642083133105e-07), (Gorakhpur Airport, 6.730406312271953e-07), (Governor of Okinawa Prefecture, 2.3164595859270776e-06), (Glaslyn Osprey Project, 9.819804063226911e-07), (OPN1MW..."
498,"Brian Hemphill , president of West Virginia State University , and Beverly Jo Harris , president of Bridgemont Community and Technical College , also made announcements during the meeting .",341,[bridgemont community_341],"[17, 19]",[],bridgemont community_341,0,True,Q6360819,Q1336920,BridgeValley Community and Technical College,Community college,[Q6360819],[Q1336920],[bridgemont community_341],[True],"[[17, 19]]","[(Unincorporated area, 0.2521010935306549), (Basque Country (autonomous community), 0.03559165820479393), (European Economic Community, 0.06355460733175278), (Community (Wales), 0.009082355536520481), (Barrios of Puerto Rico, 0.08821579813957214), (Community college, 0.02357189916074276), (Community radio, 0.039231594651937485), (Community, 0.030429603531956673), (Online community, 0.008991235867142677), (Community service, 0.00408124877139926), (Community settlement, 0.0010817223228514194),..."


We can write functions over the DataFrame to help with error analysis. For instance, to get all incorrect examples, we use the below command. 

In [12]:
pred_df[pred_df['gold_qid'] != pred_df['pred_qid']].sample(10)

Unnamed: 0,sentence,sent_idx,aliases,span,slices,alias,alias_idx,is_gold_label,gold_qid,pred_qid,gold_title,pred_title,all_gold_qids,all_pred_qids,gold_label_aliases,all_is_gold_labels,all_spans,cands
367,Jones is a former NASA astronaut and he would like to see more surface space expeditions go beyond the pure scientific focus .,247,"[jones_247, nasa_247]","[0, 1]",[],jones_247,0,True,Q1338348,Q6685569,Thomas David Jones,Lou Jones (photographer),"[Q1338348, Q23548]","[Q6685569, Q23548]","[jones_247, nasa_247]","[True, True]","[[0, 1], [4, 5]]","[(David Bowie, 0.06754890829324722), (Quincy Jones, 0.23945654928684235), (Nas, 4.920115770801203e-06), (Tom Jones (singer), 1.8179074686486274e-05), (George Jones, 7.5694770202971995e-06), (Thomas David Jones, 6.457734116338543e-07), (Robert Thomas Jones (engineer), 0.6451838612556458), (Edith Jones Woodward, 2.2745459204998042e-07), (Erick Jones, 2.4679440684849396e-05), (Doug Jones (politician), 3.364757276358432e-06), (Leslie Ann Jones, 0.047687314450740814), (Lou Jones (photographer), 2..."
492,"US and British regulators have already fined Barclays , based in Britain , US $ 453 million for submitting false information between 2005 and 2009 to keep the interest rate , known as LIBOR , low .",334,"[barclays_334, britain_334]","[11, 12]",[],britain_334,1,True,Q23666,Q145,Great Britain,United Kingdom,"[Q245343, Q23666]","[Q245343, Q145]","[barclays_334, britain_334]","[True, True]","[[7, 8], [11, 12]]","[(United Kingdom, 0.0009880485013127327), (England, 7.412957074848237e-07), (British Army, 4.996095412934665e-06), (Great Britain, 6.308083811745746e-06), (British Empire, 2.0691617464763112e-05), (United Kingdom of Great Britain and Ireland, 4.2945397581206635e-05), (Battle of Britain, 6.342497727018781e-08), (Prehistoric Britain, 3.280010059825145e-05), (Sub-Roman Britain, 0.00016561544907744974), (Britain Yearly Meeting, 3.361865594797564e-07), (Britain (place name), 6.024898198120354e-07..."
298,"Heard was taken to Athens Regional and is in critical condition , Wright said on Friday .",202,[athens regional_202],"[4, 6]",[],athens regional_202,0,True,Q14943167,Q203263,Piedmont Athens Regional,"Athens, Georgia",[Q14943167],[Q203263],[athens regional_202],[True],"[[4, 6]]","[(Greece, 0.06483016163110733), (Athens, 0.3854326605796814), (Regions of France, 0.0032996931113302708), (2004 Summer Olympics, 0.0012258823262527585), (Yugoslavia, 0.002348853973671794), (Athens, Ohio, 0.00023093065829016268), (Athens, Georgia, 0.010352338664233685), (South Vietnamese Regional Force, 0.0052626910619437695), (2015 Spanish regional elections, 0.00024867753381840885), (2011 Spanish regional elections, 0.03639393672347069), (1987 Spanish regional elections, 0.00064695393666625..."
476,"Apo Ville is lead by Former BIMP-EAGA Mountain Bike champion Hilario Ladra leads Apo Ville , while Ernesto Sagarino Jr. and Tuts Oledan banner Cycle Line .",324,[bimp eaga mountain bike_324],"[6, 9]",[],bimp eaga mountain bike_324,0,True,Q4835688,Q223705,BIMP-EAGA,Mountain bike,[Q4835688],[Q223705],[bimp eaga mountain bike_324],[True],"[[6, 9]]","[(Mountain biking, 0.16030064225196838), (Mountain bike, 0.39319589734077454), (Mountain bike racing, 0.12068700045347214), (BIMP-EAGA, 0.3258163630962372)]"
112,"The crash happened before 7:30 a.m. and affected field offices statewide until the systems were restored around noon , said DMV spokeswoman Jan Mendoza .",79,[dmv_79],"[20, 21]",[],dmv_79,0,True,Q539809,Q5020431,Department of Motor Vehicles,California Department of Motor Vehicles,[Q539809],[Q5020431],[dmv_79],[True],"[[20, 21]]","[(Washington metropolitan area, 1.7656215277384035e-05), (Deserted medieval village, 6.527765071950853e-05), (Department of Motor Vehicles, 0.13297291100025177), (German Mathematical Society, 0.0005758104380220175), (California Department of Motor Vehicles, 0.0036069792695343494), (DMV, 0.0002981254365295172), (DMV (song), 0.002292986959218979), (Demographics of Metro Vancouver, 0.059231553226709366), (German Metal Workers' Union, 0.007560214027762413), (Deutscher Motorsport Verband, 0.00871..."
362,"Germany 's Julia Goerges won in straight sets over Israel 's Shahar Peer 6-3 , 6-3 and Sweden 's Johanna Larsson beat Australia 's Casey Dellacqua 6-1 , 6-3 .",243,"[johanna larsson_243, australia_243]","[22, 23]",[],australia_243,1,True,Q229090,Q408,Casey Dellacqua,Australia,"[Q233328, Q229090]","[Q233328, Q408]","[johanna larsson_243, australia_243]","[True, True]","[[19, 21], [22, 23]]","[(Australia, 2.1821203972649528e-06), (Australian Football League, 2.4391276838287013e-06), (Victoria (Australia), 0.005510048475116491), (ARIA Charts, 4.273729246051516e-06), (Indigenous Australians, 8.468826854368672e-05), (History of the Jews in Australia, 2.752414729911834e-06), (History of Australia, 0.00020518254314083606), (Australia women's national soccer team, 0.0004091866430826485), (Australia at the 1968 Summer Paralympics, 9.441858128411695e-05), (Australia national soccer team,..."
113,DMV spokeswoman Jan Mendoza said some people might be rescheduled as staff works around the computer problem .,80,[dmv_80],"[0, 1]",[],dmv_80,0,True,Q539809,Q5256096,Department of Motor Vehicles,Demographics of Metro Vancouver,[Q539809],[Q5256096],[dmv_80],[True],"[[0, 1]]","[(Washington metropolitan area, 1.65029960044194e-05), (Deserted medieval village, 8.900241664377972e-05), (Department of Motor Vehicles, 0.6214582324028015), (German Mathematical Society, 0.004805940203368664), (California Department of Motor Vehicles, 0.01858581230044365), (DMV, 0.002977850381284952), (DMV (song), 0.013169702142477036), (Demographics of Metro Vancouver, 0.0021449122577905655), (German Metal Workers' Union, 0.020457305014133453), (Deutscher Motorsport Verband, 0.00688176322..."
413,Word had just come that Liddell died in February .,279,"[liddell_279, february_279]","[5, 6]",[],liddell_279,0,True,Q317422,Q234185,Eric Liddell,Alice Liddell,"[Q317422, Q109]","[Q234185, Q119524]","[liddell_279, february_279]","[True, True]","[[5, 6], [8, 9]]","[(Chuck Liddell, 0.020998913794755936), (B. H. Liddell Hart, 0.00990214291960001), (Alice Liddell, 0.011983256787061691), (Samuel Liddell MacGregor Mathers, 0.05489728972315788), (Eric Liddell, 0.12741108238697052), (Alan Liddell, 0.12432460486888885), (Moses J. Liddell, 0.004756293259561062), (Howard Liddell (architect), 0.005191093776375055), (Henry Liddell, 0.00783606432378292), (Billy Liddell, 0.060541559010744095), (Guy Liddell, 0.012239385396242142), (Elizabeth Liddell, 0.0823413431644..."
331,And check out In-Game editor Todd Kenreck 's review above of `` LEGO Star Wars III : The Clone Wars 3DS . ``,224,[in game_224],"[3, 4]",[],in game_224,0,True,Q444835,Q16231990,Virtual world,Michael Todd (video game developer),[Q444835],[Q16231990],[in game_224],[True],"[[3, 4]]","[(Game show, 0.7880452871322632), (Game theory, 0.021614639088511467), (Video game, 5.0140402890974656e-05), (The Game (rapper), 0.0015548624796792865), (PC game, 0.006809609942138195), (Michael Todd (video game developer), 4.14163605455542e-06), (Game, 0.00014519596879836172), (Kris Aquino, 5.624170626106206e-06), (The Game (1997 film), 0.021411282941699028), (Effort (video game player), 0.0101732537150383), (Philip Game, 0.00023917501675896347), (Marion Game, 0.001507671200670302), (1963 N..."
490,Media resources Online Newsroom : denverartmuseum.org\/press Exhibition Website : vangoghdenver.com Facebook : facebook.com\/denverartmuseum Twitter : twitter.com\/denverartmuseum Visitor Information : visitdenver.com The Denver Art Museum is located on 13th Avenue between Broadway and Bannock Streets in downtown Denver .,333,[denver art museum_333],"[21, 24]",[],denver art museum_333,0,True,Q16554,Q1189960,Denver,Denver Art Museum,[Q16554],[Q1189960],[denver art museum_333],[True],"[[21, 24]]","[(Denver Art Museum, 1.0)]"


## 4. Repeat for KORE50

Use the following config to run eval for KORE50.

In [None]:
# set the model checkpoint path 
config_args.run_config.init_checkpoint = f'{root_dir}/models/bootleg_wiki/bootleg_model.pt'

# set the path for the entity db and candidate map
config_args.data_config.entity_dir = f'{root_dir}/data/wiki_entity_data'
config_args.data_config.alias_cand_map = 'alias2qids_kore50.json'

# set the data path and RSS500 test file 
config_args.data_config.data_dir = f'{root_dir}/data/kore50'
config_args.data_config.test_dataset.file = 'test_kore50.jsonl'

# set the embedding paths 
config_args.data_config.emb_dir =  f'{root_dir}/data/emb_data'
config_args.data_config.word_embedding.cache_dir =  f'{root_dir}/pretrained_bert_models'

# set the save directory 
config_args.run_config.save_dir = f'{root_dir}/results'

# set whether to run inference on the CPU
config_args.run_config.cpu = use_cpu

run.model_eval(args=config_args, mode="eval", logger=logger, is_writer=True)

# set the model checkpoint path 
config_args.run_config.init_checkpoint = f'{root_dir}/models/bootleg_wiki/bootleg_model.pt'

# set the path for the entity db and candidate map
config_args.data_config.entity_dir = f'{root_dir}/data/wiki_entity_data'
config_args.data_config.alias_cand_map = 'alias2qids_kore50.json'

# set the data path and RSS500 test file 
config_args.data_config.data_dir = f'{root_dir}/data/kore50'
config_args.data_config.test_dataset.file = 'test_kore50.jsonl'

# set the embedding paths 
config_args.data_config.emb_dir =  f'{root_dir}/data/emb_data'
config_args.data_config.word_embedding.cache_dir =  f'{root_dir}/data/emb_data/bert'

# set the save directory 
config_args.run_config.save_dir = f'{root_dir}/results'

# set whether to run inference on the CPU
config_args.run_config.cpu = use_cpu

run.model_eval(args=config_args, mode="eval", logger=logger, is_writer=True)

Use the following config to run eval for KORE50.

In [None]:
# set the model checkpoint path 
config_args.run_config.init_checkpoint = f'{root_dir}/models/bootleg_wiki/bootleg_model.pt'

# set the path for the entity db and candidate map
config_args.data_config.entity_dir = f'{root_dir}/data/wiki_entity_data'
config_args.data_config.alias_cand_map = 'alias2qids_kore50.json'

# set the data path and RSS500 test file 
config_args.data_config.data_dir = f'{root_dir}/data/kore50'
config_args.data_config.test_dataset.file = 'test_kore50.jsonl'

# set the embedding paths 
config_args.data_config.emb_dir =  f'{root_dir}/data/emb_data'
config_args.data_config.word_embedding.cache_dir =  f'{root_dir}/data/emb_data/bert'

# set the save directory 
config_args.run_config.save_dir = f'{root_dir}/results'

# set whether to run inference on the CPU
config_args.run_config.cpu = use_cpu

run.model_eval(args=config_args, mode="eval", logger=logger, is_writer=True)

2020-10-21 17:34:14,792 Loading entity_symbols...
