<span style="color:red; font-family:Helvetica Neue, Helvetica, Arial, sans-serif; font-size:2em;">An Exception was encountered at 'In [12]'.</span>

# Downloads Publication Information for PANGO Lineages from the CORD-19 Data Set
**[Work in progress]**

This notebook text-mines [PANGO lineage](https://cov-lineages.org/) mentions in the titles and abstracts of publications and preprints from the CORD-19 data set. Note, the text-mined results may contain false positive!

Data sources: [PANGO Lineage Designations](https://github.com/cov-lineages/pango-designation), 
[CORD-19](https://allenai.org/data/cord-19)

References:

Rambaut A, et al., A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology(2020) Nature Microbiology [doi:10.1038/s41564-020-0770-5](https://doi.org/10.1038/s41564-020-0770-5).

Lucy Lu Wang, et al., CORD-19: The COVID-19 Open Research Dataset (2020) [arXiv:2004.10706v4](https://arxiv.org/abs/2004.10706).

Author: Peter Rose (pwrose@ucsd.edu)

In [1]:
import os
import pandas as pd
import io
import dateutil
import re
from pathlib import Path
import nltk
import json, requests
from urllib.request import urlopen
from xml.etree.ElementTree import parse
import urllib
import time
import numpy as np

In [2]:
pattern1 = re.compile(' [A-Z]{1,2}[.]\d+ ', re.IGNORECASE)
pattern2 = re.compile(' [A-Z]{1,2}[.]\d+[.]\d+ ', re.IGNORECASE)
pattern3 = re.compile(' [A-Z]{1,2}[.]\d+[.]\d+[.]+\d+ ', re.IGNORECASE)

# add WHO lineage
who_lineage = [' Alpha ', ' Beta ', ' Gamma ', ' Epsilon ',' Zeta ', ' Eta ', ' Theta  ',\
               ' Iota ', ' Kappa ', ' Lambda ', ' Mu ']
pattern4 = re.compile("|".join(who_lineage), re.IGNORECASE)

In [3]:
gg = pd.read_csv('lineages')

In [4]:
lineages = gg.iloc[:,0].to_list()

In [5]:
def get_lineages(row):
    text = ' ' + row.title + ' ' + row.abstract + ' '
    lin = pattern1.findall(text) + pattern2.findall(text) + pattern3.findall(text)
    u_lin = set()
    
    
    for l in lin:
        l = l.strip()
        # check if lineage is valid (e.g., not a withdrawn lineage or false positive)
        if l in lineages:
            u_lin.add(l)
            
    return ";".join(u_lin)

In [6]:
# download articles in XML and return body paragraph
def download_article(article_id):
    url = f'https://www.ebi.ac.uk/europepmc/webservices/rest/{article_id}/fullTextXML'
    xmldoc = parse(urlopen(url))
    
    # get full text
    root = xmldoc.getroot()
    text = root.findall('.//p')

    # put body paragraphs together
    ptext = ""
    for p in text:
        ptext += ''.join([x for x in p.itertext()]) + '.\n' + '\n'
    return ptext

In [7]:
# get lineage for full texts
def get_full_lineage(ptext):
    # tokenize texts into sentences
    p_sentence = nltk.tokenize.sent_tokenize(ptext)
    
    # record lineages
    pair = []
    for s in p_sentence:
        s1 = re.subn('[()/,]', ' ', s)[0] # remove special chars
        lin = set(pattern1.findall(s1) + pattern2.findall(s1) + pattern3.findall(s1) + pattern4.findall(s1))

        if lin: 
            for l in lin:
                # valid lineage and not recorded
                l = l.strip()
                l = l.capitalize()
                if (l in lineages): 
                    pair.append([l, s])
                else: continue
    return pair

In [8]:
def pub_mentions_lin(article_id, real_id):
    body_text = download_article(article_id) # get body text
    record = get_full_lineage(body_text) # extract lineages in text
    [x.append(real_id) for x in record] # attach article id to lineage record
    df = pd.DataFrame(record)
    if record:
        df.columns = ['lineage', 'string', 'ID']
        df = df[['ID','lineage','string']]
    return df

In [9]:
def run_pipeline(N, pub):
    results = []
    for i in range(N):
        article = pub.iloc[i]
        article_id = article.pmcId.split(":")[1]
        real_id = article.id
        print(f'start article {i}')
        if i%100 == 0:
            print(f'{i}/{N}')
            
        try:
            results.append(pub_mentions_lin(article_id, real_id))
        except urllib.error.HTTPError as exc:
            time.sleep(5) # wait 5 seconds and then make http request again
            continue
    return pd.concat(results)

In [10]:
pub = pd.read_csv("Publication_1120.csv")
N = pub.size


In [11]:
N

17360

In [12]:
ans = run_pipeline(N, pub)
ans.columns = [['from','to','evidence']]
ans.to_csv('Publication-MENTIONS-Lineage.csv',index=False)

start article 0
0/17360


start article 1


start article 2


start article 3


start article 4


start article 5


start article 6


start article 7


start article 8


start article 9


start article 10


start article 11


start article 12


start article 13


start article 14


start article 15


start article 16


start article 17


start article 18


start article 19


start article 20


start article 21


start article 22


start article 23


start article 24


start article 25


start article 26


start article 27


start article 28


start article 29


start article 30


start article 31


start article 32


start article 33


start article 34


start article 35


start article 36


start article 37


start article 38


start article 39


start article 40


start article 41


start article 42


start article 43


start article 44


start article 45


start article 46


start article 47


start article 48


start article 49


start article 50


start article 51


start article 52


start article 53


start article 54


start article 55


start article 56


start article 57


start article 58


start article 59


start article 60


start article 61


start article 62


start article 63


start article 64


start article 65


start article 66


start article 67


start article 68


start article 69


start article 70


start article 71


start article 72


start article 73


start article 74


start article 75


start article 76


start article 77


start article 78


start article 79


start article 80


start article 81


start article 82


start article 83


start article 84


start article 85


start article 86


start article 87


start article 88


start article 89


start article 90


start article 91


start article 92


start article 93


start article 94


start article 95


start article 96


start article 97


start article 98


start article 99


start article 100
100/17360


start article 101


start article 102


start article 103


start article 104


start article 105


start article 106


start article 107


start article 108


start article 109


start article 110


start article 111


start article 112


start article 113


start article 114


start article 115


start article 116


start article 117


start article 118


start article 119


start article 120


start article 121


start article 122


start article 123


start article 124


start article 125


start article 126


start article 127


start article 128


start article 129


start article 130


start article 131


start article 132


start article 133


start article 134


start article 135


start article 136


start article 137


start article 138


start article 139


start article 140


start article 141


start article 142


start article 143


start article 144


start article 145


start article 146


start article 147


start article 148


start article 149


start article 150


start article 151


start article 152


start article 153


start article 154


start article 155


start article 156


start article 157


start article 158


start article 159


start article 160


start article 161


start article 162


start article 163


start article 164


start article 165


start article 166


start article 167


start article 168


start article 169


start article 170


start article 171


start article 172


start article 173


start article 174


start article 175


start article 176


start article 177


start article 178


start article 179


start article 180


start article 181


start article 182


start article 183


start article 184


start article 185


start article 186


start article 187


start article 188


start article 189


start article 190


start article 191


start article 192


start article 193


start article 194


start article 195


start article 196


start article 197


start article 198


start article 199


start article 200
200/17360


start article 201


start article 202


start article 203


start article 204


start article 205


start article 206


start article 207


start article 208


start article 209


start article 210


start article 211


start article 212


start article 213


start article 214


start article 215


start article 216


start article 217


start article 218


start article 219


start article 220


start article 221


start article 222


start article 223


start article 224


start article 225


start article 226


start article 227


start article 228


start article 229


start article 230


start article 231


start article 232


start article 233


start article 234


start article 235


start article 236


start article 237


start article 238


start article 239


start article 240


start article 241


start article 242


start article 243


start article 244


start article 245


start article 246


start article 247


start article 248


start article 249


start article 250


start article 251


start article 252


start article 253


start article 254


start article 255


start article 256


start article 257


start article 258


start article 259


start article 260


start article 261


start article 262


start article 263


start article 264


start article 265


start article 266


start article 267


start article 268


start article 269


start article 270


start article 271


start article 272


start article 273


start article 274


start article 275


start article 276


start article 277


start article 278


start article 279


start article 280


start article 281


start article 282


start article 283


start article 284


start article 285


start article 286


start article 287


start article 288


start article 289


start article 290


start article 291


start article 292


start article 293


start article 294


start article 295


start article 296


start article 297


start article 298


start article 299


start article 300
300/17360


start article 301


start article 302


start article 303


start article 304


start article 305


start article 306


start article 307


start article 308


start article 309


start article 310


start article 311


start article 312


start article 313


start article 314


start article 315


start article 316


start article 317


start article 318


start article 319


start article 320


start article 321


start article 322


start article 323


start article 324


start article 325


start article 326


start article 327


start article 328


start article 329


start article 330


start article 331


start article 332


start article 333


start article 334


start article 335


start article 336


start article 337


start article 338


start article 339


start article 340


start article 341


start article 342


start article 343


start article 344


start article 345


start article 346


start article 347


start article 348


start article 349


start article 350


start article 351


start article 352


start article 353


start article 354


start article 355


start article 356


start article 357


start article 358


start article 359


start article 360


start article 361


start article 362


start article 363


start article 364


start article 365


start article 366


start article 367


start article 368


start article 369


start article 370


start article 371


start article 372


start article 373


start article 374


start article 375


start article 376


start article 377


start article 378


start article 379


start article 380


start article 381


start article 382


start article 383


start article 384


start article 385


start article 386


start article 387


start article 388


start article 389


start article 390


start article 391


start article 392


start article 393


start article 394


start article 395


start article 396


start article 397


start article 398


start article 399


start article 400
400/17360


start article 401


start article 402


start article 403


start article 404


start article 405


start article 406


start article 407


start article 408


start article 409


start article 410


start article 411


start article 412


start article 413


start article 414


start article 415


start article 416


start article 417


start article 418


start article 419


start article 420


start article 421


start article 422


start article 423


start article 424


start article 425


start article 426


start article 427


start article 428


start article 429


start article 430


start article 431


start article 432


start article 433


start article 434


start article 435


start article 436


start article 437


start article 438


start article 439


start article 440


start article 441


start article 442


start article 443


start article 444


start article 445


start article 446


start article 447


start article 448


start article 449


start article 450


start article 451


start article 452


start article 453


start article 454


start article 455


start article 456


start article 457


start article 458


start article 459


start article 460


start article 461


start article 462


start article 463


start article 464


start article 465


start article 466


start article 467


start article 468


start article 469


start article 470


start article 471


start article 472


start article 473


start article 474


start article 475


start article 476


start article 477


start article 478


start article 479


start article 480


start article 481


start article 482


start article 483


start article 484


start article 485


start article 486


start article 487


start article 488


start article 489


start article 490


start article 491


start article 492


start article 493


start article 494


start article 495


start article 496


start article 497


start article 498


start article 499


start article 500
500/17360


start article 501


start article 502


start article 503


start article 504


start article 505


start article 506


start article 507


start article 508


start article 509


start article 510


start article 511


start article 512


start article 513


start article 514


start article 515


start article 516


start article 517


start article 518


start article 519


start article 520


start article 521


start article 522


start article 523


start article 524


start article 525


start article 526


start article 527


start article 528


start article 529


start article 530


start article 531


start article 532


start article 533


start article 534


start article 535


start article 536


start article 537


start article 538


start article 539


start article 540


start article 541


start article 542


start article 543


start article 544


start article 545


start article 546


start article 547


start article 548


start article 549


start article 550


start article 551


start article 552


start article 553


start article 554


start article 555


start article 556


start article 557


start article 558


start article 559


start article 560


start article 561


start article 562


start article 563


start article 564


start article 565


start article 566


start article 567


start article 568


start article 569


start article 570


start article 571


start article 572


start article 573


start article 574


start article 575


start article 576


start article 577


start article 578


start article 579


start article 580


start article 581


start article 582


start article 583


start article 584


start article 585


start article 586


start article 587


start article 588


start article 589


start article 590


start article 591


start article 592


start article 593


start article 594


start article 595


start article 596


start article 597


start article 598


start article 599


start article 600
600/17360


start article 601


start article 602


start article 603


start article 604


start article 605


start article 606


start article 607


start article 608


start article 609


start article 610


start article 611


start article 612


start article 613


start article 614


start article 615


start article 616


start article 617


start article 618


start article 619


start article 620


start article 621


start article 622


start article 623


start article 624


start article 625


start article 626


start article 627


start article 628


start article 629


start article 630


start article 631


start article 632


start article 633


start article 634


start article 635


start article 636


start article 637


start article 638


start article 639


start article 640


start article 641


start article 642


start article 643


start article 644


start article 645


start article 646


start article 647


start article 648


start article 649


start article 650


start article 651


start article 652


start article 653


start article 654


start article 655


start article 656


start article 657


start article 658


start article 659


start article 660


start article 661


start article 662


start article 663


start article 664


start article 665


start article 666


start article 667


start article 668


start article 669


start article 670


start article 671


start article 672


start article 673


start article 674


start article 675


start article 676


start article 677


start article 678


start article 679


start article 680


start article 681


start article 682


start article 683


start article 684


start article 685


start article 686


start article 687


start article 688


start article 689


start article 690


start article 691


start article 692


start article 693


start article 694


start article 695


start article 696


start article 697


start article 698


start article 699


start article 700
700/17360


start article 701


start article 702


start article 703


start article 704


start article 705


start article 706


start article 707


start article 708


start article 709


start article 710


start article 711


start article 712


start article 713


start article 714


start article 715


start article 716


start article 717


start article 718


start article 719


start article 720


start article 721


start article 722


start article 723


start article 724


start article 725


start article 726


start article 727


start article 728


start article 729


start article 730


start article 731


start article 732


start article 733


start article 734


start article 735


start article 736


start article 737


start article 738


start article 739


start article 740


start article 741


start article 742


start article 743


start article 744


start article 745


start article 746


start article 747


start article 748


start article 749


start article 750


start article 751


start article 752


start article 753


start article 754


start article 755


start article 756


start article 757


start article 758


start article 759


start article 760


start article 761


start article 762


start article 763


start article 764


start article 765


start article 766


start article 767


start article 768


start article 769


start article 770


start article 771


start article 772


start article 773


start article 774


start article 775


start article 776


start article 777


start article 778


start article 779


start article 780


start article 781


start article 782


start article 783


start article 784


start article 785


start article 786


start article 787


start article 788


start article 789


start article 790


start article 791


start article 792


start article 793


start article 794


start article 795


start article 796


start article 797


start article 798


start article 799


start article 800
800/17360


start article 801


start article 802


start article 803


start article 804


start article 805


start article 806


start article 807


start article 808


start article 809


start article 810


start article 811


start article 812


start article 813


start article 814


start article 815


start article 816


start article 817


start article 818


start article 819


start article 820


start article 821


start article 822


start article 823


start article 824


start article 825


start article 826


start article 827


start article 828


start article 829


start article 830


start article 831


start article 832


start article 833


start article 834


start article 835


start article 836


start article 837


start article 838


start article 839


start article 840


start article 841


start article 842


start article 843


start article 844


start article 845


start article 846


start article 847


start article 848


start article 849


start article 850


start article 851


start article 852


start article 853


start article 854


start article 855


start article 856


start article 857


start article 858


start article 859


start article 860


start article 861


start article 862


start article 863


start article 864


start article 865


start article 866


start article 867


start article 868


start article 869


start article 870


start article 871


start article 872


start article 873


start article 874


start article 875


start article 876


start article 877


start article 878


start article 879


start article 880


start article 881


start article 882


start article 883


start article 884


start article 885


start article 886


start article 887


start article 888


start article 889


start article 890


start article 891


start article 892


start article 893


start article 894


start article 895


start article 896


start article 897


start article 898


start article 899


start article 900
900/17360


start article 901


start article 902


start article 903


start article 904


start article 905


start article 906


start article 907


start article 908


start article 909


start article 910


start article 911


start article 912


start article 913


start article 914


start article 915


start article 916


start article 917


start article 918


start article 919


start article 920


start article 921


start article 922


start article 923


start article 924


start article 925


start article 926


start article 927


start article 928


start article 929


start article 930


start article 931


start article 932


start article 933


start article 934


start article 935


start article 936


start article 937


start article 938


start article 939


start article 940


start article 941


start article 942


start article 943


start article 944


start article 945


start article 946


start article 947


start article 948


start article 949


start article 950


start article 951


start article 952


start article 953


start article 954


start article 955


start article 956


start article 957


start article 958


start article 959


start article 960


start article 961


start article 962


start article 963


start article 964


start article 965


start article 966


start article 967


start article 968


start article 969


start article 970


start article 971


start article 972


start article 973


start article 974


start article 975


start article 976


start article 977


start article 978


start article 979


start article 980


start article 981


start article 982


start article 983


start article 984


start article 985


start article 986


start article 987


start article 988


start article 989


start article 990


start article 991


start article 992


start article 993


start article 994


start article 995


start article 996


start article 997


start article 998


start article 999


start article 1000
1000/17360


start article 1001


start article 1002


start article 1003


start article 1004


start article 1005


start article 1006


start article 1007


start article 1008


start article 1009


start article 1010


start article 1011


start article 1012


start article 1013


start article 1014


start article 1015


start article 1016


start article 1017


start article 1018


start article 1019


start article 1020


start article 1021


start article 1022


start article 1023


start article 1024


start article 1025


start article 1026


start article 1027


start article 1028


start article 1029


start article 1030


start article 1031


start article 1032


start article 1033


start article 1034


start article 1035


start article 1036


start article 1037


start article 1038


start article 1039


start article 1040


start article 1041


start article 1042


start article 1043


start article 1044


start article 1045


start article 1046


start article 1047


start article 1048


start article 1049


start article 1050


start article 1051


start article 1052


start article 1053


start article 1054


start article 1055


start article 1056


start article 1057


start article 1058


start article 1059


start article 1060


start article 1061


start article 1062


start article 1063


start article 1064


start article 1065


start article 1066


start article 1067


start article 1068


start article 1069


start article 1070


start article 1071


start article 1072


start article 1073


start article 1074


start article 1075


start article 1076


start article 1077


start article 1078


start article 1079


start article 1080


start article 1081


start article 1082


start article 1083


start article 1084


start article 1085


start article 1086


start article 1087


start article 1088


start article 1089


start article 1090


start article 1091


start article 1092


start article 1093


start article 1094


start article 1095


start article 1096


start article 1097


start article 1098


start article 1099


start article 1100
1100/17360


start article 1101


start article 1102


start article 1103


start article 1104


start article 1105


start article 1106


start article 1107


start article 1108


start article 1109


start article 1110


start article 1111


start article 1112


start article 1113


start article 1114


start article 1115


start article 1116


start article 1117


start article 1118


start article 1119


start article 1120


start article 1121


start article 1122


start article 1123


start article 1124


start article 1125


start article 1126


start article 1127


start article 1128


start article 1129


start article 1130


start article 1131


start article 1132


start article 1133


start article 1134


start article 1135


start article 1136


start article 1137


start article 1138


start article 1139


start article 1140


start article 1141


start article 1142


start article 1143


start article 1144


start article 1145


start article 1146


start article 1147


start article 1148


start article 1149


start article 1150


start article 1151


start article 1152


start article 1153


start article 1154


start article 1155


start article 1156


start article 1157


start article 1158


start article 1159


start article 1160


start article 1161


start article 1162


start article 1163


start article 1164


start article 1165


start article 1166


start article 1167


start article 1168


start article 1169


start article 1170


start article 1171


start article 1172


start article 1173


start article 1174


start article 1175


start article 1176


start article 1177


start article 1178


start article 1179


start article 1180


start article 1181


start article 1182


start article 1183


start article 1184


start article 1185


start article 1186


start article 1187


start article 1188


start article 1189


start article 1190


start article 1191


start article 1192


start article 1193


start article 1194


start article 1195


start article 1196


start article 1197


start article 1198


start article 1199


start article 1200
1200/17360


start article 1201


start article 1202


start article 1203


start article 1204


start article 1205


start article 1206


start article 1207


start article 1208


start article 1209


start article 1210


start article 1211


start article 1212


start article 1213


start article 1214


start article 1215


start article 1216


start article 1217


start article 1218


start article 1219


start article 1220


start article 1221


start article 1222


start article 1223


start article 1224


start article 1225


start article 1226


start article 1227


start article 1228


start article 1229


start article 1230


start article 1231


start article 1232


start article 1233


start article 1234


start article 1235


start article 1236


start article 1237


start article 1238


start article 1239


start article 1240


start article 1241


start article 1242


start article 1243


start article 1244


start article 1245


start article 1246


start article 1247


start article 1248


start article 1249


start article 1250


start article 1251


start article 1252


start article 1253


start article 1254


start article 1255


start article 1256


start article 1257


start article 1258


start article 1259


start article 1260


start article 1261


start article 1262


start article 1263


start article 1264


start article 1265


start article 1266


start article 1267


start article 1268


start article 1269


start article 1270


start article 1271


start article 1272


start article 1273


start article 1274


start article 1275


start article 1276


start article 1277


start article 1278


start article 1279


start article 1280


start article 1281


start article 1282


start article 1283


start article 1284


start article 1285


start article 1286


start article 1287


start article 1288


start article 1289


start article 1290


start article 1291


start article 1292


start article 1293


start article 1294


start article 1295


start article 1296


start article 1297


start article 1298


start article 1299


start article 1300
1300/17360


start article 1301


start article 1302


start article 1303


start article 1304


start article 1305


start article 1306


start article 1307


start article 1308


start article 1309


start article 1310


start article 1311


start article 1312


start article 1313


start article 1314


start article 1315


start article 1316


start article 1317


start article 1318


start article 1319


start article 1320


start article 1321


start article 1322


start article 1323


start article 1324


start article 1325


start article 1326


start article 1327


start article 1328


start article 1329


start article 1330


start article 1331


start article 1332


start article 1333


start article 1334


start article 1335


start article 1336


start article 1337


start article 1338


start article 1339


start article 1340


start article 1341


start article 1342


start article 1343


start article 1344


start article 1345


start article 1346


start article 1347


start article 1348


start article 1349


start article 1350


start article 1351


start article 1352


start article 1353


start article 1354


start article 1355


start article 1356


start article 1357


start article 1358


start article 1359


start article 1360


start article 1361


start article 1362


start article 1363


start article 1364


start article 1365


start article 1366


start article 1367


start article 1368


start article 1369


start article 1370


start article 1371


start article 1372


start article 1373


start article 1374


start article 1375


start article 1376


start article 1377


start article 1378


start article 1379


start article 1380


start article 1381


start article 1382


start article 1383


start article 1384


start article 1385


start article 1386


start article 1387


start article 1388


start article 1389


start article 1390


start article 1391


start article 1392


start article 1393


start article 1394


start article 1395


start article 1396


start article 1397


start article 1398


start article 1399


start article 1400
1400/17360


start article 1401


start article 1402


start article 1403


start article 1404


start article 1405


start article 1406


start article 1407


start article 1408


start article 1409


start article 1410


start article 1411


start article 1412


start article 1413


start article 1414


start article 1415


start article 1416


start article 1417


start article 1418


start article 1419


start article 1420


start article 1421


start article 1422


start article 1423


start article 1424


start article 1425


start article 1426


start article 1427


start article 1428


start article 1429


start article 1430


start article 1431


start article 1432


start article 1433


start article 1434


start article 1435


start article 1436


start article 1437


start article 1438


start article 1439


start article 1440


start article 1441


start article 1442


start article 1443


start article 1444


start article 1445


start article 1446


start article 1447


start article 1448


start article 1449


start article 1450


start article 1451


start article 1452


start article 1453


start article 1454


start article 1455


start article 1456


start article 1457


start article 1458


start article 1459


start article 1460


start article 1461


start article 1462


start article 1463


start article 1464


start article 1465


start article 1466


start article 1467


start article 1468


start article 1469


start article 1470


start article 1471


start article 1472


start article 1473


start article 1474


start article 1475


start article 1476


start article 1477


start article 1478


start article 1479


start article 1480


start article 1481


start article 1482


start article 1483


start article 1484


start article 1485


start article 1486


start article 1487


start article 1488


start article 1489


start article 1490


start article 1491


start article 1492


start article 1493


start article 1494


start article 1495


start article 1496


start article 1497


start article 1498


start article 1499


start article 1500
1500/17360


start article 1501


start article 1502


start article 1503


start article 1504


start article 1505


start article 1506


start article 1507


start article 1508


start article 1509


start article 1510


start article 1511


start article 1512


start article 1513


start article 1514


start article 1515


start article 1516


start article 1517


start article 1518


start article 1519


start article 1520


start article 1521


start article 1522


start article 1523


start article 1524


start article 1525


start article 1526


start article 1527


start article 1528


start article 1529


start article 1530


start article 1531


start article 1532


start article 1533


start article 1534


start article 1535


start article 1536


start article 1537


start article 1538


start article 1539


start article 1540


start article 1541


start article 1542


start article 1543


start article 1544


start article 1545


start article 1546


start article 1547


start article 1548


start article 1549


start article 1550


start article 1551


start article 1552


start article 1553


start article 1554


start article 1555


start article 1556


start article 1557


start article 1558


start article 1559


start article 1560


start article 1561


start article 1562


start article 1563


start article 1564


start article 1565


start article 1566


start article 1567


start article 1568


start article 1569


start article 1570


start article 1571


start article 1572


start article 1573


start article 1574


start article 1575


start article 1576


start article 1577


start article 1578


start article 1579


start article 1580


start article 1581


start article 1582


start article 1583


start article 1584


start article 1585


start article 1586


start article 1587


start article 1588


start article 1589


start article 1590


start article 1591


start article 1592


start article 1593


start article 1594


start article 1595


start article 1596


start article 1597


start article 1598


start article 1599


start article 1600
1600/17360


start article 1601


start article 1602


start article 1603


start article 1604


start article 1605


start article 1606


start article 1607


start article 1608


start article 1609


start article 1610


start article 1611


start article 1612


start article 1613


start article 1614


start article 1615


start article 1616


start article 1617


start article 1618


start article 1619


start article 1620


start article 1621


start article 1622


start article 1623


start article 1624


start article 1625


start article 1626


start article 1627


start article 1628


start article 1629


start article 1630


start article 1631


start article 1632


start article 1633


start article 1634


start article 1635


start article 1636


start article 1637


start article 1638


start article 1639


start article 1640


start article 1641


start article 1642


start article 1643


start article 1644


start article 1645


start article 1646


start article 1647


start article 1648


start article 1649


start article 1650


start article 1651


start article 1652


start article 1653


start article 1654


start article 1655


start article 1656


start article 1657


start article 1658


start article 1659


start article 1660


start article 1661


start article 1662


start article 1663


start article 1664


start article 1665


start article 1666


start article 1667


start article 1668


start article 1669


start article 1670


start article 1671


start article 1672


start article 1673


start article 1674


start article 1675


start article 1676


start article 1677


start article 1678


start article 1679


start article 1680


start article 1681


start article 1682


start article 1683


start article 1684


start article 1685


start article 1686


start article 1687


start article 1688


start article 1689


start article 1690


start article 1691


start article 1692


start article 1693


start article 1694


start article 1695


start article 1696


start article 1697


start article 1698


start article 1699


start article 1700
1700/17360


start article 1701


start article 1702


start article 1703


start article 1704


start article 1705


start article 1706


start article 1707


start article 1708


start article 1709


start article 1710


start article 1711


start article 1712


start article 1713


start article 1714


start article 1715


start article 1716


start article 1717


start article 1718


start article 1719


start article 1720


start article 1721


start article 1722


start article 1723


start article 1724


start article 1725


start article 1726


start article 1727


start article 1728


start article 1729


start article 1730


start article 1731


start article 1732


start article 1733


start article 1734


start article 1735


IndexError: single positional indexer is out-of-bounds

In [None]:
"""results = []
for i in range(144, N):
    article = pub.iloc[i]
    article_id = article.pmcId.split(":")[1]
    real_id = article.id
    print(f'start article {i}')
    if i%100 == 0:
        print(f'{i}/{N}')

    try:
        results.append(pub_mentions_lin(article_id, real_id))
    except urllib.error.HTTPError as exc:
        time.sleep(5) # wait 5 seconds and then make http request again
        continue"""

In [None]:
"""results_144_to_321[:3]"""

In [None]:
#results_144_to_321 = results

In [None]:
# partial results 

#results_0_to_144 = results

## Fulltext Regrex
This part is removed when generating knowledge graph data