# Data Acquisition Part 2 - Web Scraping

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Imports-and-Defaults" data-toc-modified-id="Imports-and-Defaults-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Imports and Defaults</a></span></li><li><span><a href="#Loading-Search-Results" data-toc-modified-id="Loading-Search-Results-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Loading Search Results</a></span></li><li><span><a href="#Data-Prep" data-toc-modified-id="Data-Prep-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Data Prep</a></span><ul class="toc-item"><li><span><a href="#Resetting-very-short-abstracts" data-toc-modified-id="Resetting-very-short-abstracts-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Resetting very short abstracts</a></span></li></ul></li><li><span><a href="#Building-Dataframe-of-Unique-Records" data-toc-modified-id="Building-Dataframe-of-Unique-Records-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Building Dataframe of Unique Records</a></span><ul class="toc-item"><li><span><a href="#Records-that-have-DOI-and-abstract" data-toc-modified-id="Records-that-have-DOI-and-abstract-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>Records that have DOI and abstract</a></span><ul class="toc-item"><li><span><a href="#Add-unique-doi-abstract-combos-from-each-data-source" data-toc-modified-id="Add-unique-doi-abstract-combos-from-each-data-source-4.1.1"><span class="toc-item-num">4.1.1&nbsp;&nbsp;</span>Add unique doi-abstract combos from each data source</a></span></li><li><span><a href="#Extra-prep" data-toc-modified-id="Extra-prep-4.1.2"><span class="toc-item-num">4.1.2&nbsp;&nbsp;</span>Extra prep</a></span></li></ul></li><li><span><a href="#Records-that-have-DOI-but-no-abstract" data-toc-modified-id="Records-that-have-DOI-but-no-abstract-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>Records that have DOI but no abstract</a></span></li><li><span><a href="#Saving-New-Dataframes" data-toc-modified-id="Saving-New-Dataframes-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Saving New Dataframes</a></span></li></ul></li><li><span><a href="#Scraping" data-toc-modified-id="Scraping-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Scraping</a></span><ul class="toc-item"><li><span><a href="#Formatting-Vagaries" data-toc-modified-id="Formatting-Vagaries-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>Formatting Vagaries</a></span></li><li><span><a href="#Scraping-Function" data-toc-modified-id="Scraping-Function-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>Scraping Function</a></span></li><li><span><a href="#Sample-for-testing" data-toc-modified-id="Sample-for-testing-5.3"><span class="toc-item-num">5.3&nbsp;&nbsp;</span>Sample for testing</a></span></li></ul></li><li><span><a href="#Sandbox" data-toc-modified-id="Sandbox-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Sandbox</a></span><ul class="toc-item"><li><span><a href="#Testing" data-toc-modified-id="Testing-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Testing</a></span></li><li><span><a href="#Testing-for-cureus" data-toc-modified-id="Testing-for-cureus-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Testing for cureus</a></span></li></ul></li></ul></div>

## Imports and Defaults

In [1]:
from IPython.core.display import display, HTML, Markdown as md
display(HTML("""<style>.container { width:75% !important; } p, ul {max-width: 40em;} .rendered_html table { margin-left: 0; } .output_subarea.output_png { display: flex; justify-content: center;}</style>"""))


In [2]:
# Basics
import numpy as np 
import pandas as pd 

#String cleaning and processing
import re
import string

import os
import sys

In [3]:
# Visualisation
import seaborn as sns
import matplotlib.pyplot as plt
from pylab import rcParams

# import matplotlib.patches as mpatches
# from matplotlib.lines import Line2D
# import plotly.express as px

%matplotlib inline
rcParams['figure.figsize'] = 15, 10
rcParams['axes.titlesize'] = 20
rcParams['axes.labelsize'] = 'large'
rcParams['xtick.labelsize'] = 10
rcParams['ytick.labelsize'] = 10
rcParams['lines.linewidth'] = 2
rcParams['font.size'] = 18

#Visualising Plutchik Model of Emotio

In [4]:
from langdetect import detect

## Loading Search Results

In [5]:
gscholar = pd.read_csv('gscholarFiltered.csv', index_col=0)
scopus = pd.read_csv('scopusFiltered.csv', index_col=0)
pubmed = pd.read_csv('pubmedFiltered.csv', index_col=0)
crossref = pd.read_csv('crossrefFiltered.csv', index_col=0)

In [6]:
gscholar.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10770 entries, 0 to 11967
Data columns (total 12 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     9337 non-null   object 
 1   title           10770 non-null  object 
 2   authors         10770 non-null  object 
 3   doi             2347 non-null   object 
 4   year            10144 non-null  float64
 5   cites           10770 non-null  int64  
 6   type            4082 non-null   object 
 7   abstract        10640 non-null  object 
 8   article_url     10760 non-null  object 
 9   fulltext_url    6669 non-null   object 
 10  abstractLength  10640 non-null  float64
 11  gscholar        10770 non-null  int64  
dtypes: float64(2), int64(2), object(8)
memory usage: 1.1+ MB


In [7]:
scopus.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1797 entries, 4 to 2399
Data columns (total 12 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     1797 non-null   object 
 1   title           1797 non-null   object 
 2   authors         1796 non-null   object 
 3   doi             1793 non-null   object 
 4   year            1797 non-null   int64  
 5   cites           1797 non-null   int64  
 6   type            1796 non-null   object 
 7   abstract        0 non-null      float64
 8   article_url     359 non-null    object 
 9   fulltext_url    0 non-null      float64
 10  abstractLength  0 non-null      float64
 11  scopus          1797 non-null   int64  
dtypes: float64(3), int64(3), object(6)
memory usage: 182.5+ KB


In [8]:
pubmed.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3300 entries, 0 to 4496
Data columns (total 12 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     3300 non-null   object 
 1   title           3299 non-null   object 
 2   authors         3287 non-null   object 
 3   doi             3159 non-null   object 
 4   year            3297 non-null   float64
 5   cites           3300 non-null   int64  
 6   type            3300 non-null   object 
 7   abstract        3162 non-null   object 
 8   article_url     0 non-null      float64
 9   fulltext_url    0 non-null      float64
 10  abstractLength  3162 non-null   float64
 11  pubmed          3300 non-null   int64  
dtypes: float64(4), int64(2), object(6)
memory usage: 335.2+ KB


In [9]:
crossref.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3422 entries, 4 to 3999
Data columns (total 12 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     3422 non-null   object 
 1   title           3422 non-null   object 
 2   authors         3313 non-null   object 
 3   doi             3422 non-null   object 
 4   year            3422 non-null   int64  
 5   cites           3422 non-null   int64  
 6   type            3422 non-null   object 
 7   abstract        591 non-null    object 
 8   article_url     3422 non-null   object 
 9   fulltext_url    2968 non-null   object 
 10  abstractLength  591 non-null    float64
 11  crossref        3422 non-null   int64  
dtypes: float64(1), int64(3), object(8)
memory usage: 347.5+ KB


In [10]:
scopus.sample(10)

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,scopus
1599,PLoS ONE,Molecular detection of Borrelia burgdorferi se...,"[{'name': 'M. Lager', 'affiliation': 'Linköpin...",10.1371/journal.pone.0185434,2017,10,Article,,,,,1
701,BMC Neurology,Cerebrospinal fluid CXCL13 in Lyme neuroborrel...,"[{'name': 'D. Bremell', 'affiliation': 'Götebo...",10.1186/1471-2377-13-2,2013,39,Article,,,,,1
1332,PLoS ONE,In vivo imaging demonstrates that borrelia bur...,"[{'name': 'J.T. Skare', 'affiliation': 'Texas ...",10.1371/journal.pone.0162501,2016,18,Article,,,,,1
816,Journal of Medical Entomology,"The relationship between deer density, tick ab...","[{'name': 'H. Kilpatrick', 'affiliation': 'Wil...",10.1603/ME13232,2014,79,Article,,,,,1
1306,Infection and Immunity,"HtrA, a temperature- and stationary phase-acti...","[{'name': 'M. Ye', 'affiliation': 'Indiana Uni...",10.1128/IAI.00360-16,2016,22,Article,,,,,1
417,Annual Review of Genetics,Genetics of Borrelia burgdorferi,"[{'name': 'D. Brisson', 'affiliation': 'Univer...",10.1146/annurev-genet-011112-112140,2012,108,Article,,,,,1
1034,Infection and Immunity,Cyclic di-GMP modulates gene expression in lym...,"[{'name': 'M. Caimano', 'affiliation': 'UConn ...",10.1128/IAI.00315-15,2015,48,Article,,,,,1
1694,Journal of medical entomology,Ixodes scapularis (Acari: Ixodidae) Reservoir ...,"[{'name': 'M. Linske', 'affiliation': 'Connect...",10.1093/jme/tjx237,2018,14,Article,,,,,1
2259,Frontiers in Neurology,Detecting Borrelia Spirochetes: A Case Study W...,"[{'name': 'S.K.G. Gadila', 'affiliation': 'Tul...",10.3389/fneur.2021.628045,2021,1,Article,,,,,1
714,"Infection, Genetics and Evolution",Spatial spread and demographic expansion of Ly...,"[{'name': 'S.A. Vollmer', 'affiliation': 'Univ...",10.1016/j.meegid.2012.11.014,2013,36,Article,,https://api.elsevier.com/content/article/eid/1...,,,1


In [11]:
pubmed[pubmed.publication=='Ticks and tick-borne diseases']

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed
0,Ticks and tick-borne diseases,The range of Ixodes ricinus and the risk of co...,"[{'name': 'Thomas G T Jaenson', 'affiliation':...",10.1016/j.ttbdis.2010.10.006,2011.0,0,Journal Article,"In Sweden, the geographical distribution of Ly...",,,1370.0,1
1,Ticks and tick-borne diseases,Low-density microarrays for the detection of B...,"[{'name': 'Julie A Houck', 'affiliation': 'Dep...",10.1016/j.ttbdis.2010.10.002,2011.0,0,Journal Article,Lyme disease is the most common tick-borne dis...,,,932.0,1
2,Ticks and tick-borne diseases,Infectivity of Borrelia burgdorferi sensu lato...,"[{'name': 'N D van Burgel', 'affiliation': 'De...",10.1016/j.ttbdis.2010.10.003,2011.0,0,Journal Article,"B. burgdorferi, B. afzelii, and B. bavariensis...",,,1428.0,1
4,Ticks and tick-borne diseases,Borrelia species in Ixodes affinis and Ixodes ...,"[{'name': 'Ricardo G Maggi', 'affiliation': 'I...",10.1016/j.ttbdis.2010.08.003,2010.0,0,Journal Article,Ixodes affinis and I. scapularis are tick spec...,,,1576.0,1
5,Ticks and tick-borne diseases,Are birds reservoir hosts for Borrelia afzelii?,"[{'name': 'Jan Franke', 'affiliation': 'Instit...",10.1016/j.ttbdis.2010.03.001,2010.0,0,Journal Article,It is known that birds are competent reservoir...,,,783.0,1
...,...,...,...,...,...,...,...,...,...,...,...,...
4448,Ticks and tick-borne diseases,"Experiences with tick exposure, Lyme disease, ...","[{'name': 'C C Nawrocki', 'affiliation': 'Oak ...",10.1016/j.ttbdis.2020.101605,2021.0,0,Journal Article,Consistent and effective use of personal preve...,,,1572.0,1
4458,Ticks and tick-borne diseases,Classification of patients referred under susp...,"[{'name': 'Rosa M M Gynthersen', 'affiliation'...",10.1016/j.ttbdis.2020.101591,2021.0,0,Journal Article,To provide better care for patients suspected ...,,,2736.0,1
4463,Ticks and tick-borne diseases,Spatial variability in prevalence and genospec...,"[{'name': 'Robert E Rollins', 'affiliation': '...",10.1016/j.ttbdis.2020.101589,2021.0,0,Journal Article,Lyme borreliosis (LB) is the most common arthr...,,,1543.0,1
4465,Ticks and tick-borne diseases,Borrelia miyamotoi strain LB-2001 retains plas...,"[{'name': 'Robert D Gilmore', 'affiliation': '...",10.1016/j.ttbdis.2020.101587,2021.0,0,Journal Article,Borrelia miyamotoi is a tick-borne spirochete ...,,,1118.0,1


## Data Prep

In [12]:
# Datasets already contain a dummy variable column flagging where they were obtained from. E.g pubmed['pubmed']=1
# This adds matching columns in preparation for concatenation
pubmed[['gscholar','crossref','scopus']]=0
gscholar[['pubmed','crossref','scopus']]=0
crossref[['pubmed','gscholar','scopus']]=0
scopus[['pubmed','gscholar','crossref']]=0

In [13]:
pubmed.drop_duplicates(inplace=True)
gscholar.drop_duplicates(inplace=True)
scopus.drop_duplicates(inplace=True)
crossref.drop_duplicates(inplace=True)

In [14]:
pubmed.reset_index(drop=True,inplace=True)
gscholar.reset_index(drop=True,inplace=True)
scopus.reset_index(drop=True,inplace=True)
crossref.reset_index(drop=True,inplace=True)

In [15]:
crossref.publication.value_counts()[:20]

Clinical Infectious Diseases                                       82
Ticks and Tick-borne Diseases                                      75
PLOS ONE                                                           46
Médecine et Maladies Infectieuses                                  43
Open Forum Infectious Diseases                                     41
Journal of Clinical Microbiology                                   38
Vector-Borne and Zoonotic Diseases                                 37
Emerging Infectious Diseases                                       34
Parasites & Vectors                                                33
Infection and Immunity                                             32
The American Journal of Medicine                                   32
Journal of Medical Entomology                                      32
PLoS ONE                                                           29
Diagnostic Microbiology and Infectious Disease                     27
Revue Francophone de

In [16]:
pubmed.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2992 entries, 0 to 2991
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     2992 non-null   object 
 1   title           2991 non-null   object 
 2   authors         2979 non-null   object 
 3   doi             2851 non-null   object 
 4   year            2989 non-null   float64
 5   cites           2992 non-null   int64  
 6   type            2992 non-null   object 
 7   abstract        2861 non-null   object 
 8   article_url     0 non-null      float64
 9   fulltext_url    0 non-null      float64
 10  abstractLength  2861 non-null   float64
 11  pubmed          2992 non-null   int64  
 12  gscholar        2992 non-null   int64  
 13  crossref        2992 non-null   int64  
 14  scopus          2992 non-null   int64  
dtypes: float64(4), int64(5), object(6)
memory usage: 350.8+ KB


In [17]:
scopus.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1797 entries, 0 to 1796
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     1797 non-null   object 
 1   title           1797 non-null   object 
 2   authors         1796 non-null   object 
 3   doi             1793 non-null   object 
 4   year            1797 non-null   int64  
 5   cites           1797 non-null   int64  
 6   type            1796 non-null   object 
 7   abstract        0 non-null      float64
 8   article_url     359 non-null    object 
 9   fulltext_url    0 non-null      float64
 10  abstractLength  0 non-null      float64
 11  scopus          1797 non-null   int64  
 12  pubmed          1797 non-null   int64  
 13  gscholar        1797 non-null   int64  
 14  crossref        1797 non-null   int64  
dtypes: float64(3), int64(6), object(6)
memory usage: 210.7+ KB


In [18]:
##Scopus contains no abstracts so remove any records without DOIs
scopus = scopus[(scopus.doi.notna())]

In [19]:
scopus.type.unique()

array(['Article', nan], dtype=object)

In [20]:
scopus[scopus.type.isna()]

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,scopus,pubmed,gscholar,crossref
1133,Pediatric Infectious Disease Journal,Cerebrospinal Fluid B-lymphocyte Chemoattracta...,"[{'name': 'B. Barstad', 'affiliation': 'Stavan...",10.1097/INF.0000000000001669,2017,17,,,,,,1,0,0,0


In [21]:
scopus[~scopus['doi'].isin(pubmed.doi)]

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,scopus,pubmed,gscholar,crossref
4,PLoS ONE,Vitamin C: Intravenous use by complementary an...,"[{'name': 'S. Padayatty', 'affiliation': 'Nati...",10.1371/journal.pone.0011414,2010,179,Article,,,,,1,0,0,0
7,American Journal of Medicine,Subjective Symptoms after Treatment of Early L...,"[{'name': 'D. Cerar', 'affiliation': 'Univerzi...",10.1016/j.amjmed.2009.05.011,2010,128,Article,,https://api.elsevier.com/content/article/eid/1...,,,1,0,0,0
8,Global Ecology and Biogeography,Field and climate-based model for predicting t...,"[{'name': 'M.A. Diuk-Wasser', 'affiliation': '...",10.1111/j.1466-8238.2010.00526.x,2010,123,Article,,,,,1,0,0,0
15,Experimental and Applied Acarology,Tick burden on European roe deer (Capreolus ca...,"[{'name': 'T. Vor', 'affiliation': 'Georg-Augu...",10.1007/s10493-010-9337-0,2010,75,Article,,,,,1,0,0,0
29,Journal of Medical Entomology,Survival of Ixodes ricinus (Acari: Ixodidae) u...,"[{'name': 'C. Herrmann', 'affiliation': 'Unive...",10.1603/ME10111,2010,60,Article,,,,,1,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1772,Neuro-Ophthalmology,Lyme Neuroborreliosis Presenting as Multiple C...,"[{'name': 'A. Sriram', 'affiliation': 'Albert ...",10.1080/01658107.2021.1951769,2021,0,Article,,,,,1,0,0,0
1775,Dermatologia Revista Mexicana,Lyme's disease,"[{'name': 'Z. Quijada-Ucelo', 'affiliation': '...",10.24245/dermatolrevmex.v65id.5442,2021,0,Article,,,,,1,0,0,0
1778,Journal of the American Veterinary Medical Ass...,Comparisons of hematologic results for juvenil...,"[{'name': 'K.S. KuKanich', 'affiliation': 'Kan...",10.2460/JAVMA.259.3.275,2021,0,Article,,,,,1,0,0,0
1782,Medycyna Pracy,The state of mental health in people with a di...,"[{'name': 'K. Staszewska', 'affiliation': 'Ins...",10.13075/mp.5893.01049,2021,0,Article,,,,,1,0,0,0


### Resetting very short abstracts

With the exception of one very short pubmed abstract, there are some abstracts that appear truncated. An arbitrary threshold of 300 characters is set with scopus and crossref abstracts short than this reset to null. Hopefully these will be properly populated by the scraping process.

In [22]:
for idx in crossref.index:
    if crossref.loc[idx].abstractLength<300:
        crossref.at[idx,'abstract'] = np.NaN
        crossref.at[idx,'abstractLength'] = np.NaN

In [23]:
for idx in scopus.index:
    if scopus.loc[idx].abstractLength<300:
        scopus.at[idx,'abstract'] = np.NaN
        scopus.at[idx,'abstractLength'] = np.NaN

## Building Dataframe of Unique Records

### Records that have DOI and abstract

In [24]:
len(pubmed[pubmed.doi.notna() & pubmed.abstract.notna()])

2720

Using Pubmed results as the starting point due to their relative completeness of data, we take the 2720 results with both a DOI and abstract and add any abstracts from the Scopus or Crossref results that also have a unique DOI. Using the DOI as a unique indentifier prevents adding duplicates to the new collated dataset as we build it. At this stage Google Scholar is put aside due to the inconsistency of results and incomplete data in some of the fields.

In [25]:
## Splitting Datasets by abstract and doi availability
crossrefToAdd = crossref[crossref.doi.notna() & crossref.abstract.notna()]
crossrefRemaining = crossref[~(crossref.doi.notna() & crossref.abstract.notna())]
gscholarToAdd = gscholar[gscholar.doi.notna() & gscholar.abstract.notna()]
gscholarRemaining = gscholar[~(gscholar.doi.notna() & gscholar.abstract.notna())]
scopusToAdd = scopus[scopus.doi.notna() & scopus.abstract.notna()]
scopusRemaining = scopus[~(scopus.doi.notna() & scopus.abstract.notna())]

#### Add unique doi-abstract combos from each data source

In [26]:
collated = pubmed[pubmed.doi.notna() & pubmed.abstract.notna()]

In [27]:
collated = pd.concat([collated, crossrefToAdd[~crossrefToAdd['doi'].isin(pubmed.doi)]])

In [28]:
collated = pd.concat([collated, scopusToAdd[~scopusToAdd['doi'].isin(collated.doi)]])

In [29]:
len(collated)

2988

#### Extra prep

In [30]:
collated.reset_index(drop = True, inplace=True)

In [31]:
collated[collated.abstract.str.lower().str.islower()==False] #Shows abstracts with no alpha numeric characters. Just punctuation

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus


In [32]:
collated = collated[collated.abstract.str.lower().str.islower()] # Drop any abstracts with no alphanumeric charcters

In [33]:
collated['publication'] = collated.publication.str.lower() #standardise publication names

#add language of abstracts, titles and journal names
collated['publicationLanguage'] = collated.publication.apply(detect)
collated['titleLanguage'] = collated.title.apply(detect)
collated['abstractLanguage'] = collated.abstract.apply(detect)

In [34]:
len(collated[collated.publicationLanguage!='en'])

962

In [36]:
collated[collated.abstractLanguage!='en']

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus,publicationLanguage,titleLanguage,abstractLanguage
2720,orvosi hetilap,Acute atrioventricular block in chronic Lyme d...,"['Vince Wagner', 'Endre Zima', 'László Gellér'...",10.1556/oh.2010.28965,2010.0,1,journal-article,A Lyme-kór az egyik leggyakoribb antropozoonos...,http://dx.doi.org/10.1556/oh.2010.28965,https://akjournals.com/view/journals/650/151/3...,1235.0,0,0,1,0,fi,en,hu
2729,acta médica costarricense,Enfermedad de Lyme (Borreliosis de Lyme) en Co...,['Ricardo Boza Cordero'],10.51481/amc.v53i1.702,2011.0,0,journal-article,La enfermedad de Lyme o borreliosis de Lyme es...,http://dx.doi.org/10.51481/amc.v53i1.702,http://actamedica.medicos.cr/index.php/Acta_Me...,1068.0,0,0,1,0,pt,es,es
2732,arthritis und rheuma,Lyme-Arthritis bei Kindern,"['H.-I. Huppertz', 'F. Dressler']",10.1055/s-0037-1618052,2011.0,1,journal-article,Zusammenfassung: Die Lyme-Arthritis ist durch ...,http://dx.doi.org/10.1055/s-0037-1618052,http://www.thieme-connect.de/products/ejournal...,348.0,0,0,1,0,en,de,de
2785,tidsskrift for islamforskning,Hizb’allahs råderum i transnationale shia isla...,['Rune Friberg Lyme'],10.7146/tifo.v5i1.25000,2016.0,0,journal-article,Lige siden Hizb’allah første gang fandt vej ti...,http://dx.doi.org/10.7146/tifo.v5i1.25000,https://tifoislam.dk/article/download/25000/21920,937.0,0,0,1,0,da,da,da
2789,kinder- und jugendmedizin,Kutane Lyme-Borreliose bei Kindern und Jugendl...,"['H.-I. Huppertz', 'H. Hofmann']",10.1055/s-0037-1616313,2016.0,0,journal-article,"Zusammenfassung: Die Lyme-Borreliose, übertrag...",http://dx.doi.org/10.1055/s-0037-1616313,http://www.thieme-connect.de/products/ejournal...,619.0,0,0,1,0,da,de,de
2825,annales academiae medicae silesiensis,Estimation of hearing impairment occurrence in...,"['Barbara Oczko-Grzesik', 'Grażyna Lisowska', ...",10.18794/aams/78399,2018.0,0,journal-article,Wstęp: Borelioza z Lyme (Lyme borreliosis – LB...,http://dx.doi.org/10.18794/aams/78399,https://annales.sum.edu.pl/pdf-78399-28569,1295.0,0,0,1,0,es,en,pl
2833,journal of the portuguese society of dermatolo...,Doença de Lyme: Epidemiologia e Manifestações ...,"['Pedro Miguel Garrido', 'João Borges-Costa']",10.29021/spdv.76.2.907,2018.0,0,journal-article,A doença de Lyme é a patologia transmitida pel...,http://dx.doi.org/10.29021/spdv.76.2.907,https://revista.spdv.com.pt/index.php/spdv/art...,1378.0,0,0,1,0,en,pt,pt
2909,aktuelle rheumatologie,Was führt zur Antibiotika-refraktären Lyme-Art...,,10.1055/a-1021-6468,2020.0,0,journal-article,"Nur wenige Faktoren sind bekannt, die bisher d...",http://dx.doi.org/10.1055/a-1021-6468,http://www.thieme-connect.de/products/ejournal...,416.0,0,0,1,0,et,de,de
2918,iatreia,Panuveítis asociada a la enfermedad de Lyme en...,"['Miguel Cuevas-Peláez', 'Alexandra Correa-Gar...",10.17533/udea.iatreia.46,2020.0,1,journal-article,La enfermedad de Lyme es una zoonosis transmit...,http://dx.doi.org/10.17533/udea.iatreia.46,https://revistas.udea.edu.co/index.php/iatreia...,1159.0,0,0,1,0,ro,es,es


In [37]:
#Drop non-english abstracts
collated = collated[collated.abstractLanguage=='en']

In [38]:
collated.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2979 entries, 0 to 2987
Data columns (total 18 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   publication          2979 non-null   object 
 1   title                2979 non-null   object 
 2   authors              2976 non-null   object 
 3   doi                  2979 non-null   object 
 4   year                 2979 non-null   float64
 5   cites                2979 non-null   int64  
 6   type                 2979 non-null   object 
 7   abstract             2979 non-null   object 
 8   article_url          259 non-null    object 
 9   fulltext_url         226 non-null    object 
 10  abstractLength       2979 non-null   float64
 11  pubmed               2979 non-null   int64  
 12  gscholar             2979 non-null   int64  
 13  crossref             2979 non-null   int64  
 14  scopus               2979 non-null   int64  
 15  publicationLanguage  2979 non-null   o

In [39]:
scopusRemaining.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1793 entries, 0 to 1796
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     1793 non-null   object 
 1   title           1793 non-null   object 
 2   authors         1792 non-null   object 
 3   doi             1793 non-null   object 
 4   year            1793 non-null   int64  
 5   cites           1793 non-null   int64  
 6   type            1792 non-null   object 
 7   abstract        0 non-null      float64
 8   article_url     355 non-null    object 
 9   fulltext_url    0 non-null      float64
 10  abstractLength  0 non-null      float64
 11  scopus          1793 non-null   int64  
 12  pubmed          1793 non-null   int64  
 13  gscholar        1793 non-null   int64  
 14  crossref        1793 non-null   int64  
dtypes: float64(3), int64(6), object(6)
memory usage: 224.1+ KB


In [40]:
crossrefRemaining.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2985 entries, 0 to 3412
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     2985 non-null   object 
 1   title           2985 non-null   object 
 2   authors         2878 non-null   object 
 3   doi             2985 non-null   object 
 4   year            2985 non-null   int64  
 5   cites           2985 non-null   int64  
 6   type            2985 non-null   object 
 7   abstract        0 non-null      object 
 8   article_url     2985 non-null   object 
 9   fulltext_url    2568 non-null   object 
 10  abstractLength  0 non-null      float64
 11  crossref        2985 non-null   int64  
 12  pubmed          2985 non-null   int64  
 13  gscholar        2985 non-null   int64  
 14  scopus          2985 non-null   int64  
dtypes: float64(1), int64(6), object(8)
memory usage: 373.1+ KB


### Records that have DOI but no abstract

Next we build a dataframe of records that have unique DOIs but no abstract. This will be the list of abstracts we will attempt to scrape and add to the dataframe of existing abstracts.

In [98]:
scraped = pubmed[pubmed.doi.notna() & pubmed.abstract.isna()]

In [99]:
pubmedRemaining = pubmed[pubmed.doi.isna()] #Leftover Pubmed Articles without a DOI 

In [100]:
crossrefToAdd = crossrefRemaining[~(crossrefRemaining['doi'].isin(collated.doi) | crossrefRemaining['doi'].isin(scraped.doi))]
scraped = pd.concat([scraped, crossrefToAdd])

In [101]:
scopusToAdd = scopusRemaining[~(scopusRemaining['doi'].isin(collated.doi) | scopusRemaining['doi'].isin(scraped.doi))]
scraped = pd.concat([scraped, scopusToAdd])

In [102]:
len(scraped)

2746

In [103]:
scraped.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2746 entries, 11 to 1782
Data columns (total 15 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   publication     2746 non-null   object 
 1   title           2745 non-null   object 
 2   authors         2638 non-null   object 
 3   doi             2746 non-null   object 
 4   year            2746 non-null   float64
 5   cites           2746 non-null   int64  
 6   type            2746 non-null   object 
 7   abstract        0 non-null      object 
 8   article_url     2196 non-null   object 
 9   fulltext_url    1762 non-null   object 
 10  abstractLength  0 non-null      float64
 11  pubmed          2746 non-null   int64  
 12  gscholar        2746 non-null   int64  
 13  crossref        2746 non-null   int64  
 14  scopus          2746 non-null   int64  
dtypes: float64(2), int64(5), object(8)
memory usage: 343.2+ KB


In [104]:
# Drop record without a title
scraped=scraped[scraped.title.notna()]

In [105]:
scraped[scraped.title.str.lower().str.islower()==False] #Shows titles with no alpha numeric characters. Just punctuation

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus
640,Critical Care Medicine,1226,"['N. Gurukripa Kowlgi', 'Arushi Khurana', 'Rah...",10.1097/01.ccm.0000440458.63357.89,2013.0,0,journal-article,,http://dx.doi.org/10.1097/01.ccm.0000440458.63...,https://journals.lww.com/10.1097/01.ccm.000044...,,0,0,1,0


In [106]:
scraped = scraped[scraped.title.str.lower().str.islower()] # Drop any titles with no alphanumeric charcters

In [107]:
scraped.reset_index(drop=True, inplace=True)

In [108]:
scraped['publication'] = scraped.publication.str.lower()

In [109]:
#add language of abstracts, titles and journal names
scraped['publicationLanguage'] = scraped.publication.apply(detect)
scraped['titleLanguage'] = scraped.title.apply(detect)
scraped['abstractLanguage'] = np.NaN

In [110]:
scraped[(scraped.titleLanguage!='en') & (scraped.publicationLanguage!='en')]

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus,publicationLanguage,titleLanguage,abstractLanguage
24,praxis,[CME. Larva migrans].,"[{'name': 'Lea Landolt', 'affiliation': 'Klini...",10.1024/1661-8157/a001859,2014.0,0,Journal Article,,,,,1,0,0,0,sk,ca,
33,jaapa,Lyme disease.,"[{'name': 'Jami S Smith', 'affiliation': ""Jami...",10.1097/01.JAA.0000446993.79681.08,2014.0,0,Journal Article,,,,,1,0,0,0,fi,fr,
38,tidsskrift for den norske laegeforening,[Cellular Borrelia tests].,"['Yngvar Tveten', 'Sølvi Noraas', 'Audun Aase']",10.4045/tidsskr.13.1053,2014.0,0,Journal Article,,,,,1,0,0,0,da,ca,
44,scientific american,Lingering Lyme.,['Melinda Wenner Moyer'],10.1038/scientificamerican0915-16,2015.0,0,Journal Article,,,,,1,0,0,0,it,da,
50,praxis,[Diagnosis of Lyme borreliosis].,"[{'name': 'Silvana K Rampini', 'affiliation': ...",10.1024/1661-8157/a001970,2015.0,0,Journal Article,,,,,1,0,0,0,sk,lt,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2222,scientific reports,Investigating BB0405 as a novel Borrelia afzel...,"['M. J. Klouwens', 'J. J. Trentelman', 'J. I. ...",10.1038/s41598-021-84130-y,2021.0,0,journal-article,,http://dx.doi.org/10.1038/s41598-021-84130-y,http://www.nature.com/articles/s41598-021-8413...,,0,0,1,0,fr,it,
2249,actas dermo-sifiliograficas,Lichen sclerosus,"[{'name': 'V. Monsálvez', 'affiliation': 'Hosp...",10.1016/j.ad.2009.07.004,2010.0,20,Article,,https://api.elsevier.com/content/article/eid/1...,,,0,0,0,1,es,de,
2504,clinical and vaccine immunology,Immunoglobulin M for acute infection: True or ...,"[{'name': 'M. Landry', 'affiliation': 'Yale Sc...",10.1128/CVI.00211-16,2016.0,43,Article,,,,,0,0,0,1,it,ro,
2557,isme journal,Ixodes scapularis does not harbor a stable mid...,"[{'name': 'B.D. Ross', 'affiliation': 'Univers...",10.1038/s41396-018-0161-6,2018.0,41,Article,,,,,0,0,0,1,fr,ca,


In [111]:
scraped = scraped[(scraped.titleLanguage=='en') | (scraped.publicationLanguage=='en')]

In [112]:
scraped.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2471 entries, 0 to 2743
Data columns (total 18 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   publication          2471 non-null   object 
 1   title                2471 non-null   object 
 2   authors              2378 non-null   object 
 3   doi                  2471 non-null   object 
 4   year                 2471 non-null   float64
 5   cites                2471 non-null   int64  
 6   type                 2471 non-null   object 
 7   abstract             0 non-null      object 
 8   article_url          1930 non-null   object 
 9   fulltext_url         1532 non-null   object 
 10  abstractLength       0 non-null      float64
 11  pubmed               2471 non-null   int64  
 12  gscholar             2471 non-null   int64  
 13  crossref             2471 non-null   int64  
 14  scopus               2471 non-null   int64  
 15  publicationLanguage  2471 non-null   o

In [132]:
scraped[scraped.authors.isna()]

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus,publicationLanguage,titleLanguage,abstractLanguage
13,amyotrophic lateral sclerosis,"ALS untangled No. 17: ""when ALS is lyme"".",,10.3109/17482968.2012.717796,2012.0,0,Journal Article,,,,,1,0,0,0,en,en,
38,"continuum (minneapolis, minn.)",Appendix D: Summary of Evidence-based Guidelin...,,10.1212/CON.0000000000000263,2015.0,0,Journal Article,,,,,1,0,0,0,et,en,
39,"continuum (minneapolis, minn.)",Appendix C: Practice Parameter: Diagnosis of P...,,10.1212/CON.0000000000000262,2015.0,0,Journal Article,,,,,1,0,0,0,et,en,
62,jama,Treatment of Lyme Disease.,,10.1001/jama.2016.6888,2016.0,0,Journal Article,,,,,1,0,0,0,et,en,
77,jama,Deceptive Lyme Disease Diagnosis Linked With S...,,10.1001/jama.2017.8897,2017.0,0,Journal Article,,,,,1,0,0,0,et,en,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1744,reactions weekly,Immunosuppressants,,10.1007/s40278-020-80373-8,2020.0,0,journal-article,,http://dx.doi.org/10.1007/s40278-020-80373-8,https://link.springer.com/content/pdf/10.1007/...,,0,0,1,0,en,fr,
1751,journal of allergy & infectious diseases,Ticks positive for Lyme disease causing bacter...,,10.46439/allergy.1.006,2020.0,0,journal-article,,http://dx.doi.org/10.46439/allergy.1.006,,,0,0,1,0,en,en,
1785,archives of medical case reports,Healthy Fetal Outcomes Using A Novel Treatment...,,10.33696/casereports.2.006,2020.0,0,journal-article,,http://dx.doi.org/10.33696/casereports.2.006,,,0,0,1,0,en,en,
1854,nursing made incredibly easy!,Living with Lyme disease,,10.1097/01.nme.0000755896.12134.c7,2021.0,0,journal-article,,http://dx.doi.org/10.1097/01.nme.0000755896.12...,https://journals.lww.com/10.1097/01.NME.000075...,,0,0,1,0,en,en,


In [113]:
scraped.reset_index(drop=True, inplace=True)

In [133]:
scraped=scraped[scraped.authors.notna()]

In [145]:
scraped.reset_index(drop=True, inplace=True)

In [134]:
print('There are', scraped.publication.nunique(), 'different publications in the scraping dataset')

There are 953 different publications in the scraping dataset


In [135]:
round(scraped.publication.value_counts()[:476].sum() / len(scraped)*100, 2)

79.94

In [138]:
round(scraped.publication.value_counts()[:100].sum() / len(scraped)*100, 2)

47.1

In [137]:
scraped.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 2378 entries, 0 to 2470
Data columns (total 18 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   publication          2378 non-null   object 
 1   title                2378 non-null   object 
 2   authors              2378 non-null   object 
 3   doi                  2378 non-null   object 
 4   year                 2378 non-null   float64
 5   cites                2378 non-null   int64  
 6   type                 2378 non-null   object 
 7   abstract             0 non-null      object 
 8   article_url          1846 non-null   object 
 9   fulltext_url         1480 non-null   object 
 10  abstractLength       0 non-null      float64
 11  pubmed               2378 non-null   int64  
 12  gscholar             2378 non-null   int64  
 13  crossref             2378 non-null   int64  
 14  scopus               2378 non-null   int64  
 15  publicationLanguage  2378 non-null   o

There are 953 different journals in the dataset of 2378 articles. If we can successfully scrape abstracts for each of the top 100 (top 10%) sources we'd collect around half of the potential abstracts. Success across the top 50% of sources would collect 80% of abstracts.

### Saving New Dataframes

In [146]:
collated.to_csv('collated.csv')
scraped.to_csv('scraped.csv')

## Scraping

In [119]:
import requests                 # How Python gets the webpages
from bs4 import BeautifulSoup   # Creates structured, searchable object
import urllib                   # useful for cleaning/processing URLs

# import pprint as pp
# from time import sleep
# from datetime import datetime

In [140]:
scraped.publication.value_counts()[:20]

ticks and tick-borne diseases                     58
clinical infectious diseases                      53
plos one                                          44
infection and immunity                            43
journal of clinical microbiology                  35
parasites and vectors                             32
the american journal of medicine                  29
emerging infectious diseases                      28
journal of medical entomology                     26
clinical and vaccine immunology                   25
the lancet infectious diseases                    21
vector-borne and zoonotic diseases                21
clinical microbiology and infection               20
diagnostic microbiology and infectious disease    18
journal of bacteriology                           18
pediatric infectious disease journal              18
applied and environmental microbiology            17
option/bio                                        16
open forum infectious diseases                

### Formatting Vagaries

In each of the main sources of articles, the parent webpage (e.g. Science Direct, plos.org) tags Abstract text in different ways. 

For example:
* Science Direct (Ticks and Tick-borne Diseases): ``<h2 class="section-title u-h3 u-margin-l-top u-margin-xs-bottom">Abstract</h2><div id="abst0005">``
* PLoS ONE: ``<h2>Abstract\</h2>\<div class="abstract-content">``

* Vector-Borne and Zoonotic Diseases: ``<h2>Abstract</h2></div><div class="abstractSection abstractInFull">``

* Parasites & Vectors: ``<h2 class="c-article-section__title js-section-title js-c-reading-companion-sections-item" id="Abs1">Abstract</h2><div class="c-article-section__content" id="Abs1-content">``

In addition, some sites, notably Science Direct/Elsevier employ a redirect not picked up by the *Requests* package in Python because they use JavaScript to redirect from the page that the get request lands on. Additional handling is required in these cases to extract the actual URL, visit it and extract the Abstract text. 

### Scraping Function

In [439]:
def scrapeAbstracts(doi):
    headers = {'User-Agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (K HTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
              }
    url = 'http://dx.doi.org/' + doi
#     print(url)
    try:
        response = requests.get(url) 
    except requests.exceptions.RequestException as e:
                return 'Error'
        
    if(response.status_code!=200):
        return str(response.status_code)+ ' response'
    
    page = response.content
    abstractHeadingStart = page.find(b'>Abstract<')
    if abstractHeadingStart==-1: 
        redirect = page.find(b'redirect')
        if redirect ==-1:
            return 'No abstract found'
        else:
            URLstart = page.find(b'http', redirect)
            URLend = page.find(b'"', URLstart)
            URLencoded = page[URLstart:URLend]
            URLdecoded = URLencoded.decode('UTF-8')
            redirectURL = urllib.parse.unquote(URLdecoded)
            redirectURLshort = redirectURL[:redirectURL.find('?')]
#             print(redirectURLshort)
            try:
                response = requests.get(redirectURL, headers=headers)
            except requests.exceptions.RequestException as e:
                return 'Error after redirect'
            if(response.status_code!=200):
                return 'No response'
            page = response.content
            abstractHeadingStart = page.find(b'>Abstract<')
            if abstractHeadingStart==-1: 
                return 'No abstract found'
    lastMentionofAbstract = page.rfind(b'>Abstract<')
    if abstractHeadingStart != lastMentionofAbstract:
#         print('test')
        if page.find(b'>Abstract</h')== lastMentionofAbstract:
            abstractHeadingStart = lastMentionofAbstract
    abstractHeadingend = page.find(b'<div', abstractHeadingStart+len('>Abstract<'))
#         print(page[abstractHeadingStart:abstractHeadingend])
    try:
        divTagEnd = page.find(b'>', abstractHeadingend)
        divTag = page[abstractHeadingend:divTagEnd]
        divTagType = divTag[divTag.find(b' ')+1:divTag.find(b'=')].decode()
        divTagAtrr = divTag[divTag.find(b'"')+1:]
        divTagAtrr = divTagAtrr[:divTagAtrr.find(b'"')]

        scraping = BeautifulSoup(page, "html") 
        text = scraping.find("div", attrs={divTagType: divTagAtrr})
        for subTag in text.contents[:-1]:
            if subTag.name is not None and subTag.name.startswith("h"):
                subTag.string = subTag.string + '.'
    except AttributeError as error:
        headingTag = page[page.rfind(b'<',0, abstractHeadingStart):abstractHeadingStart]
        headingTag = headingTag[:3]

        abstractStart = page.find(b'>', abstractHeadingStart+len('>Abstract<'))+1
        abstractEnd = page.find(headingtag,abstractStart)

        scraping = BeautifulSoup(page[abstractStart:abstractEnd], "html") 
        text = scraping.get_text(strip=True)
        return text
    except TypeError as error:
        return 'TypeError'
#         print(divTag)
#         print(divTagType)
#         print(divTagAtrr)
    return text.get_text(separator = ' ', strip=True)

In [182]:
scraped

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus,publicationLanguage,titleLanguage,abstractLanguage
0,the veterinary record,Prevalence of Borrelia infection in ticks from...,"[{'name': 'D Couper', 'affiliation': 'RSPCA We...",10.1136/vr.c5285,2010.0,0,Journal Article,,,,,1,0,0,0,en,en,
1,nihon saikingaku zasshi. japanese journal of b...,[Molecular mechanism of the borrelial proteins...,"[{'name': 'Masahito Fukunaga', 'affiliation': ...",10.3412/jsb.65.343,2010.0,0,Journal Article,,,,,1,0,0,0,hr,en,
2,mmw fortschritte der medizin,[Clinical aspects of neuroborreliosis].,"[{'name': 'Hans-Walter Pfister', 'affiliation'...",10.1007/BF03366785,2010.0,0,Journal Article,,,,,1,0,0,0,de,en,
3,the nurse practitioner,Lyme disease: a diagnostic dilemma.,"[{'name': 'Virginia Savely', 'affiliation': 'T...",10.1097/01.NPR.0000383661.45156.09,2010.0,0,Journal Article,,,,,1,0,0,0,en,it,
4,schweizer archiv fur tierheilkunde,"[""Lyme disease"" as a possible cause for lamene...","[{'name': 'Esther Peterhans', 'affiliation': '...",10.1024/0036-7281/a000067,2010.0,0,Journal Article,,,,,1,0,0,0,de,en,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2373,scientific reports,Environmental determinants of the occurrence a...,"[{'name': 'Z. Zając', 'affiliation': 'Medical ...",10.1038/s41598-021-95079-3,2021.0,0,Article,,,,,0,0,0,1,fr,en,
2374,sage open medical case reports,Penile cancer after a tick bite: A possible as...,"[{'name': 'O. Ivanovski', 'affiliation': 'SS C...",10.1177/2050313X211036779,2021.0,0,Article,,,,,0,0,0,1,en,en,
2375,dermatologia revista mexicana,Lyme's disease,"[{'name': 'Z. Quijada-Ucelo', 'affiliation': '...",10.24245/dermatolrevmex.v65id.5442,2021.0,0,Article,,,,,0,0,0,1,it,en,
2376,journal of the american veterinary medical ass...,Comparisons of hematologic results for juvenil...,"[{'name': 'K.S. KuKanich', 'affiliation': 'Kan...",10.2460/JAVMA.259.3.275,2021.0,0,Article,,,,,0,0,0,1,en,en,


In [184]:
scraped = scraped.iloc[scraped.groupby('publication').publication.transform('size').mul(-1).argsort(kind='mergesort')]

In [185]:
scraped.describe()

Unnamed: 0,year,cites,abstractLength,pubmed,gscholar,crossref,scopus,abstractLanguage
count,2378.0,2378.0,0.0,2378.0,2378.0,2378.0,2378.0,0.0
mean,2015.754415,13.060976,,0.04878,0.0,0.735071,0.216148,
std,3.274343,30.33368,,0.215454,0.0,0.441388,0.411703,
min,2010.0,0.0,,0.0,0.0,0.0,0.0,
25%,2013.0,0.0,,0.0,0.0,0.0,0.0,
50%,2016.0,3.0,,0.0,0.0,1.0,0.0,
75%,2019.0,16.0,,0.0,0.0,1.0,0.0,
max,2021.0,746.0,,1.0,0.0,1.0,1.0,


In [156]:
scrapeAbstracts(scraped.loc[962]['doi'])

'No response'

### Sample for testing

In [445]:
sample = scraped.sample(250)

In [447]:
sample['abstract']= sample.doi.apply(scrapeAbstracts)

In [448]:
sample['abstractLength']=sample.abstract.apply(len)

In [449]:
sample.describe()

Unnamed: 0,year,cites,abstractLength,pubmed,gscholar,crossref,scopus,abstractLanguage
count,250.0,250.0,250.0,250.0,250.0,250.0,250.0,0.0
mean,2015.732,15.264,827.168,0.064,0.0,0.74,0.196,
std,3.271542,40.672587,4148.252983,0.245244,0.0,0.439514,0.397765,
min,2010.0,0.0,0.0,0.0,0.0,0.0,0.0,
25%,2013.0,0.0,12.0,0.0,0.0,0.0,0.0,
50%,2016.0,3.0,17.0,0.0,0.0,1.0,0.0,
75%,2018.0,15.0,736.25,0.0,0.0,1.0,0.0,
max,2021.0,366.0,45743.0,1.0,0.0,1.0,1.0,


In [454]:
sample[sample.abstractLength == sample.abstractLength.min()]

Unnamed: 0,publication,title,authors,doi,year,cites,type,abstract,article_url,fulltext_url,abstractLength,pubmed,gscholar,crossref,scopus,publicationLanguage,titleLanguage,abstractLanguage
839,journal of investigative dermatology,Geographical and Temporal Correlations in the ...,"['Vladimir Ratushny', 'Gideon P. Smith']",10.1038/jid.2015.93,2015.0,1,journal-article,,http://dx.doi.org/10.1038/jid.2015.93,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,en,
1560,frontiers in veterinary science,Social-behavioral/ecological risk assessment f...,"['Catherine Bouchard', 'Cécile Aenishaenslin',...",10.3389/conf.fvets.2019.05.00046,2019.0,0,journal-article,,http://dx.doi.org/10.3389/conf.fvets.2019.05.0...,,0,0,0,1,0,en,en,
755,pediatric emergency care,Lyme Arthritis,['Sherwin S. Chan'],10.1097/pec.0000000000000576,2015.0,0,journal-article,,http://dx.doi.org/10.1097/pec.0000000000000576,https://journals.lww.com/10.1097/PEC.000000000...,0,0,0,1,0,en,en,
2359,animals,Data on before and after the traceability syst...,"[{'name': 'C. Chirollo', 'affiliation': 'Unive...",10.3390/ani11030913,2021.0,1,Article,,,,0,0,0,0,1,et,en,
994,european journal of paediatric neurology,Acute isolated partial oculomotor nerve palsy ...,"['Anne Drenckhahn', 'Birgit Spors', 'Ellen Kni...",10.1016/j.ejpn.2016.05.022,2016.0,1,journal-article,,http://dx.doi.org/10.1016/j.ejpn.2016.05.022,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,en,
475,the journal for nurse practitioners,Lyme Disease: From Early Localized Disease to ...,"['Chloe Nichols', 'Brenda Windemuth']",10.1016/j.nurpra.2013.04.017,2013.0,0,journal-article,,http://dx.doi.org/10.1016/j.nurpra.2013.04.017,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,en,
1742,journal of obstetrics and gynaecology canada,Opinion du comité No 399 : Prise en charge des...,"['Graeme N. Smith', 'Kieran M. Moore', 'Todd F...",10.1016/j.jogc.2020.02.110,2020.0,0,journal-article,,http://dx.doi.org/10.1016/j.jogc.2020.02.110,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,fr,
601,the american journal of medicine,Treatment Trials for Post-Lyme Disease Symptom...,"['Mark S. Klempner', 'Phillip J. Baker', 'Euge...",10.1016/j.amjmed.2013.02.014,2013.0,79,journal-article,,http://dx.doi.org/10.1016/j.amjmed.2013.02.014,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,en,
520,the american journal of medicine,Alternative Considerations for “Common Misconc...,['Alfred Miller'],10.1016/j.amjmed.2013.01.034,2013.0,1,journal-article,,http://dx.doi.org/10.1016/j.amjmed.2013.01.034,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,en,
1507,journal of the neurological sciences,Optic neuropathy: unusual mode of revelation o...,"['M. Mati', 'A. Bouarfa']",10.1016/j.jns.2019.10.1703,2019.0,0,journal-article,,http://dx.doi.org/10.1016/j.jns.2019.10.1703,https://api.elsevier.com/content/article/PII:S...,0,0,0,1,0,en,en,


In [453]:
sample.loc[1277].abstract

"Suggest a Research Topic > .disabled-supplementary-btn {\r\n        cursor: not-allowed;\r\n        pointer-events: none;\r\n        opacity: .65;\r\n        filter: alpha(opacity=65);\r\n        -webkit-box-shadow: none;\r\n        box-shadow: none;\r\n    } Download Article Download PDF ReadCube EPUB XML (NLM) Supplementary Material Export citation EndNote Reference Manager Simple TEXT file BibTex total views View Article Impact Suggest a Research Topic > SHARE ON Open Supplemental Data MINI REVIEW article Front. Cell. Infect. Microbiol., 29 May 2018\r\n                             | https://doi.org/10.3389/fcimb.2018.00176 Ixodes Immune Responses Against Lyme Disease Pathogens Chrysoula Kitsou and Utpal Pal * Department of Veterinary Medicine and Virginia-Maryland Regional College of Veterinary Medicine, University of Maryland, College Park, MD, United States Although Ixodes scapularis and other related tick species are considered prolific vectors for a number of important human di

In [None]:
sample.

## Sandbox

### Testing

In [439]:
def scrapeAbstracts(doi):
    headers = {'User-Agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (K HTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
              }
    url = 'http://dx.doi.org/' + doi
#     print(url)
    try:
        response = requests.get(url) 
    except requests.exceptions.RequestException as e:
                return 'Error'
        
    if(response.status_code!=200):
        return str(response.status_code)+ ' response'
    
    page = response.content
    abstractHeadingStart = page.find(b'>Abstract<')
    if abstractHeadingStart==-1: 
        redirect = page.find(b'redirect')
        if redirect ==-1:
            return 'No abstract found'
        else:
            URLstart = page.find(b'http', redirect)
            URLend = page.find(b'"', URLstart)
            URLencoded = page[URLstart:URLend]
            URLdecoded = URLencoded.decode('UTF-8')
            redirectURL = urllib.parse.unquote(URLdecoded)
            redirectURLshort = redirectURL[:redirectURL.find('?')]
#             print(redirectURLshort)
            try:
                response = requests.get(redirectURL, headers=headers)
            except requests.exceptions.RequestException as e:
                return 'Error after redirect'
            if(response.status_code!=200):
                return 'No response'
            page = response.content
            abstractHeadingStart = page.find(b'>Abstract<')
            if abstractHeadingStart==-1: 
                return 'No abstract found'
    lastMentionofAbstract = page.rfind(b'>Abstract<')
    if abstractHeadingStart != lastMentionofAbstract:
#         print('test')
        if page.find(b'>Abstract</h')== lastMentionofAbstract:
            abstractHeadingStart = lastMentionofAbstract
    abstractHeadingend = page.find(b'<div', abstractHeadingStart+len('>Abstract<'))
#         print(page[abstractHeadingStart:abstractHeadingend])
    try:
        divTagEnd = page.find(b'>', abstractHeadingend)
        divTag = page[abstractHeadingend:divTagEnd]
        divTagType = divTag[divTag.find(b' ')+1:divTag.find(b'=')].decode()
        divTagAtrr = divTag[divTag.find(b'"')+1:]
        divTagAtrr = divTagAtrr[:divTagAtrr.find(b'"')]

        scraping = BeautifulSoup(page, "html") 
        text = scraping.find("div", attrs={divTagType: divTagAtrr})
        for subTag in text.contents[:-1]:
            if subTag.name is not None and subTag.name.startswith("h"):
                subTag.string = subTag.string + '.'
    except AttributeError as error:
        headingTag = page[page.rfind(b'<',0, abstractHeadingStart):abstractHeadingStart]
        headingTag = headingTag[:3]

        abstractStart = page.find(b'>', abstractHeadingStart+len('>Abstract<'))+1
        abstractEnd = page.find(headingtag,abstractStart)

        scraping = BeautifulSoup(page[abstractStart:abstractEnd], "html") 
        text = scraping.get_text(strip=True)
        return text
    except TypeError as error:
        return 'TypeError'
#         print(divTag)
#         print(divTagType)
#         print(divTagAtrr)
    return text.get_text(separator = ' ', strip=True)

In [437]:
journal = scraped.publication.value_counts().keys()[10]
journal

'the lancet infectious diseases'

In [226]:
scrapeAbstracts(scraped[scraped.publication==journal].iloc[0].doi)

http://dx.doi.org/10.1016/s1473-3099(11)70034-2
http://www.thelancet.com/retrieve/pii/S147330991170034


'This Journal Full Site'

In [None]:
Redirect
'infection and immunity'
'journal of clinical microbiology'
'clinical and vaccine immunology'



Access
'the american journal of medicine'

Grabs wrong section
'emerging infectious diseases'
'the lancet infectious diseases'


In [443]:
for i in range (scraped.publication.nunique()):
    journal = scraped.publication.value_counts().keys()[i]
    print(i, journal)
    scrapeAbstracts(scraped[scraped.publication==journal].iloc[0].doi)

0 ticks and tick-borne diseases
1 clinical infectious diseases
2 plos one
3 infection and immunity
4 journal of clinical microbiology
5 parasites and vectors
6 the american journal of medicine
7 emerging infectious diseases
8 journal of medical entomology
9 clinical and vaccine immunology
10 the lancet infectious diseases
11 vector-borne and zoonotic diseases
12 clinical microbiology and infection
13 pediatric infectious disease journal
14 journal of bacteriology
15 diagnostic microbiology and infectious disease
16 applied and environmental microbiology
17 option/bio
18 open forum infectious diseases
19 scientific reports
20 cureus
21 journal of the american college of cardiology
22 jama
23 bmj
24 canadian medical association journal
25 health problems of civilization
26 international journal of general medicine
27 molecular microbiology
28 journal of the neurological sciences
29 plos pathogens
30 european journal of clinical microbiology & infectious diseases
31 médecine et maladies i

253 klinische pädiatrie
254 comparative immunology, microbiology and infectious diseases
255 proceedings of the royal society b: biological sciences
256 the quarterly review of biology
257 clinical case reports
258 practice nursing
259 journal of microbiology and infectious diseases
260 proceedings of the geologists' association
261 the pediatric infectious disease journal
262 british journal of biomedical science
263 european journal of dermatology
264 journal of infection and chemotherapy
265 sociology of health & illness
266 jbjs case connector
267 veterinary microbiology
268 clinical neurology and neurosurgery
269 current research in complementary & alternative medicine
270 ecology
271 aap grand rounds
272 cmaj
273 neuro-ophthalmology
274 journal of immunology
275 neurology, psychiatry and brain research
276 microbiology spectrum
277 insect biochemistry and molecular biology
278 internal medicine
279 skin & allergy news
280 chest
281 arthritis & rheumatology (hoboken, n.j.)
282 tid

496 immunochemistry & immunopathology
497 international journal of occupational medicine and environmental health
498 japanese journal of infectious diseases
499 american journal of psychiatry
500 the journal of pediatric research
501 canadian mathematical bulletin
502 international journal of otolaryngology and head &amp; neck surgery
503 transboundary and emerging diseases
504 biotechnology & biotechnological equipment
505 cell host & microbe
506 brain, behavior, & immunity - health
507 revista do instituto de medicina tropical de sao paulo
508 journal of the british global and travel health association
509 clinical medical reviews and case reports
510 lviv clinical bulletin
511 equine health
512 pm & r
513 rheumatology science and practice
514 transfusion medicine
515 current sports medicine reports
516 environmental evidence
517 journal of evolutionary biology
518 acs chemical biology
519 american naturalist
520 developmental and comparative immunology
521 clinical medicine, journa

720 muscle and nerve
721 orthopedics and rheumatology open access journal
722 the korean journal of clinical laboratory science
723 pacing and clinical electrophysiology
724 online journal of public health informatics
725 biochimica et biophysica acta (bba) - biomembranes
726 zaporozhye medical journal
727 journal of minimally invasive gynecology
728 arthroplasty today
729 journal of clinical psychopharmacology
730 medical research journal
731 journal of pharmacy and pharmacology
732 applied sciences (switzerland)
733 oalib
734 pons - medicinski casopis
735 emerging microbes and infections
736 obstetrics and gynecology
737 advances in ophthalmology and optometry
738 the british journal of dermatology
739 ophthalmic surgery, lasers and imaging retina
740 proceedings of the institution of civil engineers - civil engineering
741 journal of the chinese medical association
742 advances in entomology
743 journal of apicultural research
744 bioconjugate chemistry
745 neurology india
746 austr

In [440]:
scrapeAbstracts(scraped[scraped.publication=='pediatric neurology briefs'].iloc[1].doi)

'TypeError'

In [326]:
# scraped[scraped.publication==journal]

In [444]:
# scrapeAbstractsv2(scraped[scraped.publication=='pediatric neurology briefs'].iloc[0].doi )

In [365]:
journal = scraped.publication.value_counts().keys()[2]
journal

'plos one'

In [366]:
doi = scraped[scraped.publication==journal].iloc[0].doi
doi

'10.1371/journal.pone.0017414'

In [315]:
# doi =scraped[scraped.publication=='clinical infectious diseases'].iloc[0].doi 
# doi

In [367]:
url = 'http://dx.doi.org/' + doi
response = requests.get(url, headers=headers) 
response

<Response [200]>

In [341]:
# url = 'https://www.sciencedirect.com/science/article/pii/S1877959X17305320'
# response = requests.get(url, headers=headers) 
# response

In [372]:
page = response.content
page

b'\n<!DOCTYPE html>\n<html xmlns="http://www.w3.org/1999/xhtml"\n      xmlns:dc="http://purl.org/dc/terms/"\n      xmlns:doi="http://dx.doi.org/"\n      lang="en" xml:lang="en"\n      itemscope itemtype="http://schema.org/Article"\n      class="no-js">\n\n\n\n<head prefix="og: http://ogp.me/ns#">\n  <title>Nod2 Suppresses Borrelia burgdorferi Mediated Murine Lyme Arthritis and Carditis through the Induction of Tolerance</title>\n\n\n\n\n  <link rel="stylesheet" type="text/css"  href="/resource/css/screen.css"/>\n\n  <!-- allows for  extra head tags -->\n\n\n<!-- hello -->\n<link rel="stylesheet" type="text/css"\n      href="https://fonts.googleapis.com/css?family=Open+Sans:400,400i,600">\n\n<link media="print" rel="stylesheet" type="text/css"  href="/resource/css/print.css"/>\n    <script type="text/javascript">\n        var siteUrlPrefix = "/plosone/";\n    </script>\n  <script src="/resource/js/vendor/modernizr-v2.7.1.js" type="text/javascript"></script>\n  <script src="/resource/js/

In [373]:
abstractHeadingStart = page.find(b'>Abstract<')
abstractHeadingStart

74655

In [357]:
headers = {'User-Agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (K HTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
              }
URLstart = page.find(b'http', page.find(b'redirect'))
URLend = page.find(b'"', URLstart)
URLencoded = page[URLstart:URLend]
URLdecoded = URLencoded.decode('UTF-8')
redirectURL = urllib.parse.unquote(URLdecoded)
redirectURLshort = redirectURL[:redirectURL.find('?')]
print(redirectURLshort)
response = requests.get(redirectURL, headers=headers)
# if(response.status_code!=200):
#     return 'No response'
page = response.content
abstractHeadingStart = page.find(b'>Abstract<')    
abstractHeadingStart

https://ars.els-cdn.co


-1

In [358]:
page

b'<!doctype html><html lang="en"><head><title>HTTP Status 404 \xe2\x80\x93 Not Found</title><style type="text/css">h1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} h2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} h3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} body {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} b {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} p {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;} a {color:black;} a.name {color:black;} .line {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP Status 404 \xe2\x80\x93 Not Found</h1><hr class="line" /><p><b>Type</b> Status Report</p><p><b>Description</b> The origin server did not find a current representation for the target resource or is not willing to disclose that one exists

In [377]:
page.rfind(b'<', 0, abstractHeadingStart)

74652

In [379]:
page[74652:74655+1]

b'<h2>'

In [380]:
page.find(b'<h2>',74655 )

77734

In [381]:
page[74652:77734]

b'<h2>Abstract</h2><div class="abstract-content"><a id="article1.front1.article-meta1.abstract1.p1" name="article1.front1.article-meta1.abstract1.p1" class="link-target"></a><p>The internalization of <em>Borrelia burgdorferi</em>, the causative agent of Lyme disease, by phagocytes is essential for an effective activation of the immune response to this pathogen. The intracellular, cytosolic receptor Nod2 has been shown to play varying roles in either enhancing or attenuating inflammation in response to different infectious agents. We examined the role of Nod2 in responses to <em>B. burgdorferi</em>. <em>In vitro</em> stimulation of Nod2 deficient bone marrow derived macrophages (BMDM) resulted in decreased induction of multiple cytokines, interferons and interferon regulated genes compared with wild-type cells. However, <em>B. burgdorferi</em> infection of Nod2 deficient mice resulted in increased rather than decreased arthritis and carditis compared to control mice. We explored multipl

In [383]:
BeautifulSoup(page[74652:77734], "html")

<html><body><h2>Abstract</h2><div class="abstract-content"><a class="link-target" id="article1.front1.article-meta1.abstract1.p1" name="article1.front1.article-meta1.abstract1.p1"></a><p>The internalization of <em>Borrelia burgdorferi</em>, the causative agent of Lyme disease, by phagocytes is essential for an effective activation of the immune response to this pathogen. The intracellular, cytosolic receptor Nod2 has been shown to play varying roles in either enhancing or attenuating inflammation in response to different infectious agents. We examined the role of Nod2 in responses to <em>B. burgdorferi</em>. <em>In vitro</em> stimulation of Nod2 deficient bone marrow derived macrophages (BMDM) resulted in decreased induction of multiple cytokines, interferons and interferon regulated genes compared with wild-type cells. However, <em>B. burgdorferi</em> infection of Nod2 deficient mice resulted in increased rather than decreased arthritis and carditis compared to control mice. We explor

In [345]:
page.find(b'>Abstract</h')

63093

In [348]:
abstractHeadingStart=63093

In [374]:
page[abstractHeadingStart:]

b'>Abstract</h2><div class="abstract-content"><a id="article1.front1.article-meta1.abstract1.p1" name="article1.front1.article-meta1.abstract1.p1" class="link-target"></a><p>The internalization of <em>Borrelia burgdorferi</em>, the causative agent of Lyme disease, by phagocytes is essential for an effective activation of the immune response to this pathogen. The intracellular, cytosolic receptor Nod2 has been shown to play varying roles in either enhancing or attenuating inflammation in response to different infectious agents. We examined the role of Nod2 in responses to <em>B. burgdorferi</em>. <em>In vitro</em> stimulation of Nod2 deficient bone marrow derived macrophages (BMDM) resulted in decreased induction of multiple cytokines, interferons and interferon regulated genes compared with wild-type cells. However, <em>B. burgdorferi</em> infection of Nod2 deficient mice resulted in increased rather than decreased arthritis and carditis compared to control mice. We explored multiple p

In [371]:
page[abstractHeadingStart:]

b'>Abstract</h2><div class="abstract-content"><a id="article1.front1.article-meta1.abstract1.p1" name="article1.front1.article-meta1.abstract1.p1" class="link-target"></a><p>The internalization of <em>Borrelia burgdorferi</em>, the causative agent of Lyme disease, by phagocytes is essential for an effective activation of the immune response to this pathogen. The intracellular, cytosolic receptor Nod2 has been shown to play varying roles in either enhancing or attenuating inflammation in response to different infectious agents. We examined the role of Nod2 in responses to <em>B. burgdorferi</em>. <em>In vitro</em> stimulation of Nod2 deficient bone marrow derived macrophages (BMDM) resulted in decreased induction of multiple cytokines, interferons and interferon regulated genes compared with wild-type cells. However, <em>B. burgdorferi</em> infection of Nod2 deficient mice resulted in increased rather than decreased arthritis and carditis compared to control mice. We explored multiple p

In [258]:
page[63093:]

b'>Abstract</h3>\n<p>Lyme disease is a tick-borne illness caused primarily by the spirochete <em>Borrelia burgdorferi</em>. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.</p>\n</div>\n<div class=\'article-content-section\'>\n<h3 class=\'reg\' id=\'introduction\'>Introduction</h3>\n<div class=\'article-content-body\'>\n<p>Lyme disease is a tick-borne illness caused primarily by the spirochete <em>Borrelia burgdorferi</em>. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits t

In [349]:
abstractHeadingend = page.find(b'<div', abstractHeadingStart+len('>Abstract<'))
abstractHeadingend

63756

In [350]:
page[abstractHeadingStart:abstractHeadingend]

b'>Abstract</h3>\n<p>Lyme disease is a tick-borne illness caused primarily by the spirochete <em>Borrelia burgdorferi</em>. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.</p>\n</div>\n'

In [241]:
divTagEnd = page.find(b'>', abstractHeadingend)
divTag = page[abstractHeadingend:divTagEnd]
divTagType = divTag[divTag.find(b' ')+1:divTag.find(b'=')].decode()
divTagAtrr = divTag[divTag.find(b'"')+1:]
divTagAtrr = divTagAtrr[:divTagAtrr.find(b'"')]

In [242]:
scraping = BeautifulSoup(page, "html") 
scraping

<!DOCTYPE html>
<html prefix="og: http://ogp.me/ns#">
<head>
<meta charset="utf-8"/>
<meta content="zjkqdnefb408s8r553whzqbqpwzt4h" name="facebook-domain-verification"/>
<script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-5L5TP48');</script>
<script async="" defer="" id="hs-script-loader" src="//js.hs-scripts.com/9367854.js" type="text/javascript"></script>
<script>
  window.dataLayer = window.dataLayer || [];
</script>
<style>
  .async-hide { opacity: 0 !important}
</style>
<script src="https://servedbydoceree.doceree.com/script/render-header.js"></script>
<script>
var hcpContext;

function docereeLogIn(userObj) {
    if (!hcpContext) {
        hcpContext = userObj;
        if (typeof setDocereeContext 

In [243]:
text = scraping.find("div", attrs={divTagType: divTagAtrr})
text

### Testing for cureus

In [384]:
doi =scraped[scraped.publication=='cureus'].iloc[0].doi 
doi

'10.7759/cureus.2417'

In [415]:
scrapeAbstractsv2(scraped[scraped.publication=='cureus'].iloc[0].doi )

http://dx.doi.org/10.7759/cureus.2417
test


'>Abstract\nLyme disease is a tick-borne illness caused primarily by the spirochete Borrelia burgdorferi. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.\n\n'

In [385]:
url = 'http://dx.doi.org/' + doi
response = requests.get(url, headers=headers) 
response

<Response [200]>

In [386]:
page = response.content
page

b'<!DOCTYPE html>\n<html prefix=\'og: http://ogp.me/ns#\'>\n<head>\n<meta charset=\'utf-8\'>\n<meta content=\'zjkqdnefb408s8r553whzqbqpwzt4h\' name=\'facebook-domain-verification\'>\n\n<script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({\'gtm.start\':\nnew Date().getTime(),event:\'gtm.js\'});var f=d.getElementsByTagName(s)[0],\nj=d.createElement(s),dl=l!=\'dataLayer\'?\'&l=\'+l:\'\';j.async=true;j.src=\n\'https://www.googletagmanager.com/gtm.js?id=\'+i+dl;f.parentNode.insertBefore(j,f);\n})(window,document,\'script\',\'dataLayer\',\'GTM-5L5TP48\');</script>\n\n\n<script type="text/javascript" id="hs-script-loader" async defer src="//js.hs-scripts.com/9367854.js"></script>\n\n<script>\n  window.dataLayer = window.dataLayer || [];\n</script>\n<style>\n  .async-hide { opacity: 0 !important}\n</style>\n<script src=\'https://servedbydoceree.doceree.com/script/render-header.js\'></script>\n<script>\nvar hcpContext;\n\nfunction docereeLogIn(userObj) {\n    if (!hcpContext) {\n        hcpCon

In [391]:
abstractHeadingStart = page.find(b'>Abstract</h')
abstractHeadingStart

63093

In [392]:
page[abstractHeadingStart:]

b'>Abstract</h3>\n<p>Lyme disease is a tick-borne illness caused primarily by the spirochete <em>Borrelia burgdorferi</em>. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.</p>\n</div>\n<div class=\'article-content-section\'>\n<h3 class=\'reg\' id=\'introduction\'>Introduction</h3>\n<div class=\'article-content-body\'>\n<p>Lyme disease is a tick-borne illness caused primarily by the spirochete <em>Borrelia burgdorferi</em>. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits t

In [395]:
headingtag = page[page.rfind(b'<', 0, abstractHeadingStart):abstractHeadingStart+1]
headingtag = headingtag[:3]
headingtag

b'<h3'

In [379]:
page[74652:74655+1]

b'<h2>'

In [377]:
page.rfind(b'<', 0, abstractHeadingStart)

74652

In [379]:
page[74652:74655+1]

b'<h2>'

In [380]:
page.find(b'<h2>',74655 )

77734

In [402]:
abstractHeadingend = page.find(b'>', abstractHeadingStart+len('>Abstract<'))+1
abstractHeadingend

63107

In [403]:
abstractEnd = page.find(headingtag,abstractHeadingend )

In [404]:
page[abstractHeadingend:abstractEnd]

b"\n<p>Lyme disease is a tick-borne illness caused primarily by the spirochete <em>Borrelia burgdorferi</em>. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.</p>\n</div>\n<div class='article-content-section'>\n"

In [430]:
BeautifulSoup(page[abstractHeadingend:abstractEnd], "html").get_text(strip=True)

'Lyme disease is a tick-borne illness caused primarily by the spirocheteBorrelia burgdorferi. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.'

In [418]:
def scrapeAbstractsv2(doi):
    headers = {'User-Agent': "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1) AppleWebKit/537.36 (K HTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36"
              }
    url = 'http://dx.doi.org/' + doi
    print(url)
    try:
        response = requests.get(url) 
    except requests.exceptions.RequestException as e:
                return 'Error'
        
    if(response.status_code!=200):
        return str(response.status_code)+ ' response'
    
    page = response.content
    abstractHeadingStart = page.find(b'>Abstract<')
    if abstractHeadingStart==-1: 
        redirect = page.find(b'redirect')
        if redirect ==-1:
            return 'No abstract found'
        else:
            URLstart = page.find(b'http', redirect)
            URLend = page.find(b'"', URLstart)
            URLencoded = page[URLstart:URLend]
            URLdecoded = URLencoded.decode('UTF-8')
            redirectURL = urllib.parse.unquote(URLdecoded)
            redirectURLshort = redirectURL[:redirectURL.find('?')]
            print(redirectURLshort)
            try:
                response = requests.get(redirectURL, headers=headers)
            except requests.exceptions.RequestException as e:
                return 'Error after redirect'
            if(response.status_code!=200):
                return 'No response'
            page = response.content
            abstractHeadingStart = page.find(b'>Abstract<')
            if abstractHeadingStart==-1: 
                return 'No abstract found'
    lastMentionofAbstract = page.rfind(b'>Abstract<')
    if abstractHeadingStart != lastMentionofAbstract:
#         print('test')
        if page.find(b'>Abstract</h')== lastMentionofAbstract:
            abstractHeadingStart = lastMentionofAbstract
    headingTag = page[page.rfind(b'<',0, abstractHeadingStart):abstractHeadingStart]
    headingTag = headingTag[:3]
    
    abstractStart = page.find(b'>', abstractHeadingStart+len('>Abstract<'))+1
    abstractEnd = page.find(headingtag,abstractStart)

    scraping = BeautifulSoup(page[abstractStart:abstractEnd], "html") 
    text = scraping.get_text()
    return text

In [417]:
scrapeAbstractsv2(scraped[scraped.publication=='cureus'].iloc[0].doi )

http://dx.doi.org/10.7759/cureus.2417
test


'Lyme disease is a tick-borne illness caused primarily by the spirochete Borrelia burgdorferi. The disease is most prevalent in forested areas endemic for Ixodes tick, which transmits the spirochete. Here, we describe a case of Lyme meningoencephalitis masquerading as normal pressure hydrocephalus (NPH) which initially presented with urinary incontinence, gait instability, and neurological decline. Due to its non-specific symptoms and low incidence, Lyme meningoencephalitis causing NPH like syndrome poses a diagnostic conundrum for clinicians. Awareness of this disease entity is key for prompt diagnosis and treatment.\n\n'

In [419]:
for i in range (scraped.publication.nunique()):
    journal = scraped.publication.value_counts().keys()[i]
    print(journal)
    print(scrapeAbstractsv2(scraped[scraped.publication==journal].iloc[0].doi))

ticks and tick-borne diseases
http://dx.doi.org/10.1016/j.ttbdis.2018.02.011
https://www.sciencedirect.com/science/article/pii/S1877959X17305320

clinical infectious diseases
http://dx.doi.org/10.1093/cid/ciaa854
403 response
plos one
http://dx.doi.org/10.1371/journal.pone.0017414
The internalization of Borrelia burgdorferi, the causative agent of Lyme disease, by phagocytes is essential for an effective activation of the immune response to this pathogen. The intracellular, cytosolic receptor Nod2 has been shown to play varying roles in either enhancing or attenuating inflammation in response to different infectious agents. We examined the role of Nod2 in responses to B. burgdorferi. In vitro stimulation of Nod2 deficient bone marrow derived macrophages (BMDM) resulted in decreased induction of multiple cytokines, interferons and interferon regulated genes compared with wild-type cells. However, B. burgdorferi infection of Nod2 deficient mice resulted in increased rather than decreased

The lone star tick, Amblyomma americanum, is a vector of Ehrlichia chaffeensis and E. ewingii, causal agents of human ehrlichiosis, and has demonstrated marked geographic expansion in recent years. A. americanum ticks often outnumber the vector of Lyme disease, Ixodes scapularis, where both ticks are sympatric, yet cases of Lyme disease far exceed ehrlichiosis cases. We quantified the risk for ehrlichiosis relative to Lyme disease by using relative tick encounter frequencies and infection rates for these 2 species in Monmouth County, New Jersey, USA. Our calculations predict >1 ehrlichiosis case for every 2 Lyme disease cases, >2 orders of magnitude higher than current case rates (e.g., 2 ehrlichiosis versus 439 Lyme disease cases in 2014). This result implies ehrlichiosis is grossly underreported (or misreported) or that many infections are asymptomatic. We recommend expansion of tickborne disease education in the Northeast United States to include human health risks posed by A. ameri


clinical microbiology and infection
http://dx.doi.org/10.1016/j.cmi.2018.11.011
https://clinicalmicrobiologyandinfection.com/retrieve/pii/S1198743X1830738
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inESCMID Member LoginNon-Member LoginRegisterSubscribeClaim


Commentary|
                                                    Volume 25, ISSUE 1, P2-3, January 01, 2019PDF [186 KB]PDF [186 KB]SaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitterFacebookLinked InSina Weibo
moreReprintsRequestTopBorrelial serology does not contribute to the diagnostic work-up of patients with nonspecific symptomsM. MarkowiczM. MarkowiczCorrespondenceCorresponding author. M. Markowicz, Institute for Hygiene and Applied Immunology, Medical University of Vienna, Kinderspitalgasse 15, 1090 Vienna, Austria.
                            Contact
                        Af



To determine how often Slovenian children with acute peripheral facial palsy are infected with Borrelia burgdorferi sensu lato, 52 patients with peripheral facial palsy were included in this prospective clinical study. According to case definitions, the diagnosis of Lyme borreliosis was established in 56% of those patients. The diagnosis was confirmed in 41%, probable in 28%, and possible in 31% of patients.

Slovenia is a highly endemic region for Lyme borreliosis (LB).1 LB is a tick-borne multisystem infectious disease caused by Borrelia burgdorferi sensu lato.2 Acute peripheral facial palsy (PFP) is a neurologic manifestation of the early disseminated stage of LB.3 In Europe, PFP is detected more often in children than in adults and B. burgdorferi sensu lato is the leading cause of PFP in children.4 Using strict laboratory criteria for diagnosis of LB, borrelial infection is confirmed in 19.3% of Slovenian adult patients with PFP.5
This study was performed to determine how often S

http://captcha.com/java-captcha-info.htm
No abstract found
diagnostic microbiology and infectious disease
http://dx.doi.org/10.1016/j.diagmicrobio.2011.10.003
https://www.sciencedirect.com/science/article/pii/S0732889311004159
Lyme disease transmission to humans by Ixodes ticks is thought to require at least 36–48 h of tick attachment. We describe 3 cases in which transmission of Borrelia burgdorferi, the spirochetal agent of Lyme disease, appears to have occurred in less than 24 h based on the degree of tick engorgement, clinical signs of acute infection, and immunologic evidence of acute Lyme disease. Health care providers and individuals exposed to ticks should be aware that transmission of Lyme disease may occur more rapidly than animal models suggest. A diagnosis of Lyme disease should not be ruled out based on a short tick attachment time in a subject with clinical evidence of B. burgdorferi infection.Previous article in issueNext article in issueKeywordsLyme diseaseBorrelia burg

http://captcha.com/java-captcha-info.htm
No abstract found
option/bio
http://dx.doi.org/10.1016/s0992-5945(11)70873-x
https://www.sciencedirect.com/science/article/pii/S099259451170873X
No abstract found
open forum infectious diseases
http://dx.doi.org/10.1093/ofid/ofx152
403 response
scientific reports
http://dx.doi.org/10.1038/s41598-017-05231-1
Vector-borne pathogens establish systemic infections in host tissues to maximize transmission to arthropod vectors. Co-feeding transmission occurs when the pathogen is transferred between infected and naive vectors that feed in close spatiotemporal proximity on a host that has not yet developed a systemic infection. Borrelia afzelii is a tick-borne spirochete bacterium that causes Lyme borreliosis (LB) and is capable of co-feeding transmission. Whether ticks that acquire LB pathogens via co-feeding are actually infectious to vertebrate hosts has never been tested. We created nymphs that had been experimentally infected as larvae with B. afzel

No abstract found
international journal of general medicine
http://dx.doi.org/10.2147/IJGM.S145134
No abstract found
molecular microbiology
http://dx.doi.org/10.1111/mmi.12390
403 response
journal of the neurological sciences
http://dx.doi.org/10.1016/j.jns.2010.05.007
http://jns-journal.com/retrieve/pii/S0022510X1000208
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Submit ArticleLog inRegisterLog inSubmit ArticleLog inSubscribeClaim


Editorial|
                                                    Volume 295, ISSUE 1-2, P8-9, August 15, 2010PurchaseSubscribeSaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitterFacebookLinked InSina Weibo
moreReprintsRequestTopDo we need to broaden the spectrum of Lyme neuroborreliosis?Andrew R. PachnerAndrew R. Pachner
                            Contact
                        AffiliationsDepartment of Neurology a

https://doi.org/10.1126/science.704373

annals of internal medicine
http://dx.doi.org/10.7326/0003-4819-157-3-201208070-01002
https://www.acponline.org
No abstract found
journal of neurology
http://dx.doi.org/10.1007/s00415-015-7891-4
The prognosis and impact of residual symptoms on quality of life in patients with Lyme neuroborreliosis (LNB) is subject to debate. The aim of this study was to assess quality of life, fatigue, depression, cognitive impairment and verbal learning in patients with definite LNB and healthy controls in a case–control study. We retrospectively identified all patients diagnosed with definite LNB between 2003 and 2014 in our tertiary care center. Healthy controls were recruited from the same area. Patients and healthy controls were assessed for quality of life [Short Form (36) with subscores for physical and mental components (PCS, MCS)], fatigue (fatigue severity scale), depression (Beck depression inventory), verbal memory and learning and cognitive impairmen

https://www.sciencedirect.com/science/article/pii/S0022073616300310
The most common manifestation of Lyme carditis is a varying degree of atrioventricular (AV) conduction block. This case describes a 45-year-old male with third-degree AV block due to Lyme carditis. Treatment with intravenous antibiotics resulted in complete normalization of AV conduction, thereby averting permanent pacemaker implantation.Previous article in issueNext article in issueKeywordsLyme diseaseConduction abnormalityRecommended articlesCiting articles (0)View full text© 2016 Elsevier Inc. All rights reserved.Recommended articlesNo articles found.Citing articlesArticle MetricsView article metricsAbout ScienceDirectRemote accessShopping cartAdvertiseContact and supportTerms and conditionsPrivacy policyWe use cookies to help provide and enhance our service and tailor content and ads. By continuing you agree to the use of cookies.Copyright © 2021 Elsevier B.V. or its licensors or contributors. ScienceDirect ® is a 

The article presents current views on the peculiarities of etiology, epidemiology, pathogenesis, clinical manifestations, diagnosis, treatment and prevention of Lyme borreliosis. Unresolved issues of monitoring, diagnosis and antibiotic therapy of the disease are elucidated.


						References
					

Malyiy VP, Kratenko IS, authors; System tick borreliosis (Lyme disease) : uch. Posobie. H.: Folio; 2006. 127 p. Russian. 
Stanek G, Wormser GP, Gray J. Lyme borreliosis. Lancet. 2012; 379: 461473. 
Afzelius A: Verhandlungen der dermatologischen Gesellschaft zu Stockholm, December, 1909. Arch Dermatol Syphil (Berlin). 1910; 101:405-406 
Steere AC, Malawista SE, Snydman DR, Shope RE, Andiman WA, Ross MR. et al. Lyme arthritis: an epidemic of oligoarticular arthritis in children and adults in three connecticut communities. Arthritis Rheum. 1977;20(1):7–17. 
Burgdorfer, W. Discovery of the Lyme disease spirochete and its relation to tick vectors. Yale J. Biol. Med. 1984; 57:518-520. 
Centers f

http://www.valueinhealthjournal.com/retrieve/pii/S1098301513020160
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Submit ArticleLog inRegisterLog inSubmit ArticleLog inSubscribeClaim

Login to your accountEmail/UsernamePasswordShowForgot password?Remember meDon’t have an account?Create a Free Account
If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your passwordEmail*If the address matches a valid account an email will be sent to __email__ with instructions for resetting your passwordCancel

Infection – Clinical Outcomes Studies|
                                                    Volume 16, ISSUE 7, PA340-A341, November 01, 2013PDF [78 KB]PDF [78 KB]SaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitte

https://www.nature.com/articles/nrrheum.2010.7
No abstract found
qjm
http://dx.doi.org/10.1093/qjmed/hcs227
403 response
wiener klinische wochenschrift
http://dx.doi.org/10.1007/s00508-014-0622-5
https://link.springer.com/article/10.1007/s00508-014-0622-
No abstract found
journal of the american academy of dermatology
http://dx.doi.org/10.1016/j.jaad.2016.07.044
https://jaad.org/retrieve/pii/S019096221630595
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inAAD Member LoginNon-Member LoginRegisterSubscribeClaim


Short communication|
                                                    Volume 76, ISSUE 2, SUPPLEMENT 1, S64-S65, February 01, 2017PurchaseSubscribeSaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitterFacebookLinked InSina Weibo
moreReprintsRequestTopThe anal groove sign: The use of dermatoscopy for identification of Ixodes ticksDeird

https://www.sciencedirect.com/science/article/pii/S0264410X20309919
No abstract found
brain, behavior, and immunity
http://dx.doi.org/10.1016/j.bbi.2010.04.009
https://www.sciencedirect.com/science/article/pii/S088915911000098X
No abstract found
international journal of environmental research and public health
http://dx.doi.org/10.3390/ijerph15051048


case reports
http://dx.doi.org/10.1136/bcr.02.2011.3833
No abstract found
travel medicine and infectious disease
http://dx.doi.org/10.1016/j.tmaid.2017.01.005
https://www.sciencedirect.com/science/article/pii/S1477893917300054
No abstract found
journal of the pediatric infectious diseases society
http://dx.doi.org/10.1093/jpids/piy083
403 response
journal of antimicrobial chemotherapy
http://dx.doi.org/10.1093/jac/dkq214
403 response
american journal of case reports
http://dx.doi.org/10.12659/ajcr.899745
No abstract found
infection
http://dx.doi.org/10.1007/s15010-010-0062-8

bmc neurology
http://dx.doi.org/10.1186/1471-2377-10-117

anna

No abstract found
infection and drug resistance
http://dx.doi.org/10.2147/idr.s15653
No abstract found
heartrhythm case reports
http://dx.doi.org/10.1016/j.hrcr.2018.09.001
https://heartrhythmcasereports.com/retrieve/pii/S221402711830254
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inRegisterLog in

Login to your accountEmail/UsernamePasswordShowForgot password?Remember meDon’t have an account?Create a Free Account
If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your passwordEmail*If the address matches a valid account an email will be sent to __email__ with instructions for resetting your passwordCancel

Case report|
                                                    Volume 4, ISSUE 12, P584-588, December 01, 2018PDF [1 MB]PDF [1 MB]FiguresFigure ViewerDownload Figu

No abstract found
idcases
http://dx.doi.org/10.1016/j.idcr.2018.04.004
https://www.sciencedirect.com/science/article/pii/S2214250918300659
We present the case of a 10-year old patient from southeastern Ontario with severe bilateral facial palsy. MRI was performed that showed extensive symmetric enhancement of cervical cranial nerve roots and multiple cranial nerves (III, V, VI, VII, VIII, X and XII). Lumbar puncture was performed that revealed pleocytosis and elevated proteins in the cerebrospinal fluid. Serology confirmed the diagnosis of neuroborreliosis. The patient was treated with a 4-week course of IV ceftriaxone, following which he returned to baseline.Previous article in issueNext article in issueKeywordsLyme diseaseNeuroborreliosisFacial nerve palsyCranial neuropathiesRecommended articlesCiting articles (0)☆Consent for this publication was obtained from the patient and his parents.© 2018 The Authors. Published by Elsevier Ltd.Recommended articlesNo articles found.Citing articl

https://jidonline.org/retrieve/pii/S0022202X1537320
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inRegisterLog inSubscribeClaim


Letter to the Editor|
                                                    Volume 135, ISSUE 7, P1903-1905, July 01, 2015PDF [160 KB]PDF [160 KB]FiguresFigure ViewerDownload Figures (PPT)SaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitterFacebookLinked InSina Weibo
moreReprintsRequestTopGeographical and Temporal Correlations in the Incidence of Lyme Disease, RMSF, Ehrlichiosis, and Coccidioidomycosis with Search DataVladimir RatushnyVladimir Ratushny
                            Contact
                        AffiliationsDepartment of Dermatology, Massachusetts General Hospital, Boston, Massachusetts, USASearch for articles by this authorGideon P. SmithGideon P. SmithAffiliationsDepartment of Dermatology, Massachu

No abstract found
mayo clinic proceedings
http://dx.doi.org/10.4065/mcp.2008.0728
https://www.mayoclinicproceedings.org/retrieve/pii/S002561961160440
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inRegisterLog inSubscribeClaim

Login to your accountEmail/UsernamePasswordShowForgot password?Remember meDon’t have an account?Create a Free Account
If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your passwordEmail*If the address matches a valid account an email will be sent to __email__ with instructions for resetting your passwordCancel

e-RESIDENTS' CLINIC|
                                                    Volume 85, ISSUE 4, e13-e16, April 01, 2010PDF [85 KB]PDF [85 KB]SaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert


biomedical journal of scientific & technical research
http://dx.doi.org/10.26717/bjstr.2018.11.002121


journal of alzheimer's disease
http://dx.doi.org/10.3233/jad-140552

ophthalmology
http://dx.doi.org/10.1016/j.ophtha.2013.12.036
https://aaojournal.org/retrieve/pii/S016164201301257
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inRegisterLog inSubscribeAccess Subscription
Login to your accountEmail/UsernamePasswordShowForgot password?Remember meDon’t have an account?Create a Free Account
If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your passwordEmail*If the address matches a valid account an email will be sent to __email__ with instructions for resetting your passwordCancel


Report|
                                                    Volume 121, ISSUE 6, P1311-

https://annemergmed.com/retrieve/pii/S019606441731504
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inACEP Member LoginNon-Member LoginRegisterSubscribeClaim


Neurology/expert clinical management|
                                                    Volume 71, ISSUE 5, P618-624, May 01, 2018PDF [937 KB]PDF [937 KB]FiguresFigure ViewerDownload Figures (PPT)SaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitterFacebookLinked InSina Weibo
moreReprintsRequestTopManaging Peripheral Facial PalsyAris Garro, MD, MPH Aris GarroCorrespondenceCorresponding Author.
                            Contact
                        AffiliationsDivision of Pediatric Emergency Medicine, Department of Emergency Medicine, Warren Alpert Medical School of Brown University and Rhode Island Hospital, Providence, RISearch for articles by this authorLise E. Nigrovic, MD, MP

https://npjournal.org/retrieve/pii/S155541551300260
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
AANP Member Login | ACNP Member Login |
Submit ArticleLog inRegisterLog inAANP Member Login | ACNP Member Login |
Submit ArticleLog inSubscribeClaim
Login to your accountEmail/UsernamePasswordShowForgot password?Remember meDon’t have an account?Create a Free Account
If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your passwordEmail*If the address matches a valid account an email will be sent to __email__ with instructions for resetting your passwordCancel


Feature Article|
                                                    Volume 9, ISSUE 6, P362-367, June 01, 2013PurchaseSubscribeSaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation

https://www.sciencedirect.com/science/article/pii/S000989811830038X
A main focus of human health studies is the early detection of infectious diseases to enable more rapid treatment and prevent disease transmission. Diagnosis of Lyme borreliosis has been always challenging because of the lack of specific, but simple assay formats. Two-tiered testing has been recommended by US Centers for Disease Control and Prevention to provide more specific results for diagnosis of Lyme disease. However, such a technique is time consuming and is not well suited for early stage detection. Therefore, many tests were proposed as alternatives to overcome these drawbacks. Simple assays, which are mainly performed in one-tier manner, could be conducted with better performance than the two-tiered testing. Proposed assays utilize both newly identified antigens and new platforms to improve detection performance. These assays can be classified into those based on employing a single antigen and assays based on 

No abstract found
american journal of gastroenterology
http://dx.doi.org/10.14309/00000434-201610001-02553
https://journals.lww.com/ajg/pages/printerfriendly.aspx
No response
chemical & engineering news archive
http://dx.doi.org/10.1021/cen-09134-notw6
The Lyme disease epidemic in the U.S. is worse—much worse—than doctors and public health officials have feared. Last year, the Centers for Disease Control & Prevention reported some 30,000 cases of Lyme disease. That number has exploded to an estimated 300,000 cases of Lyme annually, the agency reported last week. It’s not that the disease suddenly spread, but that it’s been undercounted in the past. Scientists have long suspected that Lyme disease, the number one vector-borne illness in the U.S., is significantly underreported. “We know that routine surveillance only gives us part of the picture and that the true number of illnesses is much greater,” says Paul S. Mead, chief of epidemiology and surveillance activity for CDC’s Lyme disea

403 response
the journal of pediatrics
http://dx.doi.org/10.1016/j.jpeds.2010.10.024
https://jpeds.com/retrieve/pii/S002234761000915
Article Title, Abstract, KeywordsAdvanced SearchSave searchPlease enter a term before submitting your search. Ok
Log inRegisterLog inSubscribeClaim


The Editors' Perspective|
                                                    Volume 157, ISSUE 6, PA1-A2, December 01, 2010PDF [174 KB]PDF [174 KB]SaveAdd To Online LibraryPowered ByMendeleyAdd To My Reading ListExport CitationCreate Citation Alert
ShareShare onEmailTwitterFacebookLinked InSina Weibo
moreReprintsRequestTopChronic Lyme diseaseSarah S. Long, MD Sarah S. LongSearch for articles by this author DOI:https://doi.org/10.1016/j.jpeds.2010.10.024Chronic Lyme diseasePrevious ArticleInduction of sputum has value in the care of children with cystic fibrosisNext ArticleNew treatment approaches for Kawasaki disease
















In this issue of The Journal, Johnson and Feder tested the hypothesis propo

Researchers at Alfred I duPont Hospital for Children, Wilmington, DE, and other Centers in Lyme endemic areas determined the frequency and type of all treatment complications at return visits within 30 days of an initial Lyme meningitis diagnosis.
                    
                
                
                
                    

Keywords: 

Antibiotic Therapy,
Peripherally Inserted Central Catheter,
Lyme Meningitis Diagnosis




How to Cite: 

                            
                                Millichap, J.G., 2012. Treatment Complications of Lyme Meningitis.  Pediatric Neurology Briefs,  26(11), pp.83–83. DOI: http://doi.org/10.15844/pedneurbriefs-26-11-4




5
Downloads




 
                        Published on
                        01 Nov 2012
                    

 Peer Reviewed




 CC BY 4.0
                                    





Researchers at Alfred I duPont Hospital for Children, Wilmington, DE, and other Centers in Lyme endemic areas determined the 

403 response
infectious diseases
http://dx.doi.org/10.3109/00365548.2014.961544


KeyboardInterrupt: 