<h1>Introduction to the Crossref REST API</h1>
<h3>by Levi Dolan, Indiana University</h3> 

About: This notebook demonstrates the basics of querying the Crossref REST API, using the Python library 
<a href="https://github.com/fabiobatalha/crossrefapi" target="_blank">Crossref API</a> owned by Fabio Batalha.

About Crossref: Containing over 106 million records and expanding at an average rate of 11% a year, Crossref’s metadata has become one of the major sources of scholarly data for publishers, authors, librarians, funders, and researchers. The metadata set consists of 13 content types, including not only traditional types, such as journals and conference papers, but also data sets, reports, preprints, peer reviews, and grants. The metadata is not limited to basic publication metadata, but can also include abstracts and links to full text, funding and license information, citation links, and the information about corrections, updates, retractions, etc. This scale and breadth make Crossref a valuable source for research in scientometrics, including measuring the growth and impact of science and understanding new trends in scholarly communications. The metadata is available through a number of APIs, including REST API and OAI-PMH.

Ginny Hendricks, Dominika Tkaczyk, Jennifer Lin, Patricia Feeney; Crossref: The sustainable source of community-owned scholarly metadata. Quantitative Science Studies 2020; 1 (1): 414–427. doi: https://doi.org/10.1162/qss_a_00022


In [1]:
#introduce Works 

from crossref.restful import Works

works = Works()

works.agency('10.1590/0102-311x00133115')


{'DOI': '10.1590/0102-311x00133115',
 'agency': {'id': 'crossref', 'label': 'Crossref'}}

In [2]:
#Polite requests

from crossref.restful import Works, Etiquette

my_etiquette = Etiquette('IU RLML instruction test', '0.1alpha', 'https://github.com/firbolg/crossref_py', 'dolanl@iu.edu')

str(my_etiquette)

works = Works(etiquette=my_etiquette)

for i in works.sample(5).select('DOI'):
   print(i)

{'DOI': '10.17545/eoftalmo/2018.0012'}
{'DOI': '10.1007/978-3-030-67396-3_6'}
{'DOI': '10.1017/9781316822883.004'}
{'DOI': '10.1109/lawp.2014.2327092'}
{'DOI': '10.1093/oxfordjournals.jhered.a102253'}


In [3]:
#Example
#Singh, H., et al. "Cutaneous Manifestations of COVID-19: A Systematic Review." 
#Advances in Wound Care, vol. 10, no. 2, 2021, pp. 51-80. SCOPUS, www.scopus.com, doi:10.1089/wound.2020.1309. 

w1 = works.query(bibliographic='covid',author='singh',publisher_name='liebert')

for item in w1:
   print(item['title'])

['The New Normal for Bioplastics Amid the COVID-19 Pandemic']
['Cutaneous Manifestations of COVID-19: A Systematic Review']
['Monthly Trends in Access to Care and Mental Health Services by Household Income Level During the COVID-19 Pandemic, United States, April: December 2020']
['Dysfunctional State of T Cells or Exhaustion During Chronic Viral Infections and COVID-19: A Review']
['Re: “Postinfectious Immunity After COVID-19 and Vaccination Against SARS-CoV-2” by Krsak <i>et al</i>.']
['Is Low Alveolar Type II Cell<i>SOD3</i>in the Lungs of Elderly Linked to the Observed Severity of COVID-19?']
['Effect of COVID-19–Related Lockdown on Intimate Partner Violence in India: An Online Survey-Based Study']
['Telecritical Care Clinical and Operational Strategies in Response to COVID-19']
['Global Open Health Data Cooperatives Cloud in an Era of COVID-19 and Planetary Health']
['News Article Portrayal of Virtual Care for Health Care Delivery in the First 7 Months of the COVID-19 Pandemic']


In [4]:
#Example
#check journal to see number of DOIs issued as indicator of journal scope, speciality
#search Advances in Wound Care at https://portal.issn.org/ 
#https://portal.issn.org/resource/ISSN/2162-1918

from crossref.restful import Journals

journals = Journals()

journals.journal('2162-1918')

{'last-status-check-time': 1655265945295,
 'counts': {'current-dois': 180, 'backfile-dois': 460, 'total-dois': 640},
 'breakdowns': {'dois-by-issued-year': [[2014, 77],
   [2021, 69],
   [2015, 69],
   [2019, 66],
   [2013, 65],
   [2020, 57],
   [2022, 54],
   [2016, 51],
   [2012, 46],
   [2018, 44],
   [2017, 42]]},
 'publisher': 'Mary Ann Liebert',
 'coverage': {'affiliations-current': 0.9944444444444444,
  'similarity-checking-current': 1.0,
  'descriptions-current': 0.0,
  'ror-ids-current': 0.0,
  'funders-backfile': 0.0,
  'licenses-backfile': 0.2826086956521739,
  'funders-current': 0.0,
  'affiliations-backfile': 0.9978260869565218,
  'resource-links-backfile': 0.291304347826087,
  'orcids-backfile': 0.0,
  'update-policies-current': 0.0,
  'open-references-backfile': 1.0,
  'ror-ids-backfile': 0.0,
  'orcids-current': 0.3388888888888889,
  'similarity-checking-backfile': 1.0,
  'references-backfile': 0.9869565217391304,
  'descriptions-backfile': 0.0,
  'award-numbers-backfi

In [5]:
#Example
#count all articles with given keyword in metadata 

works.query('covid').count()

441749

In [6]:
#limit to current year

works.query('covid').filter(from_online_pub_date='2022').count()

53865

In [7]:
#Find number of current year articles which included "IUSM" in their affiliation

works.query(affiliation="IUSM").filter(from_deposit_date='2022').count()

12

In [8]:
#Print the full metadata for these articles

iusm_articles = works.query(affiliation="IUSM").filter(from_deposit_date='2022')

for item in iusm_articles:
   print(item)

{'indexed': {'date-parts': [[2022, 6, 11]], 'date-time': '2022-06-11T09:50:01Z', 'timestamp': 1654941001451}, 'reference-count': 51, 'publisher': 'American Association for Cancer Research (AACR)', 'issue': '8', 'content-domain': {'domain': ['aacrjournals.org'], 'crossmark-restriction': True}, 'published-print': {'date-parts': [[2011, 4, 15]]}, 'abstract': '<jats:title>Abstract</jats:title>\n               <jats:p>Purpose: Preclinical in vivo studies can help guide the selection of agents and regimens for clinical testing. However, one of the challenges in screening anticancer therapies is the assessment of off-target human toxicity. There is a need for in vivo models that can simulate efficacy and toxicities of promising therapeutic regimens. For example, hematopoietic cells of human origin are particularly sensitive to a variety of chemotherapeutic regimens, but in vivo models to assess potential toxicities have not been developed. In this study, a xenograft model containing humanized

In [25]:
#We can check total number of records containing a listed affiliation by online publication:

works.query(affiliation='Indiana University School of Medicine').filter(from_online_pub_date='2022').count()

340596

In [24]:
#We can also specify month, or day:

works.query(affiliation='Riley Hospital for Children').filter(from_deposit_date='2022-01',until_deposit_date='2022-01').count()

36024

In [31]:
#More records are deposited in Crossref than are yet published online:

works.query(affiliation='Riley Hospital for Children').filter(from_online_pub_date='2022').count()



61958