# ADS Independent citation counter

In this notebook, I query ADS for independent citations to all of my peer-reviewed publications. I do this by looping over all refereed papers and excluding all authors of that paper from the query. We match on the last name and initials (as does ADS) and use ORCID iD for cross-checking whenever available.

In [1]:
! date

Sun Jul 11 00:39:11 EDT 2021


In [2]:
import requests
from collections import Counter

#### Enter your API key below. 

To obtain an API key, you need to be logged in to your ADS account. Sign up for one with your email if you have never logged in. Go to **Account** > **Settings** > **API token** and copy your unique 40-character API key and assign it to the `token` variable.

In [3]:
## Configuration
token="yOuR-kEy-HeRe"
headers={'Authorization': 'Bearer ' + token}

check_orcid = True
verbose = False  # Set it to True if you want lengthy outputs

In [4]:
## Define a convenience function to standardize the names
def standardize_names(name):
    try:
        lastname, firstname = auth.split(',')
        initial = firstname[1]
        return f"{lastname},+{initial}"
    except:
        # This usually happens for collaboration papers
        print(f"Cannot split '{name}' into first and last names!")
        return name

First, query for all refereed papers authored by kannawad*

In [5]:
%%time
author='kannawad%2A' #kannawad*

base_url = "https://api.adsabs.harvard.edu/v1/search/query?"
fq = "property%3Arefereed+property%3Aarticle"
fl = "author,title,bibcode,citation_count,orcid_pub,orcid_user,orcid_other"
rows = 500

# the query parameters can be included as part of the URL
url = f"{base_url}q=author%3A{author}&fq={fq}&fl={fl}&rows={rows}"
print(f"URL = {url}\n")
r = requests.get(url, headers=headers)
# the requests package returns an object; to get just the JSON API response, you have to specify this
js = r.json()
publications = js['response']['docs']

URL = https://api.adsabs.harvard.edu/v1/search/query?q=author%3Akannawad%2A&fq=property%3Arefereed+property%3Aarticle&fl=author,title,bibcode,citation_count,orcid_pub,orcid_user,orcid_other&rows=500

CPU times: user 27.4 ms, sys: 8.99 ms, total: 36.4 ms
Wall time: 1.39 s


In [6]:
## Print the complete list of queried papers
print("Complete list of queried papers")
print("\t   Bibcode       #Cites \t Title\n")
citation_counts = 0
for paper in publications:
    print(paper['bibcode'], '\t', paper['citation_count'], '\t', paper['title'][0])
    citation_counts += paper['citation_count']
print("\nTotal number of citations = ", citation_counts)

Complete list of queried papers
	   Bibcode       #Cites 	 Title

2020A&A...633A..69H 	 186 	 KiDS+VIKING-450: Cosmic shear tomography with optical and infrared data
2014ApJS..212....5M 	 118 	 The Third Gravitational Lensing Accuracy Testing (GREAT3) Challenge Handbook
2021A&A...646A.140H 	 118 	 KiDS-1000 Cosmology: Multi-probe weak gravitational lensing and spectroscopic galaxy clustering constraints
2020A&A...638L...1J 	 90 	 KiDS+VIKING-450 and DES-Y1 combined: Cosmology with cosmic shear
2021A&A...645A.104A 	 97 	 KiDS-1000 cosmology: Cosmic shear constraints and comparison between two point statistics
2020A&A...633L..10T 	 54 	 Cosmology from large-scale structure. Constraining ΛCDM with BOSS
2020A&A...634A.127A 	 57 	 KiDS+VIKING-450 and DES-Y1 combined: Mitigating baryon feedback uncertainty with COSEBIs
2019A&A...624A..92K 	 40 	 Towards emulating cosmic shear data: revisiting the calibration of the shear measurements for the Kilo-Degree Survey
2018MNRAS.481.1337H 	 27 	 Cosm

In [7]:
## Ensure that the Orcid iDs are consistent and create a single orcid key per author
for paper in publications:
    nauth = len(paper['author'])
    paper['orcid'] = []
    for aid in range(nauth):
        orcid = (paper['orcid_pub'][aid], paper.get('orcid_user','-'*nauth)[aid], paper.get('orcid_other','-'*nauth)[aid])
        orcid = [orc for orc in orcid if orc!='-']
        if len(orcid):
            all((orc==orcid[0] for orc in orcid))
            paper['orcid'].append(orcid[0])
        else:
            paper['orcid'].append('-')
        if verbose:
            print(paper['author'][aid], paper['orcid_pub'][aid], paper.get('orcid_user','-'*nauth)[aid], paper.get('orcid_other','-'*nauth)[aid])
            print(paper['author'][aid], paper.get('orcid')[aid])


This would have been an excellent exercise to use a mini database, with a Publications table and an Authors table. But for my purpose, this seemed like an overkill and I use basic Python structures like `dict` and `list`.

In [8]:
## Obtain the complete list of co-authors (including self), along with their unique ORCID iDs.
coauthors = {}
for paper in publications:
    for auth, orc in zip(paper['author'], paper['orcid']):
        std_name = standardize_names(auth)
        coauthors[std_name] = orc if orc!='-' else coauthors.get(std_name, None)
if verbose:
    print("\nName of the co-authors \t\t ORCID iD \n ")
    for coauth in sorted(coauthors.keys()):
        print(coauth, '\t\t', coauthors[coauth])

Cannot split 'Euclid Collaboration' into first and last names!
Cannot split 'Euclid Collaboration' into first and last names!


Unfortunately, many of my co-authors have not provided their ORCID iD. We will have to find common authors by name. ORCID iD will only serve as further validation where applicable.

In [9]:
print("The following authors have the same last name:")
cntr = Counter([name.split(',')[0] for name in coauthors.keys()])
for coauth in coauthors:
    if cntr[coauth.split(',')[0]] > 1:
        print(coauth, '\t', coauthors[coauth])

The following authors have the same last name:
Choi,+A 	 None
Taylor,+A 	 None
Taylor,+E 	 0000-0002-5522-9107
Choi,+S 	 None


Define an empty dictionary to contain the number of independent citations. Loop over the above publications, query the independent citations and count them.

In [10]:
%%time
fq = "property%3Aarticle"
fl = "author,title,bibcode"
rows = 500
for eid, entity in enumerate(js['response']['docs']):
    independent_authors, independent_orcid = [], []
    bibcode = entity['bibcode'].replace('&','%26')
    condition = ''
    for auth in entity['author']:
        name = standardize_names(auth)
        if name is None: continue
        condition+= f"+-author:%22{name}%22"
    url = f"{base_url}q=citations(bibcode%3A{bibcode}){condition}&fq={fq}&fl={fl}&rows={rows}"
    r = requests.get(url, headers=headers)
    jstmp = r.json()
    independent_authors += [standardize_names(paper['author']) for paper in jstmp['response']['docs']]
    nauth = len(entity['author'])
    if check_orcid:
        # Get all coauthor orcid ids
        coauthor_orcid = entity.get('orcid_pub',['-'])+entity.get('orcid_user',['-'])+entity.get('orcid_other',['-'])
        for cite in jstmp['response']['docs']:
            independent_orcid += cite.get('orcid_user',['-'])+cite.get('orcid_pub',['-'])+cite.get('orcid_other',['-'])

        independent_orcid = set(independent_orcid)
        coauthor_orcid = set(coauthor_orcid)
        assert(independent_orcid.intersection(coauthor_orcid).issubset({'-'}))

    js['response']['docs'][eid]['independent_citation_count'] = jstmp['response']['numFound']

Cannot split 'Euclid Collaboration' into first and last names!
Cannot split 'Euclid Collaboration' into first and last names!
CPU times: user 665 ms, sys: 52.8 ms, total: 718 ms
Wall time: 8.58 s


In [11]:
## Print the complete list of queried papers, with independent citations counts for each
print("Complete list of queried papers")
print("\t   Bibcode       #Cites #Independent cites \t Title\n")
citation_counts = 0
for paper in publications:
    print(paper['bibcode'], '\t', paper['citation_count'], paper['independent_citation_count'], '\t', paper['title'][0])
    citation_counts += paper['citation_count']
print("\nTotal number of citations = ", citation_counts)
print("\nTotal number of independent citations = ", sum([paper['independent_citation_count'] for paper in publications]))

Complete list of queried papers
	   Bibcode       #Cites #Independent cites 	 Title

2020A&A...633A..69H 	 186 116 	 KiDS+VIKING-450: Cosmic shear tomography with optical and infrared data
2014ApJS..212....5M 	 118 63 	 The Third Gravitational Lensing Accuracy Testing (GREAT3) Challenge Handbook
2021A&A...646A.140H 	 118 72 	 KiDS-1000 Cosmology: Multi-probe weak gravitational lensing and spectroscopic galaxy clustering constraints
2020A&A...638L...1J 	 90 69 	 KiDS+VIKING-450 and DES-Y1 combined: Cosmology with cosmic shear
2021A&A...645A.104A 	 97 71 	 KiDS-1000 cosmology: Cosmic shear constraints and comparison between two point statistics
2020A&A...633L..10T 	 54 35 	 Cosmology from large-scale structure. Constraining ΛCDM with BOSS
2020A&A...634A.127A 	 57 32 	 KiDS+VIKING-450 and DES-Y1 combined: Mitigating baryon feedback uncertainty with COSEBIs
2019A&A...624A..92K 	 40 14 	 Towards emulating cosmic shear data: revisiting the calibration of the shear measurements for the Kilo-D

Voila! We have calculated independent citation counts for each referee publication of mine, and the total number of independent citations.