# Maintaining a Bibliography
This notebook demonstrates how to use the API to maintain a library or bibliography. Examples here will be shown using Python with the use of the `requests` library, though the same work could be done using the unofficial Python [ADS library](https://ads.readthedocs.io/en/latest/) or `curl` commands on the command line.

## Using the API
In all examples below, `token` should be replaced with [your own API token](https://ui.adsabs.harvard.edu/user/settings/token). If you haven't worked with our API before, it's recommended that you read the [README](https://github.com/adsabs/adsabs-dev-api/blob/master/README.md) before beginning. We also assume some familiarity with our other API Jupyter notebooks, especially the ones on [searching](https://github.com/adsabs/adsabs-dev-api/blob/master/Search_API.ipynb), [interacting with libraries](https://github.com/adsabs/adsabs-dev-api/blob/master/Libraries_API.ipynb), and [using the API with Python](https://github.com/adsabs/adsabs-dev-api/blob/master/Converting_curl_to_python.ipynb).

In [1]:
# import the requests package and set your token in a variable for later use
import requests

token="your-token-here"

## Introduction
In this notebook, we assume that you're maintaining a library of all papers that match a given search. We recommend you start by refining your search using our website first. This method provides instant feedback and allows you to iterate until you're happy with your search terms. This also provides a shortcut to constructing your API query.

Let's look at an example. Say you're in charge of maintaining a bibliography tracking all papers on or using the MIRI instrument, onboard JWST. After some searching on our website, you arrive at the following query:

`abs:MIRI (full:JWST OR bibstem:JWST) collection:astronomy`

This finds astronomy papers that mention MIRI in the abstract, title, or keywords and requires that JWST is present in either the fulltext (which includes the title, abstract, body, acknowledgements, and keywords) or in the bibstem (these are JWST proposals) to avoid contaminating the bibliography with non-JWST MIRI papers. The URL that corresponds to this search is:

`https://ui.adsabs.harvard.edu/search/q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aastronomy&sort=date%20desc%2C%20bibcode%20desc&p_=0`

## Searching
We can convert this to a search using the API. First, take the API base URL:

`https://api.adsabs.harvard.edu/v1/search/query?`

and append the search parameter string (the part that starts with `q=`) from the website URL:

`q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aastronomy`

You'll also want to append either the sort parameter string (starting with `sort=`) from the website URL or a sort string that better matches your needs (see the [Jupyter notebook on searching](https://github.com/adsabs/adsabs-dev-api/blob/master/Search_API.ipynb) for more ideas):

`sort=date%20desc%2C%20bibcode%20desc`

You'll also need to specify which fields you want returned. Let's return the bibcode and title of each result:

`fl=bibcode,title`

Putting these together and inserting a `&` between the search, sort, and fields strings, we have this API query URL:

`https://api.adsabs.harvard.edu/v1/search/query?q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aastronomy&sort=date%20desc%2C%20bibcode%20desc&fl=bibcode,title`

Two other things you'll need to know:
* this API endpoint uses the GET HTTP method, so you'll use the `get` method on the request. (Note that you can find all API endpoints and their HTTP methods in our [full API documentation](https://ui.adsabs.harvard.edu/help/api/api-docs.html).)
* virtually all API requests require you to pass your API token in the header, as shown below 

Let's run this:

In [2]:
results = requests.get("https://api.adsabs.harvard.edu/v1/search/query?" \
                       "q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aastronomy" \
                       "&sort=date%20desc%2C%20bibcode%20desc" \
                       "&fl=bibcode,title", \
                       headers={'Authorization': 'Bearer ' + token})
# format the response in a nicely readable format
results.json()

{'responseHeader': {'status': 0,
  'QTime': 9,
  'params': {'q': 'abs:MIRI (full:JWST OR bibstem:JWST) collection:astronomy',
   'fl': 'bibcode,title',
   'start': '0',
   'internal_logging_params': 'X-Amzn-Trace-Id=Root=1-6...',
   'sort': 'date desc,bibcode desc',
   'rows': '10',
   'wt': 'json'}},
 'response': {'numFound': 577,
  'start': 0,
  'docs': [{'bibcode': '2021MNRAS.506.1209M',
    'title': ['Observing the host galaxies of high-redshift quasars with JWST: predictions from the BLUETIDES simulation']},
   {'bibcode': '2021arXiv210803161L',
    'title': ['Empirically Determining Substellar Cloud Compositions in the era of JWST']},
   {'bibcode': '2021MNRAS.505.3562L',
    'title': ['Differentiating modern and prebiotic Earth scenarios for TRAPPIST-1e: high-resolution transmission spectra and predictions for JWST']},
   {'bibcode': '2021arXiv210703696T',
    'title': ['Which molecule traces what: chemical diagnostics of protostellar sources']},
   {'bibcode': '2021AAS...238115

Two things to note:
1. Only 10 bibcodes are returned at once, by default. We can either change this to a larger number to return all results at once using the `rows` parameter, if the total number of results found is small enough. Or we can page through the results, if there are a lot.
2. The response from the API is a complicated JSON (dictionary-like) structure. To do anything with the list of results, we'll need to extract a list of bibcodes out of this structure.

For point 1, in this case, we *could* return all results at once. The total of results found is 577 (see the `numFound` key in the response) and the maximum number of results we can return at once is 2000. But for this example, let's page through the results, 100 at a time. While we're at it, we'll solve point 2 by using a list comprehension to extract the bibcodes into a list.

In [3]:
rows = 100 # fetch 100 records at a time
start = 0  # start with the first result
bibcodes = [] # we'll store the bibcodes of all of our results here

# this is the pagination - the while loop will automatically stop once we've fetched all docs
docs = True
while docs:
    # note that this URL is the same as above, except we've added parameters for start and rows
    results = requests.get("https://api.adsabs.harvard.edu/v1/search/query?" \
                           "q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aastronomy" \
                           "&sort=date%20desc%2C%20bibcode%20desc" \
                           "&fl=bibcode,title" \
                           "&rows={rows}" \
                           "&start={start}".format(rows=rows,start=start), \
                           headers={'Authorization': 'Bearer ' + token})
    try:
        docs = results.json()['response']['docs']
    except KeyError:
        print('No docs found')
        break
    # pull the bibcodes out of the results into a list
    tmp = [d['bibcode'] for d in docs]
    bibcodes = bibcodes + tmp
    start += rows # increment the start value to move to the next page of results

print(len(bibcodes))

577


Ok, we've successfully searched for all of the papers we're interested in and created a list of their bibcodes. Let's move on to creating and maintaining the library.

## Create the library
Let's assume that you're starting this project from scratch and haven't yet created the library. Let's do that now. The URL to create a library is:

`https://api.adsabs.harvard.edu/v1/biblib/libraries`

Creating a library requires the POST HTTP method; you'll need to pass in the list of bibcodes to add to the library as a payload in the format of a serialized dictionary. You can also specify the library name and description in the payload; default values will be used if you don't pass these (but these can be changed later). The library will be private by default; you can choose to make it public at this step. (More details can be found in our [libraries Jupyter notebook](https://github.com/adsabs/adsabs-dev-api/blob/master/Libraries_API.ipynb) or our [full API documentation](https://ui.adsabs.harvard.edu/help/api/api-docs.html).)

Let's create a library:

In [4]:
# create a dictionary with the payload values
payload = {'name': 'JWST/MIRI',
           'description': 'JWST/MIRI papers',
           'public': False,
           'bibcode': bibcodes}

# the json library offers an easy way to convert between JSON or dictionaries and their serialized strings
import json
serialized_payload = json.dumps(payload)

library_response = requests.post('https://api.adsabs.harvard.edu/v1/biblib/libraries', \
                                 headers={'Authorization': 'Bearer ' + token},
                                 data=serialized_payload)

library_response.json()

{'name': 'JWST/MIRI',
 'id': 'y0DzlGstQe-6aWJMyVXtdg',
 'description': 'JWST/MIRI papers',
 'bibcode': ['2021MNRAS.506.1209M',
  '2021arXiv210803161L',
  '2021MNRAS.505.3562L',
  '2021arXiv210703696T',
  '2021AAS...23811505P',
  '2021A&A...650A.192T',
  '2021arXiv210412788S',
  '2021A&A...648A..92P',
  '2021jwst.prop.2722M',
  '2021jwst.prop.2708B',
  '2021jwst.prop.2667W',
  '2021jwst.prop.2666F',
  '2021jwst.prop.2662Z',
  '2021jwst.prop.2654I',
  '2021jwst.prop.2581C',
  '2021jwst.prop.2562M',
  '2021jwst.prop.2547V',
  '2021jwst.prop.2538H',
  '2021jwst.prop.2537R',
  '2021jwst.prop.2526A',
  '2021jwst.prop.2524N',
  '2021jwst.prop.2521S',
  '2021jwst.prop.2516H',
  '2021jwst.prop.2511L',
  '2021jwst.prop.2491S',
  '2021jwst.prop.2459D',
  '2021jwst.prop.2446K',
  '2021jwst.prop.2441A',
  '2021jwst.prop.2439M',
  '2021jwst.prop.2424J',
  '2021jwst.prop.2391R',
  '2021jwst.prop.2368T',
  '2021jwst.prop.2348T',
  '2021jwst.prop.2347D',
  '2021jwst.prop.2337C',
  '2021jwst.prop.2331P'

Success! Library created. 

## Adding papers to the library
But maintaining a library/bibliography requires occasional updates and changes. Let's say you realized there are a handful of MIRI papers in ADS's physics collection, mostly on materials science, and you want to add these to your library. The search terms you've settled on by using the ADS website are:

`abs:MIRI (full:JWST OR bibstem:JWST) collection:physics -collection:astronomy`

And the corresponding URL:

`https://ui.adsabs.harvard.edu/search/q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aphysics%20-collection%3Aastronomy&sort=date%20desc%2C%20bibcode%20desc&p_=0`

We'll copy the query string into the same API search query we had before. This time, we'll set the `rows` parameter to something greater than the total number of results, since the number of results is small. We'll also extract the list of bibcodes from the results the same way we did earlier.

In [5]:
results = requests.get("https://api.adsabs.harvard.edu/v1/search/query?" \
                       "q=abs%3AMIRI%20(full%3AJWST%20OR%20bibstem%3AJWST)%20collection%3Aphysics%20-collection%3Aastronomy" \
                       "&sort=date%20desc%2C%20bibcode%20desc" \
                       "&fl=bibcode,title" \
                       "&rows=50", \
                       headers={'Authorization': 'Bearer ' + token})

results.json()

{'responseHeader': {'status': 0,
  'QTime': 19,
  'params': {'q': 'abs:MIRI (full:JWST OR bibstem:JWST) collection:physics -collection:astronomy',
   'fl': 'bibcode,title',
   'start': '0',
   'internal_logging_params': 'X-Amzn-Trace-Id=Root=1-6...',
   'sort': 'date desc,bibcode desc',
   'rows': '50',
   'wt': 'json'}},
 'response': {'numFound': 15,
  'start': 0,
  'docs': [{'bibcode': '2020JQSRT.25107011S',
    'title': ['Pseudoline parameters to represent n-butane (n-C<SUB>4</SUB>H<SUB>10</SUB>) cross-sections measured in the 7-15 μm region for the Titan atmosphere']},
   {'bibcode': '2020MS&E..755a2018P',
    'title': ['Modifications to the MIRI cryocooler design to provide significant lift in the 2K to 4K range']},
   {'bibcode': '2018AdOT....7..353T',
    'title': ['The European optical contribution to the James Webb Space Telescope']},
   {'bibcode': '2018SPIE10748E..0HS',
    'title': ['The James Webb Space Telescope: contamination control and materials']},
   {'bibcode': '201

In [6]:
docs = results.json()['response']['docs']
bibcodes = [d['bibcode'] for d in docs]
bibcodes

['2020JQSRT.25107011S',
 '2020MS&E..755a2018P',
 '2018AdOT....7..353T',
 '2018SPIE10748E..0HS',
 '2017MS&E..278a2006M',
 '2016Cryo...74..166T',
 '2012SPIE.8533E..0TL',
 '2010AIPC.1218.1015M',
 '2008AIPC..985..522G',
 '2007SPIE.6692E..0KM',
 '2007SPIE.6676E..0JK',
 '2007SPIE.6665E..0YH',
 '2006Cryo...46..216S',
 '2005SPIE.5883....1L',
 '2005SPIE.5877..244K']

Now we'll add these new documents to the existing library. First, we need to get the ID of the library we just created:

In [7]:
id = library_response.json()['id']
id

'y0DzlGstQe-6aWJMyVXtdg'

This library ID will be inserted into the URL we'll use to edit the library. The base URL to add papers to the library, also using the POST method, is:

`https://api.adsabs.harvard.edu/v1/biblib/documents/{id}`

And the request:

In [8]:
payload = {'bibcode': bibcodes,
           'action': 'add'}

serialized_payload = json.dumps(payload)

library_edit_response = requests.post('https://api.adsabs.harvard.edu/v1/biblib/documents/{}'.format(id), \
                                      headers={'Authorization': 'Bearer ' + token},
                                      data=serialized_payload)

library_edit_response.json()

{'number_added': 15}

## Adding recent papers to the library
Once we've created the library, we'll need to maintain it by adding new papers. We could search for and add all papers matching our search terms to the library, as above, regardless of whether they've already been added or not; the system is smart enough to not try to re-add papers already in the library. However, there's an easier way! Say you normally update the library monthly and you last updated it at the end of the previous month. You can search for papers that have been added to our database since the first of the month using the `entdate` field:

`abs:MIRI (full:JWST OR bibstem:JWST) entdate:[2021-08-01 TO *]`

This searches for all papers that match your previous search terms that have been added to ADS on or after August 1, 2021 (this notebook was prepared in late August 2021). Let's see what that looks like. However, instead of using the website URL to get the query term to create the API URL as we did before, let's create it by encoding the search terms ourselves:

In [9]:
from urllib.parse import urlencode
query = 'abs:MIRI (full:JWST OR bibstem:JWST) entdate:[2021-08-01 TO *]'
encoded_query = urlencode({'q': query})
encoded_query

'q=abs%3AMIRI+%28full%3AJWST+OR+bibstem%3AJWST%29+entdate%3A%5B2021-08-01+TO+%2A%5D'

In [10]:
results = requests.get("https://api.adsabs.harvard.edu/v1/search/query?" \
                       "{}" \
                       "&sort=date%20desc%2C%20bibcode%20desc" \
                       "&fl=bibcode,title" \
                       "&rows=50".format(encoded_query), \
                       headers={'Authorization': 'Bearer ' + token})

results.json()

{'responseHeader': {'status': 0,
  'QTime': 20,
  'params': {'q': 'abs:MIRI (full:JWST OR bibstem:JWST) entdate:[2021-08-01 TO *]',
   'fl': 'bibcode,title',
   'start': '0',
   'internal_logging_params': 'X-Amzn-Trace-Id=Root=1-6...',
   'sort': 'date desc,bibcode desc',
   'rows': '50',
   'wt': 'json'}},
 'response': {'numFound': 1,
  'start': 0,
  'docs': [{'bibcode': '2021arXiv210803161L',
    'title': ['Empirically Determining Substellar Cloud Compositions in the era of JWST']}]}}

In [11]:
docs = results.json()['response']['docs']
bibcodes = [d['bibcode'] for d in docs]
bibcodes

['2021arXiv210803161L']

In [12]:
payload = {'bibcode': bibcodes,
           'action': 'add'}

serialized_payload = json.dumps(payload)

library_edit_response = requests.post('https://api.adsabs.harvard.edu/v1/biblib/documents/{}'.format(id), \
                                      headers={'Authorization': 'Bearer ' + token},
                                      data=serialized_payload)

library_edit_response, library_edit_response.json()

(<Response [200]>, {'number_added': 0})

In this case, 0 new bibcodes were added to the library because that bibcode was already added previously in this notebook. However, `Response [200]` indicates (via the `200` status code) that the request was successful and this bibcode would have been added if it weren't already in the library.

## Using library operations
Now let's say you've decided to separate the JWST proposals (bibstem JWST) from the other papers in the library and move them to their own library. There are a few ways to approach this, but let's use the library set operations for this.

First, let's create a library of just the JWST proposals. The search terms are:

`abs:MIRI bibstem:JWST`

Again, let's encode the search terms ourselves, but this time we'll encode all of the query terms, not just the search:

In [13]:
query = 'abs:MIRI bibstem:JWST'
sort = 'date desc, bibcode desc'
fl = 'bibcode,title'
rows = '250'
encoded_query = urlencode({'q': query, 'sort': sort, 'fl': fl, 'rows': rows})
encoded_query

'q=abs%3AMIRI+bibstem%3AJWST&sort=date+desc%2C+bibcode+desc&fl=bibcode%2Ctitle&rows=250'

Let's search for these papers, extract the bibcodes, and create a new library with them.

In [14]:
results = requests.get("https://api.adsabs.harvard.edu/v1/search/query?{}".format(encoded_query), \
                       headers={'Authorization': 'Bearer ' + token})

try:
    docs = results.json()['response']['docs']
except KeyError:
    print('No docs found')
    docs = []
    
bibcodes = [d['bibcode'] for d in docs]

payload = {'name': 'JWST/MIRI proposals',
           'description': 'JWST/MIRI proposals',
           'public': False,
           'bibcode': bibcodes}

serialized_payload = json.dumps(payload)

library_proposal_response = requests.post('https://api.adsabs.harvard.edu/v1/biblib/libraries', \
                                          headers={'Authorization': 'Bearer ' + token},
                                          data=serialized_payload)

We'll need the ID of the new library in a second so let's get that now:

In [15]:
id_proposals = library_proposal_response.json()['id']

Ok, now we can separate these two sets of papers into their own libraries. We'll use the `difference` set operation, which will compare the contents of the first library we created to that of the second, extract the papers that only appear in the first library, and move them to their own library. The original library remains unchanged. The URL we'll use is:

`https://api.adsabs.harvard.edu/v1/biblib/libraries/operations/{id}`

where `id` refers to the ID of the original library. This is a POST request, and we'll pass the library ID to subtract (here, the proposals library) in as a payload, along with the name and description of the new library.

In [16]:
payload = {
    "action": "difference", 
    "libraries": [id_proposals], 
    "name": "JWST/MIRI papers", 
    "description": "JWST/MIRI papers only (no proposals)", 
    "public": False
}

serialized_payload = json.dumps(payload)

library_papers_only_response = requests.post('https://api.adsabs.harvard.edu/v1/biblib/libraries/operations/{}'.format(id), \
                                             headers={'Authorization': 'Bearer ' + token},
                                             data=serialized_payload)

In [17]:
library_papers_only_response.json()

{'name': 'JWST/MIRI papers',
 'id': 'sW_dbkYbRWCCKuGIkYftGg',
 'description': 'JWST/MIRI papers only (no proposals)',
 'bibcode': ['2018AAS...23230201L',
  '2010AIPC.1218.1015M',
  '2019AAS...23340201G',
  '2018AAS...23230203M',
  '2011AAS...21743306H',
  '2009AAS...21342612M',
  '2018A&A...609A..30S',
  '2018cosp...42E2133M',
  '2016SPIE.9904E..41B',
  '2007spts.conf...33G',
  '2014ebi..confP4.66L',
  '2014AAS...22431105B',
  '2006SPIE.6273E..1YL',
  '2015AAS...22533812L',
  '2012ASPC..461..169B',
  '2018SPIE10698E..3UK',
  '2004SPIE.5499..240H',
  '2018SPIE10698E..3LA',
  '2018AAS...23142704B',
  '2007lyot.confE..50M',
  '2011AAS...21740204S',
  '2016AAS...22710802S',
  '2017AAS...23020903M',
  '2018AAS...23110503K',
  '2012SPIE.8449E..09E',
  '2017apra.prop...94B',
  '2005SPIE.5877..244K',
  '2020ApJ...904..154P',
  '2020AAS...23537304B',
  '2004SPIE.5495...56H',
  '2010A&A...515A..95S',
  '2010cosp...38.2508W',
  '2017AAS...23020302G',
  '2010SPIE.7731E..3NR',
  '2007lyot.confQ..49

## Managing library contributors
Finally, if you're managing the bibliography for a larger project, it's likely that another person will be working with you. Let's see how to add a collaborator.

The URL to update collaborator permissions is: 

`https://api.adsabs.harvard.edu/v1/biblib/permissions/{id}`

This is a POST request, and we'll pass the collaborator's information in the payload. In this case, we want our collaborator to have `write` access, so they can read and write to the library. (More info on available permissions can be found in the [libraries Jupyter notebook](https://github.com/adsabs/adsabs-dev-api/blob/master/Libraries_API.ipynb) or in the [full API documentation](https://ui.adsabs.harvard.edu/help/api/api-docs.html)).

In [18]:
lib_id = library_papers_only_response.json()['id']

# we've redacted the user's email address here, but you'll need to supply the full email address of the user
payload = {"email":"user2@gmail.com", 
           "permission": {"write": True}}
serialized_payload = json.dumps(payload)

response = requests.post('https://api.adsabs.harvard.edu/v1/biblib/permissions/{}'.format(lib_id),
                         headers={'Authorization': 'Bearer ' + token},
                         data=serialized_payload
                        )

In [19]:
response

<Response [200]>

There's no data returned on this request, but we can see the status code is `200`, which indicates the response processed successfully. We can verify this by checking the full list of collaborators. This uses the same URL as above, but with the GET method:

In [20]:
response = requests.get('https://api.adsabs.harvard.edu/v1/biblib/permissions/{}'.format(lib_id),
                        headers={'Authorization': 'Bearer ' + token})
response.json()

[{'user1@cfa.harvard.edu': ['owner']},
 {'user2@gmail.com': ['write']}]

There are two users attached to this library: we're the owner, since we created the library, and the other user who now has read/write access.

And that's it! This should be enough to get you started with maintaining your own library or bibliography in ADS via our API. If you get stuck, check out the rest of our [API documentation](https://ui.adsabs.harvard.edu/help/api/) or [email us](mailto:adshelp@cfa.harvard.edu). 