# Poking the DataCite API with Python

This notebook can be used for minting and deleting DataCite DOIs with the DataCite Member API. To execute a code cell, select it and click the play (black triangle) button in the toolbar. You can also use `Ctrl + Enter`.

Depending on the platform that you're running this notebook on, execution of code cells - especially those that make a request of the DataCite API - may take some time. Be patient, and in bulk operations you might like to get up and do some stretches or get a coffee.

## Initial setup

Create a copy of `datacite-api-config.example.json` named `datacite-api-config.json` and modify it to contain your DataCite Member API username and password. For example:
```json
{
    "username" : "myusername",
    "password" : "mypassword"
}
```

By default this notebook runs on the test API.

Then execute the following code cell to perform initial setup. You swill need to execute this code cell once at the beginning of every session.

In [1]:
import csv
import datetime
import json
import os
import requests
import urllib

if not os.path.isfile('doilog.csv'):
    with open('doilog.csv', 'w') as csvfile:
        logwriter = csv.writer(csvfile)
        logwriter.writerow([ 'timestamp', 'username', 'filename', 'doi', 'action' ])

with open('datacite-api-config.json') as f:
    data = json.load(f)
    username = data["username"]
    password = data["password"]

use_test_api = True

if use_test_api:
    api_endpoint = 'https://api.test.datacite.org/dois'
else:
    api_endpoint = 'https://api.test.datacite.org/dois'

# Bulk

## Mint bulk Draft DOIs

The following code cell will go through all files in the `bulk-mint` directory and attempt to mint a DOI for each one. The DataCite API will reject anything that is not a valid JSON file containing DataCite metadata.

Put multiple metadata.json files in the `bulk-mint` directory, and then execute the following code cell.

In [51]:
path = 'bulk-mint'
url = api_endpoint
headers = {
    'Content-Type': 'application/vnd.api+json',
}
print('Bulk minting DOIs for files in ' + directory)
dois = []

with os.scandir(path) as it:
    for entry in it:
        if not entry.name.startswith('.') and entry.is_file():
            data = open(directory + '/' + entry.name)
            print('Attempting to mint DOI for ' + directory + '/' + entry.name)
            response = requests.request('POST', url, auth=(username, password), data = data, headers = headers)
            if (response.headers['Status'] == '201 Created'):
                doi = json.loads(response.text)['data']['id']
                timestamp = datetime.datetime.now().replace(microsecond=0).astimezone().isoformat()
                # timestamp = response.headers['Date']
                print(doi + ' minted')
                dois.append(doi)
                with open('doilog.csv', 'a') as csvfile:
                    logwriter = csv.writer(csvfile)
                    logwriter.writerow([ timestamp, username, directory + '/' + filename, doi , 'created' ])
            else:
                print('DOI not minted')
print('Done')

Bulk minting DOIs for files in bulk-mint
Attempting to mint DOI for bulk-mint/.ipynb_checkpoints
DOI not minted
Done


## Delete bulk Draft DOIs

The following code cell will attempt to delete all of the DOIs specified in the dois array, which is automatically filled by the bulk minting process.

If you want to specify a list of DOIs to delete, put them in the list in the second cell e.g. `dois = ['10.80335/sr34-9h64', '10.80335/5375-5t54', '10.80335/2r04-6k46', '10.80335/drgg-dp97', '10.80335/7yky-cd07']`. Then, execute the second code cell before executing the first code cell.

In [None]:
print('Bulk deleting DOIs in list')
for doi in dois:
    url = api_endpoint + '/' + urllib.parse.quote_plus(doi)
    print('Attempting to delete ' + doi)
    response = requests.request('DELETE', url, auth=(username, password), headers=headers)
    if (response.headers['Status'] == '204 No Content'):
        timestamp = datetime.datetime.now().replace(microsecond=0).astimezone().isoformat()
        # timestamp = response.headers['Date']
        print(doi + ' deleted')
        with open('doilog.csv', 'a') as csvfile:
            logwriter = csv.writer(csvfile)
            logwriter.writerow([ timestamp, username, '', doi , 'deleted' ])
    else:
        print(doi + ' not deleted')
        print(response.headers)
print('Done')

In [None]:
dois = []

In [None]:
dois

## Register, publish, or hide bulk Draft DOIs

The following code cell will attempt to register, publish, or hide all of the DOIs specified in the dois array, which is automatically filled by the bulk minting process.

Once in Registered or Findable state, a DOI can't be set back to Draft state. This also means that once in Registered or Findable state, a DOI *cannot be deleted*. This is serious, mum.

The only option for removing a Findable DOI from the public record is to hide it.

In [None]:
action = 'publish' # Set this to register, publish, or hide

payload = '{\"data\":{\"attributes\":{\"event\":\"' + action + '\"}}}'
headers = {
    'Content-Type': 'application/vnd.api+json',
}
print('Bulk ' + action + ' DOIs in list')
for doi in dois:
    url = api_endpoint + '/' + urllib.parse.quote_plus(doi)
    print('Attempting to ' + action + ' ' + doi)
    response = requests.request('PUT', url, auth=(username, password), data = payload, headers=headers)
    if (response.headers['Status'] == '200 OK'):
        timestamp = datetime.datetime.now().replace(microsecond=0).astimezone().isoformat()
        # timestamp = response.headers['Date']
        print(doi + ' ' + action + ' successful')
        with open('doilog.csv', 'a') as csvfile:
            logwriter = csv.writer(csvfile)
            logwriter.writerow([ timestamp, username, '', doi , action ])
    else:
        print(doi + ' not deleted')
        print(response.headers)
print('Done')

## Bulk download DOI metadata

In [29]:
dois = ['10.80335/1337']

path = 'downloaded'
url = api_endpoint
headers = {
    'Content-Type': 'application/vnd.api+json',
}
print('Bulk downloading metadata for DOIs in array')
for doi in dois:
    url = api_endpoint + '/' + doi
    response = requests.request("GET", url, auth=(username, password), headers = headers)
print('Done')
print(response.headers)

Bulk downloading metadata for DOIs in array
Done
{'Date': 'Fri, 27 Aug 2021 06:19:02 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Status': '200 OK', 'Cache-Control': 'max-age=0, private, must-revalidate', 'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'ETag': 'W/"f1cbca2bf80a8f53d18a9c7289bbe10c"', 'X-Runtime': '0.146920', 'X-Credential-Username': 'ardcx.ardc', 'X-Request-Id': '43473872-388d-4810-8164-6c2f3dacf889', 'X-Powered-By': 'Phusion Passenger(R) 6.0.10', 'Server': 'nginx/1.14.0 + Phusion Passenger(R) 6.0.10', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Methods': 'GET, POST, PUT, PATCH, DELETE, OPTIONS', 'Access-Control-Allow-Headers': 'Accept,Access-Control-Allow-Origin,Access-Control-Expose-Headers,Access-Control-Allow-Methods,Access-Control-Allow-Headers,Authorization,Cache-Control,Content-Type,DNT,If-Modified-Since,Keep-Alive,Origin,User-Agent,X-Mx-ReqToken,X-Requested-With

## Bulk update DOIs

The following code cell will attempt to update the metadata of all of the DOIs in the `bulk-mint` directory.

In [50]:
path = 'bulk-mint'
action = 'update'
url = api_endpoint
headers = {
    'Content-Type': 'application/vnd.api+json',
}
print('Bulk updating metadata for files in ' + directory)
dois = []
with os.scandir(path) as it:
    for entry in it:
        if not entry.name.startswith('.') and entry.is_file():
            print('Attempting to upload metadata for ' + entry.name)
            data = open(directory + '/' + entry.name)
            metadata = json.load(data)
            doi = metadata['data']['id']
            print('DOI on record is ' + doi)
            url = api_endpoint + '/' + doi
            response = requests.request('PUT', url, auth=(username, password), data = json.dumps(metadata), headers = headers)
            if (response.headers['Status'] == '200 OK'):
                timestamp = datetime.datetime.now().replace(microsecond=0).astimezone().isoformat()
                # timestamp = response.headers['Date']
                print(doi + ' ' + action + ' successful')
                with open('doilog.csv', 'a') as csvfile:
                    logwriter = csv.writer(csvfile)
                    logwriter.writerow([ timestamp, username, '', doi , action ])
            else:
                print(doi + ' ' + action + ' unsuccessful')
                print(response.headers)
print('Done')

Bulk updating metadata for files in bulk-mint
Attempting to upload metadata for data-retention.example.json
DOI on record is 10.80335/1337
10.80335/1337 update successful
Done


## Troubleshooting

If you are trying to work out why your DOI was not minted or deleted, execute the following code cell.

In [52]:
print(response.headers)
print('\n')
print(response.text)
json.loads(response.text)

{'Date': 'Fri, 27 Aug 2021 07:05:50 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Status': '422 Unprocessable Entity', 'Cache-Control': 'no-cache', 'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'X-Runtime': '0.086378', 'X-Credential-Username': 'ardcx.ardc', 'X-Request-Id': '56e53a74-7899-4e10-b12e-f687ec385b7a', 'X-Powered-By': 'Phusion Passenger(R) 6.0.10', 'Server': 'nginx/1.14.0 + Phusion Passenger(R) 6.0.10', 'Access-Control-Allow-Credentials': 'true', 'Access-Control-Allow-Methods': 'GET, POST, PUT, PATCH, DELETE, OPTIONS', 'Access-Control-Allow-Headers': 'Accept,Access-Control-Allow-Origin,Access-Control-Expose-Headers,Access-Control-Allow-Methods,Access-Control-Allow-Headers,Authorization,Cache-Control,Content-Type,DNT,If-Modified-Since,Keep-Alive,Origin,User-Agent,X-Mx-ReqToken,X-Requested-With', 'Access-Control-Expose-Headers': 'Authorization'}


{"errors":[{"source":"doi","uid":"10.80335/1337","t

{'errors': [{'source': 'doi',
   'uid': '10.80335/1337',
   'title': 'This DOI has already been taken'}]}