# Jupyter notebook to explore DataCite Rest API
This notebook demos some of the functionality of the DataCite Rest API using Python. It's based on the DataCite Rest API support docs: https://support.datacite.org/docs/api

## Import packages

In [None]:
import requests, base64, json, getpass

## Specify repository URL and credentials

In [None]:
# define the endpoint
# testing: https://api.test.datacite.org/
# production: https://api.datacite.org/
endpoint = "https://api.test.datacite.org/"

# define the repository ID and password for authentication
repository_id = "ABCD.XYZ"
repository_pw = getpass.getpass("Repository password: ")

In [None]:
# Authentication is done using base64 encoded string constructed from repository and ID and password
encoded = base64.b64encode(bytes(repository_id + ':' + repository_pw, 'utf-8'))
auth_str = "Basic " + encoded.decode()
del repository_pw

## Return a DOI

In [None]:
# the URL is constructed from the endpoint and the DOI ID (prefix and suffix)
id = "10.82790/5pyp-jj80"
url = endpoint + "dois/" + id
print(url)

headers = {"accept": "application/vnd.api+json", "authorization": auth_str}

response = requests.get(url, headers=headers) # JSON-formatted response

print(json.dumps(response.json(), indent=4))

In [None]:
# this example shows how a "real" DOI is retrieved from the production server (no authentication required)
url = "https://api.datacite.org/dois/10.14454/FXWS-0523"

headers = {"accept": "application/vnd.api+json"}

response = requests.get(url, headers=headers) # JSON-formatted response

print(json.dumps(response.json(), indent=4))

## Return list of DOIs
It's also possible to return a list of DOIs according to different parameters, for example all DOIs under a given prefix. For other parameters, see https://support.datacite.org/docs/api-queries 

In [None]:
search_param = "prefix"
search_val = "10.82790"

url = endpoint + "dois?" + search_param + "=" + search_val    # url = "https://api.test.datacite.org/dois?prefix=10.82790"

headers = {
    "accept": "application/vnd.api+json",
    "authorization": auth_str
}

response = requests.get(url, headers=headers)

print("Found " + str(len(response.json()['data'])) + " DOIs matching " + search_param + " " + search_val)

Check the resource type and client type. For IGSN repositories, this should be `PhysicalObject` and `igsnCatalog`, respectively.

In [None]:
json_data = response.json()['data']
for doi in json_data:
    print('DOI: ' + doi['attributes']['doi'])
    clientType = requests.get('https://api.test.datacite.org/clients/' + doi['relationships']['client']['data']['id']).json()['data']['attributes']['clientType']
    print('Repo type: ' + clientType)
    resource_type = response.json()['data'][0]['attributes']['types']['resourceTypeGeneral']
    print('Resource type: ' + resource_type)
    if clientType == 'igsnCatalog' and resource_type == 'PhysicalObject':
        print('This DOI is a valid IGSN\n')
    else:
        print('This DOI is not a valid IGSN\n')

## Creating DOIs
To create DOIs, the metadata is provided directly in JSON format (JSON payload). As an alternative to providing metadata attributes directly in JSON, one can also provide metadata in other formats.

The following metadata formats can be used to register DOIs:
- DataCite XML
- RIS
- BibTeX
- Schema.org JSON-LD
- Citeproc JSON
- Codemeta
- Crossref XML

To do this:
1. base64-encode the metadata
2. include them in the "xml" attribute of the JSON payload

Constructing a simple JSON-payload for a draft DOI (not published):

In [None]:
payload = {
  "data": {
    "type": "dois",
    "attributes": {
      "prefix": "10.82790",
      "creators": [
        {
          "name": "Digital Botanical Gardens Initiative"
        }
      ],
      "titles": [
        {
          "title": "Next test entry created from API."
        }
      ],
      "publisher": "DBGI",
      "publicationYear": 2024,
      "types": {
        "resourceTypeGeneral": "PhysicalObject" # for IGSN repositories, this should always be PhysicalObject
      },
      "url": "https://portal.earthmetabolome.org/"
    }
  }
}

In [None]:
def create_doi_with_json_payload(endpoint, auth_str, payload):
    url = endpoint + "dois/"

    headers = {
        "content-type": "application/json",
        "authorization": auth_str
    }

    return requests.post(url, json=payload, headers=headers)

In [None]:
response = create_doi_with_json_payload(endpoint, auth_str, payload)

print(json.dumps(response.json(), indent=4))

To publish a DOI directly (without draft state), add `"event": "publish"` to the attributes:

In [None]:
payload = {
  "data": {
      "type": "dois",
      "attributes": {
          "prefix": "10.82790",
          "event": "publish",
          "creators": [
              {
                  "name": "Digital Botanical Gardens Initiative"
              }
          ],
          "titles": [
              {
                  "title": "Test entry created from API."
              }
          ],
          "publisher": "DBGI",
          "publicationYear": 2024,
          "types": {
              "resourceTypeGeneral": "PhysicalObject" # for IGSN repositories, this should always be PhysicalObject
          },
          "url": "https://portal.earthmetabolome.org/"
    }
  }
}

In [None]:
response = create_doi_with_json_payload(endpoint, auth_str, payload)

print(json.dumps(response.json(), indent=4))

## Delete a DOI in draft state
It is possible to delete DOIs created in draft state. Published DOIs cannot be deleted. They can only be updated (see below). 

In [None]:
# the URL is constructed from the endpoint and the DOI ID (prefix and suffix)
id = "10.82790/n5xn-jm02"
url = endpoint + "dois/" + id

headers = {"authorization": auth_str}

response = requests.delete(url, headers=headers)

assert response.status_code==204, response.text # successful deletion returns status code 204

## Update a DOI
PUT requests to the /dois endpoint will **update** a DOI record if it already exists and create a **new record** if the DOI name is not already taken. When updating a record via the API, only the attributes included in the payload will be affected.

### Example: update the URL of the record and add a license

In [None]:
payload = {
  "data": {
    "type": "dois",
    "attributes": {
        "url": "https://portal.earthmetabolome.org",
        "rightsList": [
            {
                "rights": "Creative Commons Zero v1.0 Universal",
                "rightsUri": "https://creativecommons.org/publicdomain/zero/1.0/legalcode",
                "schemeUri": "https://spdx.org/licenses/",
                "rightsIdentifier": "cc0-1.0",
                "rightsIdentifierScheme": "SPDX"
            }
        ]
    }
  }
}

In [None]:
def update_doi_with_json_payload(endpoint, doi_id, auth_str, payload):
    url = endpoint + "dois/" + doi_id

    headers = {
        "content-type": "application/json",
        "authorization": auth_str
    }

    return requests.put(url, json=payload, headers=headers)

In [None]:
doi_id = "10.82790/7em2-m082"
response = update_doi_with_json_payload(endpoint, doi_id, auth_str, payload)

print(json.dumps(response.json(), indent=4))

## Metadata provenance
The changes made to a DOI are tracked and can be retrieved via the `/activities` endpoint. The below example shows the initial metadata version submitted at creation (`"action": "create"`) and subsequent changed versions (`"action": "update"`). 

In [None]:
doi_id = "10.82790/7em2-m082"
doi_id = "10.82790/7aqr-3518"

url = endpoint + "dois/" + doi_id + "/activities"

# url = "https://api.datacite.org/dois/10.5438/jwvf-8a66/activities"

headers = {"accept": "application/vnd.api+json"}

response = requests.get(url, headers=headers)

print(json.dumps(response.json(), indent=4))