# USGS DOI Tool Python Module Examples
The USGS Digital Object Identifier tool mints unique indentifers for USGS products. Please be mindful of any development testing so large amounts of DOIs are not accidentally created or set to 'published'.  Once published a DOI must persist through time (it can not be deleted). The USGS DOI Tool staging environment can help you as you develop a workflow. Any DOI that is created there is assigned a 10.5072 DOI which will not be sent to DataCite to mint.

This module requires the usgs_datatools package which can be installed using:

"pip install git+https://github.com/bserna-usgs/usgs_datatools.git"

In [14]:
import os
import json
import requests
import getpass
from usgs_datatools import doi

### Configuration
The tool expects a valid USGS Active Directory account formatted as: username@usgs.gov. 

In [15]:
username = 'dignizio@usgs.gov'
password = getpass.getpass('USGS AD Password: ')
print('*Complete*')

USGS AD Password: ········
*Complete*


# Example 1 - Reading Existing DOIs

### Establish a DOI Tool Session (production)
A DOI Session object is instantiated against a particular DOI Tool environment, ('production' or 'staging'). For the example of reading from an existing well-formed DOI, the 'production' environment is specified.

In [16]:
#DoiSession = doi.DoiSession(env='production')  # Production
#DoiSession = doi.DoiSession(env='staging')  # Staging
#*Note: User must be on the USGS network or VPN to successfully use the staging environment.*
DoiSession = doi.DoiSession(env='production')

### Authentication

In [17]:
DoiSession.doi_authenticate(username, password)
print ("Successfully authenticated.")

Successfully authenticated.


### Fetch DOI Attributes

In [18]:
## Note the expected format for the DOI string param passed to the get_doi function is 'doi:10.5066/XXXXXXXX'
## Example published DOIs that are well-formatted and illustrate the structure of the DOI model:
#'doi:10.5066/F73F4NVM'
#'doi:10.5066/F76972V8'
#'doi:10.5066/F7W0944J'
    
myDoi = DoiSession.get_doi('doi:10.5066/F7W0944J')
myDoi

{'doi': 'doi:10.5066/F7W0944J',
 'title': 'North American Breeding Bird Survey Dataset 1966 - 2016, version 2016.0',
 'pubDate': '2017',
 'url': 'ftp://ftpext.usgs.gov/pub/er/md/laurel/BBS/Archivefiles/Version2016v0/',
 'resourceType': 'Dataset',
 'date': '1966/2016',
 'dateType': 'Collected',
 'description': 'The 1966-2016 North American Breeding Bird Survey dataset contains avian point count data for more than 700 North American bird taxa (primarily species, but also some races and unidentified species groupings).  These data are collected annually during the breeding season, primarily June and May, along thousands of randomly established roadside survey routes in the United States and Canada. Routes are about 24.5 miles (39.2 km) long with counting locations placed at regular intervals, for a total of 50 stops. At each stop, a person highly skilled in avian identification conducts a 3-minute point count, recording every bird seen within a quarter-mile (400-m) radius and every bird h

### Inspect the 'description' element

In [19]:
myDoi['description']

'The 1966-2016 North American Breeding Bird Survey dataset contains avian point count data for more than 700 North American bird taxa (primarily species, but also some races and unidentified species groupings).  These data are collected annually during the breeding season, primarily June and May, along thousands of randomly established roadside survey routes in the United States and Canada. Routes are about 24.5 miles (39.2 km) long with counting locations placed at regular intervals, for a total of 50 stops. At each stop, a person highly skilled in avian identification conducts a 3-minute point count, recording every bird seen within a quarter-mile (400-m) radius and every bird heard.  Surveys begin 30 minutes before local sunrise and take approximately 5 hours to complete.  A route is sampled once per year, with the total number of routes sampled per year growing over time; about 600 routes were sampled in 1966, while in recent decades approximately 3000 routes have been sampled annu

### Inspect the 'authors' element
Note that the 'authors' element is a list (array) of dictionaries (the 'related identifiers' element is similarly structured).

In [20]:
myDoi['authors']

[{'authorName': 'Hudson, Marie-Anne R.',
  'orcId': '',
  'nameType': 'Personal',
  'position': 0},
 {'authorName': 'Lutmerding, Michael',
  'orcId': '',
  'nameType': 'Personal',
  'position': 1},
 {'authorName': 'Campbell, Kate',
  'orcId': '',
  'nameType': 'Personal',
  'position': 2},
 {'authorName': 'Pardieck, Keith L.',
  'orcId': '',
  'nameType': 'Personal',
  'position': 3},
 {'authorName': 'Ziolkowski Jr., David',
  'orcId': '0000-0002-2500-4417',
  'nameType': 'Personal',
  'position': 4}]

# Example 2 - Creating and Modifying DOIs

### Establish a DOI Tool Session (staging)
A DOI Session object is instantiated against a particular DOI Tool environment, ('production' or 'staging'). For the example of creating and modifying a DOI, the 'staging' environment is specified. Note that staging DOIs are ephemeral and will be purged every ~2 weeks, thus the example for modifying a DOI starts with creating a new target to work with so this notebook will remain usable.

Note that the DOI tool will time-out after a very short period of inactivity...
If you receive errors downstream with calls, please ensure the session is still active (re-sign in / establish session).

In [21]:
#DoiSession = doi.DoiSession(env='production')  # Production
#DoiSession = doi.DoiSession(env='staging')  # Staging
#*Note: User must be on the USGS network or VPN to successfully use the staging environment.*
DoiSession = doi.DoiSession(env='staging')

### Authentication

In [22]:
DoiSession.doi_authenticate(username, password)
print ("Successfully authenticated.")

Successfully authenticated.


### Create New DOI
The same application rules are still in place here. Keep in mind, DOIs cannot have the same resource URL as another.  Please see references below (end of Notebook for additional details on valid fields in the DOI model as well). Logged-in users, see: https://www1.usgs.gov/csas/doi/web_services.html#/Read_DOI/read1 for full schema mapping.

For initial creation of a new DOI (in production) it is recommended to set the status to 'reserved'.  This gives the user an opportunity to inspect and finalize the DOI before it is minted.

Key values are caps sensitive.

'dataSourceName' undergoes validation against a specific list. Contact DOI Tool team for details...

In [23]:
# New DOI.
doi ={'title': 'A Test DOI Created Through the Python Wrapper for the USGS DOI Tool', 
      'status':'reserved',
      'url':'https://data.usgs.gov/datacatalog/doi-messages/temporary.html', 
      'description': 'This text block stores an informative textual description of the resource.', 
      'authors': [{'authorName': 'Ignizio, Drew A.', 
                   'orcId': '0000-0001-8054-5139', 
                   'position': '0', 
                   'nameType': 'Personal'}, 
                  {'authorName': 'Talbert, Colin B.', 
                   'orcid': '0000-0002-9505-1876', 
                   'position': '1', 
                   'nameType': 'Personal'}, 
                  {'authorName': 'Zolly, Lisa', 
                   'orcid': '0000-0003-3595-7809', 
                   'position': '2', 
                   'nameType': 'Personal'}, 
                  {'authorName': 'Unicorn Enterprises Team', 
                   'position': '3', 
                   'orcid': '', 
                   'nameType': 'Organizational'}], 
      'dataSourceName': 'Science Analytics and Synthesis', 
      'users': ['dignizio@usgs.gov', 'talbertc@usgs.gov', 'bserna@usgs.gov', 'lisa_zolly@usgs.gov']}

newDoi = DoiSession.doi_create(doi) #response will be the new DOI's ID after creation
newDoi

'doi:10.5072/S9UST85C'

### Fetch DOI Attributes (pull back the values for the newly created DOI as JSON)

In [26]:
newTestDoi = DoiSession.get_doi(newDoi)
newTestDoi

{'doi': 'doi:10.5072/S9UST85C',
 'title': 'A Test DOI Created Through the Python Wrapper for the USGS DOI Tool',
 'pubDate': None,
 'url': 'https://data.usgs.gov/datacatalog/doi-messages/temporary.html',
 'resourceType': None,
 'date': None,
 'dateType': None,
 'description': 'This text block stores an informative textual description of the resource.',
 'subject': None,
 'username': 'dignizio@usgs.gov',
 'status': 'reserved',
 'noDataReleaseAvailableReason': None,
 'noPublicationIdAvailable': False,
 'dataSourceId': 59507,
 'dataSourceName': 'Science Analytics and Synthesis',
 'linkCheckingStatus': None,
 'formatTypes': [],
 'authors': [{'authorName': 'Ignizio, Drew A.',
   'orcId': '0000-0001-8054-5139',
   'nameType': 'Personal',
   'position': 0},
  {'authorName': 'Talbert, Colin B.',
   'orcId': None,
   'nameType': 'Personal',
   'position': 1},
  {'authorName': 'Zolly, Lisa',
   'orcId': None,
   'nameType': 'Personal',
   'position': 2},
  {'authorName': 'Unicorn Enterprises Tea

### Update Existing DOI
A quick note: the "get_doi" function doesn't always fetch the correct status so in the scenario you are editing a reserved (on hold) DOI in the 'production' environment and want it to remain reserved (not public) please be sure to always explicitly reset the status to "reserved". Otherwise an 'update' call may inadvertently publish the DOI which cannot be undone.

Here, we illustrate how to update 'title', 'authors' while ensuring that the 'status' is not accidentally set to public.

### Step 1. Update the DOI locally

In [30]:
# Working with content of a DOI locally (after 'get_doi'), first change DOI attributes in the local JSON object.
newTestDoi['title'] = 'A Modified DOI Made with the Python Wrapper for the USGS DOI Tool'
newTestDoi['authors'] = [
    {'authorName': 'Duck, Donald D.', 
     'orcId': '0000-0001-8054-5139', 
     'position': '0', 
     'nameType': 'Personal'}, 
    {'authorName': 'Mouse, Mickey M.', 
     'orcid': '0000-0002-9505-1876', 
     'position': '1', 
     'nameType': 'Personal'}, 
    {'authorName': 'Jasmine, Princess', 
     'orcid': '0000-0003-3595-7809', 
     'position': '2', 
     'nameType': 'Personal'}, 
    {'authorName': 'Cyberdyne Industries', 
     'position': '3', 
     'orcid': '', 
     'nameType': 'Organizational'}]
newTestDoi['status'] = 'reserved' # Change/remove line when needed.
newTestDoi # Display the local JSON to verify changes were made as expected.

{'doi': 'doi:10.5072/S9UST85C',
 'title': 'A Modified DOI Made with the Python Wrapper for the USGS DOI Tool',
 'pubDate': None,
 'url': 'https://data.usgs.gov/datacatalog/doi-messages/temporary.html',
 'resourceType': None,
 'date': None,
 'dateType': None,
 'description': 'This text block stores an informative textual description of the resource.',
 'subject': None,
 'username': 'dignizio@usgs.gov',
 'status': 'reserved',
 'noDataReleaseAvailableReason': None,
 'noPublicationIdAvailable': False,
 'dataSourceId': 59507,
 'dataSourceName': 'Science Analytics and Synthesis',
 'linkCheckingStatus': None,
 'formatTypes': [],
 'authors': [{'authorName': 'Duck, Donald D.',
   'orcId': '0000-0001-8054-5139',
   'position': '0',
   'nameType': 'Personal'},
  {'authorName': 'Mouse, Mickey M.',
   'orcid': '0000-0002-9505-1876',
   'position': '1',
   'nameType': 'Personal'},
  {'authorName': 'Jasmine, Princess',
   'orcid': '0000-0003-3595-7809',
   'position': '2',
   'nameType': 'Personal'},

### Step 2. Make the call to update the DOI in the DOI tool

In [28]:
DoiSession.doi_update(newTestDoi)

{'doi': 'doi:10.5072/S9UST85C',
 'title': 'A Modified DOI Made with the Python Wrapper for the USGS DOI Tool',
 'pubDate': None,
 'url': 'https://data.usgs.gov/datacatalog/doi-messages/temporary.html',
 'resourceType': None,
 'date': None,
 'dateType': None,
 'description': 'This text block stores an informative textual description of the resource.',
 'subject': None,
 'username': 'dignizio@usgs.gov',
 'status': 'reserved',
 'noDataReleaseAvailableReason': None,
 'noPublicationIdAvailable': False,
 'dataSourceId': 59507,
 'dataSourceName': 'Science Analytics and Synthesis',
 'linkCheckingStatus': None,
 'formatTypes': [],
 'authors': [{'authorName': 'Duck, TESTER.',
   'orcId': '0000-0001-8054-5139',
   'nameType': 'Personal',
   'position': 0},
  {'authorName': 'Mouse, Mickey M.',
   'orcId': None,
   'nameType': 'Personal',
   'position': 1},
  {'authorName': 'Jasmine, Princess',
   'orcId': None,
   'nameType': 'Personal',
   'position': 2},
  {'authorName': 'Cyberdyne Industries',


### Verify Updates were Made for Accuracy
The 'doi_update' call should return the updated DOI content. This code makes absolutely certain by re-fetching and inspecting the values.

In [29]:
updateTest = DoiSession.get_doi(newDoi) # newDoi is the explicit DOI id string we got when we created the test DOI.
updateTest

{'doi': 'doi:10.5072/S9UST85C',
 'title': 'A Modified DOI Made with the Python Wrapper for the USGS DOI Tool',
 'pubDate': None,
 'url': 'https://data.usgs.gov/datacatalog/doi-messages/temporary.html',
 'resourceType': None,
 'date': None,
 'dateType': None,
 'description': 'This text block stores an informative textual description of the resource.',
 'subject': None,
 'username': 'dignizio@usgs.gov',
 'status': 'reserved',
 'noDataReleaseAvailableReason': None,
 'noPublicationIdAvailable': False,
 'dataSourceId': 59507,
 'dataSourceName': 'Science Analytics and Synthesis',
 'linkCheckingStatus': None,
 'formatTypes': [],
 'authors': [{'authorName': 'Duck, TESTER.',
   'orcId': '0000-0001-8054-5139',
   'nameType': 'Personal',
   'position': 0},
  {'authorName': 'Mouse, Mickey M.',
   'orcId': None,
   'nameType': 'Personal',
   'position': 1},
  {'authorName': 'Jasmine, Princess',
   'orcId': None,
   'nameType': 'Personal',
   'position': 2},
  {'authorName': 'Cyberdyne Industries',


#### References

[DOI Tool Staging](https://www1-staging.snafu.cr.usgs.gov/csas/doi/)

[DOI Tool Production](https://www1.usgs.gov/csas/doi/)

Logged-in AD users should also review the documentation here: https://www1.usgs.gov/csas/doi/web_services.html

### Other Notes
USGS REST Endpoint (https://www1.usgs.gov/csas/doi/web_services.html#/)

Endpoint is caps sensitive!!! Queries for DOIs won't work if not in caps.

Format for requesting a published DOI off production endpoint (note the use of a colon and lack of single quotes):

https://www1.usgs.gov/csas/dmapi/doi/doi:10.5066/P9VRV6US