# Minting NGEE Tropics DOIs 
This notebook has the steps for minting an NGEE Tropics data set DOI.

### Setup
**a.** <br>
You need to set the following environment variables before you start this Jupyter notebook.:

* `NGT_USERNAME`
* `NGT_PASSWORD`
* `ELINK_USER`
* `ELINK_PASSWORD`

If you need to set these variables, stop Jupyter and do the following.

Create an environment file named `mint_doi_ngt.sh` and put the following 
text in it.

    export NGT_USERNAME=<your username>
    export NGT_PASSWORD=<your password>
    export ELINK_USER=<your user>
    export ELINK_PASSWORD=<your password>
    
Next, before you start Jupyter up again do the following.

    source mint_doi_ngt.sh

**b.**
<br> Verify that you have access to ngt-dev.lbl.gov <br>
Verify that you are in the NGT Administrator group

In [None]:
def print_response(response):
    """
    Pretty print the HTTP Response
    """
    print ("Status Code: {}".format(r.status_code))
    print(r.text)

In [None]:
from ipywidgets import widgets, interact
from IPython.display import display

# Setup the inputs
host_text = widgets.Text("https://ngt-dev.lbl.gov", description="Host:")
ngt_id_text = widgets.Text("NGT0050", description="NGT ID:")

# Display Widgets
display(host_text)
display(ngt_id_text)

## OAUTH Token
### Step 1: Register an application
To obtain a valid access_token first we must register an application. DOT has a set of customizable views you can use to CRUD application instances, just point your browser at:

http://{ngt-host}/o/applications/ <br>

Click on the link to create a new application and fill the form with the following data:

- *Name:* just a name of your choice
- *Client Type:* confidential
- *Authorization Grant Type:* Resource owner password-based

Save your app!



Enter your client id and client secret:

In [None]:
# Setup the inputs

client_id_text = widgets.Text("<your client_id>", description="Client ID:") 
client_secret_text = widgets.Text("<your client secret>", description="Client Secret:") 
access_token_text = widgets.Text("", description="Token:")

# Display Widgets
print(f"Click on the link to create a new application: {host_text.value}/o/applications")
display(client_id_text)
display(client_secret_text)
display(access_token_text)

## Step 2: Get your token and use your API
At this point we’re ready to request an access_token.

In [None]:
import requests
import os

# Was the access token provided
if not access_token_text.value:
    
    # There is not access token
    # let's get one
    token_url=f"{host_text.value}/o/token/"
    params = {'grant_type':'password',
             'username': os.environ["NGT_USERNAME"], 
             'password': os.environ["NGT_PASSWORD"]}
    print(f"Token URL: {token_url}")
    r = requests.post(token_url, params, 
                      auth=(client_id_text.value, client_secret_text.value))

    if r.status_code == 200:
        print_response(r)
        print("**************************************")
        print("SAVE this information for the future!!")
        print("**************************************")
        access_token=r.json()
        token = access_token['access_token']

    else:
        print_response(r)
else:
    # An access token was provided
    token = access_token_text.value
    print(f"Found Token: {token}")

## Step 3: Convert NGT JSON to OSTI XML

Import needed python modules and functions

In [None]:
import sys
import os
import re
import requests
from datetime import datetime
import xml.etree.ElementTree as ET

In [None]:
def set_value(record, name, value):
    """
    Sets the element value for the record

    :param record:
    :param name: name of the field
    :param value: value of the field to se
    :return: None
    """

    ET.SubElement(record, name).text=value

In [None]:
def creators(session, record, authors_json):
    """
    Generate the creators block

    :param session: The http session
    :type session: requests.Session
    :param record: The OSTI xml record
    :param authors_json: The NGT Archive dataset authors JSON
    :return: None
    """
    creators_block = ET.SubElement(record,'creatorsblock')
    for author in authors_json:

        r = session.get(author)
        if r.status_code == 200:
            person_json = r.json()
            creator_detail = ET.SubElement(creators_block,'creators_detail')
            set_value(creator_detail,'first_name',person_json["first_name"])
            set_value(creator_detail,'last_name',person_json["last_name"])
            set_value(creator_detail,'private_email',person_json["email"])
            set_value(creator_detail,'affiliation_name',person_json["institution_affiliation"])
        else:
            print("HTTP {}: {}".format(r.status_code, r.content.decode('utf-8')), file=sys.stderr)

In [None]:
def find_dataset_by_ngt_id(session, ngt_id):
    """
    Search for the dataset by NGT ID
    
    :param session: The http session
    :type session: requests.Session
    :param ngt_id: The NGT Archive id (NGTXXXX)
    :type ngt_id: str
    """
    r = session.get(url = f"{os.path.join(host_text.value, 'api/v1/datasets/')}")
    if r.status_code == 200:
        datasets = r.json()
        for d in datasets:
            if d["data_set_id"] == ngt_id:
                return d
        raise Exception("No dataset with NGT ID {} found!!".format(ngt_id))
    else:
        raise Exception("HTTP {}: {}".format(r.status_code,r.content.decode('utf-8')))

In [None]:
def get_session(token):
    """
    Login to NGEE Tropics and return
    the login state
    
    :param tokenb: Your Access Token
    
    :return: requests.Session
    """
    s = requests.Session()
    s.headers.update({"Authorization": f"Bearer {token}"})
    return s

## Login to the NGT Archive and start a session
Login to NGT Archive with your user credentials.  The `session` object can be used throughout this notebook to interact with **NGEE Tropics Archive Service**

In [None]:
session = get_session(token)

## Get the NGEE Tropics Dataset
Fine the dataset with ngt_id

In [None]:
# find the dataset with the ngt_id 
dataset_json = find_dataset_by_ngt_id(session, ngt_id_text.value)

print("***********************************")
print("   Got the Dataset {}!!".format(dataset_json["data_set_id"]))
print("***********************************")

Create OSTI Record with the NGT Archive dataset 

In [None]:
# Basic NGT to OST Mapping
MAPPING = [('title','name'),
   ('product_nos','data_set_id'),
   ('contract_nos','doe_funding_contract_numbers'),
   ('non-doe_contract_nos','doe_funding_contract_numbers'),
   ('originating_research_org','originating_institution'),
   ('description','description'),
   ('sponsor_org','funding_organizations'),
   ('related_resource','reference')]

# Create OSTI XML
records = ET.Element('records')
record = ET.SubElement(records,'record')

for k, v in MAPPING:
    set_value(record,k,dataset_json[v])

# Leave Blank for new  -fill in XXX with existing DOI otherwise.
set_value(record,'osti_id','1605211')

# DataSet Type: Dataset Type refers to the main content of the
# dataset. Only one value is allowed. Use the two-letter codes shown below:
set_value(record,'dataset_type', 'SM') # Specialized Mix

# Auto-fill
set_value(record,'site_url','{}/dois/{}'.format(host_text.value,dataset_json["data_set_id"]))
set_value(record,'publication_date',datetime.now().strftime("%Y"))
set_value(record, 'contact_name', 'NGEE Tropics Archive Team, Support Organization')
set_value(record, 'contact_email', 'ngee-tropics-archive@googlegroups.com')
set_value(record, 'contact_org', 'Lawrence Berkeley National Lab')
set_value(record, 'site_code','NGEE-TRPC')
set_value(record, 'doi_infix','ngt')
set_value(record, 'subject_categories_code','54 ENVIRONMENTAL SCIENCES')
set_value(record, 'language','English')
set_value(record, 'country',"US")

# Generate the Authors section
creators(session,record, dataset_json["authors"])

# Store OSTI XML 
from xml.dom.minidom import parseString
dom3 = parseString(ET.tostring(records))
OSTI_XML=dom3.toprettyxml()

## Step 4: Modify the OSTI XML
Separate the DOE contract numbers from the NON-DOE contract numbers.

`contract_nos` - Use the format of the contract “as is,” but leave off any preceding “DE”. If multiple DOE contract and/or grant numbers apply, separate with a semi-colon followed by a space.

`non-doe_contract_nos` - Enter contract or award numbers that are not assigned by DOE (an NSF award number, for example). Multiple entries are allowed. They must be separated by a semi-colon followed by a space.

## Edit the OSTI XML in the Text Area below

In [None]:
from ipywidgets import widgets, interact
from IPython.display import display

ta = widgets.Textarea(OSTI_XML)
#ta.layout.width = '95%'
#ta.layout.height = '300px'
display(ta)

### Confirm OSTI XML
Review the edited XML. If it is incorrect make corrections in the TextArea above

In [None]:
#print(ta.value)
print(ta.value.encode('utf-8')) #For funky strings

## Step 3: Mint NGEE Tropics DOI
Execute the script below to mint a new DOI with  OSTI 


In [None]:
elink_url="https://www.osti.gov/elink/2416api"

headers = {'Content-Type': 'application/xml'} 
r = requests.post(elink_url, 
                  data=ta.value.encode('utf-8'), 
                  headers=headers, 
                  auth=(os.environ["ELINK_USER"],os.environ["ELINK_PASSWORD"]))

## Print the Results

In [None]:
DOI=None


if r.status_code == 200:
    dom3 = parseString(r.content)
    print(dom3.toprettyxml())
    root = ET.fromstring(r.content)
    doi = root.find("./record/doi")
    status = doi.get("status")
    
    DOI = doi.text
    print("DOI:\t\t{}".format(doi.text))
    print("DOI STATUS:\t{}".format(status))
    
    
else:
    print_response(r)

# Step 5: Update the NGT Record with the DOI
Get the new DOI minted from the above procedure and review below.

In [None]:
# DOI = "10.15486/ngt/1434046"
DOI_URL = "http://dx.doi.org/{}".format(DOI)
print(DOI_URL)

NGT Datasets that have been approved may not be edited.  So, in order to edit an approve dataset, it must first be unapproved, updated and then reapproved.  **Only execute this on "APPROVED" datasets**

In [None]:
import json

update_data = dict()
update_data.update(dataset_json)
update_data["doi"]=DOI_URL

# Unapprove first to update DOI
r = session.get("{}unapprove".format(dataset_json["url"]))
print_response(r)

if r.status_code == 200:
    r= session.put(dataset_json["url"],data=json.dumps(update_data),
                   cookies=cookies,
                   headers={'Content-Type':'application/json',
                           'X-CSRFToken':cookies["csrftoken"],
                           'Referer':host_text.value})
    if r.status_code == 200:
        print("\nSuccessfully Updated DOI\n")
    else:
        print_response(r)
    
    # Reapprove dataset
    r = session.get("{}approve".format(dataset_json["url"]))
    print_response(r)
else:
    print_response(r)

# Confirm update

In [None]:
r = session.get(dataset_json["url"])
print(r.json()["doi"])