##### Example of using xDD (eXtract Dark Data) API, previously called GeoDeepDive for searching the snippet route.  This searches the database for a term of interest, returning basic citation information and highlights giving context of each mention of the term.

##### Contact: Daniel Wieferich (dwieferich@usgs.gov)

In [109]:
#Function handling request of information from xDD API

import requests

def xdd_api(route, params):
    """Create list of docs mentioning a term of interest
    Parameters : see https://geodeepdive.org/api for more detail
    ----------
    routes : str of available api routes for xDD 
    params : str of key value pairs of paramaters:values separated by &
    """
    url = 'https://geodeepdive.org/api'
    search = (url + '/' + route + '?' + str(params))
    all_data = []
    print (search)
    try:
        while search != '':
            r=requests.get(search)
            if r.status_code == 200 and 'success' in r.json():
                json_r = r.json()
                data = json_r['success']['data']
                search = json_r['success']['next_page']
                hits = json_r['success']['hits']
                for i in data:
                    all_data.append(i)
            else:
                raise Exception('xDD API returning: {}'.format(r.status_code))
            
    except Exception as e:
        raise Exception(e)
        
    return all_data, hits
        


##### Search for mention of YETI (a machine name in USGS Advanced Research Computing) to see if their are mentions in the xDD database.

In [110]:
#Specify API route and parameters needed for search

route = 'snippets'

#Set a term or loop through terms
#term = 'USGS Advanced Research Computing'
#term = 'USGS YETI'
term = 'YETI'

#List of available parameters and examples : 'https://geodeepdive.org/api/snippets'
params = 'term=' + term + '&clean&full_results'

#Search xDD results (detailed results per article) and hits = total # articles with mentions
results, hits = xdd_api(route, params)


https://geodeepdive.org/api/snippets?term=YETI&clean&full_results


##### Example of information returned per article

fields defined here: https://geodeepdive.org/api/snippets

In [111]:
results[200]

{'URL': 'http://doi.wiley.com/10.1002/ajpa.1330870413',
 '_gddid': '574e9029cf58f19baeeb2da3',
 'authors': 'Daegling, David J.',
 'coverDate': 'April 1992',
 'doi': '10.1002/ajpa.1330870413',
 'highlight': ['this became the material basis for the contemporary “yeti”myths that haunt the human subconscious. The'],
 'publisher': 'Wiley',
 'pubname': 'American Journal of Physical Anthropology',
 'title': 'Other origins: The search for the giant ape in human prehistory. By Russell Ciochon, John Olsen, and Jamie James. New York: Bantam. xi + 262 pp. $22.95 (cloth)'}

In [112]:
print ('Number of articles with mentions: ' + str(hits) + '\n')

Number of articles with mentions: 747



##### In this case the term YETI is returning a lot of mentions that are not relevant. To better refine the search results search for mentions that also reference USGS in highlights

In [113]:
title = []
for article in results:
    for context in article['highlight']:
        if 'U.S. Geological Survey' in context or 'USGS' in context:
            title.append(article['title'])
unique_titles = list(set(title))
for title in unique_titles:
    print (title + '\n')


Overcoming Equifinality: Leveraging Long Time Series for Stream Metabolism Estimation

Predicting monarch butterfly (Danaus plexippus) movement and egg-laying with a spatially-explicit agent-based model: The role of monarch perceptual range and spatial memory

PhreeqcRM: A reaction module for transport simulators based on the geochemical model PHREEQC

Determination of Arsenic and Mercury in Various Environmental Matrices by Chemical Neutron Activation Analysis

