# Table of Contents
 <p><div class="lev1 toc-item"><a href="#Import-packages" data-toc-modified-id="Import-packages-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Import packages</a></div><div class="lev1 toc-item"><a href="#Set-up-environment-variables" data-toc-modified-id="Set-up-environment-variables-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Set up environment variables</a></div><div class="lev2 toc-item"><a href="#Example-item" data-toc-modified-id="Example-item-21"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Example item</a></div><div class="lev2 toc-item"><a href="#Saving-results" data-toc-modified-id="Saving-results-22"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Saving results</a></div>

# Import packages

In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib


In [2]:
import requests, math, json

# Set up environment variables

In [3]:
with open('ocBaseURL.txt', 'r') as fp:
    baseURL = fp.read()[:-1]

In [4]:
ocApiUrl = baseURL.split('\n')[0]

In [5]:
with open('api-key.private', 'r') as fp:
    apiKey = fp.read()

In [6]:
collection = '24'
perPage = 100
offset = 0

In [7]:
# Query the API for the collection item count
collectionUrl = ocApiUrl + '/collections/' + collection + '?api_key=' + apiKey
apiResponse = requests.get(collectionUrl).json()
itemCount = int(apiResponse['data']['items'])
print('This collection contains {} items.'.format(itemCount))

This collection contains 11653 items.


In [8]:
# Figure out how many pages there are
pages = int(math.ceil(itemCount / float(perPage)))
print('That means there are a total of {} pages to scan'.format(pages))

That means there are a total of 117 pages to scan


In [9]:
# Loop through collection item pages to get all items
itemIDs = []
for x in range(pages):
    collectionItemsUrl = ocApiUrl + '/collections/' + collection
    collectionItemsUrl += '/items?limit=' + str(perPage) + '&offset=' + str(offset) + '&api_key=' + apiKey
    offset += perPage
    # get list of perPage items 
    apiResponse = requests.get(collectionItemsUrl).json()
    collectionItems = apiResponse['data']
    for collectionItem in collectionItems:
        itemIDs.append(collectionItem['_id'])

In [13]:
# print a sample of what we're getting...
collectionItems[0]

{'_id': '1.0165874',
 '_index': 'dsp.24-2017-05-28',
 'creator': 'McNulty, Brian',
 'dateAvailable': '2014-02-11T00:00:00Z',
 'title': 'Geology, alteration, lithogeochemistry and hydrothermal fluid characterization of the Neoproterozoic Niblack polymetallic volcanic-hosted massive sulfide camp, southeast Alaska, USA',
 'type': 'Text',
 'ubc.date.sort': '2014-12-31 AD',
 'ubc.internal.item.last.ingested': '2017-05-28 6:44:36',
 'ubc.internal.repo.handle': '2429/46005'}

In [19]:
# Store all the items so we can print them out later
items = []
for itemID in itemIDs:
    itemUrl = ocApiUrl + '/collections/' + collection + '/items/' + itemID
    apiResponse = requests.get(itemUrl).json()
    item = apiResponse['data']
    items.append(item)

## Example item

In [76]:
egItem = items[0]
print([key for key in egItem.keys()])

['DegreeGrantor', 'Language', 'Type', 'DateAvailable', 'IsShownAt', 'Provider', 'AggregatedSourceRepository', 'Creator', 'Campus', 'Program', 'Description', 'FullText', 'URI', 'Degree', 'Affiliation', 'DateIssued', 'SortDate', 'Title', 'RightsURI', 'DigitalResourceOriginalRecord', 'Genre', 'Rights', 'GraduationDate', 'ScholarlyLevel', 'Publisher']


In [67]:
egTitle = egItem['Title'][0]['value']
print(egTitle)

Sediment transport and bed material adjustments in the vicinity of Wilsey Dam : salmon spawning habitat implications


In [70]:
egProgram = egItem['Program'][0]['value']
print(egProgram)

egDate = egItem['SortDate'][0]['value']
print(egDate)

Environmental Sciences
2017-12-31 AD


In [65]:
egAbstract = egItem['Description'][0]['value']
print(egAbstract)

Substrate requirements are an important component of the multifaceted spawning needs of salmon, and this research effort was directed at developing a greater understanding of sediment transport dynamics and bed material response in the Middle Shuswap River in consequence of the emplacement and subsequent management of Wilsey Dam. Downstream of Wilsey Dam the river provides spawning habitat for coho (Oncorhynchus kisutch), chinook (O. tshawytscha), pink (O. gorbuscha) and sockeye (O. nerka) salmon. This thesis suggests that sand dredged from deposits filling the upstream reservoir basin of the dam could be redeposited downstream when coupled with specific flow releases (≥100 cubic metres per second). This is seen as a viable option for sediment management on the Middle Shuswap River aimed at restoring sediment transport processes and preserving spawning habitat. Maintaining sediment transport processes after dam emplacement is an important consideration for ecological processes in river

In [72]:
thesisFT = egItem['FullText'][0]['value']
ackTOC = thesisFT.lower().index('acknowledgements')
acknowledgementsIdx = thesisFT[(ackTOC+1):].lower().index('acknowledgements')
ackEnd = thesisFT[(ackTOC+1+acknowledgementsIdx):].lower().index('1')
egAcknow = thesisFT[(ackTOC + acknowledgementsIdx+1):(ackTOC + acknowledgementsIdx+1+ackEnd)]
print(egAcknow)

Acknowledgements   I am foremost and forever thankful to my supervisor Dr. Bernard Bauer, whose kindness, generosity and feedback not only made my research progress possible, but who also has provided me with life lessons that will be undoubtedly woven into the fabric of my future endeavors.   I am very thankful for the involvement of Dr. Mark Lorang who spent over 20 hours on the road to participate in data collection, provided invaluable feedback on my writing and who always saw my potential. I thank Dr. John Richardson for his involvement, encouraging me and teaching me much about writing.   Academic mentors who generously provided specific insights key to particular steps in this research include Dr. Marwan Hassan (UBC), Dr. Sylvia Esterby (UBC), Dr. Theodore Fuller (SFU), Dr. Carl Schwarz (SFU), and Dr. David Graham (Loughborough University).  The Fish and Wildlife Compensation Program Coastal helped fund this project on behalf of its program partners: BC Hydro, the Province of B.

## Saving results

In [77]:
with open('./ubctheses.json', 'w+', encoding='utf-8') as fp:
    json.dump(items, fp, indent=4)