## DOI parser

Need to have the habanero crossref API library installed

https://github.com/sckott/habanero

`pip install habanero`

https://www.crossref.org/blog/python-and-ruby-libraries-for-accessing-the-crossref-api/

In [41]:
from habanero import Crossref
cr = Crossref()

In [42]:
def findDOI(doi):
    x = cr.works(doi)
    return x['message']['title']

The `x` is a dictionary with all the data stored on crossref, so adapting for other info should be trivial.

## Adding support for google forms parsing

Going for google's documentation we find

https://developers.google.com/sheets/api/quickstart/python#step_1_turn_on_the

after installing

`pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`

In [43]:
from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

In [46]:
# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/spreadsheets.readonly']

# The ID and range of spreadsheet.
# To find the ID open the sheet from drive and grab the bit after the general part of the URL
SPREADSHEET_ID = '11eN2SVJae-mfhk8Tm9x2VP8EAENRMwsrNstC0HdQYqk'
RANGE_NAME = 'Form Responses 1!A1:B2'

def main():
    """Reads spreadsheet from users google drive.
    Prints values from the spreadsheet.
    Assumes spreadsheet has header line then 2nd column contains DOI
    """
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'formParser/credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    service = build('sheets', 'v4', credentials=creds)

    # Call the Sheets API
    sheet = service.spreadsheets()
    result = sheet.values().get(spreadsheetId=SPREADSHEET_ID,
                                range=RANGE_NAME).execute()
    values = result.get('values', [])

    if not values:
        print('No data found.')
    else:
        print('Date, Doi:, Title')
        skipFirstRow = True
        for row in values:
            if skipFirstRow:
                skipFirstRow = False
                next
            else:
                # Print columns A and B, which correspond to indices 0 and 1 and tile of paper.
                title = findDOI(row[1])
                print('%s, %s' '%s' % (row[0], row[1], title))

if __name__ == '__main__':
    main()

Date, Doi:, Title
7/25/2020 18:46:04, 10.1063/5.0007045['CP2K: An electronic structure and molecular dynamics software package - Quickstep: Efficient and accurate electronic structure calculations']
