### Retina hub - a user driven database of papers in the retina field

#### This notebook contains code to populate the retina paper database.

- An online form is presented to users. 
- They fill out information about the paper they want to add (URL, DOI and keywords)
- code pulls information from the form (results are stored in a datasheet) and stores it in pandas
- using CrossREF api (and habanero), we use DOIs to retrive metadata (authors, publication year, journal etc)
- using Zotero api (and pyzotero), we put the metadata retrieved into a Zotero public library, together with keywords added by users
- using anothe python library we use each zotero entry to create a post on a wordpress website.



In [1]:
#import all necessary libraries
import pandas as pd
from pyzotero import zotero
import requests
import doi

In [2]:
#import data from datasheet
with open("zotero_sheet.txt") as fid:
    sheetId = fid.readline()

url1 = "https://docs.google.com/spreadsheets/d/"+sheetId +"/export?format=csv"

#pull data from the sheet into a pandas dataframe
allEntries = pd.read_csv(url1,
                         header=0,
                        #index_col=0,
                        )
doiKey = 'Publication Identifier (DOI, ISBN, PMID, arXiv ID). If you do not know any of these for the entry, please use crossref search engine https://www.crossref.org/guestquery - use the subfield "search on article title")'

#allEntries

#### retrieve metadata with translators 

Zotero API does not have a direct way of getting article metadata.   
But they do have a "translators" library (https://github.com/zotero/translation-server)  
so the solution is to use a translator, grab article/book metadata and use it to create a zotero entry.

Once the entry is created, it can be exposed on the web/repo.

Let's try this below. The steps are basically:
- install docker (outside of the python pipeline)
- run the docker container listed on the github repository listed above
- use "requests" from within python to get the metadata.
- create zotero entries with the retrieved metadata.


In [7]:
#play around with zotero api and see what is in the zotero library
with open("zotero_key.txt") as fid:
    apiKey = fid.readline()
    
libID = "4584648"
libType = "group"

zot = zotero.Zotero(libID, libType, apiKey)


#prepare curl call
headers = {'content-type': 'text/plain','Accept-Charset': 'UTF-8'}
searchUrl = 'http://127.0.0.1:1969/search'
filterTags = ['Type',
              'Species (select all that apply)',
              'cell types (select all that apply)',
              'Main areas (please select all that apply)',
              'Other keywords (separated by commas)']

#run through dataframe and get DOIS
for item in allEntries.index:
    if pd.notna(allEntries.loc[item][doiKey]):
        data = allEntries.loc[item][doiKey]
        r = requests.post(url=searchUrl,headers=headers,data=data)
        if r.status_code != 501:
            metaData = r.json()[0]
            template = zot.item_template(metaData["itemType"])
            for key in template.keys():
                if key in metaData.keys():
                    template[key]=metaData[key]
            tagsTemp = list()
            for tag in filterTags:
                #print(allEntries.loc[item][tag])
                #print(template["tags"])
                tagsTemp.append(str(allEntries.loc[item][tag]))
            template["extra"]=template["extra"]+";"+';'.join(tagsTemp)
            #try:
            #    zot.update_item([template])
            #except KeyError:
            #zot.add_tags(template, ','.join(tagsTemp))
            #template['tags']=';'.join(tagsTemp)
            zot.create_items([template])
            
        else:
            print("the following entry has invalid DOI")
            print(allEntries.loc[item])
    else:
        pass

#template


In [35]:
tagsTemp=["fish","bipolar cell"]
zot.add_tags(template, ','.join(tagsTemp))

TypeError: unhashable type: 'dict'

In [34]:
','.join(tagsTemp)

'fish,bipolar cell'

In [6]:
zot.items()

[]

In [56]:


        
        
data = '7064538'#'https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhere'
#r = requests.get('https://github.com/timeline.json')

headers = {'content-type': 'text/plain','Accept-Charset': 'UTF-8'}
r = requests.post(url='http://127.0.0.1:1969/search',headers=headers,data=data)
#r = requests.post(url,  headers=headers)
#-H 'Content-Type: text/plain' http://127.0.0.1:1969/web
#curl -d @request.json --header "Content-Type: application/json" https://www.googleapis.com/qpxExpress/v1/trips/search?key=mykeyhe

Timestamp
Publication URL
Publication Identifier (DOI, ISBN, PMID, arXiv ID). If you do not know any of these for the entry, please use crossref search engine https://www.crossref.org/guestquery - use the subfield "search on article title")
Type
Species (select all that apply)
cell types (select all that apply)
Main areas (please select all that apply)
Other keywords (separated by commas)
Contact email (optional)


In [102]:
template = zot.item_template(metaData[0]["itemType"])