# The Unconquerables of Open Access

## Parsing Sherpa/Romeo data

Project for the EAHIL conference 2023 : https://eahil2023.org/  
Authors : **Floriane Muller & Pablo Iriarte**, University of Geneva  
Last update : 13.04.2023 

This notebook is used to extract the Sherpa/Romeo data obtained by API and process it to make it usable in the research project.

### Sources

**Sherpa/Romeo API**
  
  * Example or API call:
  https://v2.sherpa.ac.uk/cgi/retrieve_by_id?item-type=publication&api-key=EEE6F146-678E-11EB-9C3A-202F3DE2659A&format=Json&identifier=17601

In [1]:
import pandas as pd
import csv
import json
import numpy as np
import os
# afficher toutes les colonnes
pd.set_option('display.max_columns', None)

## Import MEDLINE journals

In [2]:
df = pd.read_csv('data/sources/nlm/lsi2023_medline_issns.tsv', encoding='utf-8', header=0, sep='\t')
df

Unnamed: 0,NlmUniqueID,Title,MedlineTA,Country,Place,Publisher,PublicationFirstYear,PublicationEndYear,Frequency,ISSN-Electronic,ISSN-Print,ISSN-Linking,Language,TitleContinuationYN,IndexingStartDate,CurrentlyIndexedYN,IndexOnlineYN,IndexingSubset,IndexingSelectedURL,ReportedMedlineYN,ISSN
0,9015384,20 century British history,20 Century Br Hist,England,"Eynsham, Oxford",Oxford University Press,1990,,"4 no. a year,",1477-4674,0955-2359,0955-2359,eng,N,1990.0,Y,N,QIS,,Y,1477-4674
1,101714112,A&A practice,A A Pract,United States,"[Philadelphia, PA]","Wolters Kluwer Health, Inc.",2018,,Biweekly,2575-3126,,2575-3126,eng,Y,2018.0,Y,Y,IM,https://ovidsp.ovid.com/ovidweb.cgi?T=JS&MODE=...,Y,2575-3126
2,101269322,AACN advanced critical care,AACN Adv Crit Care,United States,"Aliso Viejo, CA",American Association of Critical-Care Nurses (...,2006,,Quarterly,1559-7776,1559-7768,1559-7768,eng,Y,2006.0,Y,Y,N,https://aacnjournals.org/aacnacconline,Y,1559-7776
3,0431420,AANA journal,AANA J,United States,"Park Ridge, Ill.",American Association of Nurse Anesthetists,1974,,Bimonthly,2162-5239,0094-6354,0094-6354,eng,N,1974.0,Y,Y,N,https://www.aana.com/publications/aana-journal,Y,2162-5239
4,101223209,The AAPS journal,AAPS J,United States,"Arlington, Va., USA",American Association of Pharmaceutical Scientists,2004,,Four no. a year,1550-7416,,1550-7416,eng,Y,2004.0,Y,Y,IM,https://link.springer.com/journal/12248,Y,1550-7416
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5274,8702287,Zoological science,Zoolog Sci,Japan,"Tokyo, Japan",Zoological Society of Japan,1984,,"Monthly,",,0289-0003,0289-0003,eng,N,2002.0,Y,Y,IM,http://www.bioone.org/loi/jzoo,Y,0289-0003
5275,9435608,"Zoology (Jena, Germany)",Zoology (Jena),Germany,"Jena, Germany",Urban & Fischer,1994,,"Six no. a year,",1873-2720,0944-2006,0944-2006,eng,N,2005.0,Y,Y,IM,https://www.sciencedirect.com/journal/zoology,Y,1873-2720
5276,101300786,Zoonoses and public health,Zoonoses Public Health,Germany,"Berlin, Germany",Blackwell Verlag,2007,,Ten no. a year,1863-2378,1863-1959,1863-1959,eng,Y,2007.0,Y,Y,IM,http://onlinelibrary.wiley.com/journal/10.1111...,Y,1863-2378
5277,101179386,Zootaxa,Zootaxa,New Zealand,"Auckland, N.Z.",Magnolia Press,2001,,Irregular,1175-5334,1175-5326,1175-5326,eng,N,2013.0,Y,Y,IM,http://www.mapress.com/j/zt/,Y,1175-5334


## Parsing Sherpa/Romeo data

In [3]:
# creation du DF
col_names = ['NlmUniqueID',
             'title_sherpa',
             'issne_sherpa',
             'issnp_sherpa',
             'url',
             'publisher_id',
             'publisher_country',
             'publisher_type',
             'publisher_url',
             'publisher_name',
             'sherpa_id',
             'sherpa_uri',
             'open_access_prohibited',
             'additional_oa_fee',
             'article_version',
             'license',
             'embargo',
             'prerequisites',
             'prerequisite_funders',
             'prerequisite_funders_name',
             'prerequisite_funders_fundref',
             'prerequisite_funders_ror',
             'prerequisite_funders_country',
             'prerequisite_funders_url',
             'prerequisite_funders_sherpa_id',
             'prerequisite_subjects',
             'location',
             'locations_ir',
             'locations_not_ir',
             'named_repository',
             'named_academic_social_network',
             'copyright_owner',
             'publisher_deposit',
             'archiving',
             'conditions',
             'public_notes',
             'sherpa_created',
             'sherpa_last_modified'             
            ] 
sherpa_policies = pd.DataFrame(columns = col_names) 
sherpa_policies

Unnamed: 0,NlmUniqueID,title_sherpa,issne_sherpa,issnp_sherpa,url,publisher_id,publisher_country,publisher_type,publisher_url,publisher_name,sherpa_id,sherpa_uri,open_access_prohibited,additional_oa_fee,article_version,license,embargo,prerequisites,prerequisite_funders,prerequisite_funders_name,prerequisite_funders_fundref,prerequisite_funders_ror,prerequisite_funders_country,prerequisite_funders_url,prerequisite_funders_sherpa_id,prerequisite_subjects,location,locations_ir,locations_not_ir,named_repository,named_academic_social_network,copyright_owner,publisher_deposit,archiving,conditions,public_notes,sherpa_created,sherpa_last_modified


In [4]:
# creation du DF pour le log
col_names = ['NlmUniqueID',
             'issn',
             'sherpa_id',
             'result'
            ] 
sherpalog = pd.DataFrame(columns = col_names) 
sherpalog

Unnamed: 0,NlmUniqueID,issn,sherpa_id,result


In [5]:
# keep ISSNs
df_issns = df[['NlmUniqueID', 'ISSN']]
df_issns

Unnamed: 0,NlmUniqueID,ISSN
0,9015384,1477-4674
1,101714112,2575-3126
2,101269322,1559-7776
3,0431420,2162-5239
4,101223209,1550-7416
...,...,...
5274,8702287,0289-0003
5275,9435608,1873-2720
5276,101300786,1863-2378
5277,101179386,1175-5334


In [6]:
df_issns.loc[df_issns['ISSN'] != ''].shape[0]

5279

In [7]:
df.loc[df['ISSN-Print'].isna()]

Unnamed: 0,NlmUniqueID,Title,MedlineTA,Country,Place,Publisher,PublicationFirstYear,PublicationEndYear,Frequency,ISSN-Electronic,ISSN-Print,ISSN-Linking,Language,TitleContinuationYN,IndexingStartDate,CurrentlyIndexedYN,IndexOnlineYN,IndexingSubset,IndexingSelectedURL,ReportedMedlineYN,ISSN
1,101714112,A&A practice,A A Pract,United States,"[Philadelphia, PA]","Wolters Kluwer Health, Inc.",2018,,Biweekly,2575-3126,,2575-3126,eng,Y,2018.0,Y,Y,IM,https://ovidsp.ovid.com/ovidweb.cgi?T=JS&MODE=...,Y,2575-3126
4,101223209,The AAPS journal,AAPS J,United States,"Arlington, Va., USA",American Association of Pharmaceutical Scientists,2004,,Four no. a year,1550-7416,,1550-7416,eng,Y,2004.0,Y,Y,IM,https://link.springer.com/journal/12248,Y,1550-7416
5,100960111,AAPS PharmSciTech,AAPS PharmSciTech,United States,New York,Springer,2000,,Quarterly,1530-9932,,1530-9932,eng,N,2000.0,Y,Y,IM,https://link.springer.com/journal/12249,Y,1530-9932
15,101729147,ACS applied bio materials,ACS Appl Bio Mater,United States,"Washington, DC",ACS Publications,2018,,Monthly,2576-6422,,2576-6422,eng,N,2021.0,Y,Y,IM,https://pubs.acs.org/journal/aabmcb,Y,2576-6422
17,101654670,ACS biomaterials science & engineering,ACS Biomater Sci Eng,United States,"Washington, DC",American Chemical Society,2015,,Monthly,2373-9878,,2373-9878,eng,N,2020.0,Y,Y,IM,https://pubs.acs.org/journal/abseba,Y,2373-9878
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5150,101231645,Virology journal,Virol J,England,[London],BioMed Central,2004,,,1743-422X,,1743-422X,eng,N,2004.0,Y,Y,IM,http://www.virologyj.com/,Y,1743-422X
5157,101509722,Viruses,Viruses,Switzerland,"Basel, Switzerland",MDPI,2009,,Quarterly,1999-4915,,1999-4915,eng,N,2011.0,Y,Y,IM,http://www.mdpi.com/journal/viruses,Y,1999-4915
5188,9918227353306676,WIREs mechanisms of disease,WIREs Mech Dis,United States,"[Hoboken, NJ]","John Wiley & Sons, Inc.",2021,,Bimonthly,2692-9368,,2692-9368,eng,Y,2021.0,Y,Y,IM,https://wires.onlinelibrary.wiley.com/journal/...,Y,2692-9368
5197,101266603,World journal of emergency surgery : WJES,World J Emerg Surg,England,London,BioMed Central,2006,,,1749-7922,,1749-7922,eng,N,2017.0,Y,Y,IM,http://www.wjes.org/,Y,1749-7922


In [8]:
df_pissns = df[['NlmUniqueID', 'ISSN-Print']].loc[df['ISSN-Print'].notna()]
df_pissns = df_pissns.rename(columns={'ISSN-Print' : 'ISSN'})
df_pissns

Unnamed: 0,NlmUniqueID,ISSN
0,9015384,0955-2359
2,101269322,1559-7768
3,0431420,0094-6354
6,101674571,2366-004X
7,9418450,1069-6563
...,...,...
5274,8702287,0289-0003
5275,9435608,0944-2006
5276,101300786,1863-1959
5277,101179386,1175-5326


In [9]:
# concat ISSN and 'ISSN-Print'
df_issns = df_issns.append(df_pissns, ignore_index=True)
df_issns

Unnamed: 0,NlmUniqueID,ISSN
0,9015384,1477-4674
1,101714112,2575-3126
2,101269322,1559-7776
3,0431420,2162-5239
4,101223209,1550-7416
...,...,...
10106,8702287,0289-0003
10107,9435608,0944-2006
10108,101300786,1863-1959
10109,101179386,1175-5326


In [10]:
# type de repositories qui provoquent archiving = 1 :
# tous les types : 'academic_social_network', 'any_repository', 'any_website', 'authors_homepage',
# 'funder_designated_location', 'institutional_repository', 'institutional_website', 'named_academic_social_network',
# 'named_repository', 'non_commercial_institutional_repository', 'non_commercial_repository',
# 'non_commercial_social_network', 'non_commercial_subject_repository', 'non_commercial_website',
# 'preprint_repository', 'subject_repository', 'this_journal'
repositories_archiving = ['any_repository',
                          'institutional_repository',
                          'institutional_website',
                          'non_commercial_institutional_repository',
                          'non_commercial_repository',
                          'any_website',
                          'non_commercial_website']

# extraction des termes
iteri = int(0)
print (str(iteri))
for index, row in df_issns.iterrows():
    iteri = iteri + 1
    journal_id = row['NlmUniqueID']
    journal_issn = row['ISSN']
    sherpa_log = ''
    # boucle des fichiers json
    # print(row['format'])
    if (((iteri/100) - int(iteri/100)) == 0) :
        print(str(iteri))
    # test d'existance du fichier
    if os.path.exists('data/sources/sherpa/data_2023/' + journal_issn + '.json'):
        # fichier existant
        sherpa_log = 'JSON file OK - '
        with open('data/sources/sherpa/data_2023/' + journal_issn + '.json', 'r', encoding='utf-8') as f:
            data = json.load(f)
            # initialisation des variables à extraire
            sherpa_id = np.nan
            title_sherpa = np.nan
            issne_sherpa = np.nan
            issnp_sherpa = np.nan
            issns = np.nan
            myissn = np.nan
            mytype = np.nan
            url = np.nan
            publisher_id = np.nan
            publisher_country = np.nan
            publisher_type = np.nan
            publisher_url = np.nan
            publisher_name = np.nan
            sherpa_uri = np.nan
            open_access_prohibited = np.nan
            location = np.nan
            locations_ir = ''
            locations_not_ir = ''
            additional_oa_fee = np.nan
            article_versions = np.nan
            article_version = np.nan
            licenses = []
            embargo = 0
            prerequisites = np.nan
            prerequisites_phrases = ''
            prerequisite_funders = np.nan
            prerequisite_funders_name = np.nan
            prerequisite_funders_fundref = np.nan
            prerequisite_funders_ror = np.nan
            prerequisite_funders_country = np.nan
            prerequisite_funders_url = np.nan
            prerequisite_funders_sherpa_id = np.nan
            prerequisite_subjects = np.nan
            named_repository = np.nan
            named_academic_social_network = np.nan
            copyright_owner = np.nan
            publisher_deposit = np.nan
            archiving = np.nan
            conditions = np.nan
            public_notes = np.nan
            sherpa_created = np.nan
            sherpa_last_modified = np.nan
            if (len(data['items']) > 0):
                if ('url' in data['items'][0]):
                    url = data['items'][0]['url']
                if ('title' in data['items'][0]['title'][0]):
                    title_sherpa = data['items'][0]['title'][0]['title']
                if ('issns' in data['items'][0]):
                    issns = data['items'][0]['issns']
                for i in issns:
                    if ('issn' in i):
                        myissn = i['issn']
                    if ('type' in i):
                        mytype = i['type']
                        if mytype == 'print':
                            issnp_sherpa = myissn
                        if mytype == 'electronic':
                            issne_sherpa = myissn
                publisher_id = data['items'][0]['publishers'][0]['publisher']['id']
                sherpa_created = data['items'][0]['system_metadata']['date_created']
                sherpa_last_modified = data['items'][0]['system_metadata']['date_modified']
                if ('country' in data['items'][0]['publishers'][0]['publisher']):
                    publisher_country = data['items'][0]['publishers'][0]['publisher']['country']
                if ('relationship_type' in data['items'][0]['publishers'][0]):
                    publisher_type = data['items'][0]['publishers'][0]['relationship_type']
                if ('url' in data['items'][0]['publishers'][0]['publisher']):
                    publisher_url = data['items'][0]['publishers'][0]['publisher']['url']
                if ('name' in data['items'][0]['publishers'][0]['publisher']['name'][0]):
                    publisher_name = data['items'][0]['publishers'][0]['publisher']['name'][0]['name']
                if ('id' in data['items'][0]):
                    sherpa_id = data['items'][0]['id']
                    # test si l'id est déjà présent 
                    if sherpa_id in sherpa_policies['sherpa_id'].values :
                        # print(journal_issn + ' - sherpa ID ' + str(sherpa_id) + ' déjà traité -> SKIP')
                        sherpa_log = sherpa_log + 'sherpa_id already done - '
                    else :
                        poilicies = data['items'][0]['publisher_policy']
                        for poilicy in poilicies:
                            # initialisation des variables à extraire
                            sherpa_uri = np.nan
                            open_access_prohibited = np.nan
                            if ('uri' in poilicy):
                                sherpa_uri = poilicy['uri']
                            if ('open_access_prohibited' in poilicy):
                                open_access_prohibited = poilicy['open_access_prohibited']
                            if ('permitted_oa' in poilicy):
                                poas = poilicy['permitted_oa']
                                for poa in poas:
                                    additional_oa_fee = np.nan
                                    article_versions = np.nan
                                    article_version = np.nan
                                    licenses = []
                                    embargo = 0
                                    prerequisites = np.nan
                                    prerequisites_phrases = ''
                                    prerequisite_funders = np.nan
                                    prerequisite_funders_name = np.nan
                                    prerequisite_funders_fundref = np.nan
                                    prerequisite_funders_ror = np.nan
                                    prerequisite_funders_country = np.nan
                                    prerequisite_funders_url = np.nan
                                    prerequisite_funders_sherpa_id = np.nan
                                    prerequisite_subjects = np.nan
                                    named_repository = np.nan
                                    named_academic_social_network = np.nan
                                    locations_ir = ''
                                    locations_not_ir = ''
                                    copyright_owner = np.nan
                                    conditions = np.nan
                                    public_notes = np.nan
                                    if ('additional_oa_fee' in poa):
                                        additional_oa_fee = poa['additional_oa_fee']
                                    if ('location' in poa):
                                        archiving = 0
                                        location = ''
                                        mylocations = poa['location']['location']
                                        mylocations_text = poa['location']['location_phrases']
                                        if (type(mylocations) is not list):
                                            mylocations = [mylocations]
                                        location = ' ; '.join(mylocations)
                                        for locationi in mylocations:
                                            if locationi in repositories_archiving :
                                                archiving = archiving + 1
                                                for locationi_text in mylocations_text:
                                                    if locationi_text['value'] == locationi :
                                                        if locations_ir == '':
                                                            locations_ir = locations_ir + locationi_text['phrase']
                                                        else :
                                                            if locationi_text['phrase'] not in locations_ir :
                                                                locations_ir = locations_ir + ' ; ' + locationi_text['phrase']
                                            else :
                                                for locationi_text in mylocations_text:
                                                    if locationi_text['value'] == locationi :
                                                        if locations_not_ir == '':
                                                            locations_not_ir = locations_not_ir + locationi_text['phrase']
                                                        else :
                                                            if locationi_text['phrase'] not in locations_not_ir :
                                                                locations_not_ir = locations_not_ir + ' ; ' + locationi_text['phrase']
                                        # print (archiving)
                                        if archiving > 0:
                                            archiving = True
                                        else : 
                                            archiving = False
                                        if ('named_repository' in poa['location']):
                                            if (type(poa['location']['named_repository']) is list):
                                                named_repository = ' ; '.join(poa['location']['named_repository'])
                                            else : 
                                                named_repository = poa['location']['named_repository']
                                            locations_not_ir = locations_not_ir.replace('Named Repository', named_repository)
                                            locations_ir = locations_ir.replace('Named Repository', named_repository)
                                        if ('named_academic_social_network' in poa['location']):
                                            if (type(poa['location']['named_academic_social_network']) is list):
                                                named_academic_social_network = ' ; '.join(poa['location']['named_academic_social_network'])
                                            else : 
                                                named_academic_social_network = poa['location']['named_academic_social_network']
                                            locations_not_ir = locations_not_ir.replace('Named Academic Social Network', named_academic_social_network)
                                            locations_ir = locations_ir.replace('Named Academic Social Network', named_academic_social_network)
                                    if ('embargo' in poa):
                                        # print(poa['embargo'])
                                        embargo_amount = 0
                                        if ('amount' in poa['embargo']):
                                            embargo_amount = poa['embargo']['amount']
                                        if ('units' in poa['embargo']):
                                            if (poa['embargo']['units'] == 'months') :
                                                embargo = embargo_amount
                                            elif (poa['embargo']['units'] == 'years') :
                                                embargo = embargo_amount*12
                                            elif (poa['embargo']['units'] == 'weeks') :
                                                if (embargo_amount == 0):
                                                    embargo = 0
                                                if (embargo_amount > 0):
                                                    embargo = int(embargo_amount/4)
                                                    if (embargo == 0):
                                                        embargo = 1
                                            elif (poa['embargo']['units'] == 'days') :
                                                if (embargo_amount == 0):
                                                    embargo = 0
                                                if (embargo_amount > 0):
                                                    embargo = int(embargo_amount/30)
                                                    if (embargo == 0):
                                                        embargo = 1
                                        else :
                                            embargo = embargo_amount
                                    if ('prerequisites' in poa):
                                        if 'prerequisites' in poa['prerequisites'] :
                                            if (type(poa['prerequisites']['prerequisites']) is list):
                                                prerequisites = ' ; '.join(poa['prerequisites']['prerequisites'])
                                            else:
                                                prerequisites = poa['prerequisites']['prerequisites']
                                        if 'prerequisites_phrases' in poa['prerequisites'] :
                                            if (type(poa['prerequisites']['prerequisites_phrases']) is list):
                                                for prerequisites_phrasesi in poa['prerequisites']['prerequisites_phrases']:
                                                    if 'phrase' in prerequisites_phrasesi:
                                                        prerequisites_phrases = prerequisites_phrases + prerequisites_phrasesi['phrase'] + ' ; '
                                            else:
                                                prerequisites_phrases = poa['prerequisites']['prerequisites_phrases']['phrase']
                                        if ('prerequisite_funders' in poa['prerequisites']):
                                            prerequisite_funders = True
                                            # prerequisite_funders = poa['prerequisites']['prerequisite_funders']
                                            # if (type(poa['prerequisites']['prerequisite_funders']) is list):
                                            #     prerequisite_funders = ' ; '.join(poa['prerequisites']['prerequisite_funders'])
                                            # else:
                                            #     prerequisite_funders = poa['prerequisites']['prerequisite_funders']
                                        if ('prerequisite_subjects' in poa['prerequisites']):
                                            prerequisite_subjects = True
                                            # prerequisite_subjects = poa['prerequisites']['prerequisite_subjects']
                                            # if (type(poa['prerequisite_subjects']) is list):
                                            #     prerequisite_subjects = ' ; '.join(poa['prerequisite_subjects'])
                                            # else:
                                            #     prerequisite_subjects = poa['prerequisite_subjects']
                                    if ('copyright_owner' in poa):
                                        copyright_owner = poa['copyright_owner']
                                    if ('publisher_deposit' in poa):
                                        publisher_deposit = ''
                                        if (type(poa['publisher_deposit']) is list):
                                            for deposit in poa['publisher_deposit']:
                                                if 'type' in deposit['repository_metadata']:
                                                    publisher_deposit = publisher_deposit + deposit['repository_metadata']['type']
                                                    if 'name' in deposit['repository_metadata']:
                                                        publisher_deposit = publisher_deposit + ' (' + deposit['repository_metadata']['name'][0]['name'] + ')'
                                                else :
                                                    if 'name' in deposit['repository_metadata']:
                                                        publisher_deposit = publisher_deposit + deposit['repository_metadata']['name'][0]['name']
                                                publisher_deposit = publisher_deposit + ' ; '
                                        else :
                                            deposit = poa['publisher_deposit']
                                            if 'type' in deposit['repository_metadata']:
                                                publisher_deposit = publisher_deposit + deposit['repository_metadata']['type']
                                                if 'name' in deposit['repository_metadata']:
                                                    publisher_deposit = publisher_deposit + ' (' + deposit['repository_metadata']['name'][0]['name'] + ')'
                                            else :
                                                if 'name' in deposit['repository_metadata']:
                                                    publisher_deposit = publisher_deposit + deposit['repository_metadata']['name'][0]['name']
                                            publisher_deposit = publisher_deposit + ' ; '
                                        # print (publisher_deposit)
                                    if ('conditions' in poa):
                                        if (type(poa['conditions']) is list):
                                            conditions = ' ; '.join(poa['conditions'])
                                        else:
                                            conditions = poa['conditions']
                                    if ('public_notes' in poa):
                                        if (type(poa['public_notes']) is list):
                                            public_notes = ' ; '.join(poa['public_notes'])
                                        else:
                                            public_notes = poa['public_notes']
                                    if ('license' in poa):
                                        licenses = poa['license']
                                        if (type(licenses) is not list):
                                            licenses = [licenses]
                                    else :
                                        licenses = ['']
                                    # avec article version
                                    if ('article_version' in poa):
                                        article_versions = poa['article_version']
                                        for article_version in article_versions:
                                            for license in licenses:
                                                if ('license' in license):
                                                    mylicense = license['license']
                                                else :
                                                    mylicense = ''
                                                # avec prerequisites
                                                if ('prerequisites' in poa) :
                                                    # avec prerequisites_funders
                                                    if ('prerequisite_funders' in poa['prerequisites']):
                                                        for prerequisite_fundersi in poa['prerequisites']['prerequisite_funders'] :
                                                            prerequisite_funders_name = prerequisite_fundersi['funder_metadata']['name'][0]['name']
                                                            if 'acronym' in prerequisite_fundersi['funder_metadata']['name'][0]:
                                                                prerequisite_funders_name = prerequisite_funders_name + ' (' + prerequisite_fundersi['funder_metadata']['name'][0]['acronym'] + ')'
                                                            if 'identifiers' in prerequisite_fundersi['funder_metadata'] :
                                                                for fund_identifier in prerequisite_fundersi['funder_metadata']['identifiers'] :
                                                                    if fund_identifier['type'] == 'fundref':
                                                                        prerequisite_funders_fundref = fund_identifier['identifier']
                                                                    if fund_identifier['type'] == 'ror':
                                                                        prerequisite_funders_ror = fund_identifier['identifier']
                                                            if 'country' in prerequisite_fundersi['funder_metadata']:
                                                                prerequisite_funders_country = prerequisite_fundersi['funder_metadata']['country']
                                                            if 'url' in prerequisite_fundersi['funder_metadata']:
                                                                prerequisite_funders_url = prerequisite_fundersi['funder_metadata']['url'][0]['url']
                                                            prerequisite_funders_sherpa_id = prerequisite_fundersi['funder_metadata']['id']
                                                            sherpa_policies = sherpa_policies.append({'NlmUniqueID' : journal_id,
                                                                                              'title_sherpa' : title_sherpa,
                                                                                              'issnp_sherpa' : issnp_sherpa,
                                                                                              'issne_sherpa' : issne_sherpa,
                                                                                              'url' : url,
                                                                                              'publisher_id' : publisher_id,
                                                                                              'publisher_country' : publisher_country,
                                                                                              'publisher_type' : publisher_type,
                                                                                              'publisher_url' : publisher_url,
                                                                                              'publisher_name' : publisher_name,
                                                                                              'sherpa_id' : sherpa_id,
                                                                                              'sherpa_uri' : sherpa_uri,
                                                                                              'open_access_prohibited' : open_access_prohibited,
                                                                                              'additional_oa_fee' : additional_oa_fee,
                                                                                              'article_version' : article_version,
                                                                                              'license' : mylicense,
                                                                                              'embargo' : embargo,
                                                                                              'prerequisites' : prerequisites,
                                                                                              'prerequisites_phrases' : prerequisites_phrases,
                                                                                              'prerequisite_funders' : prerequisite_funders,
                                                                                              'prerequisite_funders_name' : prerequisite_funders_name,
                                                                                              'prerequisite_funders_fundref' : prerequisite_funders_fundref,
                                                                                              'prerequisite_funders_ror' : prerequisite_funders_ror,
                                                                                              'prerequisite_funders_country' : prerequisite_funders_country,
                                                                                              'prerequisite_funders_url' : prerequisite_funders_url,
                                                                                              'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
                                                                                              'prerequisite_subjects' : prerequisite_subjects,
                                                                                              'location' : location,
                                                                                              'locations_ir' : locations_ir,
                                                                                              'locations_not_ir' : locations_not_ir,
                                                                                              'named_repository' : named_repository,
                                                                                              'named_academic_social_network' : named_academic_social_network,
                                                                                              'copyright_owner' : copyright_owner,
                                                                                              'publisher_deposit' : publisher_deposit,
                                                                                              'archiving' : archiving,
                                                                                              'conditions' : conditions,
                                                                                              'public_notes' : public_notes,
                                                                                              'sherpa_created' : sherpa_created,
                                                                                              'sherpa_last_modified' : sherpa_last_modified
                                                                                              }, ignore_index=True)
                                                    # sans prerequisites_funders
                                                    else :
                                                        sherpa_policies = sherpa_policies.append({'NlmUniqueID' : journal_id,
                                                                                          'title_sherpa' : title_sherpa,
                                                                                          'issnp_sherpa' : issnp_sherpa,
                                                                                          'issne_sherpa' : issne_sherpa,
                                                                                          'url' : url,
                                                                                          'publisher_id' : publisher_id,
                                                                                          'publisher_country' : publisher_country,
                                                                                          'publisher_type' : publisher_type,
                                                                                          'publisher_url' : publisher_url,
                                                                                          'publisher_name' : publisher_name,
                                                                                          'sherpa_id' : sherpa_id,
                                                                                          'sherpa_uri' : sherpa_uri,
                                                                                          'open_access_prohibited' : open_access_prohibited,
                                                                                          'additional_oa_fee' : additional_oa_fee,
                                                                                          'article_version' : article_version,
                                                                                          'license' : mylicense,
                                                                                          'embargo' : embargo,
                                                                                          'prerequisites' : prerequisites,
                                                                                          'prerequisites_phrases' : prerequisites_phrases,
                                                                                          'prerequisite_funders' : prerequisite_funders,
                                                                                          'prerequisite_funders_name' : prerequisite_funders_name,
                                                                                          'prerequisite_funders_fundref' : prerequisite_funders_fundref,
                                                                                          'prerequisite_funders_ror' : prerequisite_funders_ror,
                                                                                          'prerequisite_funders_country' : prerequisite_funders_country,
                                                                                          'prerequisite_funders_url' : prerequisite_funders_url,
                                                                                          'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
                                                                                          'prerequisite_subjects' : prerequisite_subjects,
                                                                                          'location' : location,
                                                                                          'locations_ir' : locations_ir,
                                                                                          'locations_not_ir' : locations_not_ir,
                                                                                          'named_repository' : named_repository,
                                                                                          'named_academic_social_network' : named_academic_social_network,
                                                                                          'copyright_owner' : copyright_owner,
                                                                                          'publisher_deposit' : publisher_deposit,
                                                                                          'archiving' : archiving,
                                                                                          'conditions' : conditions,
                                                                                          'public_notes' : public_notes,
                                                                                          'sherpa_created' : sherpa_created,
                                                                                          'sherpa_last_modified' : sherpa_last_modified
                                                                                          }, ignore_index=True)
                                                # sans prerequisites
                                                else :
                                                    sherpa_policies = sherpa_policies.append({'NlmUniqueID' : journal_id,
                                                                                          'title_sherpa' : title_sherpa,
                                                                                          'issnp_sherpa' : issnp_sherpa,
                                                                                          'issne_sherpa' : issne_sherpa,
                                                                                          'url' : url,
                                                                                          'publisher_id' : publisher_id,
                                                                                          'publisher_country' : publisher_country,
                                                                                          'publisher_type' : publisher_type,
                                                                                          'publisher_url' : publisher_url,
                                                                                          'publisher_name' : publisher_name,
                                                                                          'sherpa_id' : sherpa_id,
                                                                                          'sherpa_uri' : sherpa_uri,
                                                                                          'open_access_prohibited' : open_access_prohibited,
                                                                                          'additional_oa_fee' : additional_oa_fee,
                                                                                          'article_version' : article_version,
                                                                                          'license' : mylicense,
                                                                                          'embargo' : embargo,
                                                                                          'prerequisites' : prerequisites,
                                                                                          'prerequisites_phrases' : prerequisites_phrases,
                                                                                          'prerequisite_funders' : prerequisite_funders,
                                                                                          'prerequisite_funders_name' : prerequisite_funders_name,
                                                                                          'prerequisite_funders_fundref' : prerequisite_funders_fundref,
                                                                                          'prerequisite_funders_ror' : prerequisite_funders_ror,
                                                                                          'prerequisite_funders_country' : prerequisite_funders_country,
                                                                                          'prerequisite_funders_url' : prerequisite_funders_url,
                                                                                          'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
                                                                                          'prerequisite_subjects' : prerequisite_subjects,
                                                                                          'location' : location,
                                                                                          'locations_ir' : locations_ir,
                                                                                          'locations_not_ir' : locations_not_ir,
                                                                                          'named_repository' : named_repository,
                                                                                          'named_academic_social_network' : named_academic_social_network,
                                                                                          'copyright_owner' : copyright_owner,
                                                                                          'publisher_deposit' : publisher_deposit,
                                                                                          'archiving' : archiving,
                                                                                          'conditions' : conditions,
                                                                                          'public_notes' : public_notes,
                                                                                          'sherpa_created' : sherpa_created,
                                                                                          'sherpa_last_modified' : sherpa_last_modified
                                                                                          }, ignore_index=True)

                                    # sans article version
                                    else :
                                        if (type(licenses) is not list):
                                            licenses = [licenses]
                                        for license in licenses:
                                            if ('license' in license):
                                                mylicense = license['license']
                                            else :
                                                mylicense = ''
                                            # avec prerequisites
                                            if ('prerequisites' in poa) :
                                                # avec prerequisites_funders
                                                if ('prerequisite_funders' in poa['prerequisites']):
                                                    for prerequisite_fundersi in poa['prerequisites']['prerequisite_funders'] :
                                                        prerequisite_funders_name = prerequisite_fundersi['funder_metadata']['name'][0]['name']
                                                        if 'acronym' in prerequisite_fundersi['funder_metadata']['name'][0]:
                                                            prerequisite_funders_name = prerequisite_funders_name + ' (' + prerequisite_fundersi['funder_metadata']['name'][0]['acronym'] + ')'
                                                        if 'identifiers' in prerequisite_fundersi['funder_metadata'] :
                                                            for fund_identifier in prerequisite_fundersi['funder_metadata']['identifiers'] :
                                                                if fund_identifier['type'] == 'fundref':
                                                                    prerequisite_funders_fundref = fund_identifier['identifier']
                                                                if fund_identifier['type'] == 'ror':
                                                                    prerequisite_funders_ror = fund_identifier['identifier']
                                                        if 'country' in prerequisite_fundersi['funder_metadata']:
                                                            prerequisite_funders_country = prerequisite_fundersi['funder_metadata']['country']
                                                        if 'url' in prerequisite_fundersi['funder_metadata']:
                                                            prerequisite_funders_url = prerequisite_fundersi['funder_metadata']['url'][0]['url']
                                                        prerequisite_funders_sherpa_id = prerequisite_fundersi['funder_metadata']['id']
                                                        sherpa_policies = sherpa_policies.append({'NlmUniqueID' : journal_id,
                                                                                          'title_sherpa' : title_sherpa,
                                                                                          'issnp_sherpa' : issnp_sherpa,
                                                                                          'issne_sherpa' : issne_sherpa,
                                                                                          'url' : url,
                                                                                          'publisher_id' : publisher_id,
                                                                                          'publisher_country' : publisher_country,
                                                                                          'publisher_type' : publisher_type,
                                                                                          'publisher_url' : publisher_url,
                                                                                          'publisher_name' : publisher_name,
                                                                                          'sherpa_id' : sherpa_id,
                                                                                          'sherpa_uri' : sherpa_uri,
                                                                                          'open_access_prohibited' : open_access_prohibited,
                                                                                          'additional_oa_fee' : additional_oa_fee,
                                                                                          'article_version' : article_version,
                                                                                          'license' : mylicense,
                                                                                          'embargo' : embargo,
                                                                                          'prerequisites' : prerequisites,
                                                                                          'prerequisites_phrases' : prerequisites_phrases,
                                                                                          'prerequisite_funders' : prerequisite_funders,
                                                                                          'prerequisite_funders_name' : prerequisite_funders_name,
                                                                                          'prerequisite_funders_fundref' : prerequisite_funders_fundref,
                                                                                          'prerequisite_funders_ror' : prerequisite_funders_ror,
                                                                                          'prerequisite_funders_country' : prerequisite_funders_country,
                                                                                          'prerequisite_funders_url' : prerequisite_funders_url,
                                                                                          'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
                                                                                          'prerequisite_subjects' : prerequisite_subjects,
                                                                                          'location' : location,
                                                                                          'locations_ir' : locations_ir,
                                                                                          'locations_not_ir' : locations_not_ir,
                                                                                          'named_repository' : named_repository,
                                                                                          'named_academic_social_network' : named_academic_social_network,
                                                                                          'copyright_owner' : copyright_owner,
                                                                                          'publisher_deposit' : publisher_deposit,
                                                                                          'archiving' : archiving,
                                                                                          'conditions' : conditions,
                                                                                          'public_notes' : public_notes,
                                                                                          'sherpa_created' : sherpa_created,
                                                                                          'sherpa_last_modified' : sherpa_last_modified
                                                                                          }, ignore_index=True)
                                                # sans prerequisites_funders
                                                else :
                                                    sherpa_policies = sherpa_policies.append({'NlmUniqueID' : journal_id,
                                                                                          'title_sherpa' : title_sherpa,
                                                                                          'issnp_sherpa' : issnp_sherpa,
                                                                                          'issne_sherpa' : issne_sherpa,
                                                                                          'url' : url,
                                                                                          'publisher_id' : publisher_id,
                                                                                          'publisher_country' : publisher_country,
                                                                                          'publisher_type' : publisher_type,
                                                                                          'publisher_url' : publisher_url,
                                                                                          'publisher_name' : publisher_name,
                                                                                          'sherpa_id' : sherpa_id,
                                                                                          'sherpa_uri' : sherpa_uri,
                                                                                          'open_access_prohibited' : open_access_prohibited,
                                                                                          'additional_oa_fee' : additional_oa_fee,
                                                                                          'article_version' : article_version,
                                                                                          'license' : mylicense,
                                                                                          'embargo' : embargo,
                                                                                          'prerequisites' : prerequisites,
                                                                                          'prerequisites_phrases' : prerequisites_phrases,
                                                                                          'prerequisite_funders' : prerequisite_funders,
                                                                                          'prerequisite_funders_name' : prerequisite_funders_name,
                                                                                          'prerequisite_funders_fundref' : prerequisite_funders_fundref,
                                                                                          'prerequisite_funders_ror' : prerequisite_funders_ror,
                                                                                          'prerequisite_funders_country' : prerequisite_funders_country,
                                                                                          'prerequisite_funders_url' : prerequisite_funders_url,
                                                                                          'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
                                                                                          'prerequisite_subjects' : prerequisite_subjects,
                                                                                          'location' : location,
                                                                                          'locations_ir' : locations_ir,
                                                                                          'locations_not_ir' : locations_not_ir,
                                                                                          'named_repository' : named_repository,
                                                                                          'named_academic_social_network' : named_academic_social_network,
                                                                                          'copyright_owner' : copyright_owner,
                                                                                          'publisher_deposit' : publisher_deposit,
                                                                                          'archiving' : archiving,
                                                                                          'conditions' : conditions,
                                                                                          'public_notes' : public_notes,
                                                                                          'sherpa_created' : sherpa_created,
                                                                                          'sherpa_last_modified' : sherpa_last_modified
                                                                                          }, ignore_index=True)
                                            # sans prerequisites
                                            else :
                                                sherpa_policies = sherpa_policies.append({'NlmUniqueID' : journal_id,
                                                                                          'title_sherpa' : title_sherpa,
                                                                                          'issnp_sherpa' : issnp_sherpa,
                                                                                          'issne_sherpa' : issne_sherpa,
                                                                                          'url' : url,
                                                                                          'publisher_id' : publisher_id,
                                                                                          'publisher_country' : publisher_country,
                                                                                          'publisher_type' : publisher_type,
                                                                                          'publisher_url' : publisher_url,
                                                                                          'publisher_name' : publisher_name,
                                                                                      'sherpa_id' : sherpa_id,
                                                                                      'sherpa_uri' : sherpa_uri,
                                                                                      'open_access_prohibited' : open_access_prohibited,
                                                                                      'additional_oa_fee' : additional_oa_fee,
                                                                                      'article_version' : article_version,
                                                                                      'license' : mylicense,
                                                                                      'embargo' : embargo,
                                                                                      'prerequisites' : prerequisites,
                                                                                      'prerequisites_phrases' : prerequisites_phrases,
                                                                                      'prerequisite_funders' : prerequisite_funders,
                                                                                      'prerequisite_funders_name' : prerequisite_funders_name,
                                                                                      'prerequisite_funders_fundref' : prerequisite_funders_fundref,
                                                                                      'prerequisite_funders_ror' : prerequisite_funders_ror,
                                                                                      'prerequisite_funders_country' : prerequisite_funders_country,
                                                                                      'prerequisite_funders_url' : prerequisite_funders_url,
                                                                                      'prerequisite_funders_sherpa_id' : prerequisite_funders_sherpa_id,
                                                                                      'prerequisite_subjects' : prerequisite_subjects,
                                                                                      'location' : location,
                                                                                      'locations_ir' : locations_ir,
                                                                                      'locations_not_ir' : locations_not_ir,
                                                                                      'named_repository' : named_repository,
                                                                                      'named_academic_social_network' : named_academic_social_network,
                                                                                      'copyright_owner' : copyright_owner,
                                                                                      'publisher_deposit' : publisher_deposit,
                                                                                      'archiving' : archiving,
                                                                                      'conditions' : conditions,
                                                                                      'public_notes' : public_notes,
                                                                                      'sherpa_created' : sherpa_created,
                                                                                      'sherpa_last_modified' : sherpa_last_modified
                                                                                      }, ignore_index=True)
                            # sans permitted_oa
                            else :
                                # print(journal_issn + ' - sherpa ID ' + str(sherpa_id) + ' permitted_oa MISSING -> SKIP')
                                sherpa_log = sherpa_log + 'permitted_oa MISSING - '
                else :
                    # print(journal_issn + ' - sherpa ID MISSING')
                    sherpa_log = sherpa_log + 'sherpa ID MISSING - '
            else :
                # print(journal_issn + ' - sherpa sans items')
                sherpa_log = sherpa_log + 'sherpa without items - '
    else :
        sherpa_log = sherpa_log + 'JSON file MISSING - '
    sherpalog = sherpalog.append({'journal' : journal_id, 'issn' : journal_issn, 'sherpa_id' : sherpa_id, 'result' : sherpa_log}, ignore_index=True)

0
100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
2200
2300
2400
2500
2600
2700
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
3900
4000
4100
4200
4300
4400
4500
4600
4700
4800
4900
5000
5100
5200
5300
5400
5500
5600
5700
5800
5900
6000
6100
6200
6300
6400
6500
6600
6700
6800
6900
7000
7100
7200
7300
7400
7500
7600
7700
7800
7900
8000
8100
8200
8300
8400
8500
8600
8700
8800
8900
9000
9100
9200
9300
9400
9500
9600
9700
9800
9900
10000
10100


In [11]:
sherpa_policies

Unnamed: 0,NlmUniqueID,title_sherpa,issne_sherpa,issnp_sherpa,url,publisher_id,publisher_country,publisher_type,publisher_url,publisher_name,sherpa_id,sherpa_uri,open_access_prohibited,additional_oa_fee,article_version,license,embargo,prerequisites,prerequisite_funders,prerequisite_funders_name,prerequisite_funders_fundref,prerequisite_funders_ror,prerequisite_funders_country,prerequisite_funders_url,prerequisite_funders_sherpa_id,prerequisite_subjects,location,locations_ir,locations_not_ir,named_repository,named_academic_social_network,copyright_owner,publisher_deposit,archiving,conditions,public_notes,sherpa_created,sherpa_last_modified,prerequisites_phrases
0,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/1112,no,no,submitted,,0,,,,,,,,,,any_repository ; any_website ; authors_homepag...,Any Repository ; Any Website ; Institutional R...,Author's Homepage ; Non-Commercial Subject Rep...,,,,,True,Prior to acceptance ; Must be accompanied by a...,,2010-07-15 16:04:39,2022-07-26 10:25:23,
1,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/1112,no,no,accepted,,24,,,,,,,,,,institutional_repository ; non_commercial_subj...,Institutional Repository,Non-Commercial Subject Repository,,,,,True,Published source must be acknowledged ; Must l...,,2010-07-15 16:04:39,2022-07-26 10:25:23,
2,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/1112,no,no,accepted,,0,,,,,,,,,,authors_homepage,,Author's Homepage,,,,,False,Published source must be acknowledged ; Must l...,,2010-07-15 16:04:39,2022-07-26 10:25:23,
3,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/3302,no,yes,published,cc_by_nc,0,,,,,,,,,,any_website ; named_repository ; non_commercia...,Any Website ; Non-Commercial Institutional Rep...,PubMed Central ; Non-Commercial Subject Reposi...,PubMed Central,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-07-15 16:04:39,2022-07-26 10:25:23,
4,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/3302,no,yes,published,cc_by_nc_nd,0,,,,,,,,,,any_website ; named_repository ; non_commercia...,Any Website ; Non-Commercial Institutional Rep...,PubMed Central ; Non-Commercial Subject Reposi...,PubMed Central,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-07-15 16:04:39,2022-07-26 10:25:23,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
49562,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Medical Research Council (MRC),http://dx.doi.org/10.13039/501100000265,https://ror.org/03x94j517,gb,http://www.mrc.ac.uk/index.htm,705,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,
49563,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Motor Neuron Disease Association (MND Associat...,http://dx.doi.org/10.13039/501100000406,https://ror.org/02gq0fg61,gb,http://www.mndassociation.org/,562,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,
49564,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Parkinson's UK,http://dx.doi.org/10.13039/501100000304,https://ror.org/02417p338,gb,http://www.parkinsons.org.uk/,411,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,
49565,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Telethon Foundation,http://dx.doi.org/10.13039/501100002426,https://ror.org/04xraxn18,it,https://www.telethon.it/en/,325,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,


In [12]:
# convertir l'index en id
sherpa_policies.reset_index(drop=True, inplace=True)
# ajout de l'id avec l'index + 1
sherpa_policies['id'] = sherpa_policies.index + 1
sherpa_policies

Unnamed: 0,NlmUniqueID,title_sherpa,issne_sherpa,issnp_sherpa,url,publisher_id,publisher_country,publisher_type,publisher_url,publisher_name,sherpa_id,sherpa_uri,open_access_prohibited,additional_oa_fee,article_version,license,embargo,prerequisites,prerequisite_funders,prerequisite_funders_name,prerequisite_funders_fundref,prerequisite_funders_ror,prerequisite_funders_country,prerequisite_funders_url,prerequisite_funders_sherpa_id,prerequisite_subjects,location,locations_ir,locations_not_ir,named_repository,named_academic_social_network,copyright_owner,publisher_deposit,archiving,conditions,public_notes,sherpa_created,sherpa_last_modified,prerequisites_phrases,id
0,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/1112,no,no,submitted,,0,,,,,,,,,,any_repository ; any_website ; authors_homepag...,Any Repository ; Any Website ; Institutional R...,Author's Homepage ; Non-Commercial Subject Rep...,,,,,True,Prior to acceptance ; Must be accompanied by a...,,2010-07-15 16:04:39,2022-07-26 10:25:23,,1
1,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/1112,no,no,accepted,,24,,,,,,,,,,institutional_repository ; non_commercial_subj...,Institutional Repository,Non-Commercial Subject Repository,,,,,True,Published source must be acknowledged ; Must l...,,2010-07-15 16:04:39,2022-07-26 10:25:23,,2
2,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/1112,no,no,accepted,,0,,,,,,,,,,authors_homepage,,Author's Homepage,,,,,False,Published source must be acknowledged ; Must l...,,2010-07-15 16:04:39,2022-07-26 10:25:23,,3
3,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/3302,no,yes,published,cc_by_nc,0,,,,,,,,,,any_website ; named_repository ; non_commercia...,Any Website ; Non-Commercial Institutional Rep...,PubMed Central ; Non-Commercial Subject Reposi...,PubMed Central,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-07-15 16:04:39,2022-07-26 10:25:23,,4
4,9015384,Twentieth Century British History,1477-4674,0955-2359,https://academic.oup.com/tcbh,55,gb,university_publisher,https://academic.oup.com/journals/,Oxford University Press,1406,https://v2.sherpa.ac.uk/id/publisher_policy/3302,no,yes,published,cc_by_nc_nd,0,,,,,,,,,,any_website ; named_repository ; non_commercia...,Any Website ; Non-Commercial Institutional Rep...,PubMed Central ; Non-Commercial Subject Reposi...,PubMed Central,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-07-15 16:04:39,2022-07-26 10:25:23,,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
49562,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Medical Research Council (MRC),http://dx.doi.org/10.13039/501100000265,https://ror.org/03x94j517,gb,http://www.mrc.ac.uk/index.htm,705,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,,49563
49563,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Motor Neuron Disease Association (MND Associat...,http://dx.doi.org/10.13039/501100000406,https://ror.org/02gq0fg61,gb,http://www.mndassociation.org/,562,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,,49564
49564,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Parkinson's UK,http://dx.doi.org/10.13039/501100000304,https://ror.org/02417p338,gb,http://www.parkinsons.org.uk/,411,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,,49565
49565,9435608,Zoology,,0944-2006,http://www.elsevier.com/wps/product/cws_home/7...,30,us,commercial_publisher,http://www.elsevier.com/,Elsevier,15919,https://v2.sherpa.ac.uk/id/publisher_policy/3323,no,yes,published,cc_by,0,,True,Telethon Foundation,http://dx.doi.org/10.13039/501100002426,https://ror.org/04xraxn18,it,https://www.telethon.it/en/,325,,any_repository ; institutional_repository ; na...,Any Repository ; Institutional Repository,PubMed Central ; Research for Development Repo...,PubMed Central ; Research for Development Repo...,,,disciplinary (PubMed Central) ;,True,Published source must be acknowledged with cit...,,2010-09-14 14:06:06,2022-07-26 14:26:43,,49566


## Export

In [13]:
# exports csv
sherpa_policies.to_csv('data/temp/2023/sherpa_policies.tsv', sep='\t', encoding='utf-8', index=False)
sherpalog.to_csv('data/temp/2023/sherpa_log.tsv', sep='\t', encoding='utf-8', index=False)

In [14]:
# exports excel
sherpa_policies.to_excel('data/temp/2023/sherpa_policies.xlsx', index=False)
sherpalog.to_excel('data/temp/2023/sherpa_log.xlsx', index=False)

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode

  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
  "65,530 URLS per worksheet." % force_unicode(url))
