# SCOPUS UM Pharmacy Paper Search

Identify papers published by affiliates of the UM Pharmacy School for a specified time period. Generate a .ris file with citations for all journal articles published in the given time period.

**Important:** This code requires an Elsevier API key. Request one here: dev.elsevier.com. Also  be sure to install the *pybliometrics* library if you have not used it before.

To use this code:
1. Run the code cell below
3. Provide the publication year and months when prompted

**If you have not used pybliometrics library before,** the first time you execute this code it will prompt you for your Elsevier API key. Enter the key and press enter again to skip when it prompts you for an InstToken. Then re-run the cell.

The .ris output will be generated in the same place that you have this notebook saved to (i.e. if the notebook is on your desktop, the .ris will also appear on your desktop). This file will be named "UM_Pharmacy_Publications_{start_month}\_to_{end_month}_{year}.ris"

*n.b.* If a there is a publication which is not a journal article, by default Pybliometrics will not generate a citation for the .ris file. I have added some additional code to this script which will generate RIS entries for books and book chapters. Other non-journal, non-book publications will be listed in the code output with their title, author, type, and DOI (if applicable). The code output will tell you how many publications this applies to for each search. Also, if a publication has an incorrect date in Scopus (usually listed as 01-01-{year}), this will be flagged. This flag is ignored during quarter 1 (Jan-Mar).

## Run this cell:

In [None]:
from datetime import datetime
from pybliometrics.scopus import ScopusSearch, AbstractRetrieval

pub_year = input("Enter the publication year in the format YYYY:")
daterange = input("Is this search annual (A) or quarterly (Q)?")

if daterange.lower() in ["annual", "a"]:
    range = "annual"
    
    search_query = f'(((AF-ID (60010491) AND (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics)) OR AF-ID (60020462) OR AF-ID (60030187)) \
    AND PUBYEAR = {pub_year})'
    
elif daterange.lower() in ["quarterly", "q"]:
    range = "quarter"
    quarter = input("Which quarter is it?\nQ1 = Jan, Feb, Mar;\nQ2 = Apr, May, June;\nQ3 = July, Aug, Sept;\nQ4 = Oct, Nov, Dec\n")
    
    if quarter.lower() in ["1", "q1"]:
        pub_month1 = "January"
        pub_month2 = "February"
        pub_month3 = "March"
    elif quarter.lower() in ["2", "q2"]:
        pub_month1 = "April"
        pub_month2 = "May"
        pub_month3 = "June"
    elif quarter.lower() in ["3", "q3"]:
        pub_month1 = "July"
        pub_month2 = "August"
        pub_month3 = "September"
    elif quarter.lower() in ["4", "q4"]:
        pub_month1 = "October"
        pub_month2 = "November"
        pub_month3 = "December"
    elif quarter.lower() not in ["1", "2", "3", "4", "q1", "q2", "q3", "q4"]:
        raise Exeception("Input not recognized, please try again.")
        
    search_query = f'(((AF-ID (60010491) AND (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics)) OR AF-ID (60020462) OR AF-ID (60030187)) \
    AND PUBDATETXT ( "{pub_month1} {pub_year}" OR "{pub_month2} {pub_year}" OR "{pub_month3} {pub_year}" ) \
    AND PUBYEAR = {pub_year})'
    
elif daterange.lower() not in ["annual","a", "quarterly", "q"]:
    raise Exeception("Input not recognized, please try again.")

start = datetime.now().replace(microsecond = 0)  # start timing the run

s = ScopusSearch(search_query)
print(f'\nNumber of search results: {len(s.results)}')

if range == "annual":
    filename = f'UM_Pharmacy_Publications_{pub_year}.ris'
elif range == "quarter":
    filename = f'UM_Pharmacy_Publications_{pub_month1}_to_{pub_month3}_{pub_year}.ris'

outfile = open(filename, 'w', encoding = 'utf-8')

count = 0
exception_count = 0
exceptions_list = []

for i in s.results:
    if i.aggregationType == "Journal":
        try:  # get abstract information in RIS format for a given Elsevier ID number
            citation = AbstractRetrieval(identifier = i[1], id_type = "doi").get_ris()
            outfile.write(citation)
            count = count + 1
        except:
            exception_count = exception_count + 1
            ris = f"TY  - JOUR\nTI  - {i.title}\nJO  - {i.publicationName}"\
              f"\nVL  - {i.volume}\nDA  - {i.coverDate}\n"\
              f"PY  - {i.coverDate[0:4]}\nSP  - {i.pageRange}\n"
        # Authors
            author_list = i.author_names.split(";")
            for au in author_list:
                ris += f'AU  - {au}\n'
            # DOI
            if i.doi:
                ris += f'DO  - {i.doi}\nUR  - https://doi.org/{i.doi}\n'
            # Issue
            if i.issueIdentifier:
                ris += f'IS  - {i.issueIdentifier}\n'
            ris += 'ER  - \n\n'
            
            outfile.write(ris)
            
            exceptions_list.append(f'Title: {i.title}\n\
Authors: {author_list}\n\
Type: {i.aggregationType}\n\
Subtype: {i.subtypeDescription}\n\
Publication Name: {i.publicationName}\n\
DOI:{i.doi}\n')

print(f"Successfully generated references for {count} journal articles.","\n")

if exception_count > 0:
    print(f"{exception_count} journal articles generated an error in the code, usually as the result of a missing DOI.\n\
They have been added to the RIS file via another mechanism; you may want to check them manually:\n")
    
    for item in exceptions_list:
        print(item)

book = 0
chapter = 0

for i in s.results:  # list non-journal publications which are not in the .ris file
    if i.aggregationType != "Journal":
        if i.subtypeDescription == "Book":
            # Basic information
            ris = f"TY  - BOOK\nTI  - {i.title}"\
                  f"\nDA  - {i.coverDate}\n"\
                  f"PY  - {i.coverDate[0:4]}\nSP  - {i.pageRange}\n"
            # Authors
            author_list = i.author_names.split(";")
            for au in author_list:
                ris += f'AU  - {au}\n'
            # DOI
            if i.doi:
                ris += f'DO  - {i.doi}\nUR  - https://doi.org/{i.doi}\n'
            ris += 'ER  - \n\n'
            
            outfile.write(ris)
            book += 1
        
        elif i.subtypeDescription == 'Book Chapter':
            # Basic information
            ris = f"TY  - CHAP\nTI  - {i.title}"\
                  f"\nT2  - {i.publicationName}\nDA  - {i.coverDate}\n"\
                  f"PY  - {i.coverDate[0:4]}\nSP  - {i.pageRange}\n"
            # Authors
            author_list = i.author_names.split(";")
            for au in author_list:
                ris += f'AU  - {au}\n'
            # DOI
            if i.doi:
                ris += f'DO  - {i.doi}\nUR  - https://doi.org/{i.doi}\n'
            ris += 'ER  - \n\n'
            
            outfile.write(ris)
            chapter += 1

outfile.close()
if book > 0 or chapter > 0:  
    print(f"Done generating {book} book references and {chapter} book chapter references.\n\n\
A total of {count + exception_count + book + chapter} items have been added to the RIS file.")

if (len(s.results) - count - exception_count - book - chapter) != 0:
    print(f"{len(s.results) - count - exception_count - book - chapter} other types of publications were found for this date range:\n")

other = 0
for i in s.results:  # list non-journal publications which are not in the .ris file
    if i.aggregationType != "Journal" and (i.aggregationType != "Book Series" and i.subtypeDescription != "Book") and (i.aggregationType != "Book Series" and i.subtypeDescription != "Book Chapter"):
        print(f'Title: {i.title}\nAuthors: {author_list}\nType: {i.aggregationType}\nSubtype: {i.subtypeDescription}\nDOI:{i.doi}\n')
        other += 1

for i in s.results:
    if range == "quarter" and i.coverDate == f'{pub_year}-01-01' and quarter.lower() not in ["1", "q1"]:
        author_list = i.author_names.split(";")
        print(f"\nThese publications may have no publication date in Scopus, please check manually:\n\nTitle: {i.title}\nAuthors: {author_list}\nType: {i.aggregationType}\nDOI: {i.doi}\n\n")

end = datetime.now().replace(microsecond = 0)
print(f'\nTime to complete the search: {end-start}')

- Scopus does not recognize these words: glycoscience

### Version 1 - Original Query

Returned 61 results:

PUBDATETXT ("January 2024" OR "February 2024" OR "March 2024") AND (PUBYEAR = 2024) AND (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics) AND (AF-ID(60010491) OR AF-ID(60020462) OR AF-ID(60030187))

### Version 2 - Original + New Keywords:

Returned 96 results:

PUBDATETXT ("January 2024" OR "February 2024" OR "March 2024") AND (PUBYEAR = 2024) AND (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics OR "drug discovery" OR "drug development" OR synthesis OR "drug design" OR chemistry OR "structure activity" OR "toxicity" OR "zebra fish" OR pharmacology OR neuroscience OR cardiovascular OR computer-aided AND design OR "computational chemistry" OR aging OR "Alzheimer" OR "formulation development" OR "product development" OR "lipid based systems" OR nanotechnology OR extrusion OR polymer OR "3D printing" OR "device design" OR cannabis OR extraction OR "design of experiments" OR "medication utilization" OR "toxicity" OR "marine biology" OR pain) AND (AF-ID(60010491) OR AF-ID(60020462) OR AF-ID(60030187))

### Version 3 - Restructured Query with Original Keywords

Returned 61 results:

PUBDATETXT("January 2024" OR "February 2024" OR "March 2024") AND PUBYEAR = 2024 AND (AF-ID("University of Mississippi School of Pharmacy" 60020462) OR AF-ID("University of Mississippi Research Institute Pharmaceutical Science" 60030187) OR (AF-ID("University of Mississippi" 60010491) AND (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics)))

### Version 4 - Restructured Query with New Keywords

Returned 111 results:

PUBDATETXT ("January 2024" OR "February 2024" OR "March 2024") AND (PUBYEAR = 2024) AND ((AF-ID(60010491) AND (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics OR "drug discovery" OR "drug development" OR synthesis OR "drug design" OR chemistry OR "structure activity" OR "toxicity" OR "zebra fish" OR pharmacology OR neuroscience OR cardiovascular OR computer-aided AND design OR "computational chemistry" OR aging OR "Alzheimer" OR "formulation development" OR "product development" OR "lipid based systems" OR nanotechnology OR extrusion OR polymer OR "3D printing" OR "device design" OR cannabis OR extraction OR "design of experiments" OR "medication utilization" OR "toxicity" OR "marine biology" OR pain)) OR AF-ID(60020462) OR AF-ID(60030187))

### Version 5 - Restructured Query with Original Keywords, using Title-Abs-Key:

Returned 20 results:

PUBDATETXT ("January 2024" OR "February 2024" OR "March 2024") AND (PUBYEAR = 2024) AND ((AF-ID(60010491) AND TITLE-ABS-KEY(pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics) OR AF-ID(60020462) OR AF-ID(60030187))

### Version 6 - Retructured, with all keywords, using Title-Abs-Key:
Returned 50 results:

PUBDATETXT ("January 2024" OR "February 2024" OR "March 2024") AND (PUBYEAR = 2024) AND ((AF-ID(60010491) AND TITLE-ABS-KEY (pharmacy OR "biomolecular sciences" OR "natural products research" OR pharmaceutics OR "drug discovery" OR "drug development" OR synthesis OR "drug design" OR chemistry OR "structure activity" OR "toxicity" OR "zebra fish" OR pharmacology OR neuroscience OR cardiovascular OR computer-aided AND design OR "computational chemistry" OR aging OR "Alzheimer" OR "formulation development" OR "product development" OR "lipid based systems" OR nanotechnology OR extrusion OR polymer OR "3D printing" OR "device design" OR cannabis OR extraction OR "design of experiments" OR "medication utilization" OR "toxicity" OR "marine biology" OR pain)) OR AF-ID(60020462) OR AF-ID(60030187))