# Data collection

### Useful links:
- Scopus [[Source](https://www.scopus.com/)]
- Scopus API (Application Programming Interface) Documentation [[Source 1](https://dev.elsevier.com/technical_documentation.html); [Source 2](https://dev.elsevier.com/api_docs.html)]
- The "pybliometrics" package in the Python programming language [[Source 1](https://github.com/pybliometrics-dev/pybliometrics); [Source 2](https://pybliometrics.readthedocs.io/en/stable/index.html)]
- Scimago Journal Rank (SJR) [[Source](https://www.scimagojr.com/)]

### Packages, classes and functions used:
**pandas package**
- .apply()
- .astype()
- .concat()
- .DataFrame()
- .drop_duplicates()
- .index
- .iterrows()
- .merge()
- .read_csv()
- .to_excel()
- .at[]

**pybliometrics package**
- .get_results_size()
- .scopus.AbstractRetrieval()
- .scopus.CitationOverview()
- .scopus.ScopusSearch()

**requests package**
- .get()

**os package**
- .makedirs()
- .path.join()

In [1]:
## 0 ## Installing and importing modules/libraries
#!pip install pandas # To work with dataframes
#!pip install pybliometrics # To work with api.elsevier.com

import pandas
from pybliometrics.scopus import ScopusSearch
#from pybliometrics.scopus import AbstractRetrieval
#from pybliometrics.scopus import CitationOverview
import requests
import os

# Specifying the API key for successful package importing
# "a6b49bc00cce366026d4cfd9396ac572" - At the moment (19/01/24), the quota limit has been spent
# "c4b35f1579a33db64d94f97c723a60d8"

#help()

---
## Scopus

In [2]:
## 1 ## Request formation with all the necessary parameters

# Specifying the API key
api_key = "c4b35f1579a33db64d94f97c723a60d8"

# Specifying the query
query = '( TITLE-ABS-KEY ( "environment* practice*" OR "ecolog* practice*" OR "eco-practice*" OR "environment* behav*" OR "ecolog* behav*" OR "eco-behav*" ) AND PUBYEAR > 2012 AND PUBYEAR < 2024 ) AND ( sociology ) AND DOCTYPE("ar") AND SUBJAREA("SOCI") AND LANGUAGE("English")'

The query string written above is provided in a slightly different form, since all fields except “LIMIT-TO()” work for the class being used.

**The formula of the search query from the web version of Scopus:** ( TITLE-ABS-KEY ( "environment* practice*" OR "ecolog* practice*" OR "eco-practice*" OR "environment* behav*" OR "ecolog* behav*" OR "eco-behav*" ) AND PUBYEAR > 2012 AND PUBYEAR < 2024 ) AND ( sociology ) AND ( LIMIT-TO ( DOCTYPE , "ar" ) ) AND ( LIMIT-TO ( SUBJAREA , "SOCI" ) ) AND ( LIMIT-TO ( LANGUAGE , "English" ) )

**Explanation for the query parameters used (for more information, see [here](https://www.scopus.com/search/form.uri?display=advanced)):**
1. Field codes:
    - TITLE-ABS-KEY - A combined field that searches abstracts, keywords, and document titles.
    - PUBYEAR - A numeric field indicating the year of publication.
    - DOCTYPE - Limits your search to document types - article (ar), review (re), book chapter(ch), etc.
    - LANGUAGE - The language in which the original document was written.
    - SUBJAREA - A search field which returns documents related to a specific field of science.
2. Operators:
    - AND - Use AND when you want your results to include all terms and the terms may be far apart.
    - OR - Use OR when your results must include one or more of the terms (such as synonyms, alternate spellings, or abbreviations). Documents that contain any of the words will be found.
3. Wildcards:
    - Asterisk (*) - Replace multiple characters anywhere in a word. The asterisk replaces 0 or more characters, so it can be used to find any number or to indicate a character that may or may not be present.

In [3]:
## 2 ## Executing the request and saving all the collected data on request to the "response" object
response = ScopusSearch(api_key = api_key
                        , query = query
                        , view = "STANDARD"
                        , verbose = True
                        , subscriber = False)

In [4]:
## 3 ## Determining the number of publications found on request
response.get_results_size()

678

In [5]:
## 4 ## Creating a dataframe in which all information about the collected publications will be saved
all_publications = pandas.DataFrame(response.results)
all_publications

Unnamed: 0,eid,doi,pii,pubmed_id,title,subtype,subtypeDescription,creator,afid,affilname,...,pageRange,description,authkeywords,citedby_count,openaccess,freetoread,freetoreadLabel,fund_acr,fund_no,fund_sponsor
0,2-s2.0-85180566359,10.3390/bs13120966,,,Pro-Environmental Behavior and Climate Change ...,ar,Article,Leite Â.,,Universidade Católica Portuguesa,...,,,,0,1,repositoryam,Green,,,
1,2-s2.0-85178364143,10.1093/jcr/ucad016,,,Cyclical Time Is Greener: The Impact of Tempor...,ar,Article,Xu L.,,Wuhan University,...,722-741,,,1,0,,,,,
2,2-s2.0-85168088015,10.1007/s13412-023-00850-9,,,Using the social identity model of pro-environ...,ar,Article,Johnson N.,,Purdue University,...,587-601,,,0,0,,,,,
3,2-s2.0-85158156830,10.1057/s41599-023-01682-2,,,The role of peers in promoting energy conserva...,ar,Article,Lin B.,,Xiamen University,...,,,,0,1,publisherfullgold,Gold,,,
4,2-s2.0-85153196714,10.1002/bse.3428,,,The impact of a proactive environmental strate...,ar,Article,Galbreath J.,,The Faculty of Business and Law,...,5420-5434,,,3,1,publisherhybridgold,Hybrid Gold,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
673,2-s2.0-84884678671,10.1177/0162243913495924,,,Unheeded Science: Taking Precaution out of Tox...,ar,Article,Hoffman K.,,Universidad de Puerto Rico,...,829-850,,,6,0,,,,,
674,2-s2.0-84884519014,10.1108/JEA-04-2012-0049,,,The relationship between transformational lead...,ar,Article,Keung E.K.,,,...,836-854,,,41,0,,,,,
675,2-s2.0-84879053585,10.1080/13504622.2012.695013,,,Use of self-determination theory to support ba...,ar,Article,Karaarslan G.,,Aǧrı İbrahim Çeçen Üniversitesi;Middle East Te...,...,342-369,,,10,0,,,,,
676,2-s2.0-84874506315,10.1177/0162243912470726,,,Justice as Measure of Nongovernmental Organiza...,ar,Article,Allen B.,,Virginia Polytechnic Institute and State Unive...,...,224-249,,,14,0,,,,,


In [6]:
## 5 ## Just in case, saving the database in its original form in an Excel file format called "All_publications_Scopus.xlsx"
all_publications.index = range(1, len(all_publications) + 1)
#all_publications.to_excel("All_publications_Scopus.xlsx")

In [7]:
## 6 ## Additional collection of abstracts, keywords and all authors for the found publications
#for index, row in all_publications.iterrows():
    #scopus_id = row["eid"]
    #try:
        #publication_info = AbstractRetrieval(scopus_id, view = "FULL")
        #all_publications.at[index, "Abstract"] = publication_info.abstract
        #all_publications.at[index, "Keywords"] = publication_info.keywords
        #all_publications.at[index, "Author(s)"] = publication_info.authors
    #except Exception as e:
        #print(f"Error retrieving information for Scopus ID {scopus_id}: {str(e)}")

#all_publications

The code in chunk 6 will not work, as full authorization is required, which we, as students of the University of Bologna, do not have. However, we have a subscription from the university to the Scopus database, so we downloaded the other necessary data manually in a ready-made file format. Next, we will upload this file and combine it with the already assembled database.

In [8]:
## 7 ## Uploading a file with additional collected data (abstracts, keywords and all authors)
additional_data = pandas.read_csv("Abstracts_Keywords_Scopus.csv")
additional_data

Unnamed: 0,Authors,Author full names,Author(s) ID,Title,Link,Affiliations,Authors with affiliations,Abstract,Author Keywords,ISSN,ISBN,CODEN,EID
0,Baratta R.; Brunetti F.; Ugolini M.M.,"Baratta, Rossella (57203115382); Brunetti, Fed...",57203115382; 7003422877; 34877845200,‘Feel at home’ on vacation: exploring homeynes...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Management, University of Bergam...","Baratta R., Department of Management, Universi...",Tourism research has increasingly highlighted ...,authenticity; destination loyalty; familiarity...,13683500,,,2-s2.0-85175026443
1,Ye X.; Ren X.; Shang Y.; Liu J.; Feng H.; Zhan...,"Ye, Xi (57734537300); Ren, Xuan (58626159100);...",57734537300; 58626159100; 58625766300; 5862603...,The role of urban green spaces in supporting a...,https://www.scopus.com/inward/record.uri?eid=2...,"Faculty of Humanities and Arts, Macau Universi...","Ye X., Faculty of Humanities and Arts, Macau U...",Purpose: Urban green spaces support people to ...,Active and healthy ageing; Environmental behav...,19387806,,,2-s2.0-85172797976
2,Syed-Abdullah S.I.S.,"Syed-Abdullah, Sharifah Intan Sharina (5721024...",57210246058,Why travel far to learn? A study of environmen...,https://www.scopus.com/inward/record.uri?eid=2...,"Faculty of Educational Studies, Universiti Put...","Syed-Abdullah S.I.S., Faculty of Educational S...",Residential outdoor environmental education (R...,education for sustainable development; Outdoor...,14729679,,,2-s2.0-85147259686
3,Villani C.; Talamini G.,"Villani, Caterina (57212250029); Talamini, Gia...",57212250029; 57212255364,Making Vulnerability Invisible: The Impact of ...,https://www.scopus.com/inward/record.uri?eid=2...,"University College Dublin, Dublin, Ireland; Ci...","Villani C., University College Dublin, Dublin,...",Despite the growing body of work on how COVID-...,behavior mapping; COVID-19; environment-behavi...,0739456X,,,2-s2.0-85176372611
4,Sun Y.; Lu X.; Cui J.; Du K.; Xie S.,"Sun, Yuyu (57219130936); Lu, Xiaoxu (572191287...",57219130936; 57219128761; 55466886600; 3677424...,"Effects of vicarious experiences of nature, en...",https://www.scopus.com/inward/record.uri?eid=2...,"College of Teacher Education, Faculty of Educa...","Sun Y., College of Teacher Education, Faculty ...",This study explores the relationship between v...,adolescents; environmental attitudes; environm...,13504622,,,2-s2.0-85153501031
...,...,...,...,...,...,...,...,...,...,...,...,...,...
673,Hoffman K.,"Hoffman, Karen (55865763600)",55865763600,Unheeded Science: Taking Precaution out of Tox...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Sociology and Anthropology, Univ...","Hoffman K., Department of Sociology and Anthro...","In the early 1970s, the idea of precaution-of ...",engagement; environmental practices; governanc...,01622439,,,2-s2.0-84884678671
674,Stanley A.,"Stanley, Anna (57223686797)",57223686797,"Natures of risk: Capital, rule, and production...",https://www.scopus.com/inward/record.uri?eid=2...,"Department of Geography, National University o...","Stanley A., Department of Geography, National ...",This purpose of this paper is to propose start...,Accumulation; Difference; Governmentality; Pro...,00167185,,,2-s2.0-84873985599
675,Kola-Olusanya A.,"Kola-Olusanya, Anthony (55151431800)",55151431800,Embedding environmental sustainability compete...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Geography, College of Management...","Kola-Olusanya A., Department of Geography, Col...",This article explores the dynamics of environm...,Environmental learning; Human resource develop...,20392117,,,2-s2.0-84892535778
676,Jin M.,"Jin, Myung (55611618800)",55611618800,Does Social Capital Promote Pro-Environmental ...,https://www.scopus.com/inward/record.uri?eid=2...,L. Douglas Wilder School of Government and Pub...,"Jin M., L. Douglas Wilder School of Government...",Nations around the globe are increasingly faci...,collaborative governance; environmental behavi...,15324265,,,2-s2.0-84876252167


In [9]:
## 8 ## Concatenating two dataframes by the EID of each publication
all_publications_new = pandas.merge(all_publications
                                    , additional_data
                                    , left_on = "eid"
                                    , right_on = "EID"
                                    , how = "left")

all_publications_new

Unnamed: 0,eid,doi,pii,pubmed_id,title,subtype,subtypeDescription,creator,afid,affilname,...,Title,Link,Affiliations,Authors with affiliations,Abstract,Author Keywords,ISSN,ISBN,CODEN,EID
0,2-s2.0-85180566359,10.3390/bs13120966,,,Pro-Environmental Behavior and Climate Change ...,ar,Article,Leite Â.,,Universidade Católica Portuguesa,...,Pro-Environmental Behavior and Climate Change ...,https://www.scopus.com/inward/record.uri?eid=2...,Centre for Philosophical and Humanistic Studie...,"Leite Â., Centre for Philosophical and Humanis...",The main objective of this paper is to assess ...,climate change anxiety; climate change despair...,2076328X,,,2-s2.0-85180566359
1,2-s2.0-85178364143,10.1093/jcr/ucad016,,,Cyclical Time Is Greener: The Impact of Tempor...,ar,Article,Xu L.,,Wuhan University,...,Cyclical Time Is Greener: The Impact of Tempor...,https://www.scopus.com/inward/record.uri?eid=2...,"Research Center for Organizational Marketing, ...","Xu L., Research Center for Organizational Mark...",The natural environment is deteriorating. Howe...,cyclical time; green values; linear time; pro-...,00935301,,,2-s2.0-85178364143
2,2-s2.0-85168088015,10.1007/s13412-023-00850-9,,,Using the social identity model of pro-environ...,ar,Article,Johnson N.,,Purdue University,...,Using the social identity model of pro-environ...,https://www.scopus.com/inward/record.uri?eid=2...,"Communication and Cognition Lab, Purdue Univer...","Johnson N., Communication and Cognition Lab, P...",Solar panels promise to provide clean energy w...,Efficacy; Norms; Social identity; Solar panels...,21906483,,,2-s2.0-85168088015
3,2-s2.0-85158156830,10.1057/s41599-023-01682-2,,,The role of peers in promoting energy conserva...,ar,Article,Lin B.,,Xiamen University,...,The role of peers in promoting energy conserva...,https://www.scopus.com/inward/record.uri?eid=2...,"School of Management, China Institute for Stud...","Lin B., School of Management, China Institute ...",Guiding individuals to adopt pro-environmental...,,26629992,,,2-s2.0-85158156830
4,2-s2.0-85153196714,10.1002/bse.3428,,,The impact of a proactive environmental strate...,ar,Article,Galbreath J.,,The Faculty of Business and Law,...,The impact of a proactive environmental strate...,https://www.scopus.com/inward/record.uri?eid=2...,"School of Management and Marketing, Faculty of...","Galbreath J., School of Management and Marketi...",Demonstration of environmental sustainability ...,environmental; information; resource-based vie...,09644733,,,2-s2.0-85153196714
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
673,2-s2.0-84884678671,10.1177/0162243913495924,,,Unheeded Science: Taking Precaution out of Tox...,ar,Article,Hoffman K.,,Universidad de Puerto Rico,...,Unheeded Science: Taking Precaution out of Tox...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Sociology and Anthropology, Univ...","Hoffman K., Department of Sociology and Anthro...","In the early 1970s, the idea of precaution-of ...",engagement; environmental practices; governanc...,01622439,,,2-s2.0-84884678671
674,2-s2.0-84884519014,10.1108/JEA-04-2012-0049,,,The relationship between transformational lead...,ar,Article,Keung E.K.,,,...,The relationship between transformational lead...,https://www.scopus.com/inward/record.uri?eid=2...,"School of Education, Liberty University, Lynch...","Keung E.K.; Rockinson-Szapkiw A.J., School of ...",Purpose: The purpose of this study is to exami...,Cultural intelligence; Intercultural schools; ...,09578234,,,2-s2.0-84884519014
675,2-s2.0-84879053585,10.1080/13504622.2012.695013,,,Use of self-determination theory to support ba...,ar,Article,Karaarslan G.,,Aǧrı İbrahim Çeçen Üniversitesi;Middle East Te...,...,Use of self-determination theory to support ba...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Elementary Education, Aǧri Ibrah...","Karaarslan G., Department of Elementary Educat...","In this paper, we examine how the basic psycho...",basic psychological needs; environmental educa...,13504622,,,2-s2.0-84879053585
676,2-s2.0-84874506315,10.1177/0162243912470726,,,Justice as Measure of Nongovernmental Organiza...,ar,Article,Allen B.,,Virginia Polytechnic Institute and State Unive...,...,Justice as Measure of Nongovernmental Organiza...,https://www.scopus.com/inward/record.uri?eid=2...,"Virginia Tech, Washington, DC, United States","Allen B.L., Virginia Tech, Washington, DC, Uni...",Through exploring multiple contemporary concep...,engagement; environmental practices; ethics; i...,01622439,,,2-s2.0-84874506315


In [10]:
## 9 ## Selecting necessary characteristics of publications
characteristics = ["eid"
                   , "doi"
                   , "subtypeDescription"
                   , "coverDate"
                   , "publicationName"
                   , "aggregationType"
                   , "volume"
                   , "issueIdentifier"
                   , "pageRange"
                   , "citedby_count"
                   , "openaccess"
                   , "freetoreadLabel"
                   , "Authors"
                   , "Author full names"
                   , "Author(s) ID"
                   , "Affiliations"
                   , "Authors with affiliations"
                   , "Title"
                   , "Abstract"
                   , "Author Keywords"
                   , "ISSN"
                   , "Link"]

all_publications_fin = pandas.DataFrame()

for i in all_publications_new:
    if i in characteristics:
        all_publications_fin[i] = all_publications_new[i]

all_publications_fin.index = range(1, len(all_publications_fin) + 1)
all_publications_fin

Unnamed: 0,eid,doi,subtypeDescription,coverDate,publicationName,aggregationType,volume,issueIdentifier,pageRange,citedby_count,...,Authors,Author full names,Author(s) ID,Title,Link,Affiliations,Authors with affiliations,Abstract,Author Keywords,ISSN
1,2-s2.0-85180566359,10.3390/bs13120966,Article,2023-12-01,Behavioral Sciences,Journal,13,12,,0,...,Leite Â.; Lopes D.; Pereira L.,"Leite, Ângela (56653128300); Lopes, Diana (587...",56653128300; 58781680500; 58782552100,Pro-Environmental Behavior and Climate Change ...,https://www.scopus.com/inward/record.uri?eid=2...,Centre for Philosophical and Humanistic Studie...,"Leite Â., Centre for Philosophical and Humanis...",The main objective of this paper is to assess ...,climate change anxiety; climate change despair...,2076328X
2,2-s2.0-85178364143,10.1093/jcr/ucad016,Article,2023-12-01,Journal of Consumer Research,Journal,50,4,722-741,1,...,Xu L.; Zhao S.; Cotte J.; Cui N.,"Xu, Lan (57199908320); Zhao, Shuangshuang (580...",57199908320; 58020580500; 7005607623; 56235068600,Cyclical Time Is Greener: The Impact of Tempor...,https://www.scopus.com/inward/record.uri?eid=2...,"Research Center for Organizational Marketing, ...","Xu L., Research Center for Organizational Mark...",The natural environment is deteriorating. Howe...,cyclical time; green values; linear time; pro-...,00935301
3,2-s2.0-85168088015,10.1007/s13412-023-00850-9,Article,2023-12-01,Journal of Environmental Studies and Sciences,Journal,13,4,587-601,0,...,Johnson N.; Reimer T.,"Johnson, Nathanael (57471631500); Reimer, Tors...",57471631500; 8513401900,Using the social identity model of pro-environ...,https://www.scopus.com/inward/record.uri?eid=2...,"Communication and Cognition Lab, Purdue Univer...","Johnson N., Communication and Cognition Lab, P...",Solar panels promise to provide clean energy w...,Efficacy; Norms; Social identity; Solar panels...,21906483
4,2-s2.0-85158156830,10.1057/s41599-023-01682-2,Article,2023-12-01,Humanities and Social Sciences Communications,Journal,10,1,,0,...,Lin B.; Jia H.,"Lin, Boqiang (35098935000); Jia, Huanyu (57475...",35098935000; 57475702400,The role of peers in promoting energy conserva...,https://www.scopus.com/inward/record.uri?eid=2...,"School of Management, China Institute for Stud...","Lin B., School of Management, China Institute ...",Guiding individuals to adopt pro-environmental...,,26629992
5,2-s2.0-85153196714,10.1002/bse.3428,Article,2023-12-01,Business Strategy and the Environment,Journal,32,8,5420-5434,3,...,Galbreath J.; Chang C.-Y.; Tisch D.,"Galbreath, Jeremy (6602228947); Chang, Chia-Ya...",6602228947; 57223016224; 57202281632,The impact of a proactive environmental strate...,https://www.scopus.com/inward/record.uri?eid=2...,"School of Management and Marketing, Faculty of...","Galbreath J., School of Management and Marketi...",Demonstration of environmental sustainability ...,environmental; information; resource-based vie...,09644733
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
674,2-s2.0-84884678671,10.1177/0162243913495924,Article,2013-01-01,Science Technology and Human Values,Journal,38,6,829-850,6,...,Hoffman K.,"Hoffman, Karen (55865763600)",55865763600,Unheeded Science: Taking Precaution out of Tox...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Sociology and Anthropology, Univ...","Hoffman K., Department of Sociology and Anthro...","In the early 1970s, the idea of precaution-of ...",engagement; environmental practices; governanc...,01622439
675,2-s2.0-84884519014,10.1108/JEA-04-2012-0049,Article,2013-01-01,Journal of Educational Administration,Journal,51,6,836-854,41,...,Keung E.K.; Rockinson-Szapkiw A.J.,"Keung, Emerson K. (55862195200); Rockinson-Sza...",55862195200; 35194705700,The relationship between transformational lead...,https://www.scopus.com/inward/record.uri?eid=2...,"School of Education, Liberty University, Lynch...","Keung E.K.; Rockinson-Szapkiw A.J., School of ...",Purpose: The purpose of this study is to exami...,Cultural intelligence; Intercultural schools; ...,09578234
676,2-s2.0-84879053585,10.1080/13504622.2012.695013,Article,2013-01-01,Environmental Education Research,Journal,19,3,342-369,10,...,Karaarslan G.; Ertepinar H.; Sungur S.,"Karaarslan, Güliz (55766199000); Ertepinar, Ha...",55766199000; 6603281464; 55127517200,Use of self-determination theory to support ba...,https://www.scopus.com/inward/record.uri?eid=2...,"Department of Elementary Education, Aǧri Ibrah...","Karaarslan G., Department of Elementary Educat...","In this paper, we examine how the basic psycho...",basic psychological needs; environmental educa...,13504622
677,2-s2.0-84874506315,10.1177/0162243912470726,Article,2013-01-01,Science Technology and Human Values,Journal,38,2,224-249,14,...,Allen B.L.,"Allen, Barbara L. (36779333000)",36779333000,Justice as Measure of Nongovernmental Organiza...,https://www.scopus.com/inward/record.uri?eid=2...,"Virginia Tech, Washington, DC, United States","Allen B.L., Virginia Tech, Washington, DC, Uni...",Through exploring multiple contemporary concep...,engagement; environmental practices; ethics; i...,01622439


In [11]:
## 10 ## Creating new variables for analysis based on existing ones
all_publications_fin["year"] = all_publications_fin["coverDate"].apply(lambda yyyy: yyyy[:4])
all_publications_fin["year"] = all_publications_fin["year"].astype("int")
all_publications_fin[["year", "coverDate"]]

Unnamed: 0,year,coverDate
1,2023,2023-12-01
2,2023,2023-12-01
3,2023,2023-12-01
4,2023,2023-12-01
5,2023,2023-12-01
...,...,...
674,2013,2013-01-01
675,2013,2013-01-01
676,2013,2013-01-01
677,2013,2013-01-01


In [12]:
## 11 ## Just in case, saving the final Scopus database in an Excel file format called "All_publications_fin_Scopus.xlsx"
#all_publications_fin.to_excel("All_publications_fin_Scopus.xlsx")

---
## Scimago Journal Rank (SJR)

In [15]:
## 12 ## Downloading data on the ratings of scientific journals for each year in CSV file format
folder_path = "./Scimago 2013-2022 CSV"
os.makedirs(folder_path)

for year in range (2013, 2023):
    url = f"https://www.scimagojr.com/journalrank.php?year={year}&out=xls"
    response = requests.get(url)

    if response.status_code == 200:
        file_name = os.path.join(folder_path, f"Journal_Rankings_{year}_Scimago.csv")
        with open(file_name, "wb") as file:
            file.write(response.content)
        print(f"File saved at: {file_name}")
    else:
        print(f"Failed to retrieve the content. Status code: {response.status_code}")

File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2013_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2014_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2015_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2016_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2017_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2018_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2019_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2020_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2021_Scimago.csv
File saved at: ./Scimago 2013-2022 CSV/Journal_Rankings_2022_Scimago.csv


At the moment, the Scimago website has journals ratings only up to 2022, so data was collected for 2013-2022, and not for 2013-2023.

Errors occurred during further reading of the files, so we manually re-edited the downloaded CSV files into Excel format, and and saved them in a new folder "Scimago 2013-2022 XLSX".

In [16]:
## 13 ## Uploading data on journal rankings for each year
rankings_years = {}

for year in range(2013, 2023):
    file_path = f"./Scimago 2013-2022 XLSX/Journal_Rankings_{year}_Scimago.xlsx"
    rankings_years[year] = pandas.read_excel(file_path, index_col = 0)
    print(f"File for {year} opened successfully")

for year in rankings_years:
    print(f"\n--------------\nData for {year}:\n{rankings_years[year]}\n")

File for 2013 opened successfully
File for 2014 opened successfully
File for 2015 opened successfully
File for 2016 opened successfully
File for 2017 opened successfully
File for 2018 opened successfully
File for 2019 opened successfully
File for 2020 opened successfully
File for 2021 opened successfully
File for 2022 opened successfully

--------------
Data for 2013:
          Sourceid                                   Title     Type  \
Rank                                                                  
1            28773      Ca-A Cancer Journal for Clinicians  journal   
2            29719               Reviews of Modern Physics  journal   
3            20651             Annual Review of Immunology  journal   
4            18434                                    Cell  journal   
5            16801           Annual Review of Biochemistry  journal   
...            ...                                     ...      ...   
32588   5800212867  Zeitschrift fur Fremdsprachenforschung  j

---
## Final database

In [17]:
## 14 ## Merging data on publications and journals with journal ratings
all_data_fin = pandas.DataFrame()

for year in range(2013, 2023):
    rankings_data = rankings_years[year]
    one_year_data = pandas.merge(all_publications_fin[all_publications_fin["year"] == year]
                                 , rankings_data
                                 , left_on = "publicationName"
                                 , right_on= "Title"
                                 , how = "left")
    
    all_data_fin = pandas.concat([all_data_fin, one_year_data])

all_data_fin = pandas.concat([all_data_fin, all_publications_fin[all_publications_fin["year"] == 2023]])    
all_data_fin

Unnamed: 0,eid,doi,subtypeDescription,coverDate,publicationName,aggregationType,volume,issueIdentifier,pageRange,citedby_count,...,Total Docs. (2014),Total Docs. (2015),Total Docs. (2016),Total Docs. (2017),Total Docs. (2018),Total Docs. (2019),Total Docs. (2020),Total Docs. (2021),Total Docs. (2022),Title
0,2-s2.0-84891116423,,Article,2013-12-31,International Journal of Intangible Heritage,Journal,8,,19-36,24,...,,,,,,,,,,
1,2-s2.0-84890442458,10.1007/s10745-013-9614-8,Article,2013-12-01,Human Ecology,Journal,41,6,905-914,128,...,,,,,,,,,,
2,2-s2.0-84887148964,10.1007/s10668-013-9446-0,Article,2013-12-01,"Environment, Development and Sustainability",Journal,15,6,1477-1494,84,...,,,,,,,,,,
3,2-s2.0-84897839499,10.1080/1533015X.2013.876250,Article,2013-10-01,Applied Environmental Education and Communication,Journal,12,4,224-234,18,...,,,,,,,,,,
4,2-s2.0-84884132626,10.1177/1075547012470821,Article,2013-10-01,Science Communication,Journal,35,5,572-602,5,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,2-s2.0-85118639900,10.1080/10509208.2021.1996310,Article,2023-01-01,Quarterly Review of Film and Video,Journal,40,2,187-214,0,...,,,,,,,,,,The Role of Forest and Environmental Conservat...
110,2-s2.0-85114873429,10.1080/02635143.2021.1978421,Article,2023-01-01,Research in Science and Technological Education,Journal,41,3,961-982,1,...,,,,,,,,,,Teaching and environmentalism: a deduction fro...
111,2-s2.0-85108819834,10.1080/14729679.2021.1935284,Article,2023-01-01,Journal of Adventure Education and Outdoor Lea...,Journal,23,1,25-37,0,...,,,,,,,,,,Autoethnographic stories for self and environm...
112,2-s2.0-85106234545,10.1080/02508281.2021.1920755,Article,2023-01-01,Tourism Recreation Research,Journal,48,3,399-418,5,...,,,,,,,,,,Environmental attitudes and behaviour of birdw...


In [18]:
## 15 ## Just in case, checking if there are any duplicates and deleting them
all_data_fin = all_data_fin.drop_duplicates("eid")
all_data_fin

Unnamed: 0,eid,doi,subtypeDescription,coverDate,publicationName,aggregationType,volume,issueIdentifier,pageRange,citedby_count,...,Total Docs. (2014),Total Docs. (2015),Total Docs. (2016),Total Docs. (2017),Total Docs. (2018),Total Docs. (2019),Total Docs. (2020),Total Docs. (2021),Total Docs. (2022),Title
0,2-s2.0-84891116423,,Article,2013-12-31,International Journal of Intangible Heritage,Journal,8,,19-36,24,...,,,,,,,,,,
1,2-s2.0-84890442458,10.1007/s10745-013-9614-8,Article,2013-12-01,Human Ecology,Journal,41,6,905-914,128,...,,,,,,,,,,
2,2-s2.0-84887148964,10.1007/s10668-013-9446-0,Article,2013-12-01,"Environment, Development and Sustainability",Journal,15,6,1477-1494,84,...,,,,,,,,,,
3,2-s2.0-84897839499,10.1080/1533015X.2013.876250,Article,2013-10-01,Applied Environmental Education and Communication,Journal,12,4,224-234,18,...,,,,,,,,,,
4,2-s2.0-84884132626,10.1177/1075547012470821,Article,2013-10-01,Science Communication,Journal,35,5,572-602,5,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
109,2-s2.0-85118639900,10.1080/10509208.2021.1996310,Article,2023-01-01,Quarterly Review of Film and Video,Journal,40,2,187-214,0,...,,,,,,,,,,The Role of Forest and Environmental Conservat...
110,2-s2.0-85114873429,10.1080/02635143.2021.1978421,Article,2023-01-01,Research in Science and Technological Education,Journal,41,3,961-982,1,...,,,,,,,,,,Teaching and environmentalism: a deduction fro...
111,2-s2.0-85108819834,10.1080/14729679.2021.1935284,Article,2023-01-01,Journal of Adventure Education and Outdoor Lea...,Journal,23,1,25-37,0,...,,,,,,,,,,Autoethnographic stories for self and environm...
112,2-s2.0-85106234545,10.1080/02508281.2021.1920755,Article,2023-01-01,Tourism Recreation Research,Journal,48,3,399-418,5,...,,,,,,,,,,Environmental attitudes and behaviour of birdw...


In [19]:
## 16 ## Saving the final database in an Excel file format called "All_data_fin_Scopus+Scimago.xlsx"
all_data_fin.index = range(1, len(all_data_fin) + 1)
#all_data_fin.to_excel("All_data_fin_Scopus+Scimago.xlsx")