# Exploratory Data Analysis with Python
<div style="
    border: 5px solid purple;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [1]:
import pandas as pd

<div style="
    border: 3px solid purple;
    border-radius: 8px;
    padding: 12px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
     Your job is to makes sense of any dataset given and give a preliminary report.
    <ul>
      <li>What is the structure of the data?</li>
      <li>How clean is the dataset?</li>
      <li>Does it look real or was machine generated?</li>
      <li>Is it worth it to further analyse it?</li>
      <li>Are there some  interesting insights that can be pulled already?</li>
    </ul>
</div>

## The basics - Understanding a dataframe
<div style="
    border: 4px solid orange;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

<div style="
    border: 3px solid orange;
    border-radius: 8px;
    padding: 12px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
A dataframe is a "size-mutable, potentially heterogeneous tabular data. Data structure also contains labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure."
Source: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html
</div>

### Building a dataframe from a dictionary
<div style="
    border: 2px solid orange;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [None]:
mydict = {
    "names": ["Gustavo", "Henrik", "Wanja", "Carlo", "Jannik"],
    "scores": [39, 34, 40, 49, 10],
    "fav_food": ["tacos", "pasta", "cake", "döner", "ice cream"]
}

In [None]:
#pandas library
df = pd.DataFrame(mydict)

### Importing data
<div style="
    border: 2px solid orange;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [None]:
#from a csv file
df = pd.read_csv("datasets/socialmedia_engagement.csv")

In [None]:
#from an excel file --- need to install openpyxl dependency
df = pd.read_excel("datasets/happiness_2015-2019.xlsx")

In [None]:
#from github
username = "datagus"
repository = "statstutorial2025"
directory = "week5/airbnb_europe.csv"
github_url = f"https://raw.githubusercontent.com/{username}/{repository}/main/{directory}"
df = pd.read_csv(github_url)

In [80]:
#from a google spreadsheet
gsheet_id = "1wEGvOk504_wnFlv1D9Dw8IFIAaDMtwau"
url = f"https://docs.google.com/spreadsheets/d/{gsheet_id}/export?format=xlsx"
excel = pd.ExcelFile(url)
df = excel.parse("master table")

### Inspecting the structure of a dataframe
<div style="
    border: 2px solid orange;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [6]:
#how many columns and rows
df.shape

(892, 47)

In [8]:
#retriving them separately
rows_num = df.shape[0]
cols_num = df.shape[1]

print(f"This dataset contains {rows_num} rows and {cols_num} columns")

This dataset contains 892 rows and 47 columns


In [9]:
#another way to getting the number of rows, using len()
len(df)

892

In [10]:
#a more detailed overview, an information overview
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 892 entries, 0 to 891
Data columns (total 47 columns):
 #   Column                                                                        Non-Null Count  Dtype  
---  ------                                                                        --------------  -----  
 0   Downloaded (1/0)                                                              764 non-null    object 
 1   Authors                                                                       892 non-null    object 
 2   Author full names                                                             880 non-null    object 
 3   Author(s) ID                                                                  879 non-null    object 
 4   Title                                                                         891 non-null    object 
 5   Year                                                                          892 non-null    int64  
 6   Source title                      

In [12]:
#checking the first 5 rows
pd.set_option('display.max_columns', None) # to show all columns
#pd.reset_option('display.max_columns')
df.head(5)

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Is there an own subsection for policy recommendations? 1/0,Notes,Coding complete
0,1,Thiebes S.; Lins S.; Sunyaev A.,"Thiebes, Scott (56319399400); Lins, Sebastian ...",56319399400; 56318996100; 24779131200,Trustworthy artificial intelligence,2021,Electronic Markets,31.0,2,,447.0,464.0,17.0,167,10.1007/s12525-020-00441-4,https://www.scopus.com/inward/record.uri?eid=2...,artificial intelligence; deep learning; emotio...,,Artificial intelligence (AI) brings forth many...,Article,Final,,Scopus,2-s2.0-85148853990,16,1,1,overall,unspecified,2,0,Societal impact of AI,,2,,weak sustainability,conceptual,,,,,,,Germany,0.0,,
1,1,Ho J.-H.; Lee G.-G.; Lu M.-T.,"Ho, Juin-Hao (57218510909); Lee, Gwo-Guang (74...",57218510909; 7404852393; 55801461400,Exploring the implementation of a legal AI bot...,2020,Sustainability (Switzerland),12.0,15,5991,,,,8,10.3390/su12155991,https://www.scopus.com/inward/record.uri?eid=2...,business; digital; education; entrepreneur; In...,,This study explores the implementation of lega...,Article,Final,,Scopus,2-s2.0-85168710066,16,1,1,overall,unspecified,2,0,legal AI bot,,1,,weak sustainability,empirical,national,snaphot,present,quantitative,Taiwan,survey,Taiwan,0.0,,
2,1,Bartmann M.,"Bartmann, Marius (56512092600)",56512092600,The Ethics of AI-Powered Climate Nudging—How M...,2022,Sustainability (Switzerland),14.0,9,5153,,,,6,10.3390/su14095153,https://www.scopus.com/inward/record.uri?eid=2...,Environmental effects; Green transition; Miner...,Asia; Economic and social effects; Economics; ...,The number of areas in which artificial intell...,Article,Final,,Scopus,2-s2.0-85174445734,16,1,1,overall,unspecified,2,0,ethics of AI-based climate nudging,,1,,weak sustainability,conceptual,,,,,,,Germany,0.0,,
3,1,Halsband A.,"Halsband, Aurélie (57562370000)",57562370000,Sustainable AI and Intergenerational Justice,2022,Sustainability (Switzerland),14.0,7,3922,,,,6,10.3390/su14073922,https://www.scopus.com/inward/record.uri?eid=2...,SDGs; ecological sustainability; intergenerati...,,"Recently, attention has been drawn to the sust...",Article,Final,,Scopus,2-s2.0-85139602753,16,1,1,overall,unspecified,2,0,intergenerational justice,,1,intergenerational justice,strong sustainability,conceptual,,,,,,,Germany,0.0,,
4,1,Raman R.; Kumar Nair V.; Nedungadi P.; Ray I.;...,"Raman, Raghu (36618183700); Kumar Nair, Vinith...",36618183700; 57647914700; 36069838600; 5883060...,"Darkweb research: Past, present, and future tr...",2023,Heliyon,9.0,11,e22269,,,,1,10.1016/j.heliyon.2023.e22269,https://www.scopus.com/inward/record.uri?eid=2...,Artificial Intelligence; Climate Change; Energ...,India; artificial intelligence; climate change...,"The Darkweb, part of the deep web, can be acce...",Article,Final,,Scopus,2-s2.0-85185331385,16,1,1,nlp,unspecified,2,0,dark web,,1,SDG 16,strong sustainability,review,,,,mixed methods,,,India,1.0,,


In [13]:
#checking the last 5 rows
df.tail(5)

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Is there an own subsection for policy recommendations? 1/0,Notes,Coding complete
887,1,Zhang Y.; Ji Y.; Qian H.,"Zhang, Yang (57471054500); Ji, Yuanhui (572047...",57471054500; 57204776640; 55186013100,Progress in thermodynamic simulation and syste...,2021,Green Chemical Engineering,2.0,3,,266.0,283.0,17.0,26,10.1016/j.gce.2021.06.003,https://www.scopus.com/inward/record.uri?eid=2...,nil,nil,"Due to the shortage of fossil energy, biomass ...",Review,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85126637055,9,1,1,AI,,2,1,Sustainable utilization of biomass resources,,1,,weak,Review,,,,,,,China,0.0,,
888,1,Jang J.; Kyun S.,"Jang, Jiyoung; Kyun, Suna",,An Innovative Career Management Platform Empow...,2022,"Journal of Logistics, Informatics and Service ...",9.0,1,,274.0,290.0,16.0,7,10.33168/LISS.2022.0117,https://www.scopus.com/inward/record.uri?eid=2...,artificial intelligence; big data; blockchain;...,n.a.,With the advent of the fourth industrial revol...,Article,Final,,Scopus,2-s2.0-85128453304,5,1,1,artificial intelligence; big data;\nblockchain...,n.a.,2,0,customized career management platform for fema...,,1,"buzzword; ""sustainable career management of ta...",weak,conceptual,n.a.,n.a.,n.a.,n.a.,n.a.,n.a.,South Korea,0.0,,
889,1,Kabukye J.K.; Namugga J.; Mpamani C.J.; Katumb...,"Kabukye, Johnblack K.; Namugga, Jane; Mpamani,...",57205140187; 57201368167; 57403715300; 3623907...,Implementing Smartphone-Based Telemedicine for...,2023,Journal of Medical Internet Research,25.0,1,e45132,,,,0,10.2196/45132,https://www.scopus.com/inward/record.uri?eid=2...,cervical cancer; cervicography; digital health...,Artificial Intelligence; Early Detection of Ca...,"Background: In Uganda, cervical cancer (CaCx) ...",Article,Final,All Open Access; Gold Open Access; Green Open ...,Scopus,2-s2.0-85174748358,5,1,1,AI; supervised learning,n.a.,1,0,smartphone-based store-and-forward telemedicin...,Forecasting,1,buzzword; longevity,weak,empirical,regional,snapshot in time,present,qualitative,Uganda,n.a.,Sweden; Uganda,0.0,,
890,1,Deng M.; Liu Y.; Chen L.,"Deng, Meizhen; Liu, Yimeng; Chen, Ling",58606568400; 58605588800; 57700546300,AI-driven innovation in ethnic clothing design...,2023,Electronic Research Archive,31.0,9,,5793.0,5814.0,21.0,0,10.3934/era.2023295,https://www.scopus.com/inward/record.uri?eid=2...,Artificial Intelligence; cultural preservation...,n.a.,This study delves into the innovative applicat...,Article,Final,All Open Access; Gold Open Access; Green Open ...,Scopus,2-s2.0-85171645006,5,1,1,AI; unsupervised learning; ML; Natural Langua...,Multimodal Unsupervised Image-to-Image Transla...,2,0,application of Artificial Intelligence (AI) an...,,1,"buzzword; ""sustainable development of ethnic f...",weak,empirical,local,snapshot in time,present,mixed methods,Biasha,n.a.,China,0.0,,
891,1,Zhu X.; Yao Q.; Dai W.; Ji L.; Yao Y.; Pang B....,"Zhu, Xingce; Yao, Qiang; Dai, Wei; Ji, Lu; Yao...",58221931100; 55588525000; 56366749800; 5684438...,Cervical cancer screening aided by artificial ...,2023,Bulletin of the World Health Organization,101.0,6,,381.0,390.0,9.0,0,10.2471/BLT.22.289061,https://www.scopus.com/inward/record.uri?eid=2...,,,Objective To implement and evaluate a large-sc...,Article,Final,All Open Access; Bronze Open Access; Green Ope...,Scopus,2-s2.0-85160969682,5,1,1,Ai,n.a.,1,0,online cervical cancer screening programme usi...,Forecasting,1,buzzword,weak,empirical,regional,snapshot in time,present,quantitative,Hubei Province China,,China,0.0,,


In [16]:
#checking a random slice of the dataframe
df.sample(9)

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Is there an own subsection for policy recommendations? 1/0,Notes,Coding complete
701,,Fan D.; Su X.; Weng B.; Wang T.; Yang F.,"Fan, Dongliang (57190302955); Su, Xiaoyun (572...",57190302955; 57204881851; 57751948400; 5775091...,Research Progress on Remote Sensing Classifica...,2021,AgriEngineering,3.0,4.0,,971.0,989.0,18.0,5,10.3390/agriengineering3040061,https://www.scopus.com/inward/record.uri?eid=2...,nil,nil,Crop planting area and spatial distribution in...,Review,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85132262086,2,1,1,Machine Learning,Neural Network,1,0,Farm management,Data mining and Remote sensing,2,,Weak,Empirical,Global,Longitudinal,Present,,China,Internet,China,0.0,,
655,1.0,Donisi L.; Cesarelli G.; Pisani N.; Ponsiglion...,"Donisi, Leandro (57212085515); Cesarelli, Gius...",57212085515; 57212086815; 58033391800; 5634897...,Wearable Sensors and Artificial Intelligence f...,2022,Diagnostics,12.0,12.0,3048.0,,,,8,10.3390/diagnostics12123048,https://www.scopus.com/inward/record.uri?eid=2...,,,Physical ergonomics has established itself as ...,Review,Final,All Open Access; Gold Open Access; Green Open ...,Scopus,2-s2.0-85144871876,3,1,1,artificial intelligence; machine learning; dee...,ensemble classifiers; Support Vector Machines;...,2,0,physical ergonomics,,1,,weak sustainability,review,,,,qualitative,,,Italy,0.0,,
823,1.0,MANN S.; HUBERT M.,"MANN, SUPREET (57208395499); HUBERT, MARTIN (2...",57208395499; 26026927500,AI4D: Artificial Intelligence for Development,2020,International Journal of Communication,14.0,,,4385.0,4405.0,20.0,3,,https://www.scopus.com/inward/record.uri?eid=2...,nil,nil,We derive a conceptual bridge between technica...,Article,Final,,Scopus,2-s2.0-85099547866,18,1,1,"AI, deep learning",,2,0,Role of AI for International Development,,1,,weak,Conceptual,,,,,,,USA,0.0,,
505,1.0,Nti E.K.; Cobbina S.J.; Attafuah E.E.; Opoku E...,"Nti, Emmanuel Kwame (24338860400); Cobbina, Sa...",24338860400; 26654761400; 57501195900; 5749367...,Environmental sustainability technologies in b...,2022,Sustainable Futures,4.0,,100068.0,,,,18,10.1016/j.sftr.2022.100068,https://www.scopus.com/inward/record.uri?eid=2...,-,-,Artificial Intelligence (AI) has become an imp...,Review,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85126586294,7,1,1,overall,unspecified,1,0,AI used in sustainable technologies,all,1,AI for environmental sustainability,strong,review,n.a.,n.a.,n.a.,n.a.,n.a.,n.a.,Ghana,0.0,,
45,1.0,Kucuker D.M.; Baskent E.Z.,"Kucuker, Derya Mumcu (37112952500); Baskent, E...",37112952500; 6701704098,Impact of forest management intensity on mushr...,2017,Forest Ecology and Management,389.0,,,240.0,248.0,8.0,11,10.1016/j.foreco.2016.12.035,https://www.scopus.com/inward/record.uri?eid=2...,,,Well-researched and sound integration of non-w...,Article,Final,,Scopus,2-s2.0-85008975730,15,1,1,supervised learning,decision support system,1,0,Impact of forest management intensity on mushr...,Fast approximate simulation,1,values,medium,empirical,local,longitudinal study,future,quantitative,Turkey,"field inventory, forest stand parameters, mark...",Turkey,0.0,,1.0
564,1.0,Thangavel K.; Spiller D.; Sabatini R.; Marzocc...,"Thangavel, Kathiravan (57222007577); Spiller, ...",57222007577; 57225928954; 56962744800; 3560734...,Near Real-Time Wildfire Management Using Distr...,2023,IEEE Geoscience and Remote Sensing Letters,20.0,,5500705.0,,,,13,10.1109/LGRS.2022.3229173,https://www.scopus.com/inward/record.uri?eid=2...,Climate action (SDG-13) is an integral part of...,1-D convolutional neural network (CNN); climat...,Australia; Climate change; Data handling; Defo...,Article,Final,All Open Access; Hybrid Gold Open Access,Scopus,2-s2.0-85144780070,13,Yes,Yes,artificial intelligence,Convolutional Neural Networks (CNN),2,0,space based detection of fire by using AI with...,Data mining and remote sensing,2,SDGs,medium,empirical,,snapshot in time,present,quantitative,,PRISMA data,Australia,0.0,,
497,1.0,Do T.-T.-H.; Schnitzer H.; Le T.-H.,"Do, Thi-Thu-Huyen (56166847900); Schnitzer, Ha...",56166847900; 7006833141; 57191503667,A decision support framework considering susta...,2014,Journal of Cleaner Production,78.0,,,112.0,120.0,8.0,22,10.1016/j.jclepro.2014.04.044,https://www.scopus.com/inward/record.uri?eid=2...,-,-,A specific combination of rule-based technique...,Article,Final,,Scopus,2-s2.0-84904259753,7,1,1,Decision Tree,fuzzy analytic hierarchy process,1,0,decision support to select food processes like...,System optimization,1,,weak,empirical,local,snapshot in time,present,quantitative,Austria,database of food products and their thermal pr...,Austria,0.0,,
92,1.0,Allam Z.; Dhunny Z.A.,"Allam, Zaheer (57205544315); Dhunny, Zaynah A....",57205544315; 57205545550,"On big data, artificial intelligence and smart...",2019,Cities,89.0,,,80.0,91.0,11.0,555,10.1016/j.cities.2019.01.032,https://www.scopus.com/inward/record.uri?eid=2...,Cities are increasingly turning towards specia...,Artificial intelligence; Big data; Internet of...,artificial intelligence; conceptual framework;...,Article,Final,,Scopus,2-s2.0-85060456132,11,1,1,AI;machine learning,artifical neural networks;fuzzy networks,2,0,"smart city; big data, IoT and blockchain",System optimization,1,"potentiality of building more sustainable, saf...",strong,review,N.A.,N.A.,N.A.,N.A.,N.A.,N.A.,Australia,0.0,,
107,1.0,Zahmatkesh H.; Al-Turjman F.,"Zahmatkesh, Hadi (56541718800); Al-Turjman, Fa...",56541718800; 20336944100,Fog computing for sustainable smart cities in ...,2020,Sustainable Cities and Society,59.0,,102139.0,,,,118,10.1016/j.scs.2020.102139,https://www.scopus.com/inward/record.uri?eid=2...,"In recent decade, the number of devices involv...",Caching; Fog computing; IoT; Machine learning;...,Antennas; Artificial intelligence; Energy effi...,Article,Final,,Scopus,2-s2.0-85084182624,11,1,1,AI; ML,N.A.,2,1,"smart city; big data, IoT and blockchain",,1,Sustainability is referred to the use of renew...,strong,conceptual,N.A.,N.A.,N.A.,N.A.,N.A.,N.A.,Norway,0.0,deleted in SDG7,


In [17]:
#checking the columns names
df.columns

Index(['Downloaded (1/0)', 'Authors', 'Author full names', 'Author(s) ID',
       'Title', 'Year', 'Source title', 'Volume', 'Issue', 'Art. No.',
       'Page start', 'Page end', 'Page count', 'Cited by', 'DOI', 'Link',
       'Author Keywords', 'Index Keywords', 'Abstract', 'Document Type',
       'Publication Stage', 'Open Access', 'Source', 'EID', 'SDG',
       'AI (yes/no)', 'Sustainability (yes/no)',
       'Type of AI \ninductive, text snippet',
       'Algorithm(s) used\ninductive, text snippet',
       'Method (1) vs. study object (2)', 'AI as buzzword? (0/1)',
       'Core topic (only one)\nshort text snippet, inductive',
       'Role of AI\ndeductive ', 'Means (1) vs. end (2)',
       'sustainability definition', 'Sus_lvl', 'empirical/conceptual/review',
       'spatial scale (individual, local, regional, national, supranational, global)',
       'snapshot in time vs. longitudinal study',
       'temporal scale (past, present, future)',
       'qualitative/quantitative/mixed 

In [23]:
##getting the index
df.index

RangeIndex(start=0, stop=892, step=1)

In [21]:
#getting some descriptive statistics for numeric
df.describe()

Unnamed: 0,Year,Volume,Page count,Cited by,SDG,Is there an own subsection for policy recommendations? 1/0,Coding complete
count,892.0,860.0,365.0,892.0,892.0,850.0,41.0
mean,2020.880045,92.473256,15.583562,40.318386,8.165919,0.145882,1.0
std,2.754888,229.278177,8.631132,73.286039,4.722542,0.353196,0.0
min,1998.0,1.0,3.0,0.0,1.0,0.0,1.0
25%,2020.0,12.0,10.0,6.0,4.0,0.0,1.0
50%,2022.0,21.0,14.0,19.0,7.0,0.0,1.0
75%,2023.0,72.0,19.0,41.0,12.0,0.0,1.0
max,2024.0,2023.0,68.0,798.0,18.0,1.0,1.0


In [22]:
#getting some descriptive statistics for categories or object data types
df.describe(include="object")

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Source title,Issue,Art. No.,Page start,Page end,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Notes
count,764,892,880,879,891,892,522,527,366,366,886,804,635,582,861,881,881,560,881,892,891,891,851,509,845,811,855,576,846,489,841,856,562,627,624,631,510,566,855,97
unique,3,868,857,853,875,444,67,509,306,327,870,789,353,298,846,2,2,8,1,873,5,7,286,361,6,3,735,48,7,223,12,18,26,16,22,20,119,312,107,41
top,1,Mhlanga D.,"Mhlanga, David (57218104204)",57218104204,A review of the Artificial Intelligence (AI) b...,Sustainability (Switzerland),1,14,1,171,10.1016/j.compag.2023.107836,https://www.scopus.com/inward/record.uri?eid=2...,nil,nil,[No abstract available],Article,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85148853990,1,1,AI,n.a.,2,0,Smart farming,System optimization,1,SDGs,weak,empirical,local,snapshot,present,quantitative,n.a.,n.a.,China,"Duplicate, deleted in SDG9"
freq,758,4,4,4,2,105,74,4,17,4,2,2,200,200,4,677,857,204,881,2,722,713,281,36,538,707,17,150,663,75,455,332,131,307,365,328,86,82,94,15


### Quality of the dataframe
<div style="
    border: 2px solid orange;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [24]:
#checking duplicates in the dataframe
df.duplicated().sum()
print(int(df.duplicated().sum()))

0


In [25]:
#how many missing values
df.isna().sum()

Downloaded (1/0)                                                                128
Authors                                                                           0
Author full names                                                                12
Author(s) ID                                                                     13
Title                                                                             1
Year                                                                              0
Source title                                                                      0
Volume                                                                           32
Issue                                                                           370
Art. No.                                                                        365
Page start                                                                      526
Page end                                                                    

In [28]:
#checking the datatypes
df.dtypes

Downloaded (1/0)                                                                 object
Authors                                                                          object
Author full names                                                                object
Author(s) ID                                                                     object
Title                                                                            object
Year                                                                              int64
Source title                                                                     object
Volume                                                                          float64
Issue                                                                            object
Art. No.                                                                         object
Page start                                                                       object
Page end                        

In [31]:
#how many unique values are in the Post Type column
df["SDG"].unique()

array([16, 14, 13, 15,  8, 10,  5, 11,  6, 12,  4,  7,  1,  3,  2, 18,  9])

In [32]:
#How many unique values are in the Post Content column
df["SDG"].nunique()

17

In [33]:
#getting a contingency table of SDGs
df["SDG"].value_counts()

SDG
4     100
3      96
6      91
2      91
11     86
13     79
7      75
12     71
9      62
18     48
15     30
14     18
16     13
8      11
5       8
1       8
10      5
Name: count, dtype: int64

In [40]:
#saving the table into a variablel
study_locations = df["location of the study (country)"].value_counts()

In [41]:
#converting the object into a Data Frame
slocations_df = pd.DataFrame(study_locations)
slocations_df

Unnamed: 0_level_0,count
location of the study (country),Unnamed: 1_level_1
n.a.,86
N.A.,53
China,40
India,28
Iran,23
...,...
Denmark,1
Northern region of Colombia,1
state of Zacatecas ​in Mexico,1
"office building, Beijing, China;ISO New England",1


In [42]:
#reseting the index
slocations_df = slocations_df.reset_index(drop=False)
slocations_df

Unnamed: 0,location of the study (country),count
0,n.a.,86
1,N.A.,53
2,China,40
3,India,28
4,Iran,23
...,...,...
114,Denmark,1
115,Northern region of Colombia,1
116,state of Zacatecas ​in Mexico,1
117,"office building, Beijing, China;ISO New England",1


## Dataframe Operations
<div style="
    border: 4px solid green;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

### Modifying the index
<div style="
    border: 2px solid green;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [44]:
#making a copy of your dataset. Recommended especially if you are modifying the original df
copy_df = df.copy()

In [45]:
# putting a custom index, for example that 
#for example, starting from 100, you need to make sure, your index fits the lenght of rows
copy_df.index = range(1, len(df)+1)
copy_df.head(5)

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Is there an own subsection for policy recommendations? 1/0,Notes,Coding complete
1,1,Thiebes S.; Lins S.; Sunyaev A.,"Thiebes, Scott (56319399400); Lins, Sebastian ...",56319399400; 56318996100; 24779131200,Trustworthy artificial intelligence,2021,Electronic Markets,31.0,2,,447.0,464.0,17.0,167,10.1007/s12525-020-00441-4,https://www.scopus.com/inward/record.uri?eid=2...,artificial intelligence; deep learning; emotio...,,Artificial intelligence (AI) brings forth many...,Article,Final,,Scopus,2-s2.0-85148853990,16,1,1,overall,unspecified,2,0,Societal impact of AI,,2,,weak sustainability,conceptual,,,,,,,Germany,0.0,,
2,1,Ho J.-H.; Lee G.-G.; Lu M.-T.,"Ho, Juin-Hao (57218510909); Lee, Gwo-Guang (74...",57218510909; 7404852393; 55801461400,Exploring the implementation of a legal AI bot...,2020,Sustainability (Switzerland),12.0,15,5991,,,,8,10.3390/su12155991,https://www.scopus.com/inward/record.uri?eid=2...,business; digital; education; entrepreneur; In...,,This study explores the implementation of lega...,Article,Final,,Scopus,2-s2.0-85168710066,16,1,1,overall,unspecified,2,0,legal AI bot,,1,,weak sustainability,empirical,national,snaphot,present,quantitative,Taiwan,survey,Taiwan,0.0,,
3,1,Bartmann M.,"Bartmann, Marius (56512092600)",56512092600,The Ethics of AI-Powered Climate Nudging—How M...,2022,Sustainability (Switzerland),14.0,9,5153,,,,6,10.3390/su14095153,https://www.scopus.com/inward/record.uri?eid=2...,Environmental effects; Green transition; Miner...,Asia; Economic and social effects; Economics; ...,The number of areas in which artificial intell...,Article,Final,,Scopus,2-s2.0-85174445734,16,1,1,overall,unspecified,2,0,ethics of AI-based climate nudging,,1,,weak sustainability,conceptual,,,,,,,Germany,0.0,,
4,1,Halsband A.,"Halsband, Aurélie (57562370000)",57562370000,Sustainable AI and Intergenerational Justice,2022,Sustainability (Switzerland),14.0,7,3922,,,,6,10.3390/su14073922,https://www.scopus.com/inward/record.uri?eid=2...,SDGs; ecological sustainability; intergenerati...,,"Recently, attention has been drawn to the sust...",Article,Final,,Scopus,2-s2.0-85139602753,16,1,1,overall,unspecified,2,0,intergenerational justice,,1,intergenerational justice,strong sustainability,conceptual,,,,,,,Germany,0.0,,
5,1,Raman R.; Kumar Nair V.; Nedungadi P.; Ray I.;...,"Raman, Raghu (36618183700); Kumar Nair, Vinith...",36618183700; 57647914700; 36069838600; 5883060...,"Darkweb research: Past, present, and future tr...",2023,Heliyon,9.0,11,e22269,,,,1,10.1016/j.heliyon.2023.e22269,https://www.scopus.com/inward/record.uri?eid=2...,Artificial Intelligence; Climate Change; Energ...,India; artificial intelligence; climate change...,"The Darkweb, part of the deep web, can be acce...",Article,Final,,Scopus,2-s2.0-85185331385,16,1,1,nlp,unspecified,2,0,dark web,,1,SDG 16,strong sustainability,review,,,,mixed methods,,,India,1.0,,


In [46]:
# if you want to reset the index
copy_df = copy_df.reset_index(drop=True) #if you use drop=False the index will be a new column in your dataframe
copy_df.head()

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Is there an own subsection for policy recommendations? 1/0,Notes,Coding complete
0,1,Thiebes S.; Lins S.; Sunyaev A.,"Thiebes, Scott (56319399400); Lins, Sebastian ...",56319399400; 56318996100; 24779131200,Trustworthy artificial intelligence,2021,Electronic Markets,31.0,2,,447.0,464.0,17.0,167,10.1007/s12525-020-00441-4,https://www.scopus.com/inward/record.uri?eid=2...,artificial intelligence; deep learning; emotio...,,Artificial intelligence (AI) brings forth many...,Article,Final,,Scopus,2-s2.0-85148853990,16,1,1,overall,unspecified,2,0,Societal impact of AI,,2,,weak sustainability,conceptual,,,,,,,Germany,0.0,,
1,1,Ho J.-H.; Lee G.-G.; Lu M.-T.,"Ho, Juin-Hao (57218510909); Lee, Gwo-Guang (74...",57218510909; 7404852393; 55801461400,Exploring the implementation of a legal AI bot...,2020,Sustainability (Switzerland),12.0,15,5991,,,,8,10.3390/su12155991,https://www.scopus.com/inward/record.uri?eid=2...,business; digital; education; entrepreneur; In...,,This study explores the implementation of lega...,Article,Final,,Scopus,2-s2.0-85168710066,16,1,1,overall,unspecified,2,0,legal AI bot,,1,,weak sustainability,empirical,national,snaphot,present,quantitative,Taiwan,survey,Taiwan,0.0,,
2,1,Bartmann M.,"Bartmann, Marius (56512092600)",56512092600,The Ethics of AI-Powered Climate Nudging—How M...,2022,Sustainability (Switzerland),14.0,9,5153,,,,6,10.3390/su14095153,https://www.scopus.com/inward/record.uri?eid=2...,Environmental effects; Green transition; Miner...,Asia; Economic and social effects; Economics; ...,The number of areas in which artificial intell...,Article,Final,,Scopus,2-s2.0-85174445734,16,1,1,overall,unspecified,2,0,ethics of AI-based climate nudging,,1,,weak sustainability,conceptual,,,,,,,Germany,0.0,,
3,1,Halsband A.,"Halsband, Aurélie (57562370000)",57562370000,Sustainable AI and Intergenerational Justice,2022,Sustainability (Switzerland),14.0,7,3922,,,,6,10.3390/su14073922,https://www.scopus.com/inward/record.uri?eid=2...,SDGs; ecological sustainability; intergenerati...,,"Recently, attention has been drawn to the sust...",Article,Final,,Scopus,2-s2.0-85139602753,16,1,1,overall,unspecified,2,0,intergenerational justice,,1,intergenerational justice,strong sustainability,conceptual,,,,,,,Germany,0.0,,
4,1,Raman R.; Kumar Nair V.; Nedungadi P.; Ray I.;...,"Raman, Raghu (36618183700); Kumar Nair, Vinith...",36618183700; 57647914700; 36069838600; 5883060...,"Darkweb research: Past, present, and future tr...",2023,Heliyon,9.0,11,e22269,,,,1,10.1016/j.heliyon.2023.e22269,https://www.scopus.com/inward/record.uri?eid=2...,Artificial Intelligence; Climate Change; Energ...,India; artificial intelligence; climate change...,"The Darkweb, part of the deep web, can be acce...",Article,Final,,Scopus,2-s2.0-85185331385,16,1,1,nlp,unspecified,2,0,dark web,,1,SDG 16,strong sustainability,review,,,,mixed methods,,,India,1.0,,


### Dropping row and columns and renaming them
<div style="
    border: 2px solid green;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [48]:
#let's delete some columns, for example the first column
df = df.drop(columns="Downloaded (1/0)")

In [51]:
# if you want to delete some columns
df = df.drop(df.columns[6:12], axis=1)

In [56]:
#your want to drop several columns
columns_to_drop = ["Author Keywords", "Coding complete"]
df = df.drop(columns=columns_to_drop)

In [77]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 865 entries, 10 to 891
Data columns (total 38 columns):
 #   Column                                                                        Non-Null Count  Dtype  
---  ------                                                                        --------------  -----  
 0   Authors                                                                       865 non-null    object 
 1   Author full names                                                             854 non-null    object 
 2   Author(s) ID                                                                  853 non-null    object 
 3   Title                                                                         864 non-null    object 
 4   Year                                                                          865 non-null    int64  
 5   Source title                                                                  865 non-null    object 
 6   Cited by                              

In [60]:
#dropping rows based on the index
df = df.drop(df.index[0:10], axis=0)

In [73]:
#checking duplicated rows based on a column, for example EID
df.duplicated(subset="EID").sum()

np.int64(44)

In [84]:
#which are those duplicated
df.loc[df.duplicated(subset="EID", keep=False),]

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,location of the study (country),Dataset used (inductive),country of the institution of the first author,Is there an own subsection for policy recommendations? 1/0,Notes,Coding complete


In [83]:
# dropping duplicates but from an specific column
df = df.drop_duplicates(subset="EID")

In [87]:
#checking missing values in the abstract column
df["Abstract"].isna()

0      False
1      False
2      False
3      False
4      False
       ...  
886    False
888    False
889    False
890    False
891    False
Name: Abstract, Length: 873, dtype: bool

In [89]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 842 entries, 0 to 841
Data columns (total 47 columns):
 #   Column                                                                        Non-Null Count  Dtype  
---  ------                                                                        --------------  -----  
 0   Downloaded (1/0)                                                              720 non-null    object 
 1   Authors                                                                       842 non-null    object 
 2   Author full names                                                             831 non-null    object 
 3   Author(s) ID                                                                  830 non-null    object 
 4   Title                                                                         841 non-null    object 
 5   Year                                                                          842 non-null    int64  
 6   Source title                      

In [88]:
# dropping all rows with a  missing values
df = df.dropna(axis=0, subset="Abstract")

In [90]:
#renaming columns
df = df.rename(columns={"Is there an own subsection for policy recommendations? 1/0": "policy_recommendations"}) #yes, dictionaries!!!

In [95]:
df.columns

Index(['Downloaded (1/0)', 'Authors', 'Author full names', 'Author(s) ID',
       'Title', 'Year', 'Source title', 'Volume', 'Issue', 'Art. No.',
       'Page start', 'Page end', 'Page count', 'Cited by', 'DOI', 'Link',
       'Author Keywords', 'Index Keywords', 'Abstract', 'Document Type',
       'Publication Stage', 'Open Access', 'Source', 'EID', 'SDG',
       'AI (yes/no)', 'Sustainability (yes/no)',
       'Type of AI \ninductive, text snippet',
       'Algorithm(s) used\ninductive, text snippet',
       'Method (1) vs. study object (2)', 'AI as buzzword? (0/1)',
       'Core topic (only one)\nshort text snippet, inductive',
       'Role of AI\ndeductive ', 'Means (1) vs. end (2)',
       'sustainability definition', 'Sus_lvl', 'empirical/conceptual/review',
       'spatial scale (individual, local, regional, national, supranational, global)',
       'snapshot in time vs. longitudinal study',
       'temporal scale (past, present, future)',
       'qualitative/quantitative/mixed 

In [92]:
#you can also rename the column based on the position
columns_to_rename = {df.columns[-4]: "country_instituton"}
df = df.rename(columns=columns_to_rename)

In [94]:
#you can rename several columns at once
columns_to_rename = {df.columns[-6]: "study_location", df.columns[-5]: "dataset_used"}
df = df.rename(columns=columns_to_rename)

## Index and Slicing
<div style="
    border: 4px solid blue;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [None]:
#selecting a column
df["organization"]

In [None]:
#use double square brackets to be shown with df format
df[["organization"]]

In [None]:
#selecting rows by index position
df.iloc[2]

In [None]:
# or use double square brackets
df.iloc[[2]]

In [None]:
#selecting the first 20 rows with all columns
df.iloc[:20,:]

In [None]:
#selecting the first 10 rows with the first five columns
df.iloc[:10, :5]

In [None]:
#select the last 10 rows with the columns 3 to 8
df.iloc[-10:,3:9]

In [None]:
#selecting rows by label and index
df.loc[5:25, ["EID", "SDG"]]

In [108]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 842 entries, 0 to 841
Data columns (total 47 columns):
 #   Column                                                                        Non-Null Count  Dtype  
---  ------                                                                        --------------  -----  
 0   Downloaded (1/0)                                                              720 non-null    object 
 1   Authors                                                                       842 non-null    object 
 2   Author full names                                                             831 non-null    object 
 3   Author(s) ID                                                                  830 non-null    object 
 4   Title                                                                         841 non-null    object 
 5   Year                                                                          842 non-null    int64  
 6   Source title                      

In [100]:
# conditional selection, for example all articles with more than 40 citations
df.loc[df["Cited by"]<40,]

617

In [107]:
# transforming first SDG columnt to object
df["SDG"] = df["SDG"].astype(str)

In [110]:
# selecting only articles from SDG 10 and 5
df.loc[df["SDG"].isin(["5","10"]),]

Unnamed: 0,Downloaded (1/0),Authors,Author full names,Author(s) ID,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Author Keywords,Index Keywords,Abstract,Document Type,Publication Stage,Open Access,Source,EID,SDG,AI (yes/no),Sustainability (yes/no),"Type of AI \ninductive, text snippet","Algorithm(s) used\ninductive, text snippet",Method (1) vs. study object (2),AI as buzzword? (0/1),"Core topic (only one)\nshort text snippet, inductive",Role of AI\ndeductive,Means (1) vs. end (2),sustainability definition,Sus_lvl,empirical/conceptual/review,"spatial scale (individual, local, regional, national, supranational, global)",snapshot in time vs. longitudinal study,"temporal scale (past, present, future)",qualitative/quantitative/mixed methods,study_location,dataset_used,country_instituton,policy_recommendations,Notes,Coding complete
327,1,Abalkheel A.,"Abalkheel, Albatool (57296805400)",57300000000,AMALGAMATING BLOOM'S TAXONOMY AND ARTIFICIAL I...,2021,International Journal of English Language and ...,11.0,1,,16,30,14.0,12,10.18488/5019.v11i1.4409,,Artificial intelligence; Bloom's taxonomy; COV...,,"During the coronavirus pandemic, remote learni...",Article,Final,All Open Access; Bronze Open Access,Scopus,2-s2.0-85124629478,4,yes,yes,natural language processing,,2,0,potential of AI and learning taxonomy on EFL f...,system optimization?,1,"SDG 4, lifelong automated learning, accessibility",weak,review,,,,,,,Saudi Arabia,0.0,,
328,1,Aguilar-Esteva V.; Acosta-Banda A.; Carreño Ag...,"Aguilar-Esteva, Verónica (57222181545); Acosta...",57222181545; 57222189876; 56258612100; 5625923...,Sustainable Social Development through the Use...,2023,Sustainability (Switzerland),15.0,8,6498,,,,4,10.3390/su15086498,,artificial intelligence; big data; data scienc...,artificial intelligence; education; learning; ...,"In this paper, we aimed to investigate how sus...",Review,Final,All Open Access; Gold Open Access; Green Open ...,Scopus,2-s2.0-85156127798,4,yes,yes,"stays general, examples",,2,0,Higher Education Innovations,system optimization?,1,"""Therefore, sustainable social development is ...",weak,review,,,,,,,Mexico,0.0,,
329,1,Alhazmi S.; Khan S.; Syed M.H.,"Alhazmi, Samah (57209568391); Khan, Shahnawaz ...",57209568391; 57203386251; 36011645900,"Learning-Related Sentiment Detection, Classifi...",2023,Intelligent Automation and Soft Computing,36.0,3,,3487,3499,12.0,1,10.32604/iasc.2023.036297,,AI modeling; deep learning; optimization; sent...,,Quality education is one of the primary object...,Article,Final,All Open Access; Hybrid Gold Open Access,Scopus,2-s2.0-85150761407,4,yes,yes,semi-supervised learning; transfer learning,XLNet neural network model; Adam optimizer,1,0,Using AI to analyze learner sentiment for impr...,data mining and remote sensing,1,,weak,empirical,,snapshot,present,quantitative,,own dataset (1188 data points),Saudi Arabia,0.0,,
330,1,Ally M.; Perris K.,"Ally, Mohamed (7003506405); Perris, Kirk (6506...",7003506405; 6506383188,Artificial Intelligence in the Fourth Industri...,2022,Canadian Journal of Learning and Technology,48.0,4,,,,,0,10.21432/cjlt28287,,4IR; Artificial intelligence; Fourth Industria...,,There has been increasing interest in the use ...,Article,Final,All Open Access; Gold Open Access; Green Open ...,Scopus,2-s2.0-85143217815,4,yes,yes,"stays general, ML",,2,0,role of AI (4IR) in Education and Sustainable ...,system optimization?,1,"""Ultimately the intent is for all sectors of s...",strong,review,,,,,,,Canada,0.0,,
331,1,Alshahrani A.,"Alshahrani, Ali (57190291747)",57200000000,The impact of ChatGPT on blended learning: Cur...,2023,International Journal of Data and Network Science,7.0,4,,2029,2040,11.0,12,10.5267/j.ijdns.2023.6.010,,Artificial Intelligence; Blended Learning; Cha...,,Designing sustainable and scalable educational...,Article,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85168959579,4,yes,yes,"natural language processing, ML (ChatGPT)",,2,0,potential of AI-powered tools to improve the s...,increase effectiveness of blended learning,1,"""Sustainable development encompasses three dim...",strong,review,,,,,,,Saudi Arabia,1.0,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
641,1,Salem K.S.; Clayson K.; Salas M.; Haque N.; Ra...,,,A critical review of existing and emerging tec...,2023,Matter,,,,,,,3,10.1016/j.matt.2023.08.003,,,,Solid waste generation and its accumulation is...,,,,,2-s2.0-85172726264,3,1,1,deep learning; AI; machine learning; neural ne...,,1,0,solid waste management,,1,,weak sustainability,conceptual,,,,,,,USA,0.0,,
642,1,Suha S.A.; Sanam T.F.,"Suha, Sayma Alam (57202155723); Sanam, Tahsina...",57202155723; 54974102000,Exploring dominant factors for ensuring the su...,2023,International Journal of Information Managemen...,3.0,1,100170,,,,4,10.1016/j.jjimei.2023.100170,https://www.scopus.com/inward/record.uri?eid=2...,,,Healthcare decision-making is a complicated as...,Article,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85150763307,3,1,1,AI,,2,0,Healthcare decision making,,1,three pillars; Brundtland,,empirical,national,snapshot,present,quantitative,Bangladesh,survey,Bangladesh,0.0,,
643,1,Manoj Kumar M.V.; Sastry N.K.B.; Moonesar I.A....,"Manoj Kumar, M.V. (57218772075); Sastry, Nanda...",57218772075; 37052785000; 57200596537; 5545360...,Predicting Universal Healthcare Through Health...,2022,Frontiers in Artificial Intelligence,5.0,,887225,,,,4,10.3389/frai.2022.887225,https://www.scopus.com/inward/record.uri?eid=2...,,,The majority of the world's population is stil...,Article,Final,All Open Access; Gold Open Access; Green Open ...,Scopus,2-s2.0-85130230486,3,1,1,Machine Learning,ML Random Forest Tree method,1,0,Universal Health Coverage,Fast approximate simulation,1,SDGs,strong sustainability,empirical,national,Longitudinal,present,quantitative,Brazil; Russia; India; China; South Africa; UK...,WHO data; World Bank data,India,0.0,,
644,1,Yao K.-C.; Hsueh H.-W.; Huang M.-H.; Wu T.-C.,"Yao, Kai-Chao (23399061700); Hsueh, Hsiu-Wen (...",23399061700; 57206857155; 56301257600; 5720407...,The Role of GARCH Effect on the Prediction of ...,2022,Sustainability (Switzerland),14.0,8,4459,,,,4,10.3390/su14084459,https://www.scopus.com/inward/record.uri?eid=2...,,,Air pollution prediction is an important issue...,Article,Final,All Open Access; Gold Open Access,Scopus,2-s2.0-85128475035,3,1,1,Machine Learning,GA-SVM (genetic algorithm support vector machi...,1,0,Air Pollution,Forecasting,1,,weak sustainability,empirical,Regional,Longitudinal,present,quantitative,Taiwan,hard sensors,Taiwan,0.0,,


## Plotting
<div style="
    border: 4px solid red;
    border-radius: 8px;
    padding: 0px;
    margin: 10px 0;
    background-color: inherit;
    color: inherit;
">
</div>

In [None]:
#!conda install -y packagename
import seaborn as sns
import matplotlib.pyplot as plt

In [None]:
df_counts

In [None]:
# Create bar plot
plt.bar("Platform", "count", data=df_counts)

# Add labels
plt.title("Basic Bar Plot")
plt.xlabel("Platform")
plt.ylabel("count")

# Show the plot
plt.show()