<a href="https://colab.research.google.com/github/mlproyecto/mappingreview/blob/main/mapping_review_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Instalación librerías extra
!pip install pandas matplotlib tabulate




In [2]:
import pandas as pd

# URL cruda de tu CSV en GitHub
RAW_URL = "https://raw.githubusercontent.com/mlproyecto/mappingreview/main/data/CorpusMappingReview.csv"

df = pd.read_csv(RAW_URL, sep=';', encoding='latin-1', engine='python')

# Fija los años (quita el cero sobrante)
def fix_year(v):
    try:
        s = str(int(v))
    except:
        s = str(v)
    if len(s)==5 and s.endswith('0'):
        s = s[:-1]
    return int(s)

df['year'] = df['year'].apply(fix_year)

# Excluir conference reviews si quedan
mask = df['document_type'].str.contains('conference review', case=False, na=False)
df = df[~mask].reset_index(drop=True)

print(f"Cargados {len(df)} estudios finalizados.")
df.head()


Cargados 69 estudios finalizados.


Unnamed: 0,title,author,journal,year,source,pages,volume,abstract,document_type,doi,url,selection_criteria,status,Keywords,domain_suggested
0,A Spatiotemporal Analysis of Teacher Practices...,"Karumbaiah, Shamya and Borchers, Conrad and Sh...",Lecture Notes in Computer Science (including s...,2023,Scopus,450  462,13916 LNAI,Research indicates that teachers play an activ...,Conference paper,10.1007/978-3-031-36272-9_37,https://www.scopus.com/inward/record.uri?eid=2...,The topic is related to the application of Art...,Accepted,Human-AI Partnership; Multimodality; Spatial a...,Analítica/EDM
1,Counter-factual Analysis of On-Line Math Tutor...,"Alhossaini, Maher and Aloqeely, Mohammed",IEEE,2021,Web Of Science,1063--1068,,The importance of understanding on-line tutori...,Article,10.1109/ICMLA52953.2021.00174,,The topic is related to the application of Art...,Accepted,on-line tutoringcausalitycounter-factualslow-i...,Estadística/Probabilidad
2,Examining computational thinking processes in ...,"Jiang, Shiyan and Qian, Yingxiao and Tang, Hen...",EDUCATION AND INFORMATION TECHNOLOGIES,2023,Web Of Science,4309--4333,28,As artificial intelligence (AI) technologies a...,Article,10.1007/s10639-022-11355-3,,The topic is related to the application of Art...,Accepted,AI educationData modelingComputational thinkin...,Geometría
3,An In-Depth Methodology to Predict At-Risk Lea...,"Ben Soussia, Amal and Roussanaly, Azim and Boy...",Lecture Notes in Computer Science,2021,Web Of Science,193--206,12884,"Nowadays, the concept of education for all is ...",Book,10.1007/978-3-030-86436-1_15,,The topic is related to the application of Art...,Accepted,At-risk learnersEarly predictionMethodologyLea...,Evaluación/Predicción
4,Development and Application of an Intelligent ...,"Wang, Guangming and Chen, Xia and Zhang, Dongl...",SUSTAINABILITY,2022,Web Of Science,12265,14,To improve the quality of mathematics learning...,Article,10.3390/su141912265,,The topic is related to the application of Art...,Accepted,mathematics learning strategies; intelligent a...,Evaluación/Predicción


**MQ1: Publicaciones por año**

In [7]:
pub_por_anyo = df['year'].value_counts().sort_index()
display(pub_por_anyo.to_frame("Estudios"))


Unnamed: 0_level_0,Estudios
year,Unnamed: 1_level_1
2015,4
2017,3
2018,3
2019,5
2020,11
2021,9
2022,15
2023,19


**MQ2: Autores más Activos**

In [4]:
import re
from collections import Counter

cnt = Counter()
for text in df['author'].dropna():
    for a in re.split(r';\s*| and ', text):
        cnt[a.strip()]+=1

top10 = cnt.most_common(10)
print(pd.DataFrame(top10, columns=["Autor","Estudios"]))


             Autor  Estudios
0    Jiang, Shiyan         3
1  Wang, Guangming         3
2    Kang, Yueyuan         3
3    Rummel, Nikol         2
4        Chao, Jie         2
5  Finzer, William         2
6        Chen, Xia         2
7    Zhang, Dongli         2
8       Wang, Fang         2
9       Su, Mingyu         2


**MQ3: Tipo de documento**

In [6]:
display(df['document_type'].value_counts().to_frame("Estudios"))


Unnamed: 0_level_0,Estudios
document_type,Unnamed: 1_level_1
Conference paper,33
Article,32
Book chapter,3
Book,1


**MQ4: Publicaciones Revistas**

In [8]:
display(df['journal'].value_counts().head(10).to_frame("Estudios"))


Unnamed: 0_level_0,Estudios
journal,Unnamed: 1_level_1
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),5
IEEE,4
ACM International Conference Proceeding Series,3
CEUR Workshop Proceedings,3
"International Conference on Computer Supported Education, CSEDU - Proceedings",2
Artificial Intelligence in Education and Teaching Assessment,2
IEEE Transactions on Learning Technologies,2
Sustainability (Switzerland),2
COMPUTATIONAL MECHANICS,1
SUSTAINABILITY,1


**MQ5: Bases de datos**

In [9]:
display(df['source'].value_counts().to_frame("Estudios"))


Unnamed: 0_level_0,Estudios
source,Unnamed: 1_level_1
Scopus,53
Web Of Science,16


**MQ6: keywords**

In [10]:
from collections import Counter

kw = Counter()
for text in df['Keywords'].dropna():
    for k in re.split(r';|,', text):
        kw[k.strip().lower()]+=1

topkw = kw.most_common(10)
df_kw = pd.DataFrame(topkw, columns=["Keyword","Estudios"])
df_kw["%"] = (df_kw["Estudios"]/len(df)*100).round().astype(int).astype(str)+"%"
display(df_kw)


Unnamed: 0,Keyword,Estudios,%
0,machine learning,17,25%
1,artificial intelligence,7,10%
2,education,5,7%
3,mathematics,3,4%
4,educational data mining,3,4%
5,intelligent tutoring systems,3,4%
6,mathematics learning strategies,2,3%
7,intelligent assessment and strategy implementa...,2,3%
8,high school students,2,3%
9,intelligent diagnosis,2,3%


**MQ7: Dominios temáticos**

In [11]:
display(df['domain_suggested'].value_counts().to_frame("Estudios"))


Unnamed: 0_level_0,Estudios
domain_suggested,Unnamed: 1_level_1
Evaluación/Predicción,15
Pensamiento Computacional,14
Didactic Activities,9
Estadística/Probabilidad,8
Álgebra,7
Tutoría Inteligente,6
Analítica/EDM,3
Geometría,2
Cálculo,2
Teaching problem solving,2
