GitHub

This repository contains the data used in MTMTE2

popular-23-24-mtmte1.csv is the "Popular Searches" dataset with 4847 rows, limited to match the date scope of the original study (Jan - Jun '23 + Nov '23 - Sep '24)
popular-23-24-top-50-mtmte1.csv is the top 50 queries from the limited Popular Searches dataset
zero-23-24-mtmte1.csv is the complete "Zero Results" dataset with 41042 rows, limited to match the date scope of the original study (Aug '23 - Sep '24)

Notes

The search sting cleaned is a cleaned column based on search string added to both popular-23-24.csv and zero-23-24.csv in OpenRefine. The text was cleaned and normalized by removing special characters, trimming whitespace, and converting to lowercase, using the following GREL expression:

value.replace(/[^a-zA-Z0-9\sÀ-ÖØ-öø-ÿ]/, "")
     .replace(/\s+/, " ")
     .toLowerCase()
     .trim()

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.gitignore		.gitignore
MTMTE2 POPULAR 23-24.ipynb		MTMTE2 POPULAR 23-24.ipynb
MTMTE2 ZERO 23-24.ipynb		MTMTE2 ZERO 23-24.ipynb
README.md		README.md
popular-23-24-mtmte1.csv		popular-23-24-mtmte1.csv
popular-23-24-top-50-mtmte1.csv		popular-23-24-top-50-mtmte1.csv
popular-23-24-top-50.csv		popular-23-24-top-50.csv
popular-23-24.csv		popular-23-24.csv
zero-23-24-mtmte1.csv		zero-23-24-mtmte1.csv
zero-23-24-sample.csv		zero-23-24-sample.csv
zero-23-24.csv		zero-23-24.csv