Skip to content

This repository lists the common public dataset used in educational data mining community.

Notifications You must be signed in to change notification settings

ckyeungac/edm-public-dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 

Repository files navigation

edm-public-dataset

This repository lists public datasets used in educational data mining community.


Affect-related

Knowledge Tracing

  • ASSISTments 2009 -- This dataset has been widely applied in student profile modeling, e.g., student performance prediction.
  • ASSISTments 2012 -- This dataset is similar to ASSISTment 2009, but augmented with the predicted affective state of a student. This includes frustration, confusion, engagment and boredom.
  • ASSISTments 2015 -- This dataset is the simplest dataset that only contains the correct tag of an attempt.
  • OLI Engineering Statics - Fall 2011 -- This dataset is from a college-level engineering statics course with 189,297 trials, 333 students and 1,223 exercises tags.
  • Synthetic Data from DKT -- This dataset simulates 2000 virtual students answering 50 exercises using the item response theory model. (paper)
  • Junyi Academy Math Practicing Log -- Junyi Academy is an e-learning platform, like Khan Academy, where students can practice exercises on various subjects including Mathematics, Biology, Computer Science. Like ASSISTments, the dataset contains attempt, hint taken, time spent, and skill tag information for an exercise.

Quitting the Game

  • Physics Playground Log Data -- Physics Playground log data capture comprehensive information on student actions and game screen changes as a time series with millisecond precision. (paper)

Others

DataShop

  • DataShop -- The world's largest repository of learning interaction data.

freeCodeCamp

  • Gitter History -- An open dataset of all chat activity in Free Code Camp's Gitter.im chatrooms
  • Gitter Analytics -- Efforts by some freeCodeCamp students focused on analyses of the activity on Gitter. (Number of help requests during learning.)
  • New Coder Survey -- The open dataset from freeCodeCamp's surveys.

College Scorecard

  • College Scorecard data -- The data that appear on the College Scorecard, as well as supporting data on student completion, debt and repayment, earnings, and more. The files include data from 1996 through 2017 for all undergraduate degree-granting institutions of higher education.

About

This repository lists the common public dataset used in educational data mining community.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published