Official implementation of the student model implemented in Deep Reinforcement Learning to Simulate, Train, and Evaluate Instructional Sequencing Policies published at EDM Workshop on RL4ED (RL for Education) 2021.
Fitting probabilistic student models like hotDINA_skill and hotDINA_full using PyMC3 and PyStan. Note: PyStan is faster for time series models like BKT/hotDINA etc.. This repo:
-
Discusses how to set up the data for RoboTutor-Analysis
-
Uses Michael Yudelson's tool hmmscalable to fit BKT params to RoboTutor's transactions data. Fits and retrieves village wise params.
-
Automatically fits the hotDINA_full and the hotDINA_skill model on a server (like PSC bridges) and retrieves the parameters for you using PyStan. Link to the hotDINA student model paper
-
In order to use/run this repo successfully, you will need to have the
RoboTutor-Analysis
repo cloned. The logged RoboTutor Data should be stored inRoboTutor-Analysis/Data/
.- First clone the
RoboTutor-Analysis
project
git clone https://github.com/jithendaraa/RoboTutor-Analysis
- Then, clone this project
git clone https://github.com/jithendaraa/hotDINA
.
- First clone the
-
Setting up transactions data and retrieving the BKT parameters:
Due to large amounts of logged data, theData
directory inRoboTutor-Analysis
contains only the scripts to obtain the fitted BKT params and not the parameters themselves.- Follow this guide to get village-specific transactions data (and the fitted BKT parameters for these 29 villages). At the end of this process,
RoboTutor-Analysis/Data
must contain 29 folders namedvillage_n
(n from 114 to 142), 1 hmmscalable folder, ascript.ipynb
and ascript.py
.
- Follow this guide to get village-specific transactions data (and the fitted BKT parameters for these 29 villages). At the end of this process,
-
Setting up activity tables, CTA and other data in
RoboTutor-Analysis/Data
- Navigate to XPrize Home 2 folder and download the pristine form of the activity_table you need: Save this as
Activity_table_KCSubtest_sl2.xlsx
in theData
directory ofRoboTutor-Analysis
. Note: this activity table should have the last 4 columns as KC columns. - Download
Code Drop 2 Matrices.xlsx
andCTA.xlsx
into theData
directory
- Navigate to XPrize Home 2 folder and download the pristine form of the activity_table you need: Save this as
-
Look at the screenshot in the 3rd point here. This is what your
Data
directory looks like after a successful setup!
The previous section already discussed about fitting BKT parameters using hmmscalable
. This section focusses on using the scripts in /scripts
to extract data from RoboTutor-Analysis/Data
.
-
This example uses PSC bridges as the server but could be extended to any other supercomputing resource.
-
In this project, create a
passwords.py
with the following contents:
PASSWORD = {
"PSC": "your_psc_password",
"XSEDE": "your_xsede_password"
}
PORT_NUM = xxxx (xxxx = the port number you want to use for ssh purposes)
SERVER = "xxxx.edu" (ssh username@xxxx -p PORT_NUM)
USERNAME = "xxxx" (ssh xxxx@SERVER -p PORT_NUM)
-
python get_data_for_village_n.py -v 130 -o 1200
: Extracts transactions data of a single village (130) for the first 1200 attempts. Use-o all
to get all attempts of a student in a particular village. Output:pickles/data/data(village_num)_(num_obs).pickle
-
cd scripts && python get_data_for_villages.py -v 114-120 -o 1200
: Extracts transactions data for villages 114 to 120 for the first 1200 attempts. Use-o all
to get all attempts of a student in a particular village. Output:pickles/data/data(village_num)_(num_obs).pickle
.