This code takes raw input and generates aggregrated, tagged data for input into a feature selection algorithm for the generation of detectors of student affect and behavior.
The main code is RMGenerateFeatures.java.
The code requires a single command-line argument which is the folder containing the student logs. The directory must contain:
baker_ses_classes.csvandvw_classes.csv: The raw log files.syncedDataAll.txt: The observation logs.PofJ.txt,PofG.txt, andPofS.txt: The parameters, determined by linear regression, for the calculation of contextual BKT parameters.generatecontextual.pyandcontextualfeatures.py: Copy these scripts to the log file directory.
The result, contained in a folder called clips, will be the data, aggregated into clips, for all observations as well as divided by label.
Note: Intermediate files are retained, and can be quite large. Upon completion of the process, the directory is likely to contain around 10 GB of files.
Note: The step of correcting the observation sync, performed by the method RMGenerateFeatures.setObs(), takes quite a long time. It's stongly recommended that this command by executed once, and the resulting file, syncedDataAll_new.txt be renamed to syncedDataAll.txt, with the call to setObs() commented out in the code.