SAINT Transformer model

This is my solution for the Riiid knowledge tracing competition, which was ensembled with an LGBM model to give me the 39th rank.

To run the code you should:

Download the dataset using either the kaggle api or manually to data/raw/ , questions.csv and train.csv only are going to be used.
Configure the model hyper-parameter, and general config in config.yaml
Run: python src/data/validation_split.py to create the validation data
Run: python src/data/preprocess.py to preprocess the data, and make training and validation into the right format
Run: src/modelds/train.py to train the model.

Feel free to check the source, I tried to make it as readable as possible. After finishing the training, you can play with the inference code, with the kaggle time series API emulator in notebooks directory.

Note that running all the transformations on the raw data would require at least 32GB of ram, training will take 15 minutes per epoch for the entire data. If all data is used, training would use 70M + sequences.

I have created an interactive notebook with more thorough explanation on kaggle in which you can train and test the inference on the dataset, you can take a look at it here: https://www.kaggle.com/abdessalemboukil/saint-training-inference-guide-39th-solution

SAINT+ paper: https://arxiv.org/pdf/2010.12042.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
notebooks		notebooks
src		src
README.md		README.md
config.yaml		config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAINT Transformer model

About

Releases

Packages

Languages

maroxtn/SAINT-Transformer-riiid-kaggle

Folders and files

Latest commit

History

Repository files navigation

SAINT Transformer model

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages