Skip to content

Latest commit

 

History

History
14 lines (10 loc) · 926 Bytes

File metadata and controls

14 lines (10 loc) · 926 Bytes

Material for Sigrist (2023) - "A Comparison of Machine Learning Methods for Data with High-Cardinality Categorical Variables"

This repository contains material for reproducing the results of Sigrist (2023) - "A Comparison of Machine Learning Methods for Data with High-Cardinality Categorical Variables".

  • ETL: code for preparing the data with instructions on where to download the data
  • data: pre-processed data sets for modeling when the license of the original source permits it
  • run_experiments.R: code for running the experiments
  • results: raw results
  • tune_pars: chosen tuning parameters
  • cv_folds: sample splits when doing cross-validation

average relative difference

See Sigrist (2023) for more information.