Code for "DIFER: Differentiable Automated Feature Engineering"
accepted in 1st Conference on Automated Machine Learning
This code is implemented in PyTorch, and we have tested the code under the environment settings in requirements.txt
.
-
data
:$23/25$ medium-sized datasets that can be pushed to git and their meta information. -
NFS_sklearn_c
: the open-source implementation of "Neural Feature Search: A Neural Architecture for Automated Feature Engineering". -
autolearn
:the core coes for DIFER inautolearn/feat_selection/nfo
, continas the feature optimizer incontroller.py
, the feature space insearch_space.py
, the end-to-end training process initer_train.py
, the three forms of feature (i.e., the original form, the parse tree and the traversal string) infeat_tree.py
.
We provide script files for convenience in conducting experiments.
run_iter.sh
: after specifying the dataset and cuda, you can run DIFER to automate feature engineering for Random Forest.run_rq3.sh
: the script for RQ3 in the paper.run_rq4_*.sh
: the script of different machine learning algorithms for RQ4 in the paper.