gbfs is a comprehensive repository dedicated to advancing Graph-Based Feature Selection methodologies in machine learning. Our project houses two significant contributions to the field: GB-AFS and GB-BC-FS, each developed to address the intricate challenges of feature selection with graph-based solutions.
-
GB-AFS (Graph-Based Automatic Feature Selection): A method that automates the process of feature selection for multi-class classification tasks, ensuring the minimal yet most effective set of features is utilized for model training.
-
GB-BC-FS (Graph-Based Budget-Constrained Feature Selection): Currently in development, this method seeks to enhance feature selection by integrating budget constraints, ensuring the cost of each feature is considered.
gbfs
has been tested with Python 3.10.
pip
$ pip install gbfs
Clone from GitHub
$ git clone https://github.com/davidlevinwork/gbfs.git && cd gbfs
$ poetry install
$ poetry shell
To begin working with GB-AFS, the first step is to initialize the GB-AFS object:
from gbfs import GBAFS
gbafs = GBAFS(
dataset_path="path/to/your/dataset.csv",
separability_metric="your_separability_metric",
dim_reducer_model="your_dimensionality_reduction_method",
label_column="class",
)
After initializing the GB-AFS object, you can move forward with the process of selecting features:
selected_features = gbafs.select_features()
print("Selected Feature Indices:", selected_features)
GB-AFS also incorporates a technique for visualizing the chosen features within the feature space, offering insights into their distribution and how distinct they are:
gbafs.plot_feature_space()
To begin working with GB-AFS, the first step is to initialize the GB-AFS object:
from gbfs import GBAFS
gbbcfs = GBBCFS(
dataset_path="path/to/your/dataset.csv",
separability_metric="your_separability_metric",
dim_reducer_model="your_dimensionality_reduction_method",
label_column="class",
budget=20,
alpha=0.5,
epochs=100,
)
After initializing the GB-BC-FS object, you can move forward with the process of selecting features:
selected_features = gbbcfs.select_features()
print("Selected Feature Indices:", selected_features)
GB-BC-FS also incorporates a technique for visualizing the chosen features within the feature space, offering insights into their distribution and how distinct they are:
gbbcfs.plot_feature_space()
For more information on available commands and usage, refer to the documentation.
Contributions to gbfs
are welcome! If you encounter any issues or have suggestions for improvements, please open an issue.
If you use this code in your research, please cite:
@article{levin2024gb,
title={GB-AFS: graph-based automatic feature selection for multi-class classification via Mean Simplified Silhouette},
author={Levin, David and Singer, Gonen},
journal={Journal of Big Data},
volume={11},
number={1},
pages={79},
year={2024},
publisher={Springer}
}