"Toward Talent Scientist: Sharing and Learning Together" --- Jingwei Too
- This toolbox offers several advanced wrapper feature selection methods
 - The 
Demo_ISSAfile provides an example of how to apply ISSA on benchmark dataset - Source code of these methods are written based on pseudocode & paper
 
The main function jfs is adopted to perform feature selection. You may switch the algorithm by changing the issa in from AFS.issa import jfs to other abbreviations
- If you wish to use improved salp swarm algorithm ( ISSA ) then you may write
 
from AFS.issa import jfs
- If you want to use time varying binary salp swarm algorithm ( TVBSSA ) then you may write
 
from AFS.tvbssa import jfs
feat: feature vector matrix ( Instance x Features )label: label matrix ( Instance x 1 )opts: parameter settingsN: number of solutions / population size ( for all methods )T: maximum number of iterations ( for all methods )k: k-value in k-nearest neighbor
Acc: accuracy of validation modelfmdl: feature selection model ( It contains several results )sf: index of selected featuresnf: number of selected featuresc: convergence curve
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from AFS.issa import jfs   # change this to switch algorithm 
import matplotlib.pyplot as plt
# load data
data  = pd.read_csv('ionosphere.csv')
data  = data.values
feat  = np.asarray(data[:, 0:-1])   # feature vector
label = np.asarray(data[:, -1])     # label vector
# split data into train & validation (70 -- 30)
xtrain, xtest, ytrain, ytest = train_test_split(feat, label, test_size=0.3, stratify=label)
fold = {'xt':xtrain, 'yt':ytrain, 'xv':xtest, 'yv':ytest}
# parameter
k    = 5     # k-value in KNN
N    = 10    # number of salps
T    = 100   # maximum number of iterations
opts = {'k':k, 'fold':fold, 'N':N, 'T':T}
# perform feature selection
fmdl = jfs(feat, label, opts)
sf   = fmdl['sf']
# model with selected features
num_train = np.size(xtrain, 0)
num_valid = np.size(xtest, 0)
x_train   = xtrain[:, sf]
y_train   = ytrain.reshape(num_train)  # Solve bug
x_valid   = xtest[:, sf]
y_valid   = ytest.reshape(num_valid)  # Solve bug
mdl       = KNeighborsClassifier(n_neighbors = k) 
mdl.fit(x_train, y_train)
# accuracy
y_pred    = mdl.predict(x_valid)
Acc       = np.sum(y_valid == y_pred)  / num_valid
print("Accuracy:", 100 * Acc)
# number of selected features
num_feat = fmdl['nf']
print("Feature Size:", num_feat)
# plot convergence
curve   = fmdl['c']
curve   = curve.reshape(np.size(curve,1))
x       = np.arange(0, opts['T'], 1.0) + 1.0
fig, ax = plt.subplots()
ax.plot(x, curve, 'o-')
ax.set_xlabel('Number of Iterations')
ax.set_ylabel('Fitness')
ax.set_title('ISSA')
ax.grid()
plt.show()
- Python 3
 - Numpy
 - Pandas
 - Scikit-learn
 - Matplotlib
 
- Note that the methods are altered so that they can be used in feature selection tasks
 - The extra parameters represent the parameter(s) other than population size and maximum number of iterations
 - Click on the name of method to view the extra parameter(s)
 - Use the 
optsto set the specific parameter(s) - If you do not set extra parameters then the algorithm will use default setting in here
 
| No. | Abbreviation | Name | Year | Extra Parameters | 
|---|---|---|---|---|
| 08 | tmgwo | 
Two-phase Mutation Grey Wolf Optimizer | 2020 | Yes | 
| 07 | tvbssa | 
Time Varying Binary Salp Swarm Algorithm | 2020 | No | 
| 06 | issa | 
Improved Salp Swarm Algorithm | 2020 | Yes | 
| 05 | essa | 
Enhanced Salp Swarm Algorithm | 2019 | No | 
| 04 | mgfpa | 
Modified Global Flower Pollination Algorithm | 2018 | Yes | 
| 03 | obwoa | 
Opposition Based Whale Optimization Algorithm | 2018 | Yes | 
| 02 | isca | 
Improved Sine Cosine Algorithm | 2017 | Yes | 
| 01 | bbpso | 
Bare Bones Particle Swarm Optimization | 2003 | No |