Wrapper FS based on Cuckoo Search algorithm #335

Mohammed-Ryiad-Eiadeh · 2023-04-21T20:41:24Z

Description

< Adding a wrapper FS approach based on binary Cuckoo Search algorithm to Tribuo engine>

Motivation

< Recently Evolutionary Computation have shown alot of good results in a wide range of technologies especially in engineering problems. Moreover, many published research papers include the utilizing of EO algorithms for FS and inform good results. Therefore, adding a wrapper FS model to Tribuo engine would be great >

< Please link any relevant issues or PRs >

Paper reference

< Rodrigues, D. et al. BCS: A Binary Cuckoo Search algorithm for feature selection. Proc. - IEEE Int. Symp. Circuits Syst. 465–468 (2013) doi:10.1109/ISCAS.2013.6571881. >

oracle-contributor-agreement · 2023-04-21T20:41:28Z

Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
The following contributors of this PR have not signed the OCA:

To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application.

When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated.

If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public.

Craigacp · 2023-04-24T15:57:43Z

Thanks for the contribution. Once you've sorted out the OCA I can go through and review this, but I'm not allowed to review it until the OCA check passes.

Mohammed-Ryiad-Eiadeh · 2023-04-24T23:59:19Z

I signed the OCA form 4 days ago. And now I am waiting for the confirmation.

Craigacp · 2023-04-25T01:00:08Z

I signed the OCA form 4 days ago. And now I am waiting for the confirmation.

Ok, thanks. Once it's gone through the system we can get going with this. It shouldn't take too long.

Mohammed-Ryiad-Eiadeh · 2023-04-25T01:57:19Z

Thank you.

Craigacp · 2023-04-28T15:34:02Z

It looks like there were some issues with the OCA you submitted, could you check the status of it and fix the issues it describes?

Mohammed-Ryiad-Eiadeh · 2023-04-29T04:11:33Z

The problem was that in the first OCA form, I forgot to write down my GitHub username. But few moments ago, I signed the OCA, and I provided my GitHub username.

Mohammed-Ryiad-Eiadeh · 2023-05-05T18:40:21Z

Dear Adam,

Is there anyway to get a list or an array of the feature values from the 'selectedFeatureDataset' ?, such as getting the array of values of a certain feature based on the index.

That this I need in order to compute functions like Pearson Correlation Coefficient between two features for some purposes

Craigacp · 2023-05-05T19:20:22Z

Not easily from the selected feature dataset as it's stored row wise. The current information theoretic feature selection dataset converts the data into a columnar storage format before running the feature selection procedure.

Depending on what you want to do it might be more efficient to do that, like if you need to compute the correlations between all features. Otherwise you pay a log(d) lookup cost for each feature, which makes the whole Pearson's correlation coefficient 2n*log(d) just to get the data in a columnar format to compute the correlation. If you want to do it the other way around you can make a n*d matrix and then write all the feature values into it, which is much more efficient if you're computing all of them, and not worth it if it's only for a few values.

Mohammed-Ryiad-Eiadeh · 2023-05-05T20:09:30Z

I want to compute the correlation between the entire features in the selected feature dataset and how can I construct the d*n matrix using tribuo engine

Craigacp

Thanks for the PR, sorry it took a while to review. The implementation looks interesting, but there's a bit of work required to get it to play nicely with Tribuo's provenance and configuration systems. Also could you add a basic unit test which runs a feature selection operation, similar to the other tests in the FeatureSelection subproject?

Craigacp · 2023-05-19T13:43:19Z

...src/main/java/org/tribuo/classification/fs/FS_Wrapper_Approaches/Discreeting/Binarizing.java

@@ -0,0 +1,27 @@
+package FS_Wrapper_Approaches.Discreeting;


The base package name needs updating to org.tribuo.classification.fs.wrapper and then the files should be moved into the right directory. At the moment this won't compile because the directory and package names don't line up.

Also all the files need the copyright and license header.

Craigacp · 2023-05-19T13:43:54Z

...src/main/java/org/tribuo/classification/fs/FS_Wrapper_Approaches/Discreeting/Binarizing.java

@@ -0,0 +1,27 @@
+package FS_Wrapper_Approaches.Discreeting;
+
+import static org.apache.commons.math3.special.Erf.erf;


Tribuo no longer depends on Apache Commons Math 3, and this import isn't used.

Craigacp · 2023-05-19T13:46:09Z

...in/java/org/tribuo/classification/fs/FS_Wrapper_Approaches/Discreeting/TransferFunction.java

+ * Enumeration that contains the types of transfer functions in which they are used to define the type of transfer function
+ */
+public enum TransferFunction {
+        V1, V2, V3, V4, S1, S2, S3, S4


The implementation of discreteValue could be folded in here, making the enum contain a DoubleUnaryFunction which performs the necessary operation, and then it could have a discretise method which applies the function the output. Then the Binarizing interface can be removed.

Craigacp · 2023-05-19T13:47:18Z

...main/java/org/tribuo/classification/fs/FS_Wrapper_Approaches/Evaluation/FitnessFunction.java

+import org.tribuo.classification.Label;
+import org.tribuo.classification.evaluation.LabelEvaluation;
+import org.tribuo.classification.evaluation.LabelEvaluator;
+import org.tribuo.common.nearest.KNNClassifierOptions;


We try to avoid surfacing the options classes in the library code as they are not under the same compatibility guarantees as the library code. It's probably better to import KNNTrainer directly.

Craigacp · 2023-05-19T13:50:00Z

...main/java/org/tribuo/classification/fs/FS_Wrapper_Approaches/Evaluation/FitnessFunction.java

+/**
+ * This interface includes the evaluation function of each solution
+ */
+public interface FitnessFunction {


This interface only contains static members. For the time being those methods can be moved onto CuckooSearchOptimizer and be private, then if we need to use them from other places they can be easily moved and made more accessible.

Craigacp · 2023-05-19T14:03:48Z

...ava/org/tribuo/classification/fs/FS_Wrapper_Approaches/Optimizers/CuckooSearchOptimizer.java

+    private final double delta;
+    private final int populationSize;
+    private int [][] setOfSolutions;
+    private final int maxIteration;


All the final variables here should not be final and need to be tagged @Config (with appropriate description fields) so they are automatically captured by the provenance system.

Craigacp · 2023-05-19T14:04:22Z

...ava/org/tribuo/classification/fs/FS_Wrapper_Approaches/Optimizers/CuckooSearchOptimizer.java

+     * @param totalNumberOfFeatures The number of features in the given dataset
+     * @return The population of subsets of selected features
+     */
+    private int[][] GeneratePopulation(int totalNumberOfFeatures) {


As mentioned elsewhere this method should accept a SplittableRandom rather than make a fresh RNG each time.

Craigacp · 2023-05-19T14:04:55Z

...ava/org/tribuo/classification/fs/FS_Wrapper_Approaches/Optimizers/CuckooSearchOptimizer.java

+    /**
+     * This record is used to hold subset of features with its corresponding fitness score
+     */
+    record FeatureSet_FScore_Container(int[] subSet, double score) { }


CuckooSearchFeatureSet?

I used this record in order to store the solutions with their corresponding fitness values in order to use this to arrange the solutions based on their fitness scores using Comparator

Ah, sorry, I was suggesting this record should be renamed CuckooSearchFeatureSet as it's specific to this approach and it's currently got a very general name.

Ah, I am sorry, I don't get your note as it should be. You are right I should change the name, and I am currently working on the FJP issue. Thanks Dr. Adam for your suggestions. All of your suggestions are suitable, and I know that I still need more time to learn more and more in programming in Java (my best language)

Craigacp · 2023-05-19T14:05:53Z

...main/java/org/tribuo/classification/fs/FS_Wrapper_Approaches/Evaluation/FitnessFunction.java

+        KNNClassifierOptions classifier = new KNNClassifierOptions();
+        CrossValidation<Label, LabelEvaluation> crossValidation = new CrossValidation<>(classifier.getTrainer(), selectedFeatureDataset, new LabelEvaluator(), 10);
+        double avgAccuracy = 0d;
+        for (Pair<LabelEvaluation, Model<Label>> ACC : crossValidation.evaluate())


We always use curly braces even for single line for loops and if statements.

Craigacp · 2023-05-19T14:06:22Z

...ava/org/tribuo/classification/fs/FS_Wrapper_Approaches/Optimizers/CuckooSearchOptimizer.java

+import FS_Wrapper_Approaches.Discreeting.Binarizing;
+import FS_Wrapper_Approaches.Discreeting.TransferFunction;
+import com.oracle.labs.mlrg.olcut.util.Pair;
+import org.tribuo.*;


All imports should be explicit, no star imports for packages.

Mohammed-Ryiad-Eiadeh · 2023-05-19T14:16:11Z

Thank you for your suggestions, all of them are great and I will do them, but currently I am working on my PhD proposal and I have meetings with my advisor during the current month, so I will make these suggestions during the next one.

…ification/fs/wrapper directory

Add files via upload

6066f3a

oracle-contributor-agreement bot added the OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. label Apr 21, 2023

Mohammed-Ryiad-Eiadeh added 5 commits April 22, 2023 20:06

Update TransferFunction.java

e729e18

Update CuckooSearchOptimizer.java

e33157e

Update Binarizing.java

580fb6a

Update CuckooSearchOptimizer.java

02ddebb

Update CuckooSearchOptimizer.java

1079ee7

Update CuckooSearchOptimizer.java

1a323c9

Mohammed-Ryiad-Eiadeh added 5 commits April 27, 2023 04:28

Update CuckooSearchOptimizer.java

8228ffe

Update TransferFunction.java

4f7a25f

Update Binarizing.java

44a33bd

Add files via upload

5d58e5d

Update CuckooSearchOptimizer.java

39c340b

Mohammed-Ryiad-Eiadeh added 5 commits April 29, 2023 07:55

Update CuckooSearchOptimizer.java

2f85c7b

Update FitnessFunction.java

6392dd1

Update CuckooSearchOptimizer.java

e214c10

Update FitnessFunction.java

f0a1300

Update FitnessFunction.java

2c9c47d

oracle-contributor-agreement bot added OCA Verified All contributors have signed the Oracle Contributor Agreement. and removed OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. labels May 1, 2023

Update FitnessFunction.java

40da8a8

Mohammed-Ryiad-Eiadeh added 3 commits May 8, 2023 01:04

Update FitnessFunction.java

b381c2a

Update FitnessFunction.java

e05f30d

Update CuckooSearchOptimizer.java

5aa563d

Craigacp requested changes May 19, 2023

View reviewed changes

Mohammed-Ryiad-Eiadeh added 18 commits May 29, 2023 23:01

Add files via upload

c5d5f62

Delete Classification/FeatureSelection/src/main/java/org/tribuo/class…

99f83e6

…ification/fs/wrapper directory

Delete Binarizing.java

63e4aa0

Update TransferFunction.java

8c9d3e2

Delete FitnessFunction.java

6d2f940

Update CuckooSearchOptimizer.java

2df7c7e

Update CuckooSearchOptimizer.java

50be1fb

Update CuckooSearchOptimizer.java

96ecc6b

Update pom.xml

21a9c3f

Update CuckooSearchOptimizer.java

fe512ae

Update CuckooSearchOptimizer.java

d4a11db

Update CuckooSearchOptimizer.java

f45bde5

Update CuckooSearchOptimizer.java

7670a28

Update CuckooSearchOptimizer.java

0993bc0

Update CuckooSearchOptimizer.java

9038ef4

Update CuckooSearchOptimizer.java

76dc715

Update CuckooSearchOptimizer.java

28509c0

Update CuckooSearchOptimizer.java

2910d99

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrapper FS based on Cuckoo Search algorithm #335

Wrapper FS based on Cuckoo Search algorithm #335

Mohammed-Ryiad-Eiadeh commented Apr 21, 2023

oracle-contributor-agreement bot commented Apr 21, 2023

Craigacp commented Apr 24, 2023 •

edited

Loading

Mohammed-Ryiad-Eiadeh commented Apr 24, 2023

Craigacp commented Apr 25, 2023

Mohammed-Ryiad-Eiadeh commented Apr 25, 2023

Craigacp commented Apr 28, 2023

Mohammed-Ryiad-Eiadeh commented Apr 29, 2023

Mohammed-Ryiad-Eiadeh commented May 5, 2023

Craigacp commented May 5, 2023

Mohammed-Ryiad-Eiadeh commented May 5, 2023 •

edited

Loading

Craigacp left a comment •

edited

Loading

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Mohammed-Ryiad-Eiadeh May 28, 2023

Craigacp May 30, 2023

Mohammed-Ryiad-Eiadeh May 30, 2023

Craigacp May 19, 2023

Craigacp May 19, 2023

Mohammed-Ryiad-Eiadeh commented May 19, 2023

		@@ -0,0 +1,27 @@
		package FS_Wrapper_Approaches.Discreeting;

		import static org.apache.commons.math3.special.Erf.erf;

Wrapper FS based on Cuckoo Search algorithm #335

Are you sure you want to change the base?

Wrapper FS based on Cuckoo Search algorithm #335

Conversation

Mohammed-Ryiad-Eiadeh commented Apr 21, 2023

Description

Motivation

Paper reference

oracle-contributor-agreement bot commented Apr 21, 2023

Craigacp commented Apr 24, 2023 • edited Loading

Mohammed-Ryiad-Eiadeh commented Apr 24, 2023

Craigacp commented Apr 25, 2023

Mohammed-Ryiad-Eiadeh commented Apr 25, 2023

Craigacp commented Apr 28, 2023

Mohammed-Ryiad-Eiadeh commented Apr 29, 2023

Mohammed-Ryiad-Eiadeh commented May 5, 2023

Craigacp commented May 5, 2023

Mohammed-Ryiad-Eiadeh commented May 5, 2023 • edited Loading

Craigacp left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mohammed-Ryiad-Eiadeh commented May 19, 2023

Craigacp commented Apr 24, 2023 •

edited

Loading

Mohammed-Ryiad-Eiadeh commented May 5, 2023 •

edited

Loading

Craigacp left a comment •

edited

Loading