Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about food and best dragon update in DA algorithm. #10

Closed
Migue8gl opened this issue Nov 18, 2023 · 10 comments
Closed

Question about food and best dragon update in DA algorithm. #10

Migue8gl opened this issue Nov 18, 2023 · 10 comments

Comments

@Migue8gl
Copy link

I am working to minimise a function according to typical criteria in feature selection (a mix between accuracy and reduction rate).

I find that the DA algorithm is minimising all features to 0. I've been investigating, specifically in the 'update_food' function, when I get to the if (dimensions == dragonflies.shape[1] - 2): for k in range(0, dragonflies.shape[1]-2): food_position[0,k] = np.clip(food_position[0,k] - dragonflies[i,k], min_values[k], max_values[k]) I'm seeing that the dragonfly and the food being subtracted have the same values. This makes the food all 0's, in the following iterations when multiplying 0 by Levy and adding it to the food[0,k] it is still 0. I don't know what makes the subtracted dragonfly exactly the same, what I do know is that it is marked as the best solution and doesn't come out of it again.
It happens with many of the datasets I have tried.

@Valdecy
Copy link
Owner

Valdecy commented Nov 18, 2023

Ok, I will have a look. Could you also provide the function you are trying to minimize and a sample of the dataset?

@Migue8gl
Copy link
Author

I'm using this function, but I have made some changes inside the algorithms to record fitness values in order to plot them. But the logic behind every algorithm is untouched.

The fitness I am using:

def fitness(weights, data, alpha=0.5, classifier='knn', n_neighbors=5):
    reduction_count = np.sum(weights == 0)
    weights[weights < 0.1] = 0.0
    classification_rate = compute_accuracy(weights, data=data, classifier=classifier, n_neighbors=n_neighbors)
    reduction_rate = reduction_count / len(weights)

    # Calculate the error as a percentage
    classification_error = 1 - classification_rate['TrainError']
    reduction_error = 1 - reduction_rate

    # Compute fitness as a combination of classification and reduction errors
    fitness_train = alpha * classification_error + (1 - alpha) * reduction_error
    classification_error = 1 - classification_rate['ValError']
    fitness_val = alpha * classification_error + (1 - alpha) * reduction_error

    return {'TrainFitness': fitness_train, 'ValFitness': fitness_val}

Accuracy function:

def compute_accuracy(weights, data, classifier='knn', n_neighbors=5):
    sample = data['data']
    labels = data['labels']

    sample_weighted = np.multiply(sample, weights)
    x_train, x_test, y_train, y_test = train_test_split(sample_weighted, labels, test_size=0.2, random_state=42)

    if (classifier == 'knn'):
        classifier = KNeighborsClassifier(n_neighbors=n_neighbors, weights='distance')
    elif (classifier == 'svc'):
        classifier = SVC(kernel='rbf')
    else:
        print('No valid classifier, using KNN by default')
        classifier = KNeighborsClassifier(n_neighbors=n_neighbors, weights='distance')

    # Train the classifier
    classifier.fit(x_train, y_train)
    y_pred = classifier.predict(x_train)
    e_in = accuracy_score(y_train, y_pred)

    y_pred = classifier.predict(x_test)
    e_out = accuracy_score(y_test, y_pred)

    return {'TrainError': e_in, 'ValError': e_out}

Sample of Ionosphere:

1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1,0.03760,0.85243,-0.17755,0.59755,-0.44945,0.60536,-0.38223,0.84356,-0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,-0.47357,0.56811,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300,g
1,0,1,-0.18829,0.93035,-0.36156,-0.10868,-0.93597,1,-0.04549,0.50874,-0.67743,0.34432,-0.69707,-0.51685,-0.97515,0.05499,-0.62237,0.33109,-1,-0.13151,-0.45300,-0.18056,-0.35734,-0.20332,-0.26569,-0.20468,-0.18401,-0.19040,-0.11593,-0.16626,-0.06288,-0.13738,-0.02447,b
1,0,1,-0.03365,1,0.00485,1,-0.12062,0.88965,0.01198,0.73082,0.05346,0.85443,0.00827,0.54591,0.00299,0.83775,-0.13644,0.75535,-0.08540,0.70887,-0.27502,0.43385,-0.12062,0.57528,-0.40220,0.58984,-0.22145,0.43100,-0.17365,0.60436,-0.24180,0.56045,-0.38238,g

@Migue8gl
Copy link
Author

In other algorithms like GAO of GWO it works just fine, but in DA something related with how food is generated is minimizing every feature to zero.

@Valdecy
Copy link
Owner

Valdecy commented Nov 20, 2023

Please try using version 1.5.1 of pyMetaheuristic. I believe the issue has been resolved in this update. Should you encounter any further problems, do not hesitate to inform me.

@Valdecy Valdecy closed this as completed Nov 20, 2023
@Migue8gl
Copy link
Author

Please try using version 1.5.1 of pyMetaheuristic. I believe the issue has been resolved in this update. Should you encounter any further problems, do not hesitate to inform me.

I have tested the new version and although the values are no longer minimised to zero, the food does not update and remains stuck at the same values iteration after iteration.

I have uploaded in my TFG-Wrapped-Based-Metaheuristics-Feature-Selection repository the changes I introduced in the algorithms and everything necessary to be able to run the datasets. However, I sincerely believe that I have not altered any of the logic of the algorithm, in fact, the rest of the algorithms that I have modified work very well. I have to say that it is an initial version for a university work and it is not yet parameterised or modularised in a proper way.

@Migue8gl
Copy link
Author

Migue8gl commented Nov 20, 2023

I would also like to know where the food and predator update operators come from, I have read the original paper and it only mentions that F_i = F_p - P, being F_i food source and F_p food position vector. Thanks for your attention and sorry if for the for the inconvenience if something I asked was bad formulated or incorrect.

@Valdecy
Copy link
Owner

Valdecy commented Nov 20, 2023

Kindly review the updates in pyMetaheuristic version 1.5.3. In this version, 'food position' has been designated as the reference point for the optimal solution, while 'predator position' signifies the location of the enemy. These adjustments were specifically made to accelerate the algorithm's convergence rate. However, it's important to emphasize that the fundamental logic of the algorithm remains unchanged. Despite these enhancements, it's worth noting that the Dragonfly Algorithm (DA) inherently requires a considerable number of iterations to achieve a satisfactory solution.

@Valdecy
Copy link
Owner

Valdecy commented Nov 21, 2023

I have refactored the DA code in version 1.5.4. Please check which one is better for you, 1.5.3 or 1.5.4

@Migue8gl
Copy link
Author

I have refactored the DA code in version 1.5.4. Please check which one is better for you, 1.5.3 or 1.5.4

I will have a look in the next few hours, thanks!

@Migue8gl
Copy link
Author

I have refactored the DA code in version 1.5.4. Please check which one is better for you, 1.5.3 or 1.5.4

Working way better than before, it needed more iterations as you said, but it works. Also the code is cleaner (in my opinion) than before. Thanks for your time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants