Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Access intermediate feature subset select #1043

Closed
arilwan opened this issue May 18, 2023 · 4 comments
Closed

Access intermediate feature subset select #1043

arilwan opened this issue May 18, 2023 · 4 comments

Comments

@arilwan
Copy link

arilwan commented May 18, 2023

Owning to the fact that Sequential Feature Selection is really a time-consuming preprocessing task.

Wouldn't it be nice to have some way to access immediate features selected while the algorithm keeps running. So for example using SFFS with say 100 features to select the best, would be nice at round N, to somehow retrieve feature subset selected at end of the selection round.

@rasbt
Copy link
Owner

rasbt commented May 19, 2023

Thanks for the suggestions! Actually, the good news is that this is already possible via Example 11 here: https://rasbt.github.io/mlxtend/user_guide/feature_selection/SequentialFeatureSelector/#example-11-interrupting-long-runs-for-intermediate-results

But please feel free to reopen this in case it doesn't work or doesn't fully solve the problem.

@rasbt rasbt closed this as completed May 19, 2023
@arilwan
Copy link
Author

arilwan commented Jun 14, 2023

@rasbt

Very sorry to reopen this issue again, I understand from the example you mentioned, Intermidiate Results are accessible upon process Interruption.

What I hope do to is retrieve those attributes (no of features selected, & metric score) saved to a variable (or write to a file) after adding every feature in an SFFS, without interrupting.

For example, I started running the selection process below on 2 June 2023.

[2023-06-14 08:01:25] Features: 149/240 -- score: 0.8947770129386831[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed: 21.0min
[Parallel(n_jobs=-1)]: Done  91 out of  91 | elapsed: 60.4min finished
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed: 20.8min
[Parallel(n_jobs=-1)]: Done 149 out of 149 | elapsed: 100.3min finished
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed: 21.4min
[Parallel(n_jobs=-1)]: Done 148 out of 148 | elapsed: 89.4min finished

[2023-06-14 12:11:29] Features: 149/240 -- score: 0.8952890526770254[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed: 18.2min
[Parallel(n_jobs=-1)]: Done  91 out of  91 | elapsed: 53.0min finished
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=-1)]: Done  25 tasks      | elapsed: 18.5min

For 2 weeks now, and still way to go, maybe another 2 weeks.

Suppose those attributes as accessible, and say saved to a file, I can do some anoalysis of the results after Features: 50/240, Features: 100/240, Features: 148/240 etc. without actually interrupting the running process.

Isn't there any way to write those to a file?

@arilwan
Copy link
Author

arilwan commented Jun 14, 2023

@rasbt
Can you please guide me what section of the code should I change to continuously write the attributes values to a txt file that I can keep updating after adding every feature?

@rasbt
Copy link
Owner

rasbt commented Jun 15, 2023

For future reference, linking the discussion here: #1051

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants