Fixed #971 turned off joblib when `n_jobs == 1` #985

NimaSarajpoor · 2022-11-08T05:30:56Z

This PR fixes issue #971

Performance Code

seed = 0
X = np.random.rand(10000, 10) # 10k samples, with 10 features
y = np.random.choice([0, 1], size=10000)

lst = []
for i in range(5):
    tic = time.time()
    efs = EFS(RandomForestClassifier()).fit(X, y) # EFS: ExhaustiveFeatureSelector
    toc = time.time()
    lst.append(toc - tic)

np.mean(lst)

Computing Time

branch main: 103 sec
this branch: 93 sec

codecov · 2022-11-08T05:40:10Z

Codecov Report

Base: 77.43% // Head: 77.43% // No change to project coverage 👍

Coverage data is based on head (7599ebf) compared to base (423d217).
Patch coverage: 100.00% of modified lines in pull request are covered.

❗ Current head 7599ebf differs from pull request most recent head e912885. Consider uploading reports for the commit e912885 to get more accurate results

Additional details and impacted files

@@           Coverage Diff           @@
##           master     #985   +/-   ##
=======================================
  Coverage   77.43%   77.43%           
=======================================
  Files         198      198           
  Lines       11165    11165           
  Branches     1406     1406           
=======================================
  Hits         8646     8646           
  Misses       2305     2305           
  Partials      214      214

Impacted Files	Coverage Δ
mlxtend/__init__.py	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

NimaSarajpoor · 2022-11-08T23:17:34Z

I will fix this in the upcoming days.

NimaSarajpoor · 2022-11-09T06:30:09Z

@rasbt
I think it is ready. if there is something that I missed, please let me know.

rasbt

thanks a lot, that looks great! Neat & clean solution!

rasbt · 2022-11-12T19:28:27Z

Was just testing the code and it definitely improved the startup time. When I am trying an example like

import numpy as np
from sklearn.linear_model import LogisticRegression
from mlxtend.feature_selection import ExhaustiveFeatureSelector as EFS

seed = 0
X = np.random.rand(10000, 10) # 10k samples, with 10 features
y = np.random.choice([0, 1], size=10000)

model = LogisticRegression()

efs1 = EFS(model, 
           min_features=1,
           max_features=10,
           scoring='accuracy',
           print_progress=True,
           n_jobs=1,
           cv=5)

efs1 = efs1.fit(X, y)

print('Best accuracy score: %.2f' % efs1.best_score_)
print('Best subset (indices):', efs1.best_idx_)
print('Best subset (corresponding names):', efs1.best_feature_names_)

it still seems to be a bit stuck though. I.e., it would not show any output for like 2-3 min and then iterate through the 1k possibilities in like 1 sec.

I wonder if that's an issue with the verbose display functionality though 🤔

EDIT: No worries, it was a computer issue. It works perfectly now. Actually it solves the problem. Before, a user could not see the progress printed to the command line until all combinations were evaluated. Now, you get the feedback immediately if n_jobs==1

NimaSarajpoor · 2022-11-13T02:07:19Z

EDIT: No worries, it was a computer issue. It works perfectly now. Actually it solves the problem. Before, a user could not see the progress printed to the command line until all combinations were evaluated. Now, you get the feedback immediately if n_jobs==1

Thanks for the info :)

rasbt approved these changes Nov 12, 2022

View reviewed changes

rasbt added 2 commits November 12, 2022 14:40

bumb to dev version

7599ebf

add docs

e912885

rasbt merged commit 55359c7 into rasbt:master Nov 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed #971 turned off joblib when `n_jobs == 1` #985

Fixed #971 turned off joblib when `n_jobs == 1` #985

NimaSarajpoor commented Nov 8, 2022

codecov bot commented Nov 8, 2022 •

edited

NimaSarajpoor commented Nov 8, 2022

NimaSarajpoor commented Nov 9, 2022

rasbt left a comment

rasbt commented Nov 12, 2022 •

edited

NimaSarajpoor commented Nov 13, 2022

Fixed #971 turned off joblib when n_jobs == 1 #985

Fixed #971 turned off joblib when n_jobs == 1 #985

Conversation

NimaSarajpoor commented Nov 8, 2022

Performance Code

Computing Time

codecov bot commented Nov 8, 2022 • edited

Codecov Report

NimaSarajpoor commented Nov 8, 2022

NimaSarajpoor commented Nov 9, 2022

rasbt left a comment

Choose a reason for hiding this comment

rasbt commented Nov 12, 2022 • edited

NimaSarajpoor commented Nov 13, 2022

Fixed #971 turned off joblib when `n_jobs == 1` #985

Fixed #971 turned off joblib when `n_jobs == 1` #985

codecov bot commented Nov 8, 2022 •

edited

rasbt commented Nov 12, 2022 •

edited