Skip to content

Commit

Permalink
OPTIM remove useless overhead caused by nested parallelism in mean_sh…
Browse files Browse the repository at this point in the history
…ift (#12159)
  • Loading branch information
ogrisel committed Sep 26, 2018
1 parent b915ca6 commit fe05e79
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion sklearn/cluster/mean_shift_.py
Expand Up @@ -193,7 +193,11 @@ def mean_shift(X, bandwidth=None, seeds=None, bin_seeding=False,
seeds = X
n_samples, n_features = X.shape
center_intensity_dict = {}
nbrs = NearestNeighbors(radius=bandwidth, n_jobs=n_jobs).fit(X)

# We use n_jobs=1 because this will be used in nested calls under
# parallel calls to _mean_shift_single_seed so there is no need for
# for further parallelism.
nbrs = NearestNeighbors(radius=bandwidth, n_jobs=1).fit(X)

# execute iterations on all seeds in parallel
all_res = Parallel(n_jobs=n_jobs)(
Expand Down

2 comments on commit fe05e79

@martinosorb
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this would call nested parallelism (if I understand what you mean)?
nbrsis an input to _mean_shift_single_seed, so it's computed before the parallel calls to the latter, isn't it?

@amueller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinosorb Not sure what you mean but you're better off commenting in the issue than on a commit.

Please sign in to comment.