Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Move threadpoolctl outside of iteration loop in KMeans #17235

Merged

Conversation

jeremiedbb
Copy link
Member

Related to #17208

Profiling shows that the threadpoolctl context manager has a significant overhead when running KMeans on very small datasets. This PR moves the call to threadpoolctl outside of the iteration loop.

prof1

The timings on my laptop from the snippet in #17208 are now:
0.22: 0.0138
0.23: 0.0562
master: 0.0431
this pr: 0.0214

Although it's a nice improvement, we don't fully recover the perf of 0.22. See explanation and discussion in #17208.

@jeremiedbb
Copy link
Member Author

ping @adrinjalali

@jeremiedbb
Copy link
Member Author

The uncovered lines are for the verbose mode.

@NicolasHug
Copy link
Member

Can you help me understand what made you change your mind about https://github.com/scikit-learn/scikit-learn/pull/16499/files#r382599621 @jeremiedbb ?

@NicolasHug
Copy link
Member

Although it's a nice improvement, we don't fully recover the perf of 0.22. See explanation and discussion in #17208.

Just trying to fully understand where we're at: is this sentence still up to date considering that the overhead of _deprecate_positional_args has been ruled out?

@jeremiedbb
Copy link
Member Author

Can you help me understand what made you change your mind about https://github.com/scikit-learn/scikit-learn/pull/16499/files#r382599621 @jeremiedbb ?

Haha I was sure you'd remember :D
I did not benchmark very small datasets at that time and the timings reported in #17208 indicate that the overhead is significant at that scale.

Reading my comment again, I recall what was the issue.

for i in range(max_iter):
    elkan_iter(...)
    ...
    euclidean_distances(...)  # BLAS

If we put the context inside the loop we have the overhead and if we put it outside of the loop, the inner euclidean_distances is limited. I don't see how we can solve this :/
Let me put back the wip tag...

@jeremiedbb jeremiedbb changed the title [MRG] Move threadpoolctl outside of iteration loop in KMeans [WIP] Move threadpoolctl outside of iteration loop in KMeans May 15, 2020
@jeremiedbb
Copy link
Member Author

Just trying to fully understand where we're at: is this sentence still up to date considering that the overhead of _deprecate_positional_args has been ruled out?

We still don't exactly recover the timings from 0.22 but the explanation is wrong, we still need to figure out why

@jeremiedbb
Copy link
Member Author

actually, there's no need for the threadpoolctl context for elkan since elkan does not call BLAS in nested parallel regions. I just removed it. The timings are now:

elkan lloyd
0.22 0.0138 0.0211
master 0.0477 0.0428
this pr 0.0134 0.0181

@jeremiedbb jeremiedbb changed the title [WIP] Move threadpoolctl outside of iteration loop in KMeans [MRG] Move threadpoolctl outside of iteration loop in KMeans May 15, 2020
Copy link
Member

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed it

These are the best fixes

LGTM as long as the answer to my question below is yes ;)

thanks @jeremiedbb

print("Iteration {0}, inertia {1}" .format(i, inertia))
# Threadpoolctl context to limit the number of threads in second level of
# nested parallelism (i.e. BLAS) to avoid oversubsciption.
with threadpool_limits(limits=1, user_api="blas"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so you confirm that we have no BLAS call that are protected by this CM, apart from the ones in lloyd_iter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I just double checked

@NicolasHug
Copy link
Member

also will need a whatsnew entry

@jeremiedbb
Copy link
Member Author

@NicolasHug I merged the what's new entry with the entry for #17210 if you don't mind. They are 2 attempts at fixing the same thing.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As another datapoint:

0.22: 0.0229
0.23: 0.16056
master: 0.1285
this pr: 0.0249

LGTM Thank you @jeremiedbb !

@thomasjpfan thomasjpfan changed the title [MRG] Move threadpoolctl outside of iteration loop in KMeans ENH Move threadpoolctl outside of iteration loop in KMeans May 16, 2020
@thomasjpfan thomasjpfan merged commit 0db5f36 into scikit-learn:master May 16, 2020
adrinjalali pushed a commit to adrinjalali/scikit-learn that referenced this pull request May 18, 2020
viclafargue pushed a commit to viclafargue/scikit-learn that referenced this pull request Jun 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants