Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Converting all Estimator Constructors to Keyword Arguments #3636

Merged

Conversation

mdemoret-nv
Copy link
Contributor

Closes #3373

This mimics sklearn's positional argument deprecation message (and is named differently to prevent mixups) but is automatically added to any class that inherits from cuml.Base (similar to other CumlArray decorators that are automatically added to Base class). When the descriptor is automatically added, it assumes all arguments should be keyword-only (to catch accidentally forgetting to specify keyword only arguments). This can easily be overridden by manually adding the decorator if some arguments need to be positional and some keyword. For example, the OneVsRestClassifier estimator takes a positional estimator to wrap so the __init__ function looks like:

@_deprecate_pos_args(version="0.20")
def __init__(self,
             estimator,
             *args,
             handle=None,
             verbose=False,
             output_type=None):

Notes:

  1. This only updates __init__ functions but could be applied to any of the public functions in the API.
  2. No tests were added because validation is performed on module load and will generate an assert error if developers have used it incorrectly

@mdemoret-nv mdemoret-nv added 2 - In Progress Currenty a work in progress Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Mar 19, 2021
@mdemoret-nv mdemoret-nv added this to PR-WIP in v0.19 Release via automation Mar 19, 2021
@mdemoret-nv mdemoret-nv marked this pull request as ready for review March 26, 2021 21:22
@mdemoret-nv mdemoret-nv requested a review from a team as a code owner March 26, 2021 21:22
@mdemoret-nv mdemoret-nv added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currenty a work in progress labels Mar 26, 2021
@JohnZed JohnZed self-assigned this Mar 29, 2021
Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very cool! I had only minor questions and comments, can wrap up quickly. If this is ready to go, can you remove the WIP label?

@@ -609,3 +613,70 @@ def sparse_scipy_to_cp(sp, dtype):
v = cp.asarray(values, dtype=dtype)

return cupyx.scipy.sparse.coo_matrix((v, (r, c)), sp.shape)


class _deprecate_pos_args:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you benchmark a class initialization with and without this change? (just %timeit x = Foo() in Ipython) I think it'll be very small but just want to make sure there is not a meaningful cost to the tricks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are the numbers. Before this PR:

>>> timeit.repeat('cuml.DBSCAN()', 'import cuml', number=100000)
[1.6759078549985134, 0.41760462100137374, 0.4198731740034418, 0.4204189170013706, 0.4161926310007402]

After this PR:

>>> timeit.repeat('cuml.DBSCAN()', 'import cuml', number=100000)
[2.557721577999473, 0.7288761550007621, 0.7315405129993451, 0.7273198049988423, 0.7297004659994855]

I was surprised that the impact would be so big (75% longer) but at the same time this is only adding 3 usec per instantiation which is incredibly small. I'm not sure what we could change in the function to make it run faster.

python/cuml/cluster/kmeans.pyx Show resolved Hide resolved
python/cuml/common/type_utils.py Outdated Show resolved Hide resolved
python/cuml/internals/base_helpers.py Outdated Show resolved Hide resolved
python/cuml/internals/base_helpers.py Outdated Show resolved Hide resolved
@mdemoret-nv mdemoret-nv changed the title [WIP] Converting all Estimator Constructors to Keyword Arguments [REVIEW] Converting all Estimator Constructors to Keyword Arguments Mar 30, 2021
Copy link
Member

@dantegd dantegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just had a quick comment

python/cuml/internals/base_helpers.py Outdated Show resolved Hide resolved
v0.19 Release automation moved this from PR-WIP to PR-Needs review Mar 30, 2021
@codecov-io
Copy link

Codecov Report

Merging #3636 (8c110a0) into branch-0.19 (fd9ec89) will increase coverage by 1.67%.
The diff coverage is 86.22%.

Impacted file tree graph

@@               Coverage Diff               @@
##           branch-0.19    #3636      +/-   ##
===============================================
+ Coverage        80.70%   82.38%   +1.67%     
===============================================
  Files              227      227              
  Lines            17615    17652      +37     
===============================================
+ Hits             14217    14542     +325     
+ Misses            3398     3110     -288     
Flag Coverage Δ
dask 46.45% <0.59%> (+1.46%) ⬆️
non-dask 74.40% <86.22%> (+1.49%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...cuml/_thirdparty/sklearn/preprocessing/__init__.py 100.00% <ø> (ø)
...l/_thirdparty/sklearn/preprocessing/_imputation.py 64.54% <ø> (+1.74%) ⬆️
python/cuml/_thirdparty/sklearn/utils/extmath.py 56.89% <ø> (ø)
...cuml/_thirdparty/sklearn/utils/skl_dependencies.py 80.00% <ø> (+26.07%) ⬆️
...ython/cuml/_thirdparty/sklearn/utils/validation.py 18.41% <ø> (-4.04%) ⬇️
python/cuml/cluster/__init__.py 100.00% <ø> (ø)
python/cuml/cluster/agglomerative.pyx 96.47% <ø> (ø)
python/cuml/cluster/dbscan.pyx 98.23% <ø> (-1.77%) ⬇️
python/cuml/cluster/dbscan_mg.pyx 100.00% <ø> (ø)
python/cuml/cluster/kmeans.pyx 92.03% <ø> (+0.07%) ⬆️
... and 169 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 668aea1...8c110a0. Read the comment docs.

@mdemoret-nv
Copy link
Contributor Author

@dantegd and @JohnZed I've incorporated your feedback into this PR and merged upstream. However, with the changes to the BaseEstimator class in preprocessing, the number of updates I had to make was pretty large. Probably warrants a quick re-review to make sure I didnt miss anything.

To address both of your comments about the local import, I've moved the _deprecate_pos_args class to cuml.internal. So this local import isnt needed anymore.

Copy link
Contributor

@JohnZed JohnZed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I like the new namings and appreciate the move to internals. I'll wait a bit before merging in case @dantegd wants to take a look at the changes

python/cuml/_thirdparty/sklearn/utils/validation.py Outdated Show resolved Hide resolved
v0.19 Release automation moved this from PR-Needs review to PR-Reviewer approved Mar 31, 2021
@JohnZed
Copy link
Contributor

JohnZed commented Apr 1, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit fb088d9 into rapidsai:branch-0.19 Apr 1, 2021
v0.19 Release automation moved this from PR-Reviewer approved to Done Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
No open projects
v0.19 Release
  
Done
Development

Successfully merging this pull request may close these issues.

[FEA] Deprecate positional arguments for initializers as skl does
4 participants