-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-10509][PYSPARK] Reduce excessive param boiler plate code #10216
[SPARK-10509][PYSPARK] Reduce excessive param boiler plate code #10216
Conversation
Test build #47408 has finished for PR 10216 at commit
|
cc @JoshRosen & @yanboliang who have probably had to deal with a fair amount of this boilerplate code in the past. |
ping @yanboliang if you have a chance to look at this would be appreciated. |
Test build #48492 has finished for PR 10216 at commit
|
re-ping @yanboliang if you have any thoughts on this approach. |
cc @jkbradley |
self.threshold = Param(self, "threshold", | ||
"Threshold in binary classification prediction, in range [0, 1]." + | ||
" If threshold and thresholds are both set, they must match.") | ||
self.threshold = LogisticRegression.threshold._copy_new_parent(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LogisticRegression.threshold
is a placeholder to make the param appear in the generated doc. After this change, it has more uses, should we also change the annotation of that dummy param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh good point - I'll remove the comments about them just being dummy params.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope we can eliminate the need for these calls in concrete classes' init methods.
@holdenk Sorry for late response, please see my inline comments. |
Test build #49031 has finished for PR 10216 at commit
|
Test build #49032 has finished for PR 10216 at commit
|
Test build #49176 has finished for PR 10216 at commit
|
Test build #49533 has finished for PR 10216 at commit
|
ping @jkbradley if you've got the time to look at this that would be awesome. |
…s and update a new param to also use the new copy syntax instead of duplicated text
Test build #50013 has finished for PR 10216 at commit
|
@holdenk Sorry for the delay. This looks great, but I was wondering if it could be improved even further: Would it be possible to add a method in the |
@jkbradley sounds interesting - would probably want to change the code generated params as well then to match but yah I'll take a crack at that this week. |
@holdenk I'm not quite sure what you mean, but I'll comment in a few places to make sure I make my idea clear. |
@@ -72,7 +72,6 @@ class LogisticRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredicti | |||
.. versionadded:: 1.3.0 | |||
""" | |||
|
|||
# a placeholder to make it appear in the generated doc | |||
threshold = Param(Params._dummy(), "threshold", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can hopefully remain the same.
@jkbradley I mean there is also the shared params code gen, so we would want to update the generated params as well (right now the change only affects the manual params). |
Ohh, right! Thanks for remembering. Should I make another pass? |
Test build #50122 has finished for PR 10216 at commit
|
$dummyPlaceHolders | ||
|
||
def __init__(self): | ||
super(DecisionTreeParams, self).__init__() | ||
$realParams''' | ||
super(DecisionTreeParams, self).__init__()''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep newline at end
Just a few small comments now |
…t and setters in tree params codegen, regen shared params
@jkbradley thanks addressed those small issues :) |
LGTM pending tests. |
I hate copy'paste code, so glad to be able to kill some (and not make more of making Python wrappers) :) |
Test build #50134 has finished for PR 10216 at commit
|
Merging with master Btw, I verified that the doc generates in the same way. |
The current python ml params require cut-and-pasting the param setup and description between the class &
__init__
methods. Remove this possible case of errors & simplify use of custom params by adding a_copy_new_parent
method to param so as to avoid cut and pasting (and cut and pasting at different indentation levels urgh).