-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13032] [ML] [PySpark] PySpark support model export/import and take LinearRegression as example #10469
Conversation
Test build #48300 has finished for PR 10469 at commit
|
@yanboliang I'll take a look at this now. Sorry for the delay! |
True | ||
>>> abs(model.intercept - model2.intercept) < 0.001 | ||
True | ||
>>> model_path = path + "/model" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use directory "/lr_model"?
I just added some comments quickly, but let me know if my suggestions are workable. I did not test the suggestions myself. |
Test build #49987 has finished for PR 10469 at commit
|
@yanboliang Thanks for the updates; I'll try to make final comments soon. I left one response in one of the threads above. |
Test build #50099 has finished for PR 10469 at commit
|
@@ -159,15 +151,16 @@ class JavaModel(Model, JavaTransformer): | |||
|
|||
__metaclass__ = ABCMeta | |||
|
|||
def __init__(self, java_model): | |||
def __init__(self, java_model=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unify the construction of Model
and Estimator
. Model
can be instantiated without argument which is used by load
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a note in the doc to explain this?
Thanks for the updates! Done with a pass. They are mostly minor comments, except for making MLReadable, MLWritable more general and not specific to Java wrappers. |
@jkbradley Thanks for your comments! I have made |
Test build #50182 has finished for PR 10469 at commit
|
HI Regards. |
@Wenpei This isn't the right forum for posting comments like that (since hardly anyone will see your comment). I'd recommend identifying missing unit tests and making JIRAs for them. We do have tests.py files under pyspark/ml and pyspark/mllib, so please do check those first before making JIRAs. |
@yanboliang I hope you don't mind, but I took the liberty of experimenting a bit myself and sending this PR: [https://github.com/yanboliang/pull/4] Please let me know what you think! Btw, thanks for the generalization-related updates. I guess we'd have to go further (providing MLWriter, MLReader abstract classes) if we wanted to allow Python developers to implement persistence from Python, but we can address that in the future (if anyone requests it). Your changes should help us work towards that though. |
Test build #50264 has finished for PR 10469 at commit
|
…d fixed current issues
f07ffcb
to
7334be9
Compare
Test build #50267 has finished for PR 10469 at commit
|
@jkbradley You PR looks good and get merged, thanks! |
LGTM |
MLWriter/MLWritable/MLReader/MLReadable
for PySpark.LinearRegression
to supportsave/load
as example. After this merged, the work for other transformers/estimators will be easy, then we can list and distribute the tasks to the community.cc @mengxr @jkbradley