New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-10759 [MLlib] Add python example to model selection via cross-validation #11202
Conversation
It's better to reuse current cross validator with |
Refer to #11126 and JIRA https://issues.apache.org/jira/browse/SPARK-11337 |
yinxusen, I have a PR here (#11240) with the solution to just this example that takes advantage of {% include_example %}, though accepting it would make this example inconsistent with the rest of the examples in this doc. I'm also going to go ahead and create a branch with all of the examples fixed with {% include_example %} to make them consistent with one another and take advantage of the automatic testing and dryer code. Would you prefer that I submit that PR here, or create a new JIRA stating that all are fixed? |
yinxusen, I went ahead and changed all of the examples in this doc to use {% include_example %} as well as updated the python cross validation example. Let me know if it's acceptable to make all of the changes within this JIRA, or if you'd prefer they be separated. I also notice that a JIRA exists (https://issues.apache.org/jira/browse/SPARK-13012) for converting the original examples to use {% include_example %}. The PR makes the same changes that are here, except does not include the change to the python example for cross-validation that this JIRA is supposed to be for. |
5c5817c
to
c27712d
Compare
…python cross validation example
42ccc13
to
14208ad
Compare
ok to test |
Test build #51791 has finished for PR 11202 at commit
|
@JeremyNixon Could you merge the PR with the master? |
@yinxusen Will do. But I assume you mean change the commits to make this consistent with https://issues.apache.org/jira/browse/SPARK-13012 which was merged in yesterday? |
Yes. Can we close the issue? Since you have #11240 |
@yinxusen Two things - I finished an implementation of train-validation-split so that we could finish the examples for this doc here: master...JeremyNixon:train_val_split_python, not sure if it belongs in this PR or should be done separately. #11240 requires one small update, which I can push now. |
@JeremyNixon Give a PR under JIRA https://issues.apache.org/jira/browse/SPARK-12877 of the train-validation-split in PySpark. You can add the train-validation-split along with that JIRA. I think we can merge #11240 first. |
@JeremyNixon is this PR still alive? |
Python example in the same form as the Scala and Java examples. Successfully rendered on my local server and run through pyspark, results match those from the Scala example.