SPARK-10759 [MLlib] Add python example to model selection via cross-validation #11202

JeremyNixon · 2016-02-14T19:31:22Z

Python example in the same form as the Scala and Java examples. Successfully rendered on my local server and run through pyspark, results match those from the Scala example.

sethah · 2016-02-17T04:34:29Z

One question is whether we should be using {% include_example %} when adding new examples to the documentation. We could separate it out into different PRs, but then we are duplicating code (this example already exists here).

@yinxusen could you advise?

yinxusen · 2016-02-17T05:54:04Z

It's better to reuse current cross validator with {% include_example %} other than writing it directly in markdown file.

yinxusen · 2016-02-17T05:58:05Z

Refer to #11126 and JIRA https://issues.apache.org/jira/browse/SPARK-11337

JeremyNixon · 2016-02-17T15:39:06Z

yinxusen, I have a PR here (#11240) with the solution to just this example that takes advantage of {% include_example %}, though accepting it would make this example inconsistent with the rest of the examples in this doc.

I'm also going to go ahead and create a branch with all of the examples fixed with {% include_example %} to make them consistent with one another and take advantage of the automatic testing and dryer code. Would you prefer that I submit that PR here, or create a new JIRA stating that all are fixed?

JeremyNixon · 2016-02-17T16:56:53Z

yinxusen, I went ahead and changed all of the examples in this doc to use {% include_example %} as well as updated the python cross validation example. Let me know if it's acceptable to make all of the changes within this JIRA, or if you'd prefer they be separated.

I also notice that a JIRA exists (https://issues.apache.org/jira/browse/SPARK-13012) for converting the original examples to use {% include_example %}. The PR makes the same changes that are here, except does not include the change to the python example for cross-validation that this JIRA is supposed to be for.

…python cross validation example

mengxr · 2016-02-23T20:52:01Z

ok to test

SparkQA · 2016-02-23T21:23:04Z

Test build #51791 has finished for PR 11202 at commit 14208ad.

This patch fails Scala style tests.
This patch does not merge cleanly.
This patch adds no public classes.

yinxusen · 2016-02-23T22:14:10Z

@JeremyNixon Could you merge the PR with the master?

JeremyNixon · 2016-02-23T22:16:37Z

@yinxusen Will do. But I assume you mean change the commits to make this consistent with https://issues.apache.org/jira/browse/SPARK-13012 which was merged in yesterday?

yinxusen · 2016-02-23T23:07:00Z

Yes. Can we close the issue? Since you have #11240

JeremyNixon · 2016-02-23T23:11:19Z

@yinxusen Two things -

I finished an implementation of train-validation-split so that we could finish the examples for this doc here: master...JeremyNixon:train_val_split_python, not sure if it belongs in this PR or should be done separately. #11240 requires one small update, which I can push now.

yinxusen · 2016-02-23T23:16:24Z

@JeremyNixon Give a PR under JIRA https://issues.apache.org/jira/browse/SPARK-12877 of the train-validation-split in PySpark. You can add the train-validation-split along with that JIRA. I think we can merge #11240 first.

MLnick · 2016-04-15T20:07:57Z

@JeremyNixon is this PR still alive?

JeremyNixon · 2016-04-15T21:39:21Z

@MLnick Dead PR, work completed with a combination of 230bbea and 02b1fef. Thanks for the heads up.

JeremyNixon force-pushed the add_py_ex_ml-guide branch from 5c5817c to c27712d Compare February 17, 2016 17:49

JeremyNixon added 5 commits February 17, 2016 10:43

add python example to model selection via cross-validation

19edc78

update all examples in ml-guide with include-example, as well as add …

76f3e63

…python cross validation example

update python cross validation with include_example

4ac974b

fix typo in the work select inside ml-guide

18fbe6a

change examples incorrectly using classification data for regression

14208ad

JeremyNixon force-pushed the add_py_ex_ml-guide branch from 42ccc13 to 14208ad Compare February 17, 2016 18:46

JeremyNixon closed this Apr 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-10759 [MLlib] Add python example to model selection via cross-validation #11202

SPARK-10759 [MLlib] Add python example to model selection via cross-validation #11202

JeremyNixon commented Feb 14, 2016

sethah commented Feb 17, 2016

yinxusen commented Feb 17, 2016

yinxusen commented Feb 17, 2016

JeremyNixon commented Feb 17, 2016

JeremyNixon commented Feb 17, 2016

mengxr commented Feb 23, 2016

SparkQA commented Feb 23, 2016

yinxusen commented Feb 23, 2016

JeremyNixon commented Feb 23, 2016

yinxusen commented Feb 23, 2016

JeremyNixon commented Feb 23, 2016

yinxusen commented Feb 23, 2016

MLnick commented Apr 15, 2016

JeremyNixon commented Apr 15, 2016

SPARK-10759 [MLlib] Add python example to model selection via cross-validation #11202

SPARK-10759 [MLlib] Add python example to model selection via cross-validation #11202

Conversation

JeremyNixon commented Feb 14, 2016

sethah commented Feb 17, 2016

yinxusen commented Feb 17, 2016

yinxusen commented Feb 17, 2016

JeremyNixon commented Feb 17, 2016

JeremyNixon commented Feb 17, 2016

mengxr commented Feb 23, 2016

SparkQA commented Feb 23, 2016

yinxusen commented Feb 23, 2016

JeremyNixon commented Feb 23, 2016

yinxusen commented Feb 23, 2016

JeremyNixon commented Feb 23, 2016

yinxusen commented Feb 23, 2016

MLnick commented Apr 15, 2016

JeremyNixon commented Apr 15, 2016