[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist #6595

mengxr · 2015-06-02T21:29:18Z

This is just a workaround to a bigger problem. Some pipeline stages may not be effective during prediction, and they should not complain about missing required columns, e.g. StringIndexerModel. @jkbradley

SparkQA · 2015-06-02T21:34:48Z

Test build #34023 has finished for PR 6595 at commit e112394.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley · 2015-06-02T23:04:54Z

As long as this is aimed at 1.4.1 (not 1.4.0), should we design a better fix? My suggestion would be to have PipelineStage include a Param specifying whether to use it during transform(). Stages could be used in transform() by default, but certain Transformers could override the default to skip during transform(). PipelineModel could read the Param and handle each stage accordingly.

If that's too big a change for 1.4.1, then this temp fix seems tolerable.

Note: We should document the behavior in the docs.

SparkQA · 2015-06-02T23:50:44Z

Test build #34025 has finished for PR 6595 at commit 8ee7c7e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mengxr · 2015-06-03T06:12:13Z

I checked the transformers we have. Perhaps this is the only one that would operate on target labels, and it is blocking users from making predictions without labels. So it would be nice to merge this fix into branch-1.4, before 1.4.1 is out.

SparkQA · 2015-06-03T06:31:36Z

Test build #34069 has finished for PR 6595 at commit b6a36b9.

This patch fails MiMa tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class ElementwiseProduct(val scalingVec: Vector) extends VectorTransformer

jkbradley · 2015-06-03T19:42:32Z

LGTM pending tests

jkbradley · 2015-06-03T19:42:35Z

test this please

SparkQA · 2015-06-03T21:35:40Z

Test build #34106 has finished for PR 6595 at commit b6a36b9.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- class ElementwiseProduct(val scalingVec: Vector) extends VectorTransformer

jkbradley · 2015-06-03T22:16:03Z

Merging with master and branch-1.4

…oes not exist This is just a workaround to a bigger problem. Some pipeline stages may not be effective during prediction, and they should not complain about missing required columns, e.g. `StringIndexerModel`. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes #6595 from mengxr/SPARK-8051 and squashes the following commits: b6a36b9 [Xiangrui Meng] add doc f143fd4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-8051 8ee7c7e [Xiangrui Meng] use SparkFunSuite e112394 [Xiangrui Meng] make StringIndexerModel silent if input column does not exist (cherry picked from commit 26c9d7a) Signed-off-by: Joseph K. Bradley <joseph@databricks.com>

…oes not exist This is just a workaround to a bigger problem. Some pipeline stages may not be effective during prediction, and they should not complain about missing required columns, e.g. `StringIndexerModel`. jkbradley Author: Xiangrui Meng <meng@databricks.com> Closes apache#6595 from mengxr/SPARK-8051 and squashes the following commits: b6a36b9 [Xiangrui Meng] add doc f143fd4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-8051 8ee7c7e [Xiangrui Meng] use SparkFunSuite e112394 [Xiangrui Meng] make StringIndexerModel silent if input column does not exist

make StringIndexerModel silent if input column does not exist

e112394

use SparkFunSuite

8ee7c7e

mengxr added 2 commits June 2, 2015 23:00

Merge remote-tracking branch 'apache/master' into SPARK-8051

f143fd4

add doc

b6a36b9

asfgit closed this in 26c9d7a Jun 3, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist #6595

[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist #6595

mengxr commented Jun 2, 2015

SparkQA commented Jun 2, 2015

jkbradley commented Jun 2, 2015

SparkQA commented Jun 2, 2015

mengxr commented Jun 3, 2015

SparkQA commented Jun 3, 2015

jkbradley commented Jun 3, 2015

jkbradley commented Jun 3, 2015

SparkQA commented Jun 3, 2015

jkbradley commented Jun 3, 2015

[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist #6595

[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist #6595

Conversation

mengxr commented Jun 2, 2015

SparkQA commented Jun 2, 2015

jkbradley commented Jun 2, 2015

SparkQA commented Jun 2, 2015

mengxr commented Jun 3, 2015

SparkQA commented Jun 3, 2015

jkbradley commented Jun 3, 2015

jkbradley commented Jun 3, 2015

SparkQA commented Jun 3, 2015

jkbradley commented Jun 3, 2015