Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7388][SPARK-7383] wrapper for VectorAssembler in Python #5930

Closed
wants to merge 6 commits into from

Conversation

brkyvz
Copy link
Contributor

@brkyvz brkyvz commented May 6, 2015

The wrapper required the implementation of the ArrayParam, because Array[T] is hard to obtain from Python. ArrayParam has an extra function called wCast which is an internal function to obtain Array[T] from Seq[T]

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31938 has started for PR 5930 at commit 7f7ea2a.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31938 has finished for PR 5930 at commit 7f7ea2a.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ArrayParam[T : ClassTag](parent: Params, name: String, doc: String, isValid: Array[T] => Boolean)
    • class VectorAssembler(JavaTransformer, HasInputCols, HasOutputCol):
    • class HasInputCols(Params):

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31938/
Test FAILed.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31943 has started for PR 5930 at commit 39ecb07.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31944 has started for PR 5930 at commit 99c2ebf.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31943 has finished for PR 5930 at commit 39ecb07.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ArrayParam[T : ClassTag](
    • class VectorAssembler(JavaTransformer, HasInputCols, HasOutputCol):
    • class HasInputCols(Params):

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31943/
Test FAILed.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31944 has finished for PR 5930 at commit 99c2ebf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31944/
Test PASSed.

@@ -218,6 +219,22 @@ class BooleanParam(parent: Params, name: String, doc: String) // No need for isV
override def w(value: Boolean): ParamPair[Boolean] = super.w(value)
}

/** Specialized version of [[Param[Array[T]]]] for Java. */
class ArrayParam[T : ClassTag](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having ClassTag is not Java friendly. Array[T] will be translated into Object in Java to handle both primitive arrays and object arrays. How about adding StringArrayParam instead of ArrayParam[T] in this PR?

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31995 has started for PR 5930 at commit c81072d.

@@ -22,6 +22,7 @@ import java.util.NoSuchElementException

import scala.annotation.varargs
import scala.collection.mutable
import scala.reflect.ClassTag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.

@SparkQA
Copy link

SparkQA commented May 6, 2015

Test build #31995 has finished for PR 5930 at commit c81072d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class StringArrayParam(parent: Params, name: String, doc: String, isValid: Array[String] => Boolean)
    • class VectorAssembler(JavaTransformer, HasInputCols, HasOutputCol):
    • class HasInputCols(Params):

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31995/
Test PASSed.

@mengxr
Copy link
Contributor

mengxr commented May 7, 2015

@brkyvz We can overload w and let py4j decide which w to call via reflection. I sent you a PR at brkyvz#3.

overload StringArrayParam.w
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32118 has started for PR 5930 at commit 73e745f.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32118 has finished for PR 5930 at commit 73e745f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class StringArrayParam(parent: Params, name: String, doc: String, isValid: Array[String] => Boolean)
    • class VectorAssembler(JavaTransformer, HasInputCols, HasOutputCol):
    • class HasInputCols(Params):

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32118/
Test PASSed.

asfgit pushed a commit that referenced this pull request May 7, 2015
The wrapper required the implementation of the `ArrayParam`, because `Array[T]` is hard to obtain from Python. `ArrayParam` has an extra function called `wCast` which is an internal function to obtain `Array[T]` from `Seq[T]`

Author: Burak Yavuz <brkyvz@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>

Closes #5930 from brkyvz/ml-feat and squashes the following commits:

73e745f [Burak Yavuz] Merge pull request #3 from mengxr/SPARK-7388
c221db9 [Xiangrui Meng] overload StringArrayParam.w
c81072d [Burak Yavuz] addressed comments
99c2ebf [Burak Yavuz] add to python_shared_params
39ecb07 [Burak Yavuz] fix scalastyle
7f7ea2a [Burak Yavuz] [SPARK-7388][SPARK-7383] wrapper for VectorAssembler in Python

(cherry picked from commit 9e2ffb1)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@asfgit asfgit closed this in 9e2ffb1 May 7, 2015
@mengxr
Copy link
Contributor

mengxr commented May 7, 2015

LGTM. Merged into master and branch-1.4. Thanks! I left SPARK-7388 open.

jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
The wrapper required the implementation of the `ArrayParam`, because `Array[T]` is hard to obtain from Python. `ArrayParam` has an extra function called `wCast` which is an internal function to obtain `Array[T]` from `Seq[T]`

Author: Burak Yavuz <brkyvz@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>

Closes apache#5930 from brkyvz/ml-feat and squashes the following commits:

73e745f [Burak Yavuz] Merge pull request apache#3 from mengxr/SPARK-7388
c221db9 [Xiangrui Meng] overload StringArrayParam.w
c81072d [Burak Yavuz] addressed comments
99c2ebf [Burak Yavuz] add to python_shared_params
39ecb07 [Burak Yavuz] fix scalastyle
7f7ea2a [Burak Yavuz] [SPARK-7388][SPARK-7383] wrapper for VectorAssembler in Python
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
The wrapper required the implementation of the `ArrayParam`, because `Array[T]` is hard to obtain from Python. `ArrayParam` has an extra function called `wCast` which is an internal function to obtain `Array[T]` from `Seq[T]`

Author: Burak Yavuz <brkyvz@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>

Closes apache#5930 from brkyvz/ml-feat and squashes the following commits:

73e745f [Burak Yavuz] Merge pull request apache#3 from mengxr/SPARK-7388
c221db9 [Xiangrui Meng] overload StringArrayParam.w
c81072d [Burak Yavuz] addressed comments
99c2ebf [Burak Yavuz] add to python_shared_params
39ecb07 [Burak Yavuz] fix scalastyle
7f7ea2a [Burak Yavuz] [SPARK-7388][SPARK-7383] wrapper for VectorAssembler in Python
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
The wrapper required the implementation of the `ArrayParam`, because `Array[T]` is hard to obtain from Python. `ArrayParam` has an extra function called `wCast` which is an internal function to obtain `Array[T]` from `Seq[T]`

Author: Burak Yavuz <brkyvz@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>

Closes apache#5930 from brkyvz/ml-feat and squashes the following commits:

73e745f [Burak Yavuz] Merge pull request apache#3 from mengxr/SPARK-7388
c221db9 [Xiangrui Meng] overload StringArrayParam.w
c81072d [Burak Yavuz] addressed comments
99c2ebf [Burak Yavuz] add to python_shared_params
39ecb07 [Burak Yavuz] fix scalastyle
7f7ea2a [Burak Yavuz] [SPARK-7388][SPARK-7383] wrapper for VectorAssembler in Python
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants