Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-10049][SPARKR] Support collecting data of ArraryType in DataFrame. #8458

Closed
wants to merge 10 commits into from

Conversation

sun-rui
Copy link
Contributor

@sun-rui sun-rui commented Aug 26, 2015

this PR :

  1. Enhance reflection in RBackend. Automatically matching a Java array to Scala Seq when finding methods. Util functions like seq(), listToSeq() in R side can be removed, as they will conflict with the Serde logic that transferrs a Scala seq to R side.
  2. Enhance the SerDe to support transferring a Scala seq to R side. Data of ArrayType in DataFrame
    after collection is observed to be of Scala Seq type.
  3. Support ArrayType in createDataFrame().

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41620 timed out for PR 8458 at commit 1f7ab95 after a configured wait of 175m.

@SparkQA
Copy link

SparkQA commented Aug 26, 2015

Test build #41621 has finished for PR 8458 at commit e3af422.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

cc @davies

@SparkQA
Copy link

SparkQA commented Aug 27, 2015

Test build #41678 has finished for PR 8458 at commit 9b5bd05.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • implicit class RegexContext(sc: StringContext)

@SparkQA
Copy link

SparkQA commented Aug 27, 2015

Test build #41683 has finished for PR 8458 at commit 02c64eb.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@sun-rui sun-rui changed the title [SPARK-10049][SPARKR][WIP] Support collecting data of ArraryType in DataFrame. [SPARK-10049][SPARKR] Support collecting data of ArraryType in DataFrame. Aug 28, 2015
@sun-rui
Copy link
Contributor Author

sun-rui commented Aug 28, 2015

@davies , @shivaram , Could you help to review it?

@SparkQA
Copy link

SparkQA commented Aug 28, 2015

Test build #41723 has finished for PR 8458 at commit 2bc97ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

@sun-rui I'll take a look at this tomorrow

@@ -49,7 +49,7 @@ infer_type <- function(x) {
stopifnot(length(x) > 0)
names <- names(x)
if (is.null(names)) {
list(type = "array", elementType = infer_type(x[[1]]), containsNull = TRUE)
paste0("array<", infer_type(x[[1]]), ">")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify this is to support vectors of the form c(1, 2, 3) etc. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for list. Next changed one is for vector.

@SparkQA
Copy link

SparkQA commented Sep 5, 2015

Test build #42033 has finished for PR 8458 at commit a1f4fcb.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 6, 2015

Test build #42065 has finished for PR 8458 at commit 1e223e0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val parameterTypes = parameterTypesOfMethods(index)

if (parameterTypes.length == numArgs) {
val convertedArgs = new Array[Object](numArgs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we shouldn't modify args in place and make a copy ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arg is checked and converted one by one when matching a method. It is possible that after some args were converted, an arg fails, then all args have to be reverted to the original ones for matching next method.

It could be possible for now that checking all args for one method, and only after the checking passes, then do conversion, thus we can modify args in place.

I will refine the code.

@shivaram
Copy link
Contributor

shivaram commented Sep 8, 2015

@sun-rui Sorry for the delay - Code looks pretty good to me and I just one minor inline comment.

@davies Would be great if you could also take a look

@SparkQA
Copy link

SparkQA commented Sep 9, 2015

Test build #42188 has finished for PR 8458 at commit adde91f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented Sep 9, 2015

@shivaram @sun-rui I will review this today, sorry for the late.

args: Array[Object]): Option[Int] = {
val numArgs = args.length

for (index <- 0 to parameterTypesOfMethods.length - 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 until xxx

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@davies
Copy link
Contributor

davies commented Sep 9, 2015

@sun-rui The changes looks good to me overall, just two minor comments, thanks!

@SparkQA
Copy link

SparkQA commented Sep 10, 2015

Test build #42228 has finished for PR 8458 at commit a7aa017.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 10, 2015

Test build #42252 has finished for PR 8458 at commit afad2c6.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented Sep 10, 2015

Test build #42272 has finished for PR 8458 at commit afad2c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

Thanks @sun-rui -- LGTM. Merging this

@asfgit asfgit closed this in 45e3be5 Sep 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants