Skip to content

Conversation

@robert3005
Copy link

What changes were proposed in this pull request?

Don't convert toRdd when doing toJSON

How was this patch tested?

Existing unit tests

@robert3005 robert3005 changed the title make toJSON not go through rdd form but operate on dataset always [SPARK-17209] make toJSON not go through rdd form but operate on dataset always Aug 11, 2016
@robert3005 robert3005 changed the title [SPARK-17209] make toJSON not go through rdd form but operate on dataset always [SPARK-17029] make toJSON not go through rdd form but operate on dataset always Aug 11, 2016
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add classdoc to explain what boundGen (and other ctor parameters) does.

@robert3005 robert3005 force-pushed the robertk/correct-tojson branch 4 times, most recently from def3cf0 to a2e8ebf Compare August 17, 2016 00:32
@robert3005
Copy link
Author

@rxin anything else? I added docs to the best of my understanding let me know if you meant something else.

@robert3005 robert3005 force-pushed the robertk/correct-tojson branch from a2e8ebf to 6f286c1 Compare September 3, 2016 02:43
@robert3005 robert3005 force-pushed the robertk/correct-tojson branch from 6f286c1 to 48342c2 Compare September 13, 2016 23:01
@ash211
Copy link
Contributor

ash211 commented Sep 20, 2016

This seems pretty reasonable, assuming test coverage already exists on that toJSON method.

Jenkins, this is ok to test.

@robert3005
Copy link
Author

@rxin any chance you or someone else can take a look?

@robert3005 robert3005 force-pushed the robertk/correct-tojson branch from 48342c2 to 0043190 Compare March 17, 2017 19:43
@ash211
Copy link
Contributor

ash211 commented Mar 17, 2017

Jenkins, this is ok to test.

@SparkQA
Copy link

SparkQA commented Mar 17, 2017

Test build #74755 has finished for PR 14615 at commit 0043190.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@ash211
Copy link
Contributor

ash211 commented Mar 17, 2017

@robert3005 looks like this has unit test failures on org.apache.spark.sql.hive.orc.OrcSourceSuite.SPARK-19459/SPARK-18220: read char/varchar column written by Hive -- is that a flake?

@ash211
Copy link
Contributor

ash211 commented Mar 17, 2017

Jenkins, this is ok to test.

@robert3005
Copy link
Author

It indeed does look like a flake.

@SparkQA
Copy link

SparkQA commented Mar 18, 2017

Test build #74759 has finished for PR 14615 at commit 0043190.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 19, 2017

Test build #74828 has finished for PR 14615 at commit 6250699.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@robert3005
Copy link
Author

ping? @rxin

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of importing it, we can explicitly pass the encoder.

mapPartitions {
// func
} (sparkSession.implicits.newStringEncoder)

@robert3005 robert3005 force-pushed the robertk/correct-tojson branch from 6250699 to 274be08 Compare May 10, 2017 20:24
@robert3005
Copy link
Author

thanks @gatorsmile, updated

@SparkQA
Copy link

SparkQA commented May 10, 2017

Test build #76757 has finished for PR 14615 at commit 274be08.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

}
import sparkSession.implicits.newStringEncoder
sparkSession.createDataset(rdd)
} (sparkSession.implicits.newStringEncoder)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Encoders.STRING

case _ => false
}

if (containsRDD.isDefined) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: assert(containsRDD.isEmpty, "xxx")

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented May 11, 2017

Test build #76766 has finished for PR 14615 at commit 723ef1e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 65accb8 May 11, 2017
lycplus pushed a commit to lycplus/spark that referenced this pull request May 24, 2017
…set always

## What changes were proposed in this pull request?

Don't convert toRdd when doing toJSON
## How was this patch tested?

Existing unit tests

Author: Robert Kruszewski <robertk@palantir.com>

Closes apache#14615 from robert3005/robertk/correct-tojson.
@robert3005 robert3005 deleted the robertk/correct-tojson branch December 6, 2019 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants