[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS #1004

marmbrus · 2014-06-07T01:41:35Z

Basically there is a race condition (possibly a scala bug?) when these values are recomputed on all of the slaves that results in an incorrect projection being generated (possibly because the GUID uniqueness contract is broken?).

In general we should probably enforce that all expression planing occurs on the driver, as is now occurring here.

…kers.

AmplabJenkins · 2014-06-07T01:42:49Z

Merged build triggered.

AmplabJenkins · 2014-06-07T01:42:56Z

Merged build started.

AmplabJenkins · 2014-06-07T02:55:13Z

Merged build finished.

AmplabJenkins · 2014-06-07T02:55:13Z

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15522/

marmbrus · 2014-06-07T10:15:55Z

test this please

AmplabJenkins · 2014-06-07T10:17:50Z

Merged build triggered.

AmplabJenkins · 2014-06-07T12:41:46Z

Merged build started.

AmplabJenkins · 2014-06-07T13:55:35Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-06-07T13:55:35Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15524/

marmbrus · 2014-06-07T19:55:38Z

@rxin I added https://issues.apache.org/jira/browse/SPARK-2068 to track other places where we need to fix this, but we should probably just merge this one right away.

rxin · 2014-06-07T21:06:53Z

How big does the closure size increase by?

marmbrus · 2014-06-07T21:19:12Z

Is there an easy way to measure that?

Either way it was wrong before and I don't think making it possible to plan
on the slaves is worth the effort. Doing so subtly breaks the expression
guid contact independent of the issues with concurrency.
On Jun 7, 2014 2:06 PM, "Reynold Xin" notifications@github.com wrote:

How big does the closure size increase by?

—
Reply to this email directly or view it on GitHub
#1004 (comment).

rxin · 2014-06-07T21:20:21Z

I'm going to merge this. YOu can test this easily by looking at the log. Spark tells you the size of the task closure and how long it takes to serialize each of them in the info log.

rxin · 2014-06-07T21:20:39Z

Merged in master & branch-1.0.

rxin · 2014-06-07T21:21:28Z

One reason we had to add @transient lazy val is due to the lack of an init method on each partition for operators. I think there are benefits of adding that - it makes clear and explicit about object initialization, and then you can probably avoid this problem.

… data in HDFS Basically there is a race condition (possibly a scala bug?) when these values are recomputed on all of the slaves that results in an incorrect projection being generated (possibly because the GUID uniqueness contract is broken?). In general we should probably enforce that all expression planing occurs on the driver, as is now occurring here. Author: Michael Armbrust <michael@databricks.com> Closes #1004 from marmbrus/fixAggBug and squashes the following commits: e0c116c [Michael Armbrust] Compute aggregate expression during planning instead of lazily on workers. (cherry picked from commit a6c72ab) Signed-off-by: Reynold Xin <rxin@apache.org>

… data in HDFS Basically there is a race condition (possibly a scala bug?) when these values are recomputed on all of the slaves that results in an incorrect projection being generated (possibly because the GUID uniqueness contract is broken?). In general we should probably enforce that all expression planing occurs on the driver, as is now occurring here. Author: Michael Armbrust <michael@databricks.com> Closes apache#1004 from marmbrus/fixAggBug and squashes the following commits: e0c116c [Michael Armbrust] Compute aggregate expression during planning instead of lazily on workers.

…edException (#1004) * [CARMEL-6072] Return more information in SchemaColumnConvertNotSupportedException * [CARMEL-6072] Return more information in SchemaColumnConvertNotSupportedException * [CARMEL-6072] Return more information in SchemaColumnConvertNotSupportedException * [CARMEL-6072] Return more information in SchemaColumnConvertNotSupportedException * [CARMEL-6072] Return more information in SchemaColumnConvertNotSupportedException

…kage (apache#1004) Co-authored-by: Egor Krivokon <>

Compute aggregate expression during planning instead of lazily on wor…

e0c116c

…kers.

asfgit closed this in a6c72ab Jun 7, 2014

marmbrus deleted the fixAggBug branch July 8, 2014 22:50

udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024

MapR [SPARK-1062] Use Spark HD3 profile by default to build Spark pac…

a1d686d

…kage (apache#1004) Co-authored-by: Egor Krivokon <>

mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025

MapR [SPARK-1062] Use Spark HD3 profile by default to build Spark pac…

8a6ca19

…kage (apache#1004) Co-authored-by: Egor Krivokon <>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS #1004

[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS #1004

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

Uh oh!

[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS #1004

[SPARK-1994][SQL] Weird data corruption bug when running Spark SQL on data in HDFS #1004

Uh oh!

Conversation

marmbrus commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

AmplabJenkins commented Jun 7, 2014

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

marmbrus commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

rxin commented Jun 7, 2014

Uh oh!

Uh oh!