[SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section #19858

wangyum · 2017-12-01T09:07:45Z

What changes were proposed in this pull request?

Update broadcast behavior changes in migration section.

How was this patch tested?

N/A

SparkQA · 2017-12-01T09:24:57Z

Test build #84374 has finished for PR 19858 at commit 4fedff1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

wangyum · 2017-12-01T19:40:14Z

cc @gatorsmile

gatorsmile · 2017-12-01T21:34:28Z

docs/sql-programming-guide.md

@@ -1776,6 +1776,8 @@ options.
    Note that, for <b>DecimalType(38,0)*</b>, the table above intentionally does not cover all other combinations of scales and precisions because currently we only infer decimal type like `BigInteger`/`BigInt`. For example, 1.1 is inferred as double type.
  - In PySpark, now we need Pandas 0.19.2 or upper if you want to use Pandas related functionalities, such as `toPandas`, `createDataFrame` from Pandas DataFrame, etc.
  - In PySpark, the behavior of timestamp values for Pandas related functionalities was changed to respect session timezone. If you want to use the old behavior, you need to set a configuration `spark.sql.execution.pandas.respectSessionTimeZone` to `False`. See [SPARK-22395](https://issues.apache.org/jira/browse/SPARK-22395) for details.
+
+ - Since Spark 2.3, broadcast behaviour changed to broadcast the join side with an explicit broadcast hint first. See [SPARK-22489](https://issues.apache.org/jira/browse/SPARK-22489) for details.


Since Spark 2.3, when either broadcast hash join or broadcast nested loop join is applicable, we prefer to broadcasting the table that is explicitly specified in a broadcast hint. For details, see the section [Broadcast Hint](#broadcast-hint-for-sql-queries) and [SPARK-22489](https://issues.apache.org/jira/browse/SPARK-22489) for details.

SparkQA · 2017-12-02T00:44:35Z

Test build #84383 has finished for PR 19858 at commit 76148f5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-12-02T07:38:41Z

docs/sql-programming-guide.md

@@ -1776,6 +1776,8 @@ options.
    Note that, for <b>DecimalType(38,0)*</b>, the table above intentionally does not cover all other combinations of scales and precisions because currently we only infer decimal type like `BigInteger`/`BigInt`. For example, 1.1 is inferred as double type.
  - In PySpark, now we need Pandas 0.19.2 or upper if you want to use Pandas related functionalities, such as `toPandas`, `createDataFrame` from Pandas DataFrame, etc.
  - In PySpark, the behavior of timestamp values for Pandas related functionalities was changed to respect session timezone. If you want to use the old behavior, you need to set a configuration `spark.sql.execution.pandas.respectSessionTimeZone` to `False`. See [SPARK-22395](https://issues.apache.org/jira/browse/SPARK-22395) for details.
+
+ - Since Spark 2.3, when either broadcast hash join or broadcast nested loop join is applicable, we prefer to broadcasting the table that is explicitly specified in a broadcast hint. For details, see the section [Broadcast Hint](#broadcast-hint-for-sql-queries) and [SPARK-22489](https://issues.apache.org/jira/browse/SPARK-22489) for details.


Sorry, there is a duplicate for details

SparkQA · 2017-12-03T01:14:40Z

Test build #84395 has finished for PR 19858 at commit 069f8b6.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

gatorsmile · 2017-12-04T07:51:32Z

LGTM

gatorsmile · 2017-12-04T07:52:50Z

Thanks! Merged to master.

migration

4fedff1

gatorsmile reviewed Dec 1, 2017

View reviewed changes

Fix review comments

76148f5

gatorsmile reviewed Dec 2, 2017

View reviewed changes

Remove duplicate

069f8b6

asfgit closed this in 4131ad0 Dec 4, 2017

wangyum deleted the SPARK-22489-migration branch December 4, 2017 08:31

GulajavaMinistudio mentioned this pull request Dec 4, 2017

Update upstream GulajavaMinistudio/spark#231

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section #19858

[SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section #19858

wangyum commented Dec 1, 2017

SparkQA commented Dec 1, 2017

wangyum commented Dec 1, 2017

gatorsmile Dec 1, 2017 •

edited

SparkQA commented Dec 2, 2017

gatorsmile Dec 2, 2017

SparkQA commented Dec 3, 2017

gatorsmile commented Dec 4, 2017

gatorsmile commented Dec 4, 2017

[SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section #19858

[SPARK-22489][DOC][FOLLOWUP] Update broadcast behavior changes in migration section #19858

Conversation

wangyum commented Dec 1, 2017

What changes were proposed in this pull request?

How was this patch tested?

SparkQA commented Dec 1, 2017

wangyum commented Dec 1, 2017

gatorsmile Dec 1, 2017 • edited

Choose a reason for hiding this comment

SparkQA commented Dec 2, 2017

gatorsmile Dec 2, 2017

Choose a reason for hiding this comment

SparkQA commented Dec 3, 2017

gatorsmile commented Dec 4, 2017

gatorsmile commented Dec 4, 2017

gatorsmile Dec 1, 2017 •

edited