[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution for the test of toPandas empty types #27247

HyukjinKwon · 2020-01-17T02:41:13Z

What changes were proposed in this pull request?

This PR proposes to explicitly disable Arrow execution for the test of toPandas empty types. If spark.sql.execution.arrow.pyspark.enabled is enabled by default, this test alone fails as below:

======================================================================
ERROR [0.205s]: test_to_pandas_from_empty_dataframe (pyspark.sql.tests.test_dataframe.DataFrameTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/.../pyspark/sql/tests/test_dataframe.py", line 568, in test_to_pandas_from_empty_dataframe
    self.assertTrue(np.all(dtypes_when_empty_df == dtypes_when_nonempty_df))
AssertionError: False is not true
----------------------------------------------------------------------

it should be best to explicitly disable for the test that only works when it's disabled.

Why are the changes needed?

To make the test independent of default values of configuration.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Manually tested and Jenkins should test.

dongjoon-hyun · 2020-01-17T02:45:56Z

python/pyspark/sql/tests/test_dataframe.py

-        dtypes_when_empty_df = self.spark.sql(sql).filter("False").toPandas().dtypes
-        self.assertTrue(np.all(dtypes_when_empty_df == dtypes_when_nonempty_df))
+        with self.sql_conf({"spark.sql.execution.arrow.pyspark.enabled": False}):
+            # SPARK-29188 test that toPandas() on an empty dataframe has the correct dtypes


Oh, do we need to re-open SPARK-29188 then because we know that toPandas() will fail when spark.sql.execution.arrow.pyspark.enabled=True?

hmmm .. I will just open another one just to make the management simpler.

I opened, here https://issues.apache.org/jira/browse/SPARK-30537.

SparkQA · 2020-01-17T03:16:15Z

Test build #116888 has finished for PR 27247 at commit 507b625.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

+1, LGTM. Merged to master. Thank you, @HyukjinKwon .

HyukjinKwon · 2020-01-17T03:28:52Z

Thank you @dongjoon-hyun !

Explicitly disable Arrow execution for the test of toPandas empty types

507b625

dongjoon-hyun reviewed Jan 17, 2020

View reviewed changes

dongjoon-hyun added the PYSPARK label Jan 17, 2020

dongjoon-hyun approved these changes Jan 17, 2020

View reviewed changes

dongjoon-hyun closed this in 4398dfa Jan 17, 2020

HyukjinKwon deleted the SPARK-29188-followup branch March 3, 2020 01:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution for the test of toPandas empty types #27247

[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution for the test of toPandas empty types #27247

HyukjinKwon commented Jan 17, 2020

dongjoon-hyun Jan 17, 2020

HyukjinKwon Jan 17, 2020

HyukjinKwon Jan 17, 2020

dongjoon-hyun Jan 17, 2020

SparkQA commented Jan 17, 2020

dongjoon-hyun left a comment

HyukjinKwon commented Jan 17, 2020

[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution for the test of toPandas empty types #27247

[SPARK-29188][PYTHON][FOLLOW-UP] Explicitly disable Arrow execution for the test of toPandas empty types #27247

Conversation

HyukjinKwon commented Jan 17, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

dongjoon-hyun Jan 17, 2020

Choose a reason for hiding this comment

HyukjinKwon Jan 17, 2020

Choose a reason for hiding this comment

HyukjinKwon Jan 17, 2020

Choose a reason for hiding this comment

dongjoon-hyun Jan 17, 2020

Choose a reason for hiding this comment

SparkQA commented Jan 17, 2020

dongjoon-hyun left a comment

Choose a reason for hiding this comment

HyukjinKwon commented Jan 17, 2020