[SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join #31470

Ngone51 · 2021-02-04T09:09:35Z

What changes were proposed in this pull request?

This PR introduces a new analysis rule DeduplicateRelations, which deduplicates any duplicate relations in a plan first and then deduplicates conflicting attributes(which resued the dedupRight of ResolveReferences).

Why are the changes needed?

CostBasedJoinReorder could fail when applying on self-join, e.g.,

// test in JoinReorderSuite
test("join reorder with self-join") {
  val plan = t2.join(t1, Inner, Some(nameToAttr("t1.k-1-2") === nameToAttr("t2.k-1-5")))
      .select(nameToAttr("t1.v-1-10"))
      .join(t2, Inner, Some(nameToAttr("t1.v-1-10") === nameToAttr("t2.k-1-5")))

    // this can fail
    Optimize.execute(plan.analyze)
}

Besides, with the new rule DeduplicateRelations, we'd be able to enable some optimizations, e.g., LeftSemiAnti pushdown, redundant project removal, as reflects in updated unit tests.

Does this PR introduce any user-facing change?

How was this patch tested?

Added and updated unit tests.

Ngone51 · 2021-02-04T09:09:51Z

cc @cloud-fan

SparkQA · 2021-02-04T10:08:05Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39451/

SparkQA · 2021-02-04T10:26:27Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39451/

SparkQA · 2021-02-04T11:17:38Z

Test build #134867 has finished for PR 31470 at commit c09ab12.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-02-04T14:54:34Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39462/

SparkQA · 2021-02-04T15:22:40Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39462/

SparkQA · 2021-02-04T18:51:59Z

Test build #134879 has finished for PR 31470 at commit 77170f6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

Ngone51 · 2021-02-08T10:01:18Z

sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala

@@ -728,8 +728,6 @@ class PlannerSuite extends SharedSparkSession with AdaptiveSparkPlanHelper {
      case r: Range => r
    }
    assert(ranges.length == 2)
-    // Ensure the two Range instances are equal according to their equal method
-    assert(ranges.head == ranges.last)


This's no longer valid as DeduplicateRelations deduplicates them.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala

Ngone51 · 2021-02-08T10:03:12Z

@cloud-fan @maropu @viirya Could you take a look? Thanks!

SparkQA · 2021-02-08T11:05:11Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39605/

SparkQA · 2021-02-08T11:32:16Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/39605/

SparkQA · 2021-02-08T14:50:23Z

Test build #135022 has finished for PR 31470 at commit f1b4d37.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisTest.scala

maropu · 2021-03-09T01:32:12Z

@Ngone51 This PR has the two parts below?

Bugfix: fixes a bug when applying join reorder with self-join
Improvement: refactors the logic to deduplicate relations for further optimizations

At least, we need to backport the former part into the previous branches, I think.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala

...st/src/test/scala/org/apache/spark/sql/catalyst/optimizer/joinReorder/JoinReorderSuite.scala

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala

Ngone51 · 2021-03-09T15:41:52Z

At least, we need to backport the former part into the previous branches, I think.

But these two parts are actually made by a single rule - DeduplicateRelations. So I'm afraid we can not split it and backport part 1 only.

SparkQA · 2021-03-09T16:26:08Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40488/

SparkQA · 2021-03-09T17:00:56Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40489/

SparkQA · 2021-03-09T17:01:45Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40488/

SparkQA · 2021-03-09T17:05:16Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40489/

SparkQA · 2021-03-09T17:49:02Z

Test build #135905 has finished for PR 31470 at commit 73ade4e.

This patch fails Spark unit tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2021-03-09T18:26:40Z

Test build #135906 has finished for PR 31470 at commit 85e73c6.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2021-03-11T15:53:31Z

Test build #135971 has finished for PR 31470 at commit 17b3866.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

maropu · 2021-03-25T14:15:30Z

Looks fine otherwise

� Conflicts: � sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q32/explain.txt � sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q41.sf100/explain.txt � sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q41/explain.txt � sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q92/explain.txt

SparkQA · 2021-03-26T11:21:19Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41141/

SparkQA · 2021-03-26T11:32:50Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41141/

SparkQA · 2021-03-26T15:11:31Z

Test build #136557 has finished for PR 31470 at commit c6f9714.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala

SparkQA · 2021-03-29T12:37:28Z

Test build #136652 has started for PR 31470 at commit f0c7ce4.

maropu · 2021-03-31T01:31:35Z

retest this please

SparkQA · 2021-03-31T02:29:07Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41323/

SparkQA · 2021-03-31T02:59:49Z

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41323/

SparkQA · 2021-03-31T06:24:39Z

Test build #136741 has finished for PR 31470 at commit f0c7ce4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2021-03-31T06:28:33Z

thanks, merging to master!

HyukjinKwon · 2021-04-01T03:46:39Z

It seems it broke PySpark tests ... https://github.com/apache/spark/runs/2240195852

Ngone51 · 2021-04-01T03:49:36Z

Is the failure from the latest run? I tested the same failed test locally yesterday and passed, and so I retriggered the GA.

HyukjinKwon · 2021-04-01T03:50:08Z

I think the last trigger didn't pass in this PR:

HyukjinKwon · 2021-04-01T03:50:55Z

Just for a bit of more contexts, ever since Jenkins were upgraded, we didn't install pandas, etc so we don't run the pandas in Jenkins. The pandas related tests only run on GA for now.

HyukjinKwon · 2021-04-01T03:53:26Z

I think we should install pandas / pyarrow in jenkins machines:

Skipped tests in pyspark.sql.tests.test_arrow with pypy3:
    test_createDataFrame_column_name_encoding (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_does_not_modify_input (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_empty_partition (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_fallback_disabled (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_fallback_enabled (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_respect_session_timezone (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_toggle (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_array_type (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_float_index (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_incorrect_schema (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_int_col_names (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_map_type (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_names (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_schema (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_single_data_type (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDateFrame_with_category_type (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_filtered_frame (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_no_partition_frame (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_no_partition_toPandas (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_null_conversion (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_round_trip (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_self_destruct (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_propagates_spark_exception (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_schema_conversion_roundtrip (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_timestamp_dst (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_timestamp_nat (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_arrow_toggle (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_batch_order (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_fallback_disabled (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_fallback_enabled (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_respect_session_timezone (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_with_array_type (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_with_map_type (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_with_map_type_nulls (pyspark.sql.tests.test_arrow.ArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_column_name_encoding (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_does_not_modify_input (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_empty_partition (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_fallback_disabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_fallback_enabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_respect_session_timezone (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_toggle (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_array_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_float_index (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_incorrect_schema (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_int_col_names (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_map_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_names (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_schema (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDataFrame_with_single_data_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_createDateFrame_with_category_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_filtered_frame (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_no_partition_frame (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_no_partition_toPandas (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_null_conversion (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_round_trip (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_self_destruct (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_propagates_spark_exception (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_schema_conversion_roundtrip (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_timestamp_dst (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_timestamp_nat (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_arrow_toggle (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_batch_order (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_fallback_disabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_fallback_enabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_respect_session_timezone (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_with_array_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_with_map_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_toPandas_with_map_type_nulls (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_exception_by_max_results (pyspark.sql.tests.test_arrow.MaxResultArrowTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_dataframe with pypy3:
    test_create_dataframe_from_pandas_with_dst (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_create_dataframe_from_pandas_with_timestamp (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas_avoid_astype (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas_from_empty_dataframe (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas_from_mixed_dataframe (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas_from_null_dataframe (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas_on_cross_join (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_to_pandas_with_duplicated_column_names (pyspark.sql.tests.test_dataframe.DataFrameTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_query_execution_listener_on_collect_with_arrow (pyspark.sql.tests.test_dataframe.QueryExecutionListenerTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_cogrouped_map with pypy3:
    test_case_insensitive_grouping_column (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_complex_group_by (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_different_schemas (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_empty_group_by (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_left_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_scalar_udfs_followed_by_cogrouby_apply (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_right_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_self_join (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_simple (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_with_key_complex (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_with_key_left (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_with_key_left_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_with_key_right (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_with_key_right_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_wrong_args (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_wrong_return_type (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_grouped_map with pypy3:
    test_array_type_correct (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_case_insensitive_grouping_column (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_coerce (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_column_order (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_complex_groupby (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_datatype_string (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_decorator (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_empty_groupby (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_grouped_over_window (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_grouped_with_empty_partition (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_scalar_udfs_followed_by_groupby_apply (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_positional_assignment_conf (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_register_grouped_map_udf (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_self_join_with_pandas (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_supported_types (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_timestamp_dst (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_udf_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_unsupported_types (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_wrong_args (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_wrong_return_type (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_map with pypy3:
    test_chain_map_partitions_in_pandas (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_different_output_length (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_empty_iterator (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_empty_rows (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_map_partitions_in_pandas (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_multiple_columns (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_self_join (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_udf with pypy3:
    test_pandas_udf_arrow_overflow (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_udf_basic (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_udf_decorator (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_udf_detect_unsafe_type_conversion (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_stopiteration_in_udf (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_udf_wrong_arg (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_udf_grouped_agg with pypy3:
    test_alias (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_array_type (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_basic (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_complex_expressions (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_complex_groupby (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_grouped_with_empty_partition (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_grouped_without_group_by_clause (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_invalid_args (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_manual (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_sql (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_udfs (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_multiple_udfs (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_no_predicate_pushdown_through (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_register_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_retain_group_columns (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_unsupported_types (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_udf_scalar with pypy3:
    test_datasource_with_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_udf_and_sql (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_nondeterministic_vectorized_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_nondeterministic_vectorized_udf_in_aggregate (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_udf_nested_arrays (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_pandas_udf_tokenize (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_register_nondeterministic_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_register_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_scalar_iter_udf_close (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_scalar_iter_udf_close_early (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_scalar_iter_udf_init (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_timestamp_dst (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_type_annotation (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_udf_category_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_array_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_chained (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_chained_struct_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_check_config (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_complex (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_datatype_string (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_dates (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_decorator (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_empty_partition (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_exception (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_invalid_length (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_map_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_nested_struct (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_array (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_binary (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_boolean (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_byte (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_decimal (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_double (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_float (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_int (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_long (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_short (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_null_string (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_return_scalar (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_return_timestamp_tz (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_string_in_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_struct_complex (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_struct_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_struct_with_empty_partition (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_timestamps (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_timestamps_respect_session_timezone (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_unsupported_types (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_varargs (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_vectorized_udf_wrong_return_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_udf_typehints with pypy3:
    test_group_agg_udf_type_hint (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_ignore_type_hint_in_cogroup_apply_in_pandas (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_ignore_type_hint_in_group_apply_in_pandas (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_ignore_type_hint_in_map_in_pandas (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_scalar_iter_udf_type_hint (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_scalar_udf_type_hint (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_type_annotation_group_agg (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_type_annotation_negative (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_type_annotation_scalar (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_type_annotation_scalar_iter (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_pandas_udf_window with pypy3:
    test_array_type (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_bounded_mixed (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_bounded_simple (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_growing_window (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_invalid_args (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_sql (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_sql_and_udf (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_mixed_udf (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_multiple_udfs (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_replace_existing (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_shrinking_window (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_simple (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_sliding_window (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'
    test_without_partitionBy (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... skipped 'Pandas >= 0.23.2 must be installed; however, it was not found.'

Skipped tests in pyspark.sql.tests.test_arrow with python3.6:
      test_createDataFrame_column_name_encoding (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_does_not_modify_input (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_empty_partition (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_fallback_disabled (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_fallback_enabled (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_respect_session_timezone (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_toggle (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_array_type (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_float_index (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_incorrect_schema (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_int_col_names (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_map_type (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_names (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_schema (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_single_data_type (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDateFrame_with_category_type (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_filtered_frame (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_no_partition_frame (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_no_partition_toPandas (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_null_conversion (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_pandas_round_trip (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_pandas_self_destruct (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_propagates_spark_exception (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_schema_conversion_roundtrip (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_timestamp_dst (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_timestamp_nat (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_arrow_toggle (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_batch_order (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_fallback_disabled (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_fallback_enabled (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_respect_session_timezone (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_with_array_type (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_with_map_type (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_toPandas_with_map_type_nulls (pyspark.sql.tests.test_arrow.ArrowTests) ... SKIP (0.000s)
      test_createDataFrame_column_name_encoding (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_does_not_modify_input (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_empty_partition (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_fallback_disabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_fallback_enabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_respect_session_timezone (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_toggle (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_array_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_float_index (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_incorrect_schema (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_int_col_names (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_map_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_names (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_schema (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDataFrame_with_single_data_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_createDateFrame_with_category_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_filtered_frame (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_no_partition_frame (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_no_partition_toPandas (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_null_conversion (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_pandas_round_trip (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_pandas_self_destruct (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_propagates_spark_exception (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_schema_conversion_roundtrip (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_timestamp_dst (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_timestamp_nat (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_arrow_toggle (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_batch_order (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_fallback_disabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_fallback_enabled (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_respect_session_timezone (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_with_array_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_with_map_type (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_toPandas_with_map_type_nulls (pyspark.sql.tests.test_arrow.EncryptionArrowTests) ... SKIP (0.000s)
      test_exception_by_max_results (pyspark.sql.tests.test_arrow.MaxResultArrowTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_dataframe with python3.6:
      test_create_dataframe_required_pandas_not_found (pyspark.sql.tests.test_dataframe.DataFrameTests) ... SKIP (0.000s)
      test_to_pandas_required_pandas_not_found (pyspark.sql.tests.test_dataframe.DataFrameTests) ... SKIP (0.000s)
      test_query_execution_listener_on_collect_with_arrow (pyspark.sql.tests.test_dataframe.QueryExecutionListenerTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_cogrouped_map with python3.6:
      test_case_insensitive_grouping_column (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_complex_group_by (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_different_schemas (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_empty_group_by (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_left_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_mixed_scalar_udfs_followed_by_cogrouby_apply (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_right_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_self_join (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_simple (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_with_key_complex (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_with_key_left (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_with_key_left_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_with_key_right (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_with_key_right_group_empty (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_wrong_args (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)
      test_wrong_return_type (pyspark.sql.tests.test_pandas_cogrouped_map.CogroupedMapInPandasTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_grouped_map with python3.6:
      test_array_type_correct (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_case_insensitive_grouping_column (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_coerce (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_column_order (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_complex_groupby (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_datatype_string (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_decorator (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_empty_groupby (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_grouped_over_window (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_grouped_over_window_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_grouped_with_empty_partition (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_mixed_scalar_udfs_followed_by_groupby_apply (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_positional_assignment_conf (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_register_grouped_map_udf (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_self_join_with_pandas (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_supported_types (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_timestamp_dst (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_udf_with_key (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_unsupported_types (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_wrong_args (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)
      test_wrong_return_type (pyspark.sql.tests.test_pandas_grouped_map.GroupedMapInPandasTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_map with python3.6:
      test_chain_map_partitions_in_pandas (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)
      test_different_output_length (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)
      test_empty_iterator (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)
      test_empty_rows (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)
      test_map_partitions_in_pandas (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)
      test_multiple_columns (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)
      test_self_join (pyspark.sql.tests.test_pandas_map.MapInPandasTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_udf with python3.6:
      test_pandas_udf_arrow_overflow (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... SKIP (0.000s)
      test_pandas_udf_basic (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... SKIP (0.000s)
      test_pandas_udf_decorator (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... SKIP (0.000s)
      test_pandas_udf_detect_unsafe_type_conversion (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... SKIP (0.000s)
      test_stopiteration_in_udf (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... SKIP (0.000s)
      test_udf_wrong_arg (pyspark.sql.tests.test_pandas_udf.PandasUDFTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_udf_grouped_agg with python3.6:
      test_alias (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_array_type (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_basic (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_complex_expressions (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_complex_groupby (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_grouped_with_empty_partition (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_grouped_without_group_by_clause (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_invalid_args (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_manual (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_mixed_sql (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_mixed_udfs (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_multiple_udfs (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_no_predicate_pushdown_through (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_register_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_retain_group_columns (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)
      test_unsupported_types (pyspark.sql.tests.test_pandas_udf_grouped_agg.GroupedAggPandasUDFTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_udf_scalar with python3.6:
      test_datasource_with_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_mixed_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_mixed_udf_and_sql (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_nondeterministic_vectorized_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_nondeterministic_vectorized_udf_in_aggregate (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_pandas_udf_nested_arrays (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_pandas_udf_tokenize (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_register_nondeterministic_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_register_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_scalar_iter_udf_close (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_scalar_iter_udf_close_early (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_scalar_iter_udf_init (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_timestamp_dst (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_type_annotation (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_udf_category_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_array_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_basic (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_chained (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_chained_struct_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_check_config (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_complex (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_datatype_string (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_dates (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_decorator (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_empty_partition (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_exception (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_invalid_length (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_map_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_nested_struct (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_array (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_binary (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_boolean (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_byte (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_decimal (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_double (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_float (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_int (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_long (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_short (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_null_string (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_return_scalar (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_return_timestamp_tz (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_string_in_udf (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_struct_complex (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_struct_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_struct_with_empty_partition (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_timestamps (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_timestamps_respect_session_timezone (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_unsupported_types (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_varargs (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)
      test_vectorized_udf_wrong_return_type (pyspark.sql.tests.test_pandas_udf_scalar.ScalarPandasUDFTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_udf_typehints with python3.6:
      test_group_agg_udf_type_hint (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_ignore_type_hint_in_cogroup_apply_in_pandas (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_ignore_type_hint_in_group_apply_in_pandas (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_ignore_type_hint_in_map_in_pandas (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_scalar_iter_udf_type_hint (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_scalar_udf_type_hint (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_type_annotation_group_agg (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_type_annotation_negative (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_type_annotation_scalar (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)
      test_type_annotation_scalar_iter (pyspark.sql.tests.test_pandas_udf_typehints.PandasUDFTypeHintsTests) ... SKIP (0.000s)

Skipped tests in pyspark.sql.tests.test_pandas_udf_window with python3.6:
      test_array_type (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_bounded_mixed (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_bounded_simple (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_growing_window (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_invalid_args (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_mixed_sql (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_mixed_sql_and_udf (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_mixed_udf (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_multiple_udfs (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_replace_existing (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_shrinking_window (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_simple (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_sliding_window (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)
      test_without_partitionBy (pyspark.sql.tests.test_pandas_udf_window.WindowPandasUDFTests) ... SKIP (0.000s)

let me file a JIRA and assign to @shaneknapp

HyukjinKwon · 2021-04-01T03:55:11Z

SPARK-34930

Ngone51 · 2021-04-01T14:34:35Z

Resubmitted #32027

github-actions bot added the SQL label Feb 4, 2021

Ngone51 commented Feb 8, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala Show resolved Hide resolved

cloud-fan reviewed Mar 1, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala Show resolved Hide resolved

cloud-fan reviewed Mar 1, 2021

View reviewed changes

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisTest.scala Show resolved Hide resolved

maropu reviewed Mar 9, 2021

View reviewed changes

Ngone51 force-pushed the join-reorder branch from 73ade4e to 85e73c6 Compare March 9, 2021 15:42

Ngone51 force-pushed the join-reorder branch 2 times, most recently from d212b95 to ec525d8 Compare March 19, 2021 07:46

cloud-fan reviewed Mar 22, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated Show resolved Hide resolved

Ngone51 added 7 commits March 26, 2021 15:00

fix hasConflictingAttrs

36a74ad

compare exprId

8bbacf6

skip DeduplicateRelations if no conflicting attributes

a04c4e7

remove uncessary pattern match

b27d4d3

revert skip if no conflicting attrs

6243b43

gen golden file

c6f9714

cloud-fan reviewed Mar 29, 2021

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DeduplicateRelations.scala Outdated Show resolved Hide resolved

cloud-fan approved these changes Mar 29, 2021

View reviewed changes

Ngone51 added 2 commits March 29, 2021 20:32

match SubqueryExpression

5d85df3

fix scala2.13 error

f0c7ce4

cloud-fan closed this in f05b940 Mar 31, 2021

[SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join #31470

[SPARK-34354][SQL] Fix failure when apply CostBasedJoinReorder on self-join #31470

Conversation

Ngone51 commented Feb 4, 2021

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Ngone51 commented Feb 4, 2021

SparkQA commented Feb 4, 2021

SparkQA commented Feb 4, 2021

SparkQA commented Feb 4, 2021

SparkQA commented Feb 4, 2021

SparkQA commented Feb 4, 2021

SparkQA commented Feb 4, 2021

Ngone51 Feb 8, 2021

Choose a reason for hiding this comment

Ngone51 commented Feb 8, 2021

SparkQA commented Feb 8, 2021

SparkQA commented Feb 8, 2021

SparkQA commented Feb 8, 2021

maropu commented Mar 9, 2021

Ngone51 commented Mar 9, 2021

SparkQA commented Mar 9, 2021

SparkQA commented Mar 9, 2021

SparkQA commented Mar 9, 2021

SparkQA commented Mar 9, 2021

SparkQA commented Mar 9, 2021

SparkQA commented Mar 9, 2021

SparkQA commented Mar 11, 2021

maropu commented Mar 25, 2021

SparkQA commented Mar 26, 2021

SparkQA commented Mar 26, 2021

SparkQA commented Mar 26, 2021

SparkQA commented Mar 29, 2021

maropu commented Mar 31, 2021

SparkQA commented Mar 31, 2021

SparkQA commented Mar 31, 2021

SparkQA commented Mar 31, 2021

cloud-fan commented Mar 31, 2021

HyukjinKwon commented Apr 1, 2021

Ngone51 commented Apr 1, 2021 • edited

HyukjinKwon commented Apr 1, 2021 • edited

HyukjinKwon commented Apr 1, 2021

HyukjinKwon commented Apr 1, 2021

HyukjinKwon commented Apr 1, 2021

Ngone51 commented Apr 1, 2021

Ngone51 commented Apr 1, 2021 •

edited

HyukjinKwon commented Apr 1, 2021 •

edited