Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21272] SortMergeJoin LeftAnti does not update numOutputRows #18494

Closed
wants to merge 2 commits into from

Conversation

juliuszsompolski
Copy link
Contributor

What changes were proposed in this pull request?

Updating numOutputRows metric was missing from one return path of LeftAnti SortMergeJoin.

How was this patch tested?

Non-zero output rows manually seen in metrics.

@kiszk
Copy link
Member

kiszk commented Jul 1, 2017

Would it be possible to add a test suite into SQLMetricsSuite?

@rxin
Copy link
Contributor

rxin commented Jul 2, 2017

cc @hvanhovell

@viirya
Copy link
Member

viirya commented Jul 2, 2017

Yeah, it's better to add a test to SQLMetricsSuite for this.

@hvanhovell
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Jul 2, 2017

Test build #79047 has finished for PR 18494 at commit 580dc46.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

add to whitelist

@SparkQA
Copy link

SparkQA commented Jul 3, 2017

Test build #79069 has finished for PR 18494 at commit 580dc46.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

gatorsmile commented Jul 3, 2017

Unfortunately, the comments in the test cases of SQLMetricsSuite are out-of-dated. Anyway, could you add a test case to SQLMetricsSuite for verifying the logics. Below is an example.

  test("SortMergeJoin(left-anti) metrics when right table is empty") {
    val testDataForJoin = testData2.filter("a < 0") // Empty but not optimized to EmptyRDD
    withTempView("testDataForJoin") {
      testDataForJoin.createOrReplaceTempView("testDataForJoin")
      val df = spark.sql(
        "SELECT * FROM testData2 ANTI JOIN testDataForJoin ON testData2.a = testDataForJoin.a")
      testSparkPlanMetrics(df, 1, Map(
        0L -> ("SortMergeJoin", Map("number of output rows" -> 6L)))
      )
    }
  }

@SparkQA
Copy link

SparkQA commented Jul 10, 2017

Test build #79467 has finished for PR 18494 at commit 93f7b88.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gatorsmile
Copy link
Member

gatorsmile commented Jul 10, 2017

Thanks! Merging to master/2.2.

@asfgit asfgit closed this in 18b3b00 Jul 10, 2017
asfgit pushed a commit that referenced this pull request Jul 10, 2017
## What changes were proposed in this pull request?

Updating numOutputRows metric was missing from one return path of LeftAnti SortMergeJoin.

## How was this patch tested?

Non-zero output rows manually seen in metrics.

Author: Juliusz Sompolski <julek@databricks.com>

Closes #18494 from juliuszsompolski/SPARK-21272.
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
## What changes were proposed in this pull request?

Updating numOutputRows metric was missing from one return path of LeftAnti SortMergeJoin.

## How was this patch tested?

Non-zero output rows manually seen in metrics.

Author: Juliusz Sompolski <julek@databricks.com>

Closes apache#18494 from juliuszsompolski/SPARK-21272.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants