Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13031] [SQL] cleanup codegen and improve test coverage #10977

Closed
wants to merge 5 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented Jan 29, 2016

  1. enable whole stage codegen during tests even there is only one operator supports that.
  2. split doProduce() into two APIs: upstream() and doProduce()
  3. generate prefix for fresh names of each operator
  4. pass UnsafeRow to parent directly (avoid getters and create UnsafeRow again)
  5. fix bugs and tests.

This PR re-open #10944 and fix the bug.

@rxin
Copy link
Contributor

rxin commented Jan 29, 2016

What's the bug?

@davies
Copy link
Contributor Author

davies commented Jan 29, 2016

When there is no aggregate functions, it did not generate the output using resultExpression, which have only literals (I was mislead by the comment in AggregateIterator).

@rxin
Copy link
Contributor

rxin commented Jan 29, 2016

Thanks - can you add a test case that would catch this? In the long run, we should beef up our own test coverage and don't want to rely on HiveCompatibilitySuite.

@davies
Copy link
Contributor Author

davies commented Jan 29, 2016

The way we managed HiveCompatibilitySuite is actually better than our unit tests (sql query and golden results in text format). Even if we don't want to be compatible with Hive, it's still good to have those tests (don't call them HiveCompatibilitySuite), and also managed in similar way.

@rxin
Copy link
Contributor

rxin commented Jan 29, 2016

Sure it's a good idea to use that golden file infrastructure. Given we don't have that yet, can you just add a test case?

@rxin
Copy link
Contributor

rxin commented Jan 29, 2016

The issue here is that we want test cases that are targeted for specific problems, and the Hive ones are not (they are just a giant blackbox we took to bootstrap coverage, and not to mention that a targeted test helps you catch problems with aggregations earlier without rerunning the entire Hive suite).

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50351 has finished for PR 10977 at commit 951e2cd.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Jan 29, 2016

@rxin Added.

@rxin
Copy link
Contributor

rxin commented Jan 29, 2016

LGTM

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #50359 has finished for PR 10977 at commit ffa8e6b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 29, 2016

Test build #2473 has finished for PR 10977 at commit ffa8e6b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor

rxin commented Jan 29, 2016

I've merged this.

@asfgit asfgit closed this in 55561e7 Jan 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants