Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31467][SQL][TEST] Refactor the sql tests to prevent TableAlreadyExistsException #28239

Closed
wants to merge 9 commits into from

Conversation

turboFei
Copy link
Member

@turboFei turboFei commented Apr 17, 2020

What changes were proposed in this pull request?

If we add UT in hive/SQLQuerySuite or other sql test suites and use table named test.
We may meet TableAlreadyExistsException.

org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or view 'test' already exists in database 'default'

The reason is that, there is some tests that does not clean up the tables/views.
In this PR, I add withTempViews for these tests.

Why are the changes needed?

To fix the TableAlreadyExistsException issue when adding an UT, which uses table named test or others, in some sql test suites, such as hive/SQLQuerySuite.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existed UT.

@turboFei turboFei changed the title SPARK-31467: Fix test issue with table named in hive/SQLQuerySuite SPARK-31467: Fix issue when adding UT uses table named test in hive/SQLQuerySuite Apr 17, 2020
@turboFei
Copy link
Member Author

Just add withTempView for the tests that do not clean up used tables/views.
cc @HyukjinKwon

@turboFei turboFei changed the title SPARK-31467: Fix issue when adding UT uses table named test in hive/SQLQuerySuite SPARK-31467: Refactor the tests in hive/SQLQuerySuite to prevent TableAlreadyExistsException Apr 17, 2020
@maropu
Copy link
Member

maropu commented Apr 18, 2020

ok to test

@maropu maropu changed the title SPARK-31467: Refactor the tests in hive/SQLQuerySuite to prevent TableAlreadyExistsException [SPARK-31467][SQL][TEST] Refactor the tests in hive/SQLQuerySuite to prevent TableAlreadyExistsException Apr 18, 2020
@maropu
Copy link
Member

maropu commented Apr 18, 2020

I think (trivial though), the change looks ok. Did you check them only in SQLQuerySuite? I mean, have you checked if the logs of the other test suites have no TableAlreadyExistsException?

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121460 has finished for PR 28239 at commit cdde32f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@turboFei
Copy link
Member Author

turboFei commented Apr 19, 2020

I think the change looks reasonable. Did you check them only in SQLQuerySuite? I mean, have you checked if the logs of the other test suites have no TableAlreadyExistsException?

Hi, I found this issue when adding UT into hive/SQLQuerySuite.
There is no TableAlreadyExistsException in the logs before add UT.

I just check the sql/SQLQuerySuite and refactor it too.

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121467 has finished for PR 28239 at commit 565631b.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@turboFei turboFei changed the title [SPARK-31467][SQL][TEST] Refactor the tests in hive/SQLQuerySuite to prevent TableAlreadyExistsException [SPARK-31467][SQL][TEST] Refactor the sql tests to prevent TableAlreadyExistsException Apr 19, 2020
@turboFei
Copy link
Member Author

Will try to refactor other suites.

@turboFei
Copy link
Member Author

I have used grep command to find all tests, which does clean up their used views, and refactor them.

cc @maropu

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121466 has finished for PR 28239 at commit d699f58.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121468 has finished for PR 28239 at commit babffc7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121470 has finished for PR 28239 at commit bb52763.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Apr 19, 2020

retest this please

@maropu
Copy link
Member

maropu commented Apr 19, 2020

I think createOrReplaceTempView doesn't throw TableAlreadyExistsException without withTempView. Does the title describe a different issue?

@turboFei
Copy link
Member Author

Oh, I mean that if we createOrReplaceTempView and does not clean up it after this test.
If another test create a table with same name, TableAlreadyExistsException would be thrown.

@maropu
Copy link
Member

maropu commented Apr 19, 2020

Ur, I see. cc: @dongjoon-hyun

|AS (c STRING, d STRING)
|ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
|WITH SERDEPROPERTIES('field.delim' = '|')
withTempView("data") {
Copy link
Member

@maropu maropu Apr 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data -> test?

"""
|select complexArrayOfStruct[0].field1[1].inner2[0],
|complexArrayOfStruct[1].field2[0][1]
|from jsonTable
""".stripMargin),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: wrong indents.

@turboFei
Copy link
Member Author

turboFei commented Apr 19, 2020

thanks, will fix all indents related with """ """.

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121473 has finished for PR 28239 at commit bb52763.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@turboFei
Copy link
Member Author

Have refactor the indents and double check it.
cc @maropu @dongjoon-hyun

@SparkQA
Copy link

SparkQA commented Apr 19, 2020

Test build #121477 has finished for PR 28239 at commit 453c5a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented Apr 20, 2020

btw, since I think we tend to forget the cleanup, any other solution to automatically clean up temp views with low overheads after each test, if this issue's annoying?

@turboFei
Copy link
Member Author

turboFei commented Apr 20, 2020

btw, since I think we tend to forget the cleanup, any other solution to automatically clean up temp views with low overheads after each test, if this issue's annoying?

It is difficult to identify which view should be cleaned up.
For example:

override def beforeAll(): Unit = {
super.beforeAll()
sql(
"""
|CREATE TEMPORARY VIEW oneToTen
|USING org.apache.spark.sql.sources.SimpleScanSource
|OPTIONS (
| From '1',
| To '10',
| option_with_underscores 'someval',
| option.with.dots 'someval'
|)
""".stripMargin)

This temp view is created in beforeAll, we should not clean up it after each test.

@maropu
Copy link
Member

maropu commented Apr 20, 2020

Ah, I see.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to me.

@maropu
Copy link
Member

maropu commented May 5, 2020

retest this please

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, too.

@SparkQA
Copy link

SparkQA commented May 5, 2020

Test build #122295 has finished for PR 28239 at commit 453c5a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu maropu closed this in 8d1f7d2 May 5, 2020
maropu pushed a commit that referenced this pull request May 5, 2020
…dyExistsException

### What changes were proposed in this pull request?
If we add UT in hive/SQLQuerySuite or other sql test suites and use table named `test`.
We may meet TableAlreadyExistsException.

```
org.apache.spark.sql.catalyst.analysis.TableAlreadyExistsException: Table or view 'test' already exists in database 'default'
```

The reason is that, there is some tests that does not clean up the tables/views.
In this PR, I add `withTempViews` for these tests.

### Why are the changes needed?
To fix the TableAlreadyExistsException issue when adding an UT, which uses table named `test` or others, in some sql test suites, such as hive/SQLQuerySuite.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?

Existed UT.

Closes #28239 from turboFei/SPARK-31467.

Authored-by: turbofei <fwang12@ebay.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
(cherry picked from commit 8d1f7d2)
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
@maropu
Copy link
Member

maropu commented May 5, 2020

Thanks! Merged to master/3.0.

@turboFei turboFei deleted the SPARK-31467 branch May 5, 2020 08:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants