-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-5941] [SQL] Unit Test loads the table src
twice for leftsemijoin.q
#4506
Conversation
Test build #27207 has finished for PR 4506 at commit
|
We have eagerly resolved the table since Spark 1.0 when Spark SQL was added. Why do you think this is problematic? |
The deferred resolving table should be harmless, and since most of the DF API are yielding the unresolved logical plans(can I say that?), I think we'd better keep it the same for Ideally, eagerly resolving the table should produce the the result in unit test, but seems not, probably something wrong somewhere, I will keep investigating that. |
3ee839f
to
40ccd81
Compare
Test build #27505 has finished for PR 4506 at commit
|
40ccd81
to
4c58fbc
Compare
Test build #27625 has finished for PR 4506 at commit
|
/cc @marmbrus @rxin @yhuai I noticed that we have added the |
Test build #27672 has finished for PR 4506 at commit
|
@marmbrus can you take a look? not 100% sure what's happening |
If I remember it correctly, the issue is some test tables are loaded twice. @chenghao-intel Can you change the title? |
def table
Sorry for the confusing. I've updated the title and description. |
|ROW FORMAT SERDE '${classOf[RegexSerDe].getCanonicalName}' | ||
|WITH SERDEPROPERTIES ("input.regex" = "([^ ]*)\t([^ ]*)") | ||
""".stripMargin.cmd, | ||
s"LOAD DATA LOCAL INPATH '${getHiveFile("data/files/sales.txt")}' INTO TABLE sales".cmd), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since sales
is not a preloaded in Hive's unit tests (https://github.com/apache/hive/blob/trunk/data/scripts/q_test_init.sql), seems it is fine to remove it.
def table
src
twice for leftsemijoin.q
Test build #29735 has finished for PR 4506 at commit
|
Closing it since it's only impact a single hive compatible test. |
b463f8a
to
dd0b3f6
Compare
dd0b3f6
to
0be05f7
Compare
@marmbrus I've reopened this PR, just in case people runs into the bug of this while unit testing. |
Test build #30132 has finished for PR 4506 at commit
|
Test build #30133 has finished for PR 4506 at commit
|
Test build #30131 has finished for PR 4506 at commit
|
Thanks! Merged to master. |
In
leftsemijoin.q
, there is a data loading command for tablesales
already, but inTestHive
, it also created the tablesales
, which causes duplicated records inserted into thesales
.