Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5118][SQL] Fix: create table test stored as parquet as select .. #3921

Closed
wants to merge 4 commits into from

Conversation

guowei2
Copy link
Contributor

@guowei2 guowei2 commented Jan 7, 2015

No description provided.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@marmbrus
Copy link
Contributor

marmbrus commented Jan 7, 2015

Can you add a test case? Perhaps in HiveQuerySuite

@guowei2
Copy link
Contributor Author

guowei2 commented Jan 8, 2015

@marmbrus test case done

| SELECT key, value
| FROM src
| ORDER BY key, value""".stripMargin).collect

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want to test STORED AS PARQUET?

@guowei2
Copy link
Contributor Author

guowei2 commented Jan 9, 2015

i think i should remove the test case for stored as parquet only can pass in hive-0.13

@marmbrus
Copy link
Contributor

marmbrus commented Jan 9, 2015

Just check the version in the hive shim in the test case.
On Jan 8, 2015 7:53 PM, "guowei2" notifications@github.com wrote:

i think i should remove the test case for stored as parquet only can pass
in hive-0.13


Reply to this email directly or view it on GitHub
#3921 (comment).

@yhuai
Copy link
Contributor

yhuai commented Jan 9, 2015

Like this one.

@guowei2
Copy link
Contributor Author

guowei2 commented Jan 9, 2015

check the version in the hive shim in the test case Done
@yhuai thanks a lot

@marmbrus
Copy link
Contributor

ok to test

@marmbrus
Copy link
Contributor

Also can you please fix the merge conflict?

@SparkQA
Copy link

SparkQA commented Jan 10, 2015

Test build #25364 has started for PR 3921 at commit 833f438.

  • This patch does not merge cleanly.

@SparkQA
Copy link

SparkQA commented Jan 11, 2015

Test build #25364 has finished for PR 3921 at commit 833f438.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25364/
Test PASSed.

@guowei2
Copy link
Contributor Author

guowei2 commented Jan 12, 2015

fix the merge conflict

@SparkQA
Copy link

SparkQA commented Jan 12, 2015

Test build #25388 has started for PR 3921 at commit 5405b84.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Jan 12, 2015

Test build #25388 has finished for PR 3921 at commit 5405b84.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25388/
Test PASSed.

@yhuai
Copy link
Contributor

yhuai commented Jan 21, 2015

@guowei2 Can you rebase it? Also, can we add a test to make sure we actually generate the correct Parquet file? Just reading back the data of created table cannot verify the file format. Thank you.

checkAnswer(
sql("SELECT key, value FROM ctas5 ORDER BY key, value"),
sql("SELECT key, value FROM src ORDER BY key, value").collect().toSeq)
sql("set spark.sql.hive.convertMetastoreParquet = true")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually first get the original value of the setting and set it back at the end of the test.

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25879 has started for PR 3921 at commit 9da56f8.

  • This patch merges cleanly.

@guowei2
Copy link
Contributor Author

guowei2 commented Jan 21, 2015

@yhuai How to check the correct Parquet file?

  1. create a parquet table, and then insert into data
  2. create table stored as parquet as select ..
    and then checkAnswer with them
    is it OK?

@SparkQA
Copy link

SparkQA commented Jan 21, 2015

Test build #25879 has finished for PR 3921 at commit 9da56f8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25879/
Test PASSed.

@yhuai
Copy link
Contributor

yhuai commented Jan 21, 2015

I think you can use describe formatted <table name> to get the information of a table. You should be able to find InputFormat and OutputFormat in the output. We can just double check the output of describe and then make sure the data is stored in the correct format.

@SparkQA
Copy link

SparkQA commented Jan 22, 2015

Test build #25948 has started for PR 3921 at commit b1ba3be.

  • This patch merges cleanly.

@guowei2
Copy link
Contributor Author

guowei2 commented Jan 22, 2015

@yhuai I use DESC EXTENDED to check the the stored file, Is it OK?

@SparkQA
Copy link

SparkQA commented Jan 22, 2015

Test build #25948 has finished for PR 3921 at commit b1ba3be.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25948/
Test PASSed.

asfgit pushed a commit that referenced this pull request Feb 4, 2015
Author: guowei2 <guowei2@asiainfo.com>

Closes #3921 from guowei2/SPARK-5118 and squashes the following commits:

b1ba3be [guowei2] add table file check in test case
9da56f8 [guowei2] test case only run in Shim13
112a0b6 [guowei2] add test case
187c7d8 [guowei2] Fix: create table test stored as parquet as select ..

(cherry picked from commit e0490e2)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@marmbrus
Copy link
Contributor

marmbrus commented Feb 4, 2015

Thanks for fixing this! Merged to master and branch 1.3.

@asfgit asfgit closed this in e0490e2 Feb 4, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants