Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-11778] [SQL]:parse table name before it is passed to lookupRelation #9773

Closed
wants to merge 4 commits into from
Closed

Conversation

huaxingao
Copy link
Contributor

Fix a bug in DataFrameReader.table (table with schema name such as "db_name.table" doesn't work)
Use SqlParser.parseTableIdentifier to parse the table name before lookupRelation.

@huaxingao
Copy link
Contributor Author

hiveContext.table("db_name.table") works but
hiveContext.read.table("db_name.table")
throws an org.apache.spark.sql.catalyst.analysis.NoSuchTableException

In hiveContext.table("db_name.table"), it goes through SqlParser.parseTableIdentifier(tableName)
and the table name "db_name.table" got resolved to 'db_name'.'table', and later, when trying to get the the qualified table name, the database name is resolved to db_name, and table name is table, and it can get the qualified table name OK.

In hiveContext.read.table("db_name.table"), it doesn't go through SQLParser to parse the table name, so the table name "db_name.table" remain as is. Later, when trying to get the the qualified table name, the database name resolved as default, and table name is "db_name.table", it can't get the qualified table name correctly.

@marmbrus
Copy link
Contributor

Test cases please for any bug fix. Look at HiveDataFrameAnalyticsSuite as an example (since i think we have to put this in the hive package for it to work).

@marmbrus
Copy link
Contributor

ok to test

@SparkQA
Copy link

SparkQA commented Nov 17, 2015

Test build #46106 has finished for PR 9773 at commit 02300e6.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@huaxingao
Copy link
Contributor Author

I will add a test case.

@SparkQA
Copy link

SparkQA commented Nov 18, 2015

Test build #46242 has finished for PR 9773 at commit 1342374.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@marmbrus
Copy link
Contributor

If we can get a test case soon we can still include this in Spark 1.6.

Also please make the title: [SPARK-11778] [SQL] Parse table name before it is passed to lookupRelation and add something to the PR description. Together, these will become the commit message.

@huaxingao huaxingao changed the title SPARK-11778:parse table name before it is passed to lookupRelation [SPARK-11778] [SQL]:parse table name before it is passed to lookupRelation Nov 19, 2015
@huaxingao
Copy link
Contributor Author

test case added. Could you please take a look? Thanks a lot!!

@SparkQA
Copy link

SparkQA commented Nov 19, 2015

Test build #46279 has finished for PR 9773 at commit 158d1a7.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 19, 2015

Test build #46285 has finished for PR 9773 at commit b12a475.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// There was a bug in DataFrameFrameReader.table and it has problem for table with schema name,
// Before fix, it throw Exceptionorg.apache.spark.sql.catalyst.analysis.NoSuchTableException
test("table name with schema") {
hiveContext.read.table("usrdb.test")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I was unclear. I don't think we should put this in HiveDataFrameAnalyticsSuite since it has nothing to do with analytics. I was just suggesting to use this as a model. Lets just make a separate generic HiveDataFrameSuite since this is pretty core functionality.

While we are at it. I'm also not a huge fan of using beforeAll and afterAll for test setup for a single test since it means the state is spread out across the file (sorry for the bad example). I would just do it all in the test block.

For the description in the comments, its customary to link to the JIRA.

@marmbrus
Copy link
Contributor

I'm going to go ahead and merge this since I want it in 1.6. It would be great if you could address comments in a follow up PR. Thanks!

asfgit pushed a commit that referenced this pull request Nov 19, 2015
…tion

Fix a bug in DataFrameReader.table (table with schema name such as "db_name.table" doesn't work)
Use SqlParser.parseTableIdentifier to parse the table name before lookupRelation.

Author: Huaxin Gao <huaxing@oc0558782468.ibm.com>

Closes #9773 from huaxingao/spark-11778.

(cherry picked from commit 4700074)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@asfgit asfgit closed this in 4700074 Nov 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants