-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-28697][SQL] Invalidate Database/Table names starting with underscore #25448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@dongjoon-hyun @cloud-fan @HyukjinKwon please review and let me know your opinion on the fix |
|
ok to test |
|
Test build #109093 has finished for PR 25448 at commit
|
|
retest this please |
|
Test build #109097 has finished for PR 25448 at commit
|
| protected var currentDb: String = formatDatabaseName(DEFAULT_DATABASE) | ||
|
|
||
| private val validNameFormat = "([\\w_]+)".r | ||
| private val validNameFormat = "([a-zA-Z0-9]+[\\w]*)".r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use 0-9 as the first letter of table name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, same as before.. just restricts name not to start with underscore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have restricted only _ by this change ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and abiding by hive standard
|
Test build #109104 has finished for PR 25448 at commit
|
|
@ajithme . As you see in the UT failures, this is exactly reverting our previous efforts. Please don't change the UT and try to narrow down this patch to fix only your issues (Hive tables?).
cc @gatorsmile |
|
Do we need to keep the failed unittest case actually? |
|
Wait, does table name starting with |
|
@cloud-fan @dongjoon-hyun @HyukjinKwon Its defined as following : Hive seems to have allowed digit as first character as well. Not sure why in spark we allowed "_" as starting char to begin with ? Is it to match some other system ? |
|
Just FYI: pgSQL accepts an underscore for the start of table names and doesn't do a digit; |
When enable Hive support, spark doesn't support |
|
I think spark allowed "_" to be used as start char for table/database to fall inline with PostGre, but this does deviate from behaviour of Hive. |
pvk2727
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changes seems fine, do wee need to add positive test cases such as test valid names, to assure the valid format "([a-zA-Z0-9]+[\w]*)".r
|
According to the discussions in this PR, seems there are many other databases allowing the identifiers to start with |
@cloud-fan @HyukjinKwon |
|
Can one of the admins verify this patch? |
|
We're closing this PR because it hasn't been updated in a while. If you'd like to revive this PR, please reopen it! |
What changes were proposed in this pull request?
I think we should disallow if a identifier starts with _ for create database and create table
Partially we can see its effect in SPARK-28697 where as the table name starts with _ (like _sampleTable) , the FileFormat assumes it to be a hidden folder and do not list it which causes unusual behavior
How was this patch tested?
Avoiding creating tables and databases with names starting from underscore. Added test case for same