Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26215][SQL] Define reserved/non-reserved keywords based on the ANSI SQL standard #23259

Closed
wants to merge 18 commits into from
2 changes: 2 additions & 0 deletions docs/_data/menu-sql.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@
url: sql-migration-guide-upgrade.html
- text: Compatibility with Apache Hive
url: sql-migration-guide-hive-compatibility.html
- text: SQL Reserved/Non-Reserved Keywords
url: sql-reserved-and-non-reserved-keywords.html
- text: Reference
url: sql-reference.html
subitems:
Expand Down
573 changes: 573 additions & 0 deletions docs/sql-reserved-and-non-reserved-keywords.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,11 @@ grammar SqlBase;
return true;
}
}

/**
* When true, ANSI SQL parsing mode is enabled.
*/
public boolean ansi = false;
}

singleStatement
Expand Down Expand Up @@ -720,14 +725,15 @@ qualifiedName

identifier
: strictIdentifier
| ANTI | FULL | INNER | LEFT | SEMI | RIGHT | NATURAL | JOIN | CROSS | ON
| UNION | INTERSECT | EXCEPT | SETMINUS
| {ansi}? ansiReserved
| {!ansi}? reserved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe defaultReserved

;

strictIdentifier
: IDENTIFIER #unquotedIdentifier
| quotedIdentifier #quotedIdentifierAlternative
| nonReserved #unquotedIdentifier
: IDENTIFIER #unquotedIdentifier
| quotedIdentifier #quotedIdentifierAlternative
| {ansi}? ansiNonReserved #unquotedIdentifier
| {!ansi}? nonReserved #unquotedIdentifier
;

quotedIdentifier
Expand All @@ -744,40 +750,69 @@ number
| MINUS? BIGDECIMAL_LITERAL #bigDecimalLiteral
;

// NOTE: You must follow a rule below when you add a new ANTLR taken in this file:
// - All the ANTLR tokens = UNION(`ansiReserved`, `ansiNonReserved`) = UNION(`reserved`, `nonReserved`)
//
// Let's say you add a new token `NEWTOKEN` and this is not reserved regardless of a `spark.sql.parser.ansi.enabled`
// value. In this case, you must add a token `NEWTOKEN` in both `ansiNonReserved` and `nonReserved`.

// A list of the reserved keywords below in Spark SQL. These keywords cannot be used for identifiers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// The list of the reserved keywords when `spark.sql.parser.ansi.enabled` is true. Currently, we only reserve
// the ANSI keywords that almost all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011,
// and SQL-2016) and PostgreSQL reserve.

// when `spark.sql.parser.ansi.enabled` = true. Currently, we only reserve the ANSI keywords
// that almost all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016)
// and PostgreSQL reserve.
ansiReserved
: ALL | AND | ANTI | ANY | AS | AUTHORIZATION | BOTH | CASE | CAST | CHECK | COLLATE | COLUMN | CONSTRAINT | CREATE
| CROSS | CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP | CURRENT_USER | DISTINCT | ELSE | END | EXCEPT | FALSE
| FETCH | FOR | FOREIGN | FROM | FULL | GRANT | GROUP | HAVING | IN | INNER | INTERSECT | INTO | JOIN | IS
| LEADING | LEFT | NATURAL | NOT | NULL | ON | ONLY | OR | ORDER | OUTER | OVERLAPS | PRIMARY | REFERENCES | RIGHT
| SELECT | SEMI | SESSION_USER | SETMINUS | SOME | TABLE | THEN | TO | TRAILING | UNION | UNIQUE | USER | USING
| WHEN | WHERE | WITH
;

// When `spark.sql.parser.ansi.enabled` = true, the `ansiNonReserved` keywords can be used for identifiers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ansiReserved + ansiNonReserved are all the keywords in Spark? Can we add a note here? If in the future someone adds a new keyword, he must put it either in ansiReserved or in ansiNonReserved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ansiReserved + ansiNonReserved are all the keywords in Spark?

Yeah, yes.

Can we add a note here? If in the future someone adds a new keyword, he must put it either in ansiReserved or in ansiNonReserved.

ok.

// Otherwise (`spark.sql.parser.ansi.enabled` = false), we follow the existing Spark SQL behaviour until v3.0:
Copy link
Contributor

@cloud-fan cloud-fan Feb 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of the non-reserved keywords when `spark.sql.parser.ansi.enabled` is true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

// the `nonReserved` keywords can be used instead.
ansiNonReserved
: ADD | AFTER | ALTER | ANALYZE | ARCHIVE | ARRAY | ASC | AT | BETWEEN | BUCKET | BUCKETS | BY | CACHE | CASCADE
| CHANGE | CLEAR | CLUSTER | CLUSTERED | CODEGEN | COLLECTION | COLUMNS | COMMENT | COMMIT | COMPACT | COMPACTIONS
| COMPUTE | CONCATENATE | COST | CUBE | CURRENT | DATA | DATABASE | DATABASES | DBPROPERTIES | DEFINED | DELETE
| DELIMITED | DESC | DESCRIBE | DFS | DIRECTORIES | DIRECTORY | DISTRIBUTE | DIV | DROP | ESCAPED | EXCHANGE
| EXISTS | EXPLAIN | EXPORT | EXTENDED | EXTERNAL | EXTRACT | FIELDS | FILEFORMAT | FIRST | FOLLOWING | FORMAT
| FORMATTED | FUNCTION | FUNCTIONS | GLOBAL | GROUPING | IF | IGNORE | IMPORT | INDEX | INDEXES | INPATH
| INPUTFORMAT | INSERT | INTERVAL | ITEMS | KEYS | LAST | LATERAL | LAZY | LIKE | LIMIT | LINES | LIST | LOAD
| LOCAL | LOCATION | LOCK | LOCKS | LOGICAL | MACRO | MAP | MSCK | NO | NULLS | OF | OPTION | OPTIONS | OUT
| OUTPUTFORMAT | OVER | OVERWRITE | PARTITION | PARTITIONED | PARTITIONS | PERCENT | PERCENTLIT | PIVOT | PRECEDING
| PRINCIPALS | PURGE | RANGE | RECORDREADER | RECORDWRITER | RECOVER | REDUCE | REFRESH | RENAME | REPAIR | REPLACE
| RESET | RESTRICT | REVOKE | RLIKE | ROLE | ROLES | ROLLBACK | ROLLUP | ROW | ROWS | SCHEMA | SEPARATED | SERDE
| SERDEPROPERTIES | SET | SETS | SHOW | SKEWED | SORT | SORTED | START | STATISTICS | STORED | STRATIFY | STRUCT
| TABLES | TABLESAMPLE | TBLPROPERTIES | TEMPORARY | TERMINATED | TOUCH | TRANSACTION | TRANSACTIONS | TRANSFORM
| TRUE | TRUNCATE | UNARCHIVE | UNBOUNDED | UNCACHE | UNLOCK | UNSET | USE | VALUES | VIEW | WINDOW
;

reserved
: ANTI | CROSS | EXCEPT | FULL | INNER | INTERSECT | JOIN | LEFT | NATURAL | ON | RIGHT | SEMI | SETMINUS | UNION
| USING
;

nonReserved
: SHOW | TABLES | COLUMNS | COLUMN | PARTITIONS | FUNCTIONS | DATABASES
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I compared this section. All the original nonReserved keywords are still in the nonReserved

| ADD
| OVER | PARTITION | RANGE | ROWS | PRECEDING | FOLLOWING | CURRENT | ROW | LAST | FIRST | AFTER
| MAP | ARRAY | STRUCT
| PIVOT | LATERAL | WINDOW | REDUCE | TRANSFORM | SERDE | SERDEPROPERTIES | RECORDREADER
| DELIMITED | FIELDS | TERMINATED | COLLECTION | ITEMS | KEYS | ESCAPED | LINES | SEPARATED
| EXTENDED | REFRESH | CLEAR | CACHE | UNCACHE | LAZY | GLOBAL | TEMPORARY | OPTIONS
| GROUPING | CUBE | ROLLUP
| EXPLAIN | FORMAT | LOGICAL | FORMATTED | CODEGEN | COST
| TABLESAMPLE | USE | TO | BUCKET | PERCENTLIT | OUT | OF
| SET | RESET
| VIEW | REPLACE
| IF
| POSITION
| EXTRACT
| NO | DATA
| START | TRANSACTION | COMMIT | ROLLBACK | IGNORE
| SORT | CLUSTER | DISTRIBUTE | UNSET | TBLPROPERTIES | SKEWED | STORED | DIRECTORIES | LOCATION
| EXCHANGE | ARCHIVE | UNARCHIVE | FILEFORMAT | TOUCH | COMPACT | CONCATENATE | CHANGE
| CASCADE | RESTRICT | BUCKETS | CLUSTERED | SORTED | PURGE | INPUTFORMAT | OUTPUTFORMAT
| DBPROPERTIES | DFS | TRUNCATE | COMPUTE | LIST
| STATISTICS | ANALYZE | PARTITIONED | EXTERNAL | DEFINED | RECORDWRITER
| REVOKE | GRANT | LOCK | UNLOCK | MSCK | REPAIR | RECOVER | EXPORT | IMPORT | LOAD | VALUES | COMMENT | ROLE
| ROLES | COMPACTIONS | PRINCIPALS | TRANSACTIONS | INDEX | INDEXES | LOCKS | OPTION | LOCAL | INPATH
| ASC | DESC | LIMIT | RENAME | SETS
| AT | NULLS | OVERWRITE | ALL | ANY | ALTER | AS | BETWEEN | BY | CREATE | DELETE
| DESCRIBE | DROP | EXISTS | FALSE | FOR | GROUP | IN | INSERT | INTO | IS |LIKE
| NULL | ORDER | OUTER | TABLE | TRUE | WITH | RLIKE
| AND | CASE | CAST | DISTINCT | DIV | ELSE | END | FUNCTION | INTERVAL | MACRO | OR | STRATIFY | THEN
| UNBOUNDED | WHEN
| DATABASE | SELECT | FROM | WHERE | HAVING | TO | TABLE | WITH | NOT
| DIRECTORY
| BOTH | LEADING | TRAILING
: ADD | AFTER | ALL | ALTER | ANALYZE | AND | ANY | ARCHIVE | ARRAY | AS | ASC | AT | AUTHORIZATION | BETWEEN
| BOTH | BUCKET | BUCKETS | BY | CACHE | CASCADE | CASE | CAST | CHANGE | CHECK | CLEAR | CLUSTER | CLUSTERED
| CODEGEN | COLLATE | COLLECTION | COLUMN | COLUMNS | COMMENT | COMMIT | COMPACT | COMPACTIONS | COMPUTE
| CONCATENATE | CONSTRAINT | COST | CREATE | CUBE | CURRENT | CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP
| CURRENT_USER | DATA | DATABASE | DATABASES | DBPROPERTIES | DEFINED | DELETE | DELIMITED | DESC | DESCRIBE | DFS
| DIRECTORIES | DIRECTORY | DISTINCT | DISTRIBUTE | DIV | DROP | ELSE | END | ESCAPED | EXCHANGE | EXISTS | EXPLAIN
| EXPORT | EXTENDED | EXTERNAL | EXTRACT | FALSE | FETCH | FIELDS | FILEFORMAT | FIRST | FOLLOWING | FOR | FOREIGN
| FORMAT | FORMATTED | FROM | FUNCTION | FUNCTIONS | GLOBAL | GRANT | GROUP | GROUPING | HAVING | IF | IGNORE
| IMPORT | IN | INDEX | INDEXES | INPATH | INPUTFORMAT | INSERT | INTERVAL | INTO | IS | ITEMS | KEYS | LAST
| LATERAL | LAZY | LEADING | LIKE | LIMIT | LINES | LIST | LOAD | LOCAL | LOCATION | LOCK | LOCKS | LOGICAL | MACRO
| MAP | MSCK | NO | NOT | NULL | NULLS | OF | ONLY | OPTION | OPTIONS | OR | ORDER | OUT | OUTER | OUTPUTFORMAT
| OVER | OVERLAPS | OVERWRITE | PARTITION | PARTITIONED | PARTITIONS | PERCENTLIT | PIVOT | POSITION | PRECEDING
| PRIMARY | PRINCIPALS | PURGE | RANGE | RECORDREADER | RECORDWRITER | RECOVER | REDUCE | REFERENCES | REFRESH
| RENAME | REPAIR | REPLACE | RESET | RESTRICT | REVOKE | RLIKE | ROLE | ROLES | ROLLBACK | ROLLUP | ROW | ROWS
| SELECT | SEPARATED | SERDE | SERDEPROPERTIES | SESSION_USER | SET | SETS | SHOW | SKEWED | SOME | SORT | SORTED
| START | STATISTICS | STORED | STRATIFY | STRUCT | TABLE | TABLES | TABLESAMPLE | TBLPROPERTIES | TEMPORARY
| TERMINATED | THEN | TO | TOUCH | TRAILING | TRANSACTION | TRANSACTIONS | TRANSFORM | TRUE | TRUNCATE | UNARCHIVE
| UNBOUNDED | UNCACHE | UNLOCK | UNIQUE | UNSET | USE | USER | VALUES | VIEW | WHEN | WHERE | WINDOW | WITH
;

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.


SELECT: 'SELECT';
Expand Down Expand Up @@ -1014,6 +1049,24 @@ OPTION: 'OPTION';
ANTI: 'ANTI';
LOCAL: 'LOCAL';
INPATH: 'INPATH';
AUTHORIZATION: 'AUTHORIZATION';
CHECK: 'CHECK';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ansi is off, we need to ensure all these newly added keywords are still not reserved. Thus, CHECK should be added to nonReserved , right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the newly added keywords. Below is the list that are not included:

 CHECK: 'CHECK';
 COLLATE: 'COLLATE';
 OVERLAPS: 'OVERLAPS';
 ONLY: 'ONLY';
 PRIMARY: 'PRIMARY';
 REFERENCES: 'REFERENCES';
 UNIQUE: 'UNIQUE';
 VARYING: 'VARYING';

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, I'll update

COLLATE: 'COLLATE';
CONSTRAINT: 'CONSTRAINT';
CURRENT_DATE: 'CURRENT_DATE';
CURRENT_TIME: 'CURRENT_TIME';
CURRENT_TIMESTAMP: 'CURRENT_TIMESTAMP';
CURRENT_USER: 'CURRENT_USER';
FETCH: 'FETCH';
FOREIGN: 'FOREIGN';
ONLY: 'ONLY';
OVERLAPS: 'OVERLAPS';
PRIMARY: 'PRIMARY';
REFERENCES: 'REFERENCES';
SESSION_USER: 'SESSION_USER';
SOME: 'SOME';
UNIQUE: 'UNIQUE';
USER: 'USER';

STRING
: '\'' ( ~('\''|'\\') | ('\\' .) )* '\''
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -271,7 +271,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
override def visitPartitionSpec(
ctx: PartitionSpecContext): Map[String, Option[String]] = withOrigin(ctx) {
val parts = ctx.partitionVal.asScala.map { pVal =>
val name = pVal.identifier.getText
val name = visitIdentifier(pVal.identifier)
val value = Option(pVal.constant).map(visitStringConstant)
name -> value
}
Expand Down Expand Up @@ -570,7 +570,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
// Collect all window specifications defined in the WINDOW clause.
val baseWindowMap = ctx.namedWindow.asScala.map {
wCtx =>
(wCtx.identifier.getText, typedVisit[WindowSpec](wCtx.windowSpec))
(visitIdentifier(wCtx.identifier), typedVisit[WindowSpec](wCtx.windowSpec))
}.toMap

// Handle cases like
Expand Down Expand Up @@ -645,11 +645,11 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
.flatMap(_.namedExpression.asScala)
.map(typedVisit[Expression])
val pivotColumn = if (ctx.pivotColumn.identifiers.size == 1) {
UnresolvedAttribute.quoted(ctx.pivotColumn.identifier.getText)
UnresolvedAttribute.quoted(visitIdentifier(ctx.pivotColumn.identifier))
} else {
CreateStruct(
ctx.pivotColumn.identifiers.asScala.map(
identifier => UnresolvedAttribute.quoted(identifier.getText)))
identifier => UnresolvedAttribute.quoted(visitIdentifier(identifier))))
}
val pivotValues = ctx.pivotValues.asScala.map(visitPivotValue)
Pivot(None, pivotColumn, pivotValues, aggregates, query)
Expand All @@ -661,7 +661,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
override def visitPivotValue(ctx: PivotValueContext): Expression = withOrigin(ctx) {
val e = expression(ctx.expression)
if (ctx.identifier != null) {
Alias(e, ctx.identifier.getText)()
Alias(e, visitIdentifier(ctx.identifier))()
} else {
e
}
Expand Down Expand Up @@ -832,7 +832,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
}

val tvf = UnresolvedTableValuedFunction(
func.identifier.getText, func.expression.asScala.map(expression), aliases)
visitIdentifier(func.identifier), func.expression.asScala.map(expression), aliases)
tvf.optionalMap(func.tableAlias.strictIdentifier)(aliasPlan)
}

Expand Down Expand Up @@ -931,7 +931,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
* Create a Sequence of Strings for an identifier list.
*/
override def visitIdentifierSeq(ctx: IdentifierSeqContext): Seq[String] = withOrigin(ctx) {
ctx.identifier.asScala.map(_.getText)
ctx.identifier.asScala.map(visitIdentifier)
}

/* ********************************************************************************************
Expand All @@ -942,7 +942,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
*/
override def visitTableIdentifier(
ctx: TableIdentifierContext): TableIdentifier = withOrigin(ctx) {
TableIdentifier(ctx.table.getText, Option(ctx.db).map(_.getText))
TableIdentifier(visitIdentifier(ctx.table), Option(ctx.db).map(_.getText))
}

/**
Expand Down Expand Up @@ -974,7 +974,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
* Both un-targeted (global) and targeted aliases are supported.
*/
override def visitStar(ctx: StarContext): Expression = withOrigin(ctx) {
UnresolvedStar(Option(ctx.qualifiedName()).map(_.identifier.asScala.map(_.getText)))
UnresolvedStar(Option(ctx.qualifiedName()).map(_.identifier.asScala.map(visitIdentifier)))
}

/**
Expand All @@ -984,7 +984,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
override def visitNamedExpression(ctx: NamedExpressionContext): Expression = withOrigin(ctx) {
val e = expression(ctx.expression)
if (ctx.identifier != null) {
Alias(e, ctx.identifier.getText)()
Alias(e, visitIdentifier(ctx.identifier))()
} else if (ctx.identifierList != null) {
MultiAlias(e, visitIdentifierList(ctx.identifierList))
} else {
Expand Down Expand Up @@ -1326,7 +1326,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
* Create a function database (optional) and name pair.
*/
protected def visitFunctionName(ctx: QualifiedNameContext): FunctionIdentifier = {
ctx.identifier().asScala.map(_.getText) match {
ctx.identifier().asScala.map(visitIdentifier) match {
case Seq(db, fn) => FunctionIdentifier(fn, Option(db))
case Seq(fn) => FunctionIdentifier(fn, None)
case other => throw new ParseException(s"Unsupported function name '${ctx.getText}'", ctx)
Expand All @@ -1350,7 +1350,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
* Create a reference to a window frame, i.e. [[WindowSpecReference]].
*/
override def visitWindowRef(ctx: WindowRefContext): WindowSpecReference = withOrigin(ctx) {
WindowSpecReference(ctx.identifier.getText)
WindowSpecReference(visitIdentifier(ctx.identifier))
}

/**
Expand Down Expand Up @@ -1404,6 +1404,15 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
}
}

override def visitIdentifier(ctx: IdentifierContext): String = withOrigin(ctx) {
val keyword = ctx.getText
if (ctx.ansiReserved() != null) {
throw new ParseException(
s"'$keyword' is reserved and you cannot use this keyword as an identifier.", ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, do we need to do it in this PR? I think this PR just changes the list of non-reserved/reserved keywords according to the config.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah..., ok. But, we have no behaivour change between ansi=true and ansi=false now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I reverted the unrelated stuffs.

}
keyword
}

/**
* Create a [[CreateStruct]] expression.
*/
Expand Down Expand Up @@ -1551,7 +1560,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
*/
override def visitTypeConstructor(ctx: TypeConstructorContext): Literal = withOrigin(ctx) {
val value = string(ctx.STRING)
val valueType = ctx.identifier.getText.toUpperCase(Locale.ROOT)
val valueType = visitIdentifier(ctx.identifier).toUpperCase(Locale.ROOT)
def toLiteral[T](f: UTF8String => Option[T], t: DataType): Literal = {
f(UTF8String.fromString(value)).map(Literal(_, t)).getOrElse {
throw new ParseException(s"Cannot parse the $valueType value: $value", ctx)
Expand Down Expand Up @@ -1755,7 +1764,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
* Resolve/create a primitive type.
*/
override def visitPrimitiveDataType(ctx: PrimitiveDataTypeContext): DataType = withOrigin(ctx) {
val dataType = ctx.identifier.getText.toLowerCase(Locale.ROOT)
val dataType = visitIdentifier(ctx.identifier).toLowerCase(Locale.ROOT)
(dataType, ctx.INTEGER_VALUE().asScala.toList) match {
case ("boolean", Nil) => BooleanType
case ("tinyint" | "byte", Nil) => ByteType
Expand Down Expand Up @@ -1827,7 +1836,7 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
}

StructField(
identifier.getText,
visitIdentifier(identifier),
cleanedDataType,
nullable = true,
builder.build())
Expand All @@ -1853,7 +1862,8 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging
*/
override def visitComplexColType(ctx: ComplexColTypeContext): StructField = withOrigin(ctx) {
import ctx._
val structField = StructField(identifier.getText, typedVisit(dataType), nullable = true)
val name = visitIdentifier(identifier)
val structField = StructField(name, typedVisit(dataType), nullable = true)
if (STRING == null) structField else structField.withComment(string(STRING))
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -85,13 +85,15 @@ abstract class AbstractSqlParser extends ParserInterface with Logging {
lexer.removeErrorListeners()
lexer.addErrorListener(ParseErrorListener)
lexer.legacy_setops_precedence_enbled = SQLConf.get.setOpsPrecedenceEnforced
lexer.ansi = SQLConf.get.ansiParserEnabled

val tokenStream = new CommonTokenStream(lexer)
val parser = new SqlBaseParser(tokenStream)
parser.addParseListener(PostProcessor)
parser.removeErrorListeners()
parser.addErrorListener(ParseErrorListener)
parser.legacy_setops_precedence_enbled = SQLConf.get.setOpsPrecedenceEnforced
parser.ansi = SQLConf.get.ansiParserEnabled

try {
try {
Expand Down Expand Up @@ -131,7 +133,7 @@ class CatalystSqlParser(conf: SQLConf) extends AbstractSqlParser {

/** For test-only. */
object CatalystSqlParser extends AbstractSqlParser {
val astBuilder = new AstBuilder(new SQLConf())
val astBuilder = new AstBuilder(SQLConf.get)
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,12 @@ object SQLConf {
.booleanConf
.createWithDefault(true)

val ANSI_SQL_PARSER =
buildConf("spark.sql.parser.ansi.enabled")
.doc("When true, tries to conform to ANSI SQL syntax.")
.booleanConf
.createWithDefault(false)

val ESCAPED_STRING_LITERALS = buildConf("spark.sql.parser.escapedStringLiterals")
.internal()
.doc("When true, string literals (including regex patterns) remain escaped in our SQL " +
Expand Down Expand Up @@ -1832,6 +1838,8 @@ class SQLConf extends Serializable with Logging {

def constraintPropagationEnabled: Boolean = getConf(CONSTRAINT_PROPAGATION_ENABLED)

def ansiParserEnabled: Boolean = getConf(ANSI_SQL_PARSER)

def escapedStringLiterals: Boolean = getConf(ESCAPED_STRING_LITERALS)

def fileCompressionFactor: Double = getConf(FILE_COMRESSION_FACTOR)
Expand Down
Loading