From 7b4b29e8d955b43daa9ad28667e4fadbb9fce49a Mon Sep 17 00:00:00 2001 From: Kent Yao Date: Thu, 12 Mar 2020 09:24:49 -0700 Subject: [PATCH] [SPARK-31131][SQL] Remove the unnecessary config spark.sql.legacy.timeParser.enabled ### What changes were proposed in this pull request? spark.sql.legacy.timeParser.enabled should be removed from SQLConf and the migration guide spark.sql.legacy.timeParsePolicy is the right one ### Why are the changes needed? fix doc ### Does this PR introduce any user-facing change? no ### How was this patch tested? Pass the jenkins Closes #27889 from yaooqinn/SPARK-31131. Authored-by: Kent Yao Signed-off-by: Dongjoon Hyun --- docs/sql-migration-guide.md | 2 +- .../scala/org/apache/spark/sql/internal/SQLConf.scala | 11 +---------- 2 files changed, 2 insertions(+), 11 deletions(-) diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md index 49b79b6132e98..468dcdb96a8d1 100644 --- a/docs/sql-migration-guide.md +++ b/docs/sql-migration-guide.md @@ -70,7 +70,7 @@ license: | - Since Spark 3.0, Proleptic Gregorian calendar is used in parsing, formatting, and converting dates and timestamps as well as in extracting sub-components like years, days and etc. Spark 3.0 uses Java 8 API classes from the java.time packages that based on ISO chronology (https://docs.oracle.com/javase/8/docs/api/java/time/chrono/IsoChronology.html). In Spark version 2.4 and earlier, those operations are performed by using the hybrid calendar (Julian + Gregorian, see https://docs.oracle.com/javase/7/docs/api/java/util/GregorianCalendar.html). The changes impact on the results for dates before October 15, 1582 (Gregorian) and affect on the following Spark 3.0 API: - - Parsing/formatting of timestamp/date strings. This effects on CSV/JSON datasources and on the `unix_timestamp`, `date_format`, `to_unix_timestamp`, `from_unixtime`, `to_date`, `to_timestamp` functions when patterns specified by users is used for parsing and formatting. Since Spark 3.0, the conversions are based on `java.time.format.DateTimeFormatter`, see https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. New implementation performs strict checking of its input. For example, the `2015-07-22 10:00:00` timestamp cannot be parse if pattern is `yyyy-MM-dd` because the parser does not consume whole input. Another example is the `31/01/2015 00:00` input cannot be parsed by the `dd/MM/yyyy hh:mm` pattern because `hh` supposes hours in the range `1-12`. In Spark version 2.4 and earlier, `java.text.SimpleDateFormat` is used for timestamp/date string conversions, and the supported patterns are described in https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. The old behavior can be restored by setting `spark.sql.legacy.timeParser.enabled` to `true`. + - Parsing/formatting of timestamp/date strings. This effects on CSV/JSON datasources and on the `unix_timestamp`, `date_format`, `to_unix_timestamp`, `from_unixtime`, `to_date`, `to_timestamp` functions when patterns specified by users is used for parsing and formatting. Since Spark 3.0, we define our own pattern strings in `sql-ref-datetime-pattern.md`, which is implemented via `java.time.format.DateTimeFormatter` under the hood. New implementation performs strict checking of its input. For example, the `2015-07-22 10:00:00` timestamp cannot be parse if pattern is `yyyy-MM-dd` because the parser does not consume whole input. Another example is the `31/01/2015 00:00` input cannot be parsed by the `dd/MM/yyyy hh:mm` pattern because `hh` supposes hours in the range `1-12`. In Spark version 2.4 and earlier, `java.text.SimpleDateFormat` is used for timestamp/date string conversions, and the supported patterns are described in https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html. The old behavior can be restored by setting `spark.sql.legacy.timeParserPolicy` to `LEGACY`. - The `weekofyear`, `weekday`, `dayofweek`, `date_trunc`, `from_utc_timestamp`, `to_utc_timestamp`, and `unix_timestamp` functions use java.time API for calculation week number of year, day number of week as well for conversion from/to TimestampType values in UTC time zone. diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala index b5f20466b1ec8..4926183969a80 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala @@ -2402,7 +2402,7 @@ object SQLConf { "When set to CORRECTED, classes from java.time.* packages are used for the same purpose. " + "The default value is EXCEPTION, RuntimeException is thrown when we will get different " + "results.") - .version("3.1.0") + .version("3.0.0") .stringConf .transform(_.toUpperCase(Locale.ROOT)) .checkValues(LegacyBehaviorPolicy.values.map(_.toString)) @@ -2482,15 +2482,6 @@ object SQLConf { .checkValue(_ > 0, "The value of spark.sql.addPartitionInBatch.size must be positive") .createWithDefault(100) - val LEGACY_TIME_PARSER_ENABLED = buildConf("spark.sql.legacy.timeParser.enabled") - .internal() - .doc("When set to true, java.text.SimpleDateFormat is used for formatting and parsing " + - "dates/timestamps in a locale-sensitive manner. When set to false, classes from " + - "java.time.* packages are used for the same purpose.") - .version("3.0.0") - .booleanConf - .createWithDefault(false) - val LEGACY_ALLOW_HASH_ON_MAPTYPE = buildConf("spark.sql.legacy.allowHashOnMapType") .doc("When set to true, hash expressions can be applied on elements of MapType. Otherwise, " + "an analysis exception will be thrown.")