[SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType by MaxGekk · Pull Request #55952 · apache/spark

MaxGekk · 2026-05-18T08:50:14Z

What changes were proposed in this pull request?

In the PR, I propose to extend the Spark SQL type system, and add new classes to Scala/Java APIs:

TimestampNTZNanosType(p)represents the SQL data type TIMESTAMP_NTZ(p)
TimestampLTZNanosType(p)represents TIMESTAMP_LTZ(p)

They are public API entry points only, and have no SQL/DDL/datasource integration in this PR.

The classes align with the SQL standard’s direction for optional feature F555, “Enhanced seconds precision”: datetime types can carry fractional seconds with precision p in the SECOND field beyond the traditional six decimal places (microseconds). Here p is restricted to 7, 8, and 9, i.e. the nanosecond-capable band (up to nine fractional digits, nanoseconds in the second field).

The logical layout documented on the classes matches this precision story: epoch microseconds plus nanoseconds within that microsecond, with a default estimated width of 10 bytes for planning (8 + 2).

Parameterless timestamp_ntz / timestamp_ltz are unchanged and remain the existing microsecond-oriented types.

Why are the changes needed?

New timestamp types are useful for Spark SQL users because they allow:

Represent timestamp without time zone and timestamp with local time zone with fractional-second precision 7–9, in line with SQL optional feature F555 (Enhanced seconds precision).
Describe schemas from other systems that already use nanosecond-capable timestamps, without overloading microsecond timestamp_ntz / timestamp_ltz types.
Migrate SQL and metadata that distinguish NTZ and LTZ at sub-microsecond precision toward Spark in small, reviewable steps.
Prepare later work to read and write such columns from datasources and JDBC, and to apply optimizations that depend on precise timestamp types.

Does this PR introduce any user-facing change?

Public API adds two new types in org.apache.spark.sql.types; they cannot yet be used in DataFrames, schemas read from datasources, or SQL DDL.

How was this patch tested?

By extending DataTypeSuite (round-trip and precision bounds for the new types, including invalid precisions).

$ build/sbt "test:testOnly *DataTypeSuite"

Plus SparkThrowableSuite / error-json validation if error-conditions.json is updated.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

Convert NumberFormatException from overflowing precision strings into UNSUPPORTED_TIMESTAMP_{LTZ,NTZ}_PRECISION with the original digit string preserved. Co-authored-by: Isaac

The regex in nameToType already handles every valid precision for timestamp_ltz(n) / timestamp_ntz(n) and emits a precision-specific error for invalid ones, so the parallel enumeration was dead lookup. Co-authored-by: Isaac

Anchor both types to their parameterless counterparts (TimestampType and TimestampNTZType) and state plainly that no time zone is stored, replacing the ambiguous "time zone affects interpretation only" phrase that could read as if the type carried a zone tag. Co-authored-by: Isaac

Drive both timestamp_ltz and timestamp_ntz through a single loop and add coverage for malformed precision forms (negative, empty, non- numeric, uppercase) that fall through to INVALID_JSON_DATA_TYPE. Co-authored-by: Isaac

Co-authored-by: Isaac

stevomitric · 2026-05-20T09:45:00Z

+ * @since 4.2.0
+ */
+@Unstable
+case class TimestampLTZNanosType(precision: Int) extends DatetimeType {


The current timestamp type doesn't include "LTZ" in the name. Why not go with TimestampNanosType here?

First of all, because the SPIP https://docs.google.com/document/d/1DeW15QueI4PdRyPm6C6jsTZFmIjbXX2j4h-Ja5W_fsg/edit?usp=sharing defines this class with such name. Probably you might ask why I named it in this way in the SPIP. So, there are a few reasons:

Pairs with TimestampNTZNanosType. Spark already has two SQL timestamp families: with local time zone (TimestampType / TIMESTAMP_LTZ) and without (TimestampNTZType / TIMESTAMP_NTZ). The nanosecond-capable types are the same split. Alone TimestampNanosType reads as “the” nano timestamp type and does not signal which semantics apply.

Matches SQL and typeName. The class backs timestamp_ltz(p). TimestampLTZNanosType lines up with TimestampNTZNanosType and with the SPIP/SQL names; TimestampNanosType would mirror neither timestamp_ntz nor the explicit TIMESTAMP_LTZ(n) surface.

Consistency with how Spark names the NTZ side. TimestampType omits “LTZ” for history (timestamp defaulted to session-local semantics), but TimestampNTZType is explicit because the second variant exists. For new APIs where both variants are first-class, being explicit on both sides avoids the ambiguity that already bites people (TimestampType vs “timestamp with TZ” in docs).

Safer for pattern matches and downstream code. Much of the codebase branches TimestampType vs TimestampNTZType. TimestampLTZNanosType + TimestampNTZNanosType extend that model predictably; TimestampNanosType would be assumed LTZ-by-analogy-to-TimestampType, which is easy to get wrong in reviews and refactors.

stevomitric · 2026-05-20T09:46:27Z

      cause = null)
  }
+
+  def unsupportedTimestampNtzPrecisionError(precision: String): Throwable = {


Why is precision a string here?

To pass any garbage from an user while parsing the type in json. The regex captures p as text. For values like "9" * 20, p.toInt throws NumberFormatException. I catch that and raise UNSUPPORTED_TIMESTAMP_*_PRECISION with the original digit string in the error (see DataTypeSuite — I do not want a bare NumberFormatException or a misleading message).

MaxGekk · 2026-05-20T13:06:25Z

@dongjoon-hyun @cloud-fan @felixcheung @peter-toth @mridulm @sunchao This is an initial PR corresponded to the SPIP SPARK-56822 "Timestamps with nanosecond precision". It contains minimum changes to unblock parallel work on new types. Please, review it.

peter-toth · 2026-05-20T14:19:43Z

  },
+  "UNSUPPORTED_TIMESTAMP_LTZ_PRECISION" : {
+    "message" : [
+      "The seconds precision <precision> of TIMESTAMP_LTZ is out of the supported range [7, 9]."


Would it make sense to mention parameterless TIMESTAMP_LTZ as viable option for precision < 7?

peter-toth

LGTM, just a nit.

…ISION Replace UNSUPPORTED_TIMESTAMP_{LTZ,NTZ}_PRECISION (sqlState 0A001 was "feature not supported") with a single INVALID_TIMESTAMP_PRECISION parameterized on <type>, sqlState 22023 ("invalid parameter value"). Message now points users at parameterless TIMESTAMP_LTZ / TIMESTAMP_NTZ for precision <= 6, addressing peter-toth's review comment. Co-authored-by: Isaac

MaxGekk · 2026-05-21T07:05:19Z

Merging to master. Thank you, @stevomitric @peter-toth for review.

MaxGekk added 10 commits May 18, 2026 10:44

Add TimestampNTZNanosType and TimestampLTZNanosType

59e49ed

Fix coding style

86e157d

Improve error messages

06ffd74

Handle precision overflow in nanos timestamp JSON parsing

e87f6ae

Convert NumberFormatException from overflowing precision strings into UNSUPPORTED_TIMESTAMP_{LTZ,NTZ}_PRECISION with the original digit string preserved. Co-authored-by: Isaac

Merge remote-tracking branch 'origin/master' into nanos-add-types

39584c5

Drop redundant nanos timestamp entries from otherTypes map

14106e7

The regex in nameToType already handles every valid precision for timestamp_ltz(n) / timestamp_ntz(n) and emits a precision-specific error for invalid ones, so the parallel enumeration was dead lookup. Co-authored-by: Isaac

Cover malformed JSON forms and DRY the SPARK-56876 parser test

4730b9b

Drive both timestamp_ltz and timestamp_ntz through a single loop and add coverage for malformed precision forms (negative, empty, non- numeric, uppercase) that fall through to INVALID_JSON_DATA_TYPE. Co-authored-by: Isaac

Use Locale.ROOT in DataTypeSuite to satisfy scalastyle

89616ff

Co-authored-by: Isaac

Apply scalafmt to TimestampNTZNanosType scaladoc

63f2bc3

Co-authored-by: Isaac

MaxGekk changed the title ~~[WIP][SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType~~ [SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType May 20, 2026

stevomitric reviewed May 20, 2026

View reviewed changes

peter-toth reviewed May 20, 2026

View reviewed changes

stevomitric approved these changes May 20, 2026

View reviewed changes

peter-toth approved these changes May 20, 2026

View reviewed changes

MaxGekk closed this in 1e59b7b May 21, 2026

MaxGekk mentioned this pull request May 22, 2026

[WIP][SPARK-56981][SQL] Add physical representation and UnsafeRow support for nanosecond timestamps #56059

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType#55952

[SPARK-56876][SQL] Add TimestampNTZNanosType and TimestampLTZNanosType#55952
MaxGekk wants to merge 11 commits into
apache:masterfrom
MaxGekk:nanos-add-types

MaxGekk commented May 18, 2026 •

edited

Loading

Uh oh!

stevomitric May 20, 2026

Uh oh!

MaxGekk May 20, 2026 •

edited

Loading

Uh oh!

stevomitric May 20, 2026

Uh oh!

MaxGekk May 20, 2026

Uh oh!

MaxGekk commented May 20, 2026

Uh oh!

peter-toth May 20, 2026

Uh oh!

peter-toth left a comment

Uh oh!

MaxGekk commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MaxGekk commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

stevomitric May 20, 2026

Choose a reason for hiding this comment

Uh oh!

MaxGekk May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stevomitric May 20, 2026

Choose a reason for hiding this comment

Uh oh!

MaxGekk May 20, 2026

Choose a reason for hiding this comment

Uh oh!

MaxGekk commented May 20, 2026

Uh oh!

peter-toth May 20, 2026

Choose a reason for hiding this comment

Uh oh!

peter-toth left a comment

Choose a reason for hiding this comment

Uh oh!

MaxGekk commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MaxGekk commented May 18, 2026 •

edited

Loading

MaxGekk May 20, 2026 •

edited

Loading