Skip to content

Add PostgreSQL-compatible EXTRACT units#100274

Merged
alexey-milovidov merged 4 commits intomasterfrom
extract-postgres-units
Mar 21, 2026
Merged

Add PostgreSQL-compatible EXTRACT units#100274
alexey-milovidov merged 4 commits intomasterfrom
extract-postgres-units

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Add PostgreSQL-compatible units to the EXTRACT operator: EPOCH, DOW, DOY, ISODOW, ISOYEAR, WEEK, CENTURY, DECADE, MILLENNIUM.

These are parsed as extract-only units (not interval kinds) and mapped to existing ClickHouse functions:

  • EPOCHtoUnixTimestamp
  • DOWtoDayOfWeek(expr, 2) (0 = Sunday, 6 = Saturday)
  • DOYtoDayOfYear
  • ISODOWtoDayOfWeek(expr) (1 = Monday, 7 = Sunday)
  • ISOYEARtoISOYear
  • WEEKtoISOWeek (previously threw an error)
  • CENTURYintDiv(toYear(expr) - 1, 100) + 1
  • DECADEintDiv(toYear(expr), 10)
  • MILLENNIUMintDiv(toYear(expr) - 1, 1000) + 1

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add PostgreSQL-compatible units to the EXTRACT operator: EPOCH, DOW, DOY, ISODOW, ISOYEAR, WEEK, CENTURY, DECADE, MILLENNIUM. Also fix EXTRACT(WEEK FROM date) which previously threw an error.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

The EXTRACT operator now supports additional PostgreSQL-compatible units:

  • EXTRACT(EPOCH FROM expr) — seconds since 1970-01-01 00:00:00 UTC
  • EXTRACT(DOW FROM expr) — day of week (0 = Sunday, 6 = Saturday)
  • EXTRACT(DOY FROM expr) — day of year (1–366)
  • EXTRACT(ISODOW FROM expr) — ISO day of week (1 = Monday, 7 = Sunday)
  • EXTRACT(ISOYEAR FROM expr) — ISO 8601 week-numbering year
  • EXTRACT(WEEK FROM expr) — ISO 8601 week number (1–53)
  • EXTRACT(CENTURY FROM expr) — century
  • EXTRACT(DECADE FROM expr) — decade (year / 10)
  • EXTRACT(MILLENNIUM FROM expr) — millennium

Example:

SELECT EXTRACT(EPOCH FROM now());
SELECT EXTRACT(DOW FROM toDate('2024-01-15')); -- 1 (Monday)
SELECT EXTRACT(CENTURY FROM toDate('2024-01-01')); -- 21

…YEAR, WEEK, CENTURY, DECADE, MILLENNIUM

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 21, 2026

Workflow [PR], commit [6661f10]

Summary:


AI Review

Summary

This PR adds PostgreSQL-style EXTRACT units and maps them to existing functions, including enabling EXTRACT(WEEK FROM ...) via toISOWeek. The main correctness gap is EPOCH: mapping to toUnixTimestamp does not preserve PostgreSQL-compatible semantics for DateTime64 (fractional seconds are lost and the return type/range semantics differ). Overall verdict: request changes until EPOCH behavior is aligned (or clearly scoped as intentionally different).

Findings

❌ Blockers

  • [src/Parsers/ExpressionListParsers.cpp:1640] EXTRACT(EPOCH FROM ...) is implemented as toUnixTimestamp(expr), which is not PostgreSQL-compatible for DateTime64 because subsecond precision is truncated and unsigned-second conversion semantics differ from PostgreSQL double precision epoch extraction.
    • Suggested fix: route EPOCH through a path that preserves fractional seconds for DateTime64 (e.g., toUnixTimestamp64* + scale to Float64, while keeping integer behavior for non-fractional types if desired), and ensure behavior for pre-epoch timestamps is explicitly defined and tested.

ClickHouse Rules

Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time

Performance & Safety

  • EPOCH currently takes a lossy conversion path for DateTime64; this is a semantic correctness issue rather than a runtime performance issue, but it can silently drop precision in user queries.

Final Verdict

  • Status: ⚠️ Request changes
  • Minimum required actions:
    • Fix EXTRACT(EPOCH FROM DateTime64) to preserve PostgreSQL-compatible epoch semantics (fractional seconds and well-defined behavior across full supported timestamp range), or explicitly narrow/document non-compatibility and adjust naming/claims accordingly.

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Mar 21, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread src/Parsers/ExpressionListParsers.cpp
-- Test PostgreSQL-compatible EXTRACT units

-- EPOCH: seconds since 1970-01-01
SELECT EXTRACT(EPOCH FROM toDateTime('2024-01-15 12:30:45', 'UTC'));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a coverage case for EXTRACT(EPOCH FROM toDateTime64(...)).

Right now the new test checks only Date/DateTime, so it does not lock the intended behavior for fractional seconds. This is especially important because the implementation currently routes through toUnixTimestamp.

Add `DateTime64` test case for `EXTRACT(EPOCH FROM ...)` to document that
subsecond precision is truncated (returns integer seconds via
`toUnixTimestamp`). Add case-insensitive keyword tests and `DateTime64`
tests for DOW, DOY, ISOYEAR units. Document the truncation behavior in
the EPOCH docs entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread docs/en/sql-reference/operators/index.md
Update docs to mention `Date32` and `DateTime64` as supported types.
Add tests for all new PostgreSQL-compatible EXTRACT units (EPOCH, DOW,
DOY, ISODOW, ISOYEAR, WEEK, CENTURY, DECADE, MILLENNIUM) with `Date32`
and `DateTime64` arguments.

https://github.com/ClickHouse/ClickHouse/pull/100274/changes#diff-3b481535bcd78eeebb8fb9282ee4ca40a5e1a69f499146820b39fe5389e903ac

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member Author

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is good.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Mar 21, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.80% 83.80% +0.00%
Functions 24.60% 24.60% +0.00%
Branches 76.50% 76.40% -0.10%

PR changed lines: PR changed-lines coverage: 97.54% (119/122, 1 noise lines excluded)
Diff coverage report
Uncovered code

@alexey-milovidov alexey-milovidov self-assigned this Mar 21, 2026
@alexey-milovidov alexey-milovidov merged commit be3a15f into master Mar 21, 2026
151 of 152 checks passed
@alexey-milovidov alexey-milovidov deleted the extract-postgres-units branch March 21, 2026 21:04
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants