From d8163fe49664c46631c1e317a82b6e2e6fa17d92 Mon Sep 17 00:00:00 2001
From: Phillip LeBlanc <phillip@leblanc.tech>
Date: Tue, 18 Feb 2025 14:29:24 +0900
Subject: [PATCH 1/2] Add docs on `time_partition_column`

---
 .../features/data-acceleration/data-refresh.md     | 14 ++++++++++++++
 website/docs/reference/spicepod/datasets.md        | 11 +++++++++++
 2 files changed, 25 insertions(+)

diff --git a/website/docs/features/data-acceleration/data-refresh.md b/website/docs/features/data-acceleration/data-refresh.md
index db44c2066..1ad92aefc 100644
--- a/website/docs/features/data-acceleration/data-refresh.md
+++ b/website/docs/features/data-acceleration/data-refresh.md
@@ -58,6 +58,20 @@ datasets:
 
 If late arriving data or clock-skew needs to be accounted for, an optional overlap can also be specified. See [`acceleration.refresh_append_overlap`](/docs/reference/spicepod/datasets#accelerationrefresh_append_overlap).
 
+Datasets that are partitioned by a less-granular time-column (e.g. day, month, year) can also use the `time_partition_column` parameter in addition to the `time_column` parameter to specify the time-column to use for efficient partition pruning.
+
+Example:
+
+```yaml
+datasets:
+  - from: databricks:my_dataset
+    name: accelerated_dataset
+    time_column: created_at
+    time_format: iso8601
+    time_partition_column: created_at_day
+    time_partition_format: date
+```
+
 ### Changes (CDC)
 
 Datasets configured with acceleration `refresh_mode: changes` requires a [Change Data Capture (CDC)](/docs/features/cdc/index.md) supported data connector. Initial CDC support in Spice is supported by the [Debezium data connector](/docs/components/data-connectors/debezium.md).
diff --git a/website/docs/reference/spicepod/datasets.md b/website/docs/reference/spicepod/datasets.md
index f129f4624..900dd2d86 100644
--- a/website/docs/reference/spicepod/datasets.md
+++ b/website/docs/reference/spicepod/datasets.md
@@ -150,6 +150,7 @@ Optional. The format of the `time_column`. The following values are supported:
 - `unix_seconds` - Unix timestamp in seconds. E.g. `1718756687`.
 - `unix_millis` - Unix timestamp in milliseconds. E.g. `1718756687000`.
 - `ISO8601` - [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) format.
+- `date` - Date in YYYY-MM-DD format. E.g. `2024-01-01`.
 
 Spice emits a warning if the `time_column` from the data source is incompatible with the `time_format` config.
 
@@ -159,6 +160,16 @@ Spice emits a warning if the `time_column` from the data source is incompatible
 
 :::
 
+## `time_partition_column`
+
+Optional. The name of the column that represents the time-based partitioning of the dataset. Requires `time_column` to be set.
+
+This parameter is used when the dataset is partitioned by a less-granular time-column (e.g. day, month, year), but the data source has a more granular time-column available (e.g. timestamp). This can ensure that queries for a specific time range are optimized by the data source to use the appropriate partitions.
+
+## `time_partition_format`
+
+Optional. The format of the `time_partition_column`. The same format options as `time_format` are supported.
+
 ## `unsupported_type_action`
 
 Optional. Specifies the action to take when a data type that is not supported by the data connector is encountered.

From b349cdd86ec8aa4adad652ef542d6ff6e75d2530 Mon Sep 17 00:00:00 2001
From: Luke Kim <80174+lukekim@users.noreply.github.com>
Date: Tue, 18 Feb 2025 14:38:44 +0900
Subject: [PATCH 2/2] Tweak docs

---
 website/docs/reference/spicepod/datasets.md | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/website/docs/reference/spicepod/datasets.md b/website/docs/reference/spicepod/datasets.md
index 900dd2d86..10d3cece5 100644
--- a/website/docs/reference/spicepod/datasets.md
+++ b/website/docs/reference/spicepod/datasets.md
@@ -162,13 +162,11 @@ Spice emits a warning if the `time_column` from the data source is incompatible
 
 ## `time_partition_column`
 
-Optional. The name of the column that represents the time-based partitioning of the dataset. Requires `time_column` to be set.
-
-This parameter is used when the dataset is partitioned by a less-granular time-column (e.g. day, month, year), but the data source has a more granular time-column available (e.g. timestamp). This can ensure that queries for a specific time range are optimized by the data source to use the appropriate partitions.
+(Optional) Specify the column that represents the physical partitioning of the dataset when using append-based acceleration. When the defined `time_column` is a fine-grained timestamp and the dataset is physically partitioned by a coarser granularity (for example, by date), setting `time_partition_column` to the partition column (e.g. date_col) improves partition pruning, excludes irrelevant partitions during refreshes, and optimizes scan efficiency.
 
 ## `time_partition_format`
 
-Optional. The format of the `time_partition_column`. The same format options as `time_format` are supported.
+(Optional) Define the format of the `time_partition_column`. For instance, if the physical partitions follow a date format (YYYY-MM-DD), set this value to `date`. The same format options as `time_format` are supported for `time_partition_column`.
 
 ## `unsupported_type_action`