In this section, we’re going to explore how time travel works in Apache Iceberg. We’ll create a table, insert some data, and then move between the different snapshots that Iceberg generates along the way.

In [4]:
spark.stop()

In [5]:
from pyspark.sql import SparkSession

spark = SparkSession.builder \
    .appName("IcebergSparkWithMinIO") \
    .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkCatalog") \
    .config("spark.sql.catalog.iceberg_catalog.type", "rest") \
    .config("spark.sql.catalog.iceberg_catalog.uri", "http://iceberg-rest:8181") \
    .config("spark.sql.catalog.iceberg_catalog.warehouse", "s3://warehouse/") \
    .config("spark.sql.catalog.iceberg_catalog.io-impl", "org.apache.iceberg.aws.s3.S3FileIO") \
    .config("spark.sql.catalog.iceberg_catalog.s3.endpoint", "http://minio:9000") \
    .config("spark.sql.catalog.iceberg_catalog.s3.access-key-id", "admin") \
    .config("spark.sql.catalog.iceberg_catalog.s3.secret-access-key", "password") \
    .config("spark.sql.catalog.iceberg_catalog.s3.path-style-access", "true") \
    .getOrCreate()

25/12/06 11:29:48 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
25/12/06 11:29:48 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
25/12/06 11:29:48 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.


## CREATIE AND INSERT DATA IN THE TABLE 

In [11]:
spark.sql("""
    CREATE OR REPLACE TABLE iceberg_catalog.default.test (
        id BIGINT,
        name STRING
    ) USING iceberg
""")

DataFrame[]

In [12]:
spark.sql("INSERT INTO iceberg_catalog.default.test VALUES (1, 'Alice'), (2, 'Bob')")

DataFrame[]

In [14]:
df = spark.sql("SELECT * FROM iceberg_catalog.default.test")
df.show()

+---+-----+
| id| name|
+---+-----+
|  1|Alice|
|  2|  Bob|
+---+-----+



In [15]:
spark.sql("SELECT * FROM iceberg_catalog.default.test.snapshots").show()

+--------------------+-------------------+---------+---------+--------------------+--------------------+
|        committed_at|        snapshot_id|parent_id|operation|       manifest_list|             summary|
+--------------------+-------------------+---------+---------+--------------------+--------------------+
|2025-12-06 11:34:...|1867844730161823211|     NULL|   append|s3://warehouse/de...|{spark.app.id -> ...|
+--------------------+-------------------+---------+---------+--------------------+--------------------+



The parent_id of snapshot 1867844730161823211 is NULL because it represents the very first snapshot of the table.
Next, we’re going to update the table so a new snapshot can be created.

## UPDATING A RECORD

In [17]:
spark.sql("UPDATE iceberg_catalog.default.test VALUES SET name = 'David' WHERE id = 1")
df = spark.sql("SELECT * FROM iceberg_catalog.default.test")
df.show()

+---+-----+
| id| name|
+---+-----+
|  1|David|
|  2|  Bob|
+---+-----+



In [18]:
spark.sql("SELECT * FROM iceberg_catalog.default.test.snapshots").show()

+--------------------+-------------------+-------------------+---------+--------------------+--------------------+
|        committed_at|        snapshot_id|          parent_id|operation|       manifest_list|             summary|
+--------------------+-------------------+-------------------+---------+--------------------+--------------------+
|2025-12-06 11:34:...|1867844730161823211|               NULL|   append|s3://warehouse/de...|{spark.app.id -> ...|
|2025-12-06 11:37:...| 510849903785427612|1867844730161823211|overwrite|s3://warehouse/de...|{spark.app.id -> ...|
+--------------------+-------------------+-------------------+---------+--------------------+--------------------+



We can now see that the table has two snapshots, and the parent_id of the latest snapshot points back to the first one.

## INSERT MORE DATA

In [19]:
spark.sql("INSERT INTO iceberg_catalog.default.test VALUES (3, 'Suraj'), (4, 'Mihir')")

DataFrame[]

In [23]:
df = spark.sql("SELECT * FROM iceberg_catalog.default.test")
df.show()

+---+-----+
| id| name|
+---+-----+
|  1|David|
|  2|  Bob|
|  3|Suraj|
|  4|Mihir|
+---+-----+



In [24]:
spark.sql("SELECT * FROM iceberg_catalog.default.test.snapshots").show()

+--------------------+-------------------+-------------------+---------+--------------------+--------------------+
|        committed_at|        snapshot_id|          parent_id|operation|       manifest_list|             summary|
+--------------------+-------------------+-------------------+---------+--------------------+--------------------+
|2025-12-06 11:34:...|1867844730161823211|               NULL|   append|s3://warehouse/de...|{spark.app.id -> ...|
|2025-12-06 11:37:...| 510849903785427612|1867844730161823211|overwrite|s3://warehouse/de...|{spark.app.id -> ...|
|2025-12-06 11:40:...|4397443594468568582| 510849903785427612|   append|s3://warehouse/de...|{spark.app.id -> ...|
+--------------------+-------------------+-------------------+---------+--------------------+--------------------+



We now have three snapshots, and the current one is 4397443594468568582.
Next, let’s travel back in time to the second snapshot, 510849903785427612.
According to the Iceberg documentation, the query to perform a rollback is:

CALL catalog_name.system.rollback_to_snapshot('{namespace}.{table}', {snapshot_id});

## Time Travel: Using the rollback_to_snapshot Method

In [30]:

spark.sql("""
CALL iceberg_catalog.system.rollback_to_snapshot('default.test', 510849903785427612)
""").show()

+--------------------+-------------------+
|previous_snapshot_id|current_snapshot_id|
+--------------------+-------------------+
| 4397443594468568582| 510849903785427612|
+--------------------+-------------------+



In [32]:
df = spark.sql("SELECT * FROM iceberg_catalog.default.test")
df.show()

+---+-----+
| id| name|
+---+-----+
|  1|David|
|  2|  Bob|
+---+-----+



We can see that we have successfully rolled back to the point in time when Alice’s name was updated to David.
Keep in mind that using this rollback method is one-way — once you roll back, you cannot move forward.
Let’s try to advance to our previous snapshot_id; as expected, this will result in an error.

In [33]:
spark.sql("""
CALL iceberg_catalog.system.rollback_to_snapshot('default.test', 4397443594468568582)
""").show()

Py4JJavaError: An error occurred while calling o108.sql.
: org.apache.iceberg.exceptions.ValidationException: Cannot roll back to snapshot, not an ancestor of the current state: 4397443594468568582
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.SetSnapshotOperation.rollbackTo(SetSnapshotOperation.java:84)
	at org.apache.iceberg.SnapshotManager.rollbackTo(SnapshotManager.java:67)
	at org.apache.iceberg.spark.procedures.RollbackToSnapshotProcedure.lambda$call$0(RollbackToSnapshotProcedure.java:88)
	at org.apache.iceberg.spark.procedures.BaseProcedure.execute(BaseProcedure.java:107)
	at org.apache.iceberg.spark.procedures.BaseProcedure.modifyIcebergTable(BaseProcedure.java:88)
	at org.apache.iceberg.spark.procedures.RollbackToSnapshotProcedure.call(RollbackToSnapshotProcedure.java:83)
	at org.apache.spark.sql.execution.datasources.v2.CallExec.run(CallExec.scala:34)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437)
	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:638)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:629)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:659)
	at jdk.internal.reflect.GeneratedMethodAccessor54.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
	at java.base/java.lang.Thread.run(Thread.java:840)


In [34]:
spark.sql("SELECT * FROM iceberg_catalog.default.test.snapshots").show()

+--------------------+-------------------+-------------------+---------+--------------------+--------------------+
|        committed_at|        snapshot_id|          parent_id|operation|       manifest_list|             summary|
+--------------------+-------------------+-------------------+---------+--------------------+--------------------+
|2025-12-06 11:34:...|1867844730161823211|               NULL|   append|s3://warehouse/de...|{spark.app.id -> ...|
|2025-12-06 11:37:...| 510849903785427612|1867844730161823211|overwrite|s3://warehouse/de...|{spark.app.id -> ...|
|2025-12-06 11:40:...|4397443594468568582| 510849903785427612|   append|s3://warehouse/de...|{spark.app.id -> ...|
+--------------------+-------------------+-------------------+---------+--------------------+--------------------+



## Time Travel: Using the rollback_to_timestamp Method

You can also travel through time using the timestamp method. You don’t need to provide the exact timestamp — Iceberg will select the snapshot that existed immediately before the specified time.

In [38]:
spark.sql("""
CALL iceberg_catalog.system.rollback_to_timestamp('default.test', TIMESTAMP '2025-12-06 11:34:50.00')
""").show()

+--------------------+-------------------+
|previous_snapshot_id|current_snapshot_id|
+--------------------+-------------------+
|  510849903785427612|1867844730161823211|
+--------------------+-------------------+



we can see that we are back in our first insertion

In [39]:
Finally, we can move backward or forward in time using the following method:

CALL catalog_name.system.set_current_snapshot('{namespace}.{table}', {snapshot_id});

Let’s use it to return to the most recent snapshot, 4397443594468568582.

+---+-----+
| id| name|
+---+-----+
|  1|Alice|
|  2|  Bob|
+---+-----+



lastly we can travel back or forward in time by using the method

CALL catalog_name.system.set_current_snapshot('{namespace}.{table}', {snapshot_id});
so let's go to the last snapshot 4397443594468568582

## Time Travel: Using the set_current_snapshot Method 

In [40]:
spark.sql("""
CALL iceberg_catalog.system.set_current_snapshot('default.test', 4397443594468568582)
""").show()

+--------------------+-------------------+
|previous_snapshot_id|current_snapshot_id|
+--------------------+-------------------+
| 1867844730161823211|4397443594468568582|
+--------------------+-------------------+



In [41]:
df = spark.sql("SELECT * FROM iceberg_catalog.default.test")
df.show()

+---+-----+
| id| name|
+---+-----+
|  1|David|
|  3|Suraj|
|  2|  Bob|
|  4|Mihir|
+---+-----+



In [42]:
spark.sql("""
SELECT * FROM iceberg_catalog.default.test.history
""").show(truncate=False)

+-----------------------+-------------------+-------------------+-------------------+
|made_current_at        |snapshot_id        |parent_id          |is_current_ancestor|
+-----------------------+-------------------+-------------------+-------------------+
|2025-12-06 11:34:28.143|1867844730161823211|NULL               |true               |
|2025-12-06 11:37:09.367|510849903785427612 |1867844730161823211|true               |
|2025-12-06 11:40:53.185|4397443594468568582|510849903785427612 |true               |
|2025-12-06 11:51:37.332|510849903785427612 |1867844730161823211|true               |
|2025-12-06 11:58:58.265|1867844730161823211|NULL               |true               |
|2025-12-06 12:03:30.521|4397443594468568582|510849903785427612 |true               |
+-----------------------+-------------------+-------------------+-------------------+

