# Delta Lake Lab 
## Unit 6: Time Travel

In the previous unit, we-
1. Learned how to change the schema of tables with data in them, and reviewed the impact on files in the data lake and the transaction log

In this unit, we will-
1. Study delta lake's time travel support

### 1. Imports

In [None]:
import pandas as pd

from pyspark.sql.functions import month, date_format
from pyspark.sql.types import IntegerType
from pyspark.sql import SparkSession

from delta.tables import *

import warnings
warnings.filterwarnings('ignore')

### 2. Create a Spark session powered by Cloud Dataproc 

In [None]:
spark = SparkSession.builder.appName('Loan Analysis').getOrCreate()
spark

### 3. Declare variables

In [None]:
project_id_output = !gcloud config list --format "value(core.project)" 2>/dev/null
PROJECT_ID = project_id_output[0]
print("PROJECT_ID: ", PROJECT_ID)

In [None]:
project_name_output = !gcloud projects describe $PROJECT_ID | grep name | cut -d':' -f2 | xargs
PROJECT_NAME = project_name_output[0]
print("PROJECT_NAME: ", PROJECT_NAME)

In [None]:
project_number_output = !gcloud projects describe $PROJECT_ID | grep projectNumber | cut -d':' -f2 | xargs
PROJECT_NUMBER = project_number_output[0]
print("PROJECT_NUMBER: ", PROJECT_NUMBER)

In [None]:
ACCOUNT_NAME = "YOUR_ACCOUNT_NAME"

In [None]:
DATA_LAKE_ROOT_PATH= f"gs://dll-data-bucket-{PROJECT_NUMBER}-{ACCOUNT_NAME}"
DELTA_LAKE_DIR_ROOT = f"{DATA_LAKE_ROOT_PATH}/delta-consumable"
print(DELTA_LAKE_DIR_ROOT)

In [None]:
!gsutil ls -r $DELTA_LAKE_DIR_ROOT

### 4. History

In [None]:
spark.sql("DESCRIBE HISTORY "+ ACCOUNT_NAME +"_loan_db.loans_by_state_delta").select("version","timestamp","operation","operationParameters").show(truncate=False)

### 5. Lets look at a few versions

In [None]:
spark.sql("SELECT * FROM "+ ACCOUNT_NAME +"_loan_db.loans_by_state_delta VERSION AS OF 1 where addr_state='IA'").show()

In [None]:
spark.sql("SELECT * FROM "+ ACCOUNT_NAME +"_loan_db.loans_by_state_delta VERSION AS OF 10 where addr_state='IA'").show()

In [None]:
spark.sql("SELECT * FROM "+ ACCOUNT_NAME +"_loan_db.loans_by_state_delta VERSION AS OF 5 where addr_state='IA'").show()

### THIS CONCLUDES THIS UNIT. PROCEED TO THE NEXT NOTEBOOK