**from_unixtime**

- Converts **Unix Time Seconds** to **Date and Timestamp**.
- is used to convert the number of **seconds** from Unix epoch (1970-01-01 00:00:00 UTC) to a **string** representation of the **timestamp**.

- Converting **Unix Time** to a **Human-Readable Format** of timestamp.


|unix_time (seconds) |   timestamp          |
|--------------------|----------------------|
|1648974310|2023-04-03 09:45:10|


**Syntax**

     from_unixtime(timestamp: ColumnOrName, format: str = 'yyyy-MM-dd HH:mm:ss') 

**timestamp:** column of **unix time** values.

**format:**

      # default: yyyy-MM-dd HH:mm:ss
      from_unixtime(col("timestamp_1")).alias("timestamp_1") 

      # custom format
      from_unixtime(col("timestamp_2"),"MM-dd-yyyy HH:mm:ss").alias("timestamp_2")
      from_unixtime(col("timestamp_3"),"MM-dd-yyyy").alias("timestamp_3")

**Returns:** string of **default: yyyy-MM-dd HH:mm:ss**

In [0]:
%fs ls /FileStore/tables/

path,name,size,modificationTime
dbfs:/FileStore/tables/Emp_Hash-1.csv,Emp_Hash-1.csv,3312,1733110041000
dbfs:/FileStore/tables/Emp_Hash-2.csv,Emp_Hash-2.csv,6365,1733125960000
dbfs:/FileStore/tables/Emp_Hash-3.csv,Emp_Hash-3.csv,6385,1733126482000
dbfs:/FileStore/tables/Emp_Hash.csv,Emp_Hash.csv,3310,1733108841000
dbfs:/FileStore/tables/Flatten Nested Array.json,Flatten Nested Array.json,3756,1718618620000
dbfs:/FileStore/tables/Generate_Random_Data/,Generate_Random_Data/,0,0
dbfs:/FileStore/tables/InterviewQuestions/,InterviewQuestions/,0,0
dbfs:/FileStore/tables/MarketPrice.csv,MarketPrice.csv,19528,1719656208000
dbfs:/FileStore/tables/MultiLineJSON.json/,MultiLineJSON.json/,0,0
dbfs:/FileStore/tables/MultiLineJSON01.json/,MultiLineJSON01.json/,0,0


In [0]:
import pyspark.sql.functions as f
from pyspark.sql.functions import col, exp, current_timestamp, to_timestamp, from_unixtime
from pyspark.sql.functions import *
from pyspark.sql.types import LongType

In [0]:
df = spark.read.csv("dbfs:/FileStore/tables/from_unixtime-1.csv", header=True, inferSchema=True)
display(df.limit(10))

Commodity_Index,Effective_Date,Start_Date,End_Date,Income,Delta_Value,Target_Id,Input_Timestamp_UTC,Update_Timestamp_UTC
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1068,1709109264,1709109264
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1071,1710234895,1710234895
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1068,1709109264,1709109264
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1071,1707813327,1707813327


In [0]:
# format columns according to datatypes of Kafka Schema
df_cast = df.withColumn('Input_Timestamp_UTC', f.col('Input_Timestamp_UTC').cast(LongType()))\
            .withColumn('Update_Timestamp_UTC', f.col('Update_Timestamp_UTC').cast(LongType()))

display(df_cast.limit(10))

Commodity_Index,Effective_Date,Start_Date,End_Date,Income,Delta_Value,Target_Id,Input_Timestamp_UTC,Update_Timestamp_UTC
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1068,1709109264,1709109264
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1071,1710234895,1710234895
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1068,1709109264,1709109264
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1071,1707813327,1707813327


In [0]:
df_cust = df.select("Input_Timestamp_UTC", "Update_Timestamp_UTC", 
                    from_unixtime(col("Input_Timestamp_UTC")).alias('default_input_timestamp_utc'),
                    from_unixtime(col("Update_Timestamp_UTC")).alias('default_last_update_timestamp_utc'),
                    from_unixtime(col("Input_Timestamp_UTC"), 'MM-dd-yyyy HH:mm:ss').alias('custom_input_timestamp_utc'),
                    from_unixtime(col("Update_Timestamp_UTC"), 'MM-dd-yyyy').alias('custom_last_update_timestamp_utc'))
display(df_cust.limit(10))

Input_Timestamp_UTC,Update_Timestamp_UTC,default_input_timestamp_utc,default_last_update_timestamp_utc,custom_input_timestamp_utc,custom_last_update_timestamp_utc
1709109264,1709109264,2024-02-28 08:34:24,2024-02-28 08:34:24,02-28-2024 08:34:24,02-28-2024
1710234895,1710234895,2024-03-12 09:14:55,2024-03-12 09:14:55,03-12-2024 09:14:55,03-12-2024
1709109264,1709109264,2024-02-28 08:34:24,2024-02-28 08:34:24,02-28-2024 08:34:24,02-28-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024


In [0]:
df_all = df.select(current_timestamp().alias("created_timestamp"),
                   expr("current_user()").alias("created_by"),
                   from_unixtime(col("Input_Timestamp_UTC")).alias('input_wo_timestamp_utc'),
                   from_unixtime(col("Update_Timestamp_UTC")).alias('last_update_wo_timestamp_utc'),
                   to_timestamp(from_unixtime(col("Input_Timestamp_UTC")),'yyyy-MM-dd HH:mm:ss').alias('input_timestamp_utc'),
                   to_timestamp(from_unixtime(col("Update_Timestamp_UTC")),'yyyy-MM-dd HH:mm:ss').alias('last_update_timestamp_utc'))
display(df_all.limit(10))

created_timestamp,created_by,input_wo_timestamp_utc,last_update_wo_timestamp_utc,input_timestamp_utc,last_update_timestamp_utc
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-28 08:34:24,2024-02-28 08:34:24,2024-02-28T08:34:24Z,2024-02-28T08:34:24Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-03-12 09:14:55,2024-03-12 09:14:55,2024-03-12T09:14:55Z,2024-03-12T09:14:55Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-28 08:34:24,2024-02-28 08:34:24,2024-02-28T08:34:24Z,2024-02-28T08:34:24Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z
2025-08-07T08:29:20.023Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13 08:35:27,2024-02-13T08:35:27Z,2024-02-13T08:35:27Z


%md
- **current_user()** is **not directly** available as a PySpark function.
- To resolve this issue, you can use the **SQL expression functionality** provided by PySpark to execute SQL functions that are **not directly** exposed in the **PySpark API**.

#### **to_timestamp**

- Convert **String** to **Timestamp** type.

- **yyyy-MM-dd HH:mm:ss.SSS** is the **standard timestamp format**.

- **Syntax**

       to_timestamp(column_name, pattern)

     from_unixtime(col("Input_Timestamp_UTC"))
     from_unixtime(col("Update_Timestamp_UTC"))

**default:** yyyy-MM-dd HH:mm:ss
- input_wo_timestamp_utc
- last_update_wo_timestamp_utc


      to_timestamp(from_unixtime(col("Input_Timestamp_UTC")),'yyyy-MM-dd HH:mm:ss')
      to_timestamp(from_unixtime(col("Update_Timestamp_UTC")),'yyyy-MM-dd HH:mm:ss')

**custom format:** yyyy-MM-ddTHH:mm:ss