**from_unixtime**

- Converts **Unix Time Seconds** to **Date and Timestamp**.
- is used to convert the number of **seconds** from Unix epoch (1970-01-01 00:00:00 UTC) to a **string** representation of the **timestamp**.
- Converting **Unix Time** to a **Human-Readable Format** of timestamp.


|unix_time (seconds) |   timestamp          |
|--------------------|----------------------|
|1648974310|2023-04-03 09:45:10|


- **from_unixtime** expects **timestamp in seconds, not milliseconds**.
- If your **timestamps are in milliseconds, divide by 1000** first:

      df = df.withColumn("unix_time_sec", (df["unix_time_ms"] / 1000).cast("long"))

#### Syntax

     from_unixtime(timestamp: ColumnOrName, format: str = 'yyyy-MM-dd HH:mm:ss') 

**timestamp:** column of **unix time** values.

**format:**

      # default: yyyy-MM-dd HH:mm:ss
      from_unixtime(col("timestamp_1")).alias("timestamp_1") 

      # custom format
      from_unixtime(col("timestamp_2"),"MM-dd-yyyy HH:mm:ss").alias("timestamp_2")
      from_unixtime(col("timestamp_3"),"MM-dd-yyyy").alias("timestamp_3")

**Returns:** string of **default: yyyy-MM-dd HH:mm:ss**

**Supported input data types**
- LONG
- BIGINT
- INT
- DOUBLE

In [0]:
import pyspark.sql.functions as f
from pyspark.sql.functions import col, exp, current_timestamp, to_timestamp, from_unixtime
from pyspark.sql.functions import *
from pyspark.sql.types import LongType

##### 1) Basic Usage

- Convert **Unix timestamp** to **default timestamp format (yyyy-MM-dd HH:mm:ss)**.

In [0]:
from pyspark.sql.functions import from_unixtime

data = [(1633072800,), (1622476800,), (1609459200,), (1766476800,), (1998859200,)]
df = spark.createDataFrame(data, ["unix_time"])

df_with_timestamp = df.withColumn("timestamp", from_unixtime("unix_time"))
display(df_with_timestamp)

unix_time,timestamp
1633072800,2021-10-01 07:20:00
1622476800,2021-05-31 16:00:00
1609459200,2021-01-01 00:00:00
1766476800,2025-12-23 08:00:00
1998859200,2033-05-04 22:40:00


##### 2) Custom Format

In [0]:
df_with_format = df\
    .withColumn("formatted01", from_unixtime("unix_time", "yyyy/MM/dd HH:mm")) \
    .withColumn("formatted02", from_unixtime("unix_time", "yyyy/MM/dd HH:mm:ss")) \
    .withColumn("formatted03", from_unixtime("unix_time", "yyyy-MM-dd HH:mm:ss")) \
    .withColumn("formatted04", from_unixtime("unix_time", "yyyy-MM-dd"))
display(df_with_format)

unix_time,formatted01,formatted02,formatted03,formatted04
1633072800,2021/10/01 07:20,2021/10/01 07:20:00,2021-10-01 07:20:00,2021-10-01
1622476800,2021/05/31 16:00,2021/05/31 16:00:00,2021-05-31 16:00:00,2021-05-31
1609459200,2021/01/01 00:00,2021/01/01 00:00:00,2021-01-01 00:00:00,2021-01-01
1766476800,2025/12/23 08:00,2025/12/23 08:00:00,2025-12-23 08:00:00,2025-12-23
1998859200,2033/05/04 22:40,2033/05/04 22:40:00,2033-05-04 22:40:00,2033-05-04


##### 3) Convert string time-format (including milliseconds ) to unix_timestamp(double)
- unix_timestamp() function **excludes milliseconds**. 

In [0]:
df_cust_frmt = spark.createDataFrame([('22-Jul-2018 04:21:18.792 UTC',),
                                      ('23-Jul-2018 04:21:25.888 UTC',),
                                      ('24-Jul-2018 07:24:28.992 UTC',),
                                      ('25-Jul-2019 22:29:55.555 UTC',),
                                      ('26-Jul-2021 12:45:35.666 UTC',)], ['TIME'])

df_cust_frmt = df_cust_frmt\
  .withColumn("date_time", unix_timestamp(col("TIME"),'dd-MMM-yyyy HH:mm:ss.SSS z')) \
  .withColumn("unix_timestamp", unix_timestamp(df_cust_frmt.TIME,'dd-MMM-yyyy HH:mm:ss.SSS z') + substring(df_cust_frmt.TIME,-7,3)/1000)

display(df_cust_frmt)

TIME,date_time,unix_timestamp
22-Jul-2018 04:21:18.792 UTC,1532233278,1532233278.792
23-Jul-2018 04:21:25.888 UTC,1532319685,1532319685.888
24-Jul-2018 07:24:28.992 UTC,1532417068,1532417068.992
25-Jul-2019 22:29:55.555 UTC,1564093795,1564093795.555
26-Jul-2021 12:45:35.666 UTC,1627303535,1627303535.666


##### 4) set timezone

In [0]:
from pyspark.sql import functions as F
df = spark.createDataFrame([(1428476400,), (1528598500,)], ['unix_time'])
df.select('*', F.from_unixtime('unix_time').alias("intToString")).display()

unix_time,intToString
1428476400,2015-04-08 07:00:00
1528598500,2018-06-10 02:41:40


In [0]:
spark.conf.set("spark.sql.session.timeZone", "America/Los_Angeles")

In [0]:
from pyspark.sql import functions as F
df = spark.createDataFrame([(1428476400,), (1528598500,)], ['unix_time'])
df.select('*', F.from_unixtime('unix_time').alias("intToString")).display()

unix_time,intToString
1428476400,2015-04-08 00:00:00
1528598500,2018-06-09 19:41:40


In [0]:
spark.conf.unset("spark.sql.session.timeZone")

In [0]:
spark.conf.set("spark.sql.session.timeZone", "America/New_York")

In [0]:
from pyspark.sql import functions as F
df = spark.createDataFrame([(1428476400,), (1528598500,)], ['unix_time'])
df.select('*', F.from_unixtime('unix_time').alias("intToString")).display()

unix_time,intToString
1428476400,2015-04-08 03:00:00
1528598500,2018-06-09 22:41:40


In [0]:
spark.conf.unset("spark.sql.session.timeZone")

##### 4) Basic usage and custom format

In [0]:
df = spark.read.csv("/Volumes/workspace/default/@azureadb/from_unixtime.csv", header=True, inferSchema=True)
display(df.limit(10))

Commodity_Index,Effective_Date,Start_Date,End_Date,Income,Delta_Value,Target_Id,Input_Timestamp_UTC,Update_Timestamp_UTC
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1068,1709109264,1709109264
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1071,1710234895,1710234895
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1068,1709109264,1709109264
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1071,1707813327,1707813327


In [0]:
# format columns according to datatypes of Kafka Schema
df_cast = df.withColumn('Input_Timestamp_UTC', f.col('Input_Timestamp_UTC').cast(LongType()))\
            .withColumn('Update_Timestamp_UTC', f.col('Update_Timestamp_UTC').cast(LongType()))

display(df_cast.limit(10))

Commodity_Index,Effective_Date,Start_Date,End_Date,Income,Delta_Value,Target_Id,Input_Timestamp_UTC,Update_Timestamp_UTC
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1068,1709109264,1709109264
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1071,1710234895,1710234895
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1068,1709109264,1709109264
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1071,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1068,1707813327,1707813327
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1071,1707813327,1707813327


#### from_unixtime:

     from_unixtime(col("Input_Timestamp_UTC"))
     from_unixtime(col("Update_Timestamp_UTC"))

**default:** yyyy-MM-dd HH:mm:ss


In [0]:
df_cust = df_cast.select("Input_Timestamp_UTC", "Update_Timestamp_UTC", 
                         from_unixtime(col("Input_Timestamp_UTC")).alias('default_str_input_timestamp_utc'),
                         from_unixtime(col("Update_Timestamp_UTC")).alias('default_str_last_update_timestamp_utc'),
                         from_unixtime(col("Input_Timestamp_UTC"), 'MM-dd-yyyy HH:mm:ss').alias('custom_input_timestamp_utc'),
                         from_unixtime(col("Update_Timestamp_UTC"), 'MM-dd-yyyy').alias('custom_last_update_timestamp_utc'),
                         from_unixtime(col("Input_Timestamp_UTC")).cast("timestamp").alias('default_ts_input_timestamp_utc'),
                         from_unixtime(col("Update_Timestamp_UTC")).cast("timestamp").alias('default_ts_update_timestamp_utc')
                        )
display(df_cust.limit(10))

Input_Timestamp_UTC,Update_Timestamp_UTC,default_str_input_timestamp_utc,default_str_last_update_timestamp_utc,custom_input_timestamp_utc,custom_last_update_timestamp_utc,default_ts_input_timestamp_utc,default_ts_update_timestamp_utc
1709109264,1709109264,2024-02-28 08:34:24,2024-02-28 08:34:24,02-28-2024 08:34:24,02-28-2024,2024-02-28T08:34:24.000Z,2024-02-28T08:34:24.000Z
1710234895,1710234895,2024-03-12 09:14:55,2024-03-12 09:14:55,03-12-2024 09:14:55,03-12-2024,2024-03-12T09:14:55.000Z,2024-03-12T09:14:55.000Z
1709109264,1709109264,2024-02-28 08:34:24,2024-02-28 08:34:24,02-28-2024 08:34:24,02-28-2024,2024-02-28T08:34:24.000Z,2024-02-28T08:34:24.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z
1707813327,1707813327,2024-02-13 08:35:27,2024-02-13 08:35:27,02-13-2024 08:35:27,02-13-2024,2024-02-13T08:35:27.000Z,2024-02-13T08:35:27.000Z


#### to_timestamp

- Convert **String** to **Timestamp** type.
- **yyyy-MM-dd HH:mm:ss.SSS** is the **standard timestamp format**.

**Syntax**

       to_timestamp(column_name, pattern)

In [0]:
df_all = df_cast.select(current_timestamp().alias("created_timestamp"),
                        expr("current_user()").alias("created_by"),
                        from_unixtime(col("Input_Timestamp_UTC")).alias('default_str_input_timestamp_utc'),
                        to_timestamp(from_unixtime(col("Input_Timestamp_UTC"))).alias('default_input_timestamp_utc'))
display(df_all.limit(10))

created_timestamp,created_by,default_str_input_timestamp_utc,default_input_timestamp_utc
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-28 08:34:24,2024-02-28T08:34:24.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-03-12 09:14:55,2024-03-12T09:14:55.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-28 08:34:24,2024-02-28T08:34:24.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z
2025-08-12T02:33:22.948Z,enugantisuresh12@gmail.com,2024-02-13 08:35:27,2024-02-13T08:35:27.000Z


%md
- **current_user()** is **not directly** available as a PySpark function.
- To resolve this issue, you can use the **SQL expression functionality** provided by PySpark to execute SQL functions that are **not directly** exposed in the **PySpark API**.