**timestamp_millis()**

- This function is useful when you have **timestamps** represented as **long integers** in **milliseconds** and you need to convert them into a **human-readable timestamp format** for further processing or analysis.

- For example, if you have the timestamp value **1230219000123** in **milliseconds**, using **timestamp_millis(1230219000123)** will convert this value to the **TIMESTAMP 2008-12-25 07:30:00.123**, representing the **date and time** in a readable format.

**Syntax**

      timestamp_millis(expr)
     
**expr:** An integral numeric expression specifying **milliseconds**.

**Returns:** TIMESTAMP.

      > SELECT timestamp_millis(1230219000123);
        2008-12-25 07:30:00.123

In [0]:
%fs ls /FileStore/tables/

path,name,size,modificationTime
dbfs:/FileStore/tables/Flatten Nested Array.json,Flatten Nested Array.json,3756,1718618620000
dbfs:/FileStore/tables/MarketPrice-1.csv,MarketPrice-1.csv,19528,1719656512000
dbfs:/FileStore/tables/MarketPrice.csv,MarketPrice.csv,19528,1719656208000
dbfs:/FileStore/tables/MultiLineJSON.json/,MultiLineJSON.json/,0,0
dbfs:/FileStore/tables/MultiLineJSON1.json/,MultiLineJSON1.json/,0,0
dbfs:/FileStore/tables/MultiLineJSON2.json/,MultiLineJSON2.json/,0,0
dbfs:/FileStore/tables/Question7.csv,Question7.csv,154,1725816645000
dbfs:/FileStore/tables/RunningData_Rev02.csv,RunningData_Rev02.csv,1222,1719810609000
dbfs:/FileStore/tables/RunningData_Rev03.csv,RunningData_Rev03.csv,1216,1719810946000
dbfs:/FileStore/tables/SalesData_Rev02.csv,SalesData_Rev02.csv,472,1719810784000


In [0]:
import pyspark.sql.functions as f
from pyspark.sql.functions import *
from pyspark.sql.types import LongType

In [0]:
df = spark.read.csv("dbfs:/FileStore/tables/timestamp_millis-3.csv", header=True, inferSchema=True)
display(df.limit(10))

Commodity_Index,Effective_Date,Start_Date,End_Date,Income,Delta_Value,Target_Id,Input_Timestamp_UTC,Update_Timestamp_UTC
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1068,1724256609000,1724256609000
DISCOUNT,6-Feb-23,14-Jan-23,6-Feb-23,1500,10,1071,1724256609000,1724256609000
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1068,1724256609000,1724256609000
DISCOUNT,8-Jan-24,7-Oct-23,8-Jan-24,1500,10,1071,1724256609000,1724256609000
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1068,1724256609000,1724256609000
DISCOUNT,6-Mar-23,7-Feb-23,6-Mar-23,1500,10,1071,1724256609000,1724256609000
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1068,1724256609000,1724256609000
DISCOUNT,6-Jan-25,9-Jan-24,6-Jan-25,1500,10,1071,1724256609000,1724256609000
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1068,1724256609000,1724256609000
DISCOUNT,6-Apr-23,7-Mar-23,6-Apr-23,1500,10,1071,1724256609000,1724256609000


In [0]:
df_err = df.select(current_timestamp().alias("created_timestamp"),
               current_user().alias("created_by"),
               timestamp_millis(col("Input_Timestamp_UTC")).alias('input_timestamp_utc'),
               timestamp_millis(col("Update_Timestamp_UTC")).alias('last_update_timestamp_utc'))
display(df_err.limit(10))

created_timestamp,created_by,input_timestamp_utc,last_update_timestamp_utc
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-14T18:11:32.806Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z


- The reason why you are not encountering this error in this notebook could be due to the difference in **Spark versions or configurations between the two environments**. It's possible that the version of Spark in the **Databricks Community Edition does not include the timestamp_millis function**.

- The functions **timestamp_millis and unix_millis** are **not available** in the **Apache Spark DataFrame API**. These functions are **specific to SQL** and are included in **Spark 3.1.1 and above**.

#### **Solution 01**

- This code snippet uses **from_unixtime(col / 1000)** to **convert the milliseconds to seconds** and then casts the result to a timestamp

- The error occurs because **timestamp_millis** is not a **direct function** available in **pyspark.sql.functions**. To convert a timestamp in **milliseconds to a timestamp** type, you should use the **from_unixtime** function combined with **division by 1000** (since **from_unixtime expects seconds, not milliseconds**) and then **cast to a timestamp** if needed.

In [0]:
df1 = df.select(
    current_timestamp().alias("created_timestamp"),
    expr("current_user()").alias("created_by"),
    from_unixtime(col("Input_Timestamp_UTC") / 1000).alias('input_timestamp_utc'),
    from_unixtime(col("Update_Timestamp_UTC") / 1000).alias('last_update_timestamp_utc')
)

display(df1.limit(10))

created_timestamp,created_by,input_timestamp_utc,last_update_timestamp_utc
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09
2024-09-14T17:57:47.288+0000,enugantisuresh12@gmail.com,2024-08-21 16:10:09,2024-08-21 16:10:09


In [0]:
df2 = df.select(
    current_timestamp().alias("created_timestamp"),
    expr("current_user()").alias("created_by"),
    to_timestamp(from_unixtime(col("Input_Timestamp_UTC") / 1000), 'yyyy-MM-dd HH:mm:ss').alias('input_timestamp_utc'),
    to_timestamp(from_unixtime(col("Update_Timestamp_UTC") / 1000), 'yyyy-MM-dd HH:mm:ss').alias('update_timestamp_utc')
)

display(df2.limit(10))

created_timestamp,created_by,input_timestamp_utc,update_timestamp_utc
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-14T18:00:44.081+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000


In [0]:
%sql
select timestamp_millis(1724256609000)

timestamp_millis(1724256609000)
2024-08-21T16:10:09.000+0000


#### **Solution 02**

In [0]:
df_exp = df.select(
    current_timestamp().alias("created_timestamp"),
    expr("current_user()").alias("created_by"),
    expr("timestamp_millis(Input_Timestamp_UTC)").alias('input_timestamp_utc'),
    expr("timestamp_millis(Update_Timestamp_UTC)").alias('last_update_timestamp_utc')
)
display(df_exp.limit(10))

created_timestamp,created_by,input_timestamp_utc,last_update_timestamp_utc
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000
2024-09-10T08:24:48.595+0000,enugantisuresh12@gmail.com,2024-08-21T16:10:09.000+0000,2024-08-21T16:10:09.000+0000


#### **Solution 03**

- The reason why you are not encountering this error in this notebook could be due to the difference in **Spark versions or configurations between the two environments**. It's possible that the version of Spark in the **Databricks Community Edition does not include the timestamp_millis function**.

- The functions **timestamp_millis and unix_millis** are **not available** in the **Apache Spark DataFrame API**. These functions are **specific to SQL** and are included in **Spark 3.1.1 and above**.

In [0]:
df_confg = df.select(current_timestamp().alias("created_timestamp"),
               current_user().alias("created_by"),
               timestamp_millis(col("Input_Timestamp_UTC")).alias('input_timestamp_utc'),
               timestamp_millis(col("Update_Timestamp_UTC")).alias('last_update_timestamp_utc'))
display(df_confg.limit(10))

created_timestamp,created_by,input_timestamp_utc,last_update_timestamp_utc
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
2024-09-10T08:11:41.459Z,enugantisuresh12@gmail.com,2024-08-21T16:10:09Z,2024-08-21T16:10:09Z
