##PySpark to_date() – Convert Timestamp to Date


---

**PySpark functions provide to_date() function to convert timestamp to date (DateType), this ideally achieved by just truncating the time part from the Timestamp column. In this tutorial, I will show you a PySpark example of how to convert timestamp to date on DataFrame & SQL.**

---


**to_date() – function formats Timestamp to Date.**


##Syntax: to_date(timestamp_column)
##Syntax: to_date(timestamp_column,format)


---


**PySpark timestamp (TimestampType) consists of value in the format yyyy-MM-dd HH:mm:ss.SSSS and Date (DateType) format would be yyyy-MM-dd. Use to_date() function to truncate time from Timestamp or to convert the timestamp to date on DataFrame column.**

In [0]:
df = spark.createDataFrame(
        data=[ ("1","2023-01-18 12:01:19.000")],
        schema=["id","input_timestamp"])

df.printSchema()
df.show(truncate=False)

root
 |-- id: string (nullable = true)
 |-- input_timestamp: string (nullable = true)

+---+-----------------------+
|id |input_timestamp        |
+---+-----------------------+
|1  |2023-01-18 12:01:19.000|
+---+-----------------------+



##Using to_date() – Convert Timestamp String to Date


**In this example, we will use to_date() function to convert TimestampType (or string) column to DateType column. The input to this function should be timestamp column or string in TimestampType format and it returns just date in DateType column.**

In [0]:
from pyspark.sql.functions import *

In [0]:
#Timestamp String to DateType
df.withColumn("date_type", to_date("input_timestamp"))\
.show(truncate=False)

#Timestamp Type to DateType
df.withColumn("date_type", to_date(current_timestamp()))\
.show(truncate=False)

+---+-----------------------+----------+
|id |input_timestamp        |date_type |
+---+-----------------------+----------+
|1  |2023-01-18 12:01:19.000|2023-01-18|
+---+-----------------------+----------+

+---+-----------------------+----------+
|id |input_timestamp        |date_type |
+---+-----------------------+----------+
|1  |2023-01-18 12:01:19.000|2023-01-18|
+---+-----------------------+----------+



In [0]:
#Custom Timestamp format to DateType
df.select(to_date(lit('01-18-2023 12:01:19.000'), 'MM-dd-yyyy HH:mm:ss.SSS'))\
.show(truncate=False)

+---------------------------------------------------------+
|to_date(01-18-2023 12:01:19.000, MM-dd-yyyy HH:mm:ss.SSS)|
+---------------------------------------------------------+
|2023-01-18                                               |
+---------------------------------------------------------+



##Convert TimestampType (timestamp) to DateType (date)

**This example converts the PySpark TimestampType column to DateType.**

In [0]:
#Timestamp Type to DateType

df.withColumn('ts', to_timestamp(col("input_timestamp")))\
.withColumn("datetype", to_date(col("ts")))\
.show(truncate=False)

+---+-----------------------+-------------------+----------+
|id |input_timestamp        |ts                 |datetype  |
+---+-----------------------+-------------------+----------+
|1  |2023-01-18 12:01:19.000|2023-01-18 12:01:19|2023-01-18|
+---+-----------------------+-------------------+----------+



##Using Column cast() Function


**Here is another way to convert TimestampType (timestamp string) to DateType using cast function.**

In [0]:
#Using Cast to convert Timestamp String to DateType

df.withColumn('date_type', col('input_timestamp').cast('date'))\
.show(truncate=False)

+---+-----------------------+----------+
|id |input_timestamp        |date_type |
+---+-----------------------+----------+
|1  |2023-01-18 12:01:19.000|2023-01-18|
+---+-----------------------+----------+



In [0]:
#Using Cast to convert Timestamp to DataType

df.withColumn('data_type', to_timestamp('input_timestamp').cast('date'))\
.show(truncate=False)

+---+-----------------------+----------+
|id |input_timestamp        |data_type |
+---+-----------------------+----------+
|1  |2023-01-18 12:01:19.000|2023-01-18|
+---+-----------------------+----------+



##PySpark SQL – Convert Timestamp to Date

**Following are similar examples using with PySpark SQL. If you are from an SQL background these come in handy.**

In [0]:
# SQL TimestampType to DateType
spark.sql(" select to_date(CURRENT_TIMESTAMP) as date_type").show(truncate=False)

+----------+
|date_type |
+----------+
|2023-01-18|
+----------+



In [0]:
#SQL CAST TimestampType to DateType
spark.sql(" select date(to_timestamp('2019-06-24 12:01:19.000')) as date_type ").show(truncate=False)

+----------+
|date_type |
+----------+
|2019-06-24|
+----------+



In [0]:
# SQL CAST timestamp string to DateType

spark.sql(" SELECT date('2019-06-24 12:01:19.000') as date_type ").show(truncate=False)

+----------+
|date_type |
+----------+
|2019-06-24|
+----------+



In [0]:
# SQL Timestamp String (default format) to DateType
spark.sql(" SELECT to_date('2019-06-24 12:01:19.000') as date_type ").show(truncate=False)

+----------+
|date_type |
+----------+
|2019-06-24|
+----------+



In [0]:
#SQL Custom Timeformat to DateType

spark.sql("SELECT to_date('06-24-2019 12:01:19.000','MM-dd-yyyy HH:mm:ss.SSSS') as date_type ").show(truncate=False)

+----------+
|date_type |
+----------+
|2019-06-24|
+----------+

