# Ejemplos de Funciones de Fecha de Spark
A continuación se muestran los ejemplos más utilizados de las funciones de fecha.

## ```current_date()``` y ```date_format()```
Veremos cómo obtener la fecha actual y convertir la fecha en un formato específico usando ```date_format()``` con un ejemplo de Scala. El siguiente ejemplo analiza la fecha y la convierte del formato 'aaaa-dd-mm' al formato 'MM-dd-aaaa'.

In [1]:
import org.apache.spark.sql.functions._

Seq(("2019-01-23"))
  .toDF("Input")
  .select( 
    current_date()as("current_date"), 
    col("Input"), 
    date_format(col("Input"), "MM-dd-yyyy").as("format") 
  ).show()

Intitializing Scala interpreter ...

Spark Web UI available at http://ALC-1NJW5D3.usersad.everis.int:4042
SparkContext available as 'sc' (version = 3.3.0, master = local[*], app id = local-1656925484014)
SparkSession available as 'spark'


22/07/04 11:04:58 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped
+------------+----------+----------+
|current_date|     Input|    format|
+------------+----------+----------+
|  2022-07-04|2019-01-23|01-23-2019|
+------------+----------+----------+



import org.apache.spark.sql.functions._


## ```to_date()```
El siguiente ejemplo convierte una cadena en formato de fecha 'MM/dd/aaaa' a un DateType 'aaaa-MM-dd' usando ```to_date()``` con el ejemplo de Scala.

In [2]:
import org.apache.spark.sql.functions._

Seq(("04/13/2019"))
   .toDF("Input")
  .select( col("Input"), 
           to_date(col("Input"), "MM/dd/yyyy").as("to_date") 
   ).show()

+----------+----------+
|     Input|   to_date|
+----------+----------+
|04/13/2019|2019-04-13|
+----------+----------+



import org.apache.spark.sql.functions._


## ```datediff()```
El siguiente ejemplo devuelve la diferencia entre dos fechas utilizando ```datediff()``` con el ejemplo de Scala.

In [3]:
import org.apache.spark.sql.functions._

Seq(("2019-01-23"),("2019-06-24"),("2019-09-20"))
   .toDF("input")
   .select( col("input"), current_date(), 
       datediff(current_date(),col("input")).as("diff") 
    ).show()

+----------+--------------+----+
|     input|current_date()|diff|
+----------+--------------+----+
|2019-01-23|    2022-07-04|1258|
|2019-06-24|    2022-07-04|1106|
|2019-09-20|    2022-07-04|1018|
+----------+--------------+----+



import org.apache.spark.sql.functions._


## ```months_between()```
El siguiente ejemplo devuelve los meses entre dos fechas utilizando ```months_between()``` con el lenguaje Scala.

In [4]:
import org.apache.spark.sql.functions._

Seq(("2019-01-23"),("2019-06-24"),("2019-09-20"))
   .toDF("date")
  .select( col("date"), current_date(), 
       datediff(current_date(),col("date")).as("datediff"), 
       months_between(current_date(),col("date")).as("months_between")
   ).show()

+----------+--------------+--------+--------------+
|      date|current_date()|datediff|months_between|
+----------+--------------+--------+--------------+
|2019-01-23|    2022-07-04|    1258|   41.38709677|
|2019-06-24|    2022-07-04|    1106|   36.35483871|
|2019-09-20|    2022-07-04|    1018|   33.48387097|
+----------+--------------+--------+--------------+



import org.apache.spark.sql.functions._


## ```trunc()```
El siguiente ejemplo trunca la fecha en una unidad especificada usando ```trunc()``` con el lenguaje Scala.

In [5]:
import org.apache.spark.sql.functions._

Seq(("2019-01-23"),("2019-06-24"),("2019-09-20"))
    .toDF("input")
    .select( col("input"), 
          trunc(col("input"),"Month").as("Month_Trunc"), 
          trunc(col("input"),"Year").as("Month_Year"), 
          trunc(col("input"),"Month").as("Month_Trunc") 
     ).show()

+----------+-----------+----------+-----------+
|     input|Month_Trunc|Month_Year|Month_Trunc|
+----------+-----------+----------+-----------+
|2019-01-23| 2019-01-01|2019-01-01| 2019-01-01|
|2019-06-24| 2019-06-01|2019-01-01| 2019-06-01|
|2019-09-20| 2019-09-01|2019-01-01| 2019-09-01|
+----------+-----------+----------+-----------+



import org.apache.spark.sql.functions._


## ```add_months()```, ```date_add()```, ```date_sub()```
Aquí estamos sumando y restando la fecha y el mes de una entrada dada.

In [6]:
import org.apache.spark.sql.functions._

Seq(("2019-01-23"),("2019-06-24"),("2019-09-20")).toDF("input")
  .select( col("input"), 
      add_months(col("input"),3).as("add_months"), 
      add_months(col("input"),-3).as("sub_months"), 
      date_add(col("input"),4).as("date_add"), 
      date_sub(col("input"),4).as("date_sub") 
   ).show()

+----------+----------+----------+----------+----------+
|     input|add_months|sub_months|  date_add|  date_sub|
+----------+----------+----------+----------+----------+
|2019-01-23|2019-04-23|2018-10-23|2019-01-27|2019-01-19|
|2019-06-24|2019-09-24|2019-03-24|2019-06-28|2019-06-20|
|2019-09-20|2019-12-20|2019-06-20|2019-09-24|2019-09-16|
+----------+----------+----------+----------+----------+



import org.apache.spark.sql.functions._


## ```year()```, ```month()```, ```dayofweek()```
## ```dayofmonth()```, ```dayofyear()```, ```next_day()```, ```weekofyear()```.

In [7]:
import org.apache.spark.sql.functions._

Seq(("2019-01-23"),("2019-06-24"),("2019-09-20"))
  .toDF("input")
  .select( col("input"), year(col("input")).as("year"), 
       month(col("input")).as("month"), 
       dayofweek(col("input")).as("dayofweek"), 
       dayofmonth(col("input")).as("dayofmonth"), 
       dayofyear(col("input")).as("dayofyear"), 
       next_day(col("input"),"Sunday").as("next_day"), 
       weekofyear(col("input")).as("weekofyear") 
   ).show()

+----------+----+-----+---------+----------+---------+----------+----------+
|     input|year|month|dayofweek|dayofmonth|dayofyear|  next_day|weekofyear|
+----------+----+-----+---------+----------+---------+----------+----------+
|2019-01-23|2019|    1|        4|        23|       23|2019-01-27|         4|
|2019-06-24|2019|    6|        2|        24|      175|2019-06-30|        26|
|2019-09-20|2019|    9|        6|        20|      263|2019-09-22|        38|
+----------+----+-----+---------+----------+---------+----------+----------+



import org.apache.spark.sql.functions._


# Ejemplos de Funciones Timestamp de Spark
A continuación se muestran los ejemplos más utilizados de las funciones Timestamp.

## ```current_timestamp()```
Devuelve la marca de tiempo actual en el formato por defecto de Spark **yyyy-MM-dd HH:mm:ss**

In [8]:
import org.apache.spark.sql.functions._

val df = Seq((1)).toDF("seq")
val curDate = df.withColumn("current_date",current_date().as("current_date"))
 .withColumn("current_timestamp",current_timestamp().as("current_timestamp"))
curDate.show(false)

+---+------------+-----------------------+
|seq|current_date|current_timestamp      |
+---+------------+-----------------------+
|1  |2022-07-04  |2022-07-04 11:15:37.499|
+---+------------+-----------------------+



import org.apache.spark.sql.functions._
df: org.apache.spark.sql.DataFrame = [seq: int]
curDate: org.apache.spark.sql.DataFrame = [seq: int, current_date: date ... 1 more field]


## ```to_timestamp()```
Convierte la cadena de tiempo a un formato de tipo Timestamp.

In [9]:
import org.apache.spark.sql.functions._

val dfDate = Seq(("07-01-2019 12 01 19 406"),
    ("06-24-2019 12 01 19 406"),
    ("11-16-2019 16 44 55 406"),
    ("11-16-2019 16 50 59 406")).toDF("input_timestamp")

  dfDate.withColumn("datetype_timestamp",
          to_timestamp(col("input_timestamp"),"MM-dd-yyyy HH mm ss SSS"))
    .show(false)


+-----------------------+-----------------------+
|input_timestamp        |datetype_timestamp     |
+-----------------------+-----------------------+
|07-01-2019 12 01 19 406|2019-07-01 12:01:19.406|
|06-24-2019 12 01 19 406|2019-06-24 12:01:19.406|
|11-16-2019 16 44 55 406|2019-11-16 16:44:55.406|
|11-16-2019 16 50 59 406|2019-11-16 16:50:59.406|
+-----------------------+-----------------------+



import org.apache.spark.sql.functions._
dfDate: org.apache.spark.sql.DataFrame = [input_timestamp: string]


## ```hour()```, ```Minute()``` y ```second()```

In [10]:
import org.apache.spark.sql.functions._

val df = Seq(("2019-07-01 12:01:19.000"),
    ("2019-06-24 12:01:19.000"),
    ("2019-11-16 16:44:55.406"),
    ("2019-11-16 16:50:59.406")).toDF("input_timestamp")

  df.withColumn("hour", hour(col("input_timestamp")))
    .withColumn("minute", minute(col("input_timestamp")))
    .withColumn("second", second(col("input_timestamp")))
    .show(false)

+-----------------------+----+------+------+
|input_timestamp        |hour|minute|second|
+-----------------------+----+------+------+
|2019-07-01 12:01:19.000|12  |1     |19    |
|2019-06-24 12:01:19.000|12  |1     |19    |
|2019-11-16 16:44:55.406|16  |44    |55    |
|2019-11-16 16:50:59.406|16  |50    |59    |
+-----------------------+----+------+------+



import org.apache.spark.sql.functions._
df: org.apache.spark.sql.DataFrame = [input_timestamp: string]


# Conclusión
Se ha consolidado la lista completa de Timestamp y fecha de Spark con una descripción y ejemplo de algunas de uso común.