![JohnSnowLabs](https://sparknlp.org/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/annotation/text/english/MultiDateMatcherMultiLanguage_en.ipynb)


# **Matching Dates**

In [None]:
# Only run this cell when you are using Spark NLP on Google Colab
!wget http://setup.johnsnowlabs.com/colab.sh -O - | bash

--2022-12-23 14:46:21--  http://setup.johnsnowlabs.com/colab.sh
Resolving setup.johnsnowlabs.com (setup.johnsnowlabs.com)... 51.158.130.125
Connecting to setup.johnsnowlabs.com (setup.johnsnowlabs.com)|51.158.130.125|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://setup.johnsnowlabs.com/colab.sh [following]
--2022-12-23 14:46:21--  https://setup.johnsnowlabs.com/colab.sh
Connecting to setup.johnsnowlabs.com (setup.johnsnowlabs.com)|51.158.130.125|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp/master/scripts/colab_setup.sh [following]
--2022-12-23 14:46:22--  https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp/master/scripts/colab_setup.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:44

In [None]:
import sparknlp
from sparknlp.annotator import *
from sparknlp.base import *
from pyspark import *
from pyspark.sql.types import StringType

spark = sparknlp.start()
sparknlp.version()

'4.3.1'

## English formatted dates matching examples

In [None]:
df = spark.createDataFrame(
  ["We met on the 13/5/2018 and then on the 18/5/2020."],
  StringType()).toDF("text")
df.show()

+--------------------+
|                text|
+--------------------+
|We met on the 13/...|
+--------------------+



In [None]:
document_assembler = DocumentAssembler() \
            .setInputCol("text") \
            .setOutputCol("document")

date_matcher = MultiDateMatcher() \
            .setInputCols(['document']) \
            .setOutputCol("date") \
            .setOutputFormat("MM/dd/yyyy") \
            .setSourceLanguage("en")

assembled = document_assembler.transform(df)
date_matcher.transform(assembled).select("date").show(10, False)

+--------------------------------------------------------------------------------------------------+
|date                                                                                              |
+--------------------------------------------------------------------------------------------------+
|[[date, 14, 22, 05/13/2018, [sentence -> 0], []], [date, 40, 48, 05/18/2020, [sentence -> 0], []]]|
+--------------------------------------------------------------------------------------------------+



## English unformatted dates matching examples

In [None]:
df = spark.createDataFrame(
  ["I see you next Friday after the next Thursday."],
  StringType()).toDF("text")
df.show()

+--------------------+
|                text|
+--------------------+
|I see you next Fr...|
+--------------------+



In [None]:
document_assembler = DocumentAssembler() \
            .setInputCol("text") \
            .setOutputCol("document")

date_matcher = MultiDateMatcher() \
            .setInputCols(['document']) \
            .setOutputCol("date") \
            .setOutputFormat("MM/dd/yyyy") \
            .setSourceLanguage("en")

assembled = document_assembler.transform(df)
date_matcher.transform(assembled).select("date").show(10, False)

+--------------------------------------------------------------------------------------------------+
|date                                                                                              |
+--------------------------------------------------------------------------------------------------+
|[[date, 10, 17, 02/24/2023, [sentence -> 0], []], [date, 32, 39, 02/23/2023, [sentence -> 0], []]]|
+--------------------------------------------------------------------------------------------------+

