![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

# **WindowedSentenceModel**

This notebook will cover the different parameters and usages of `WindowedSentenceModel`. This annotator helps you merge the previous and following sentences of a given piece of text, so that you add the context surrounding them.


**📖 Learning Objectives:**

1. Understand how it is super useful when using for especially context-rich analyses that require a deeper understanding of the language being used.

2. Become comfortable using the different parameters of the annotator.


**🔗 Helpful Links:**

- Documentation : [WindowedSentenceModel](https://nlp.johnsnowlabs.com/docs/en/licensed_annotators#windowedsentencemodel)

- Python Docs : [WindowedSentenceModel](https://nlp.johnsnowlabs.com/licensed/api/python/reference/autosummary/sparknlp_jsl/annotator/windowed/windowed_sentence/index.html)

- Scala Docs : [WindowedSentenceModel](https://nlp.johnsnowlabs.com/licensed/api/com/johnsnowlabs/nlp/annotators/windowed/WindowedSentenceModel.html)

- For extended examples of usage, see the [Spark NLP Workshop repository](https://github.com/JohnSnowLabs/spark-nlp-workshop/tree/master/tutorials/Certification_Trainings/Healthcare).

## **🎬 Colab Setup**

In [None]:
!pip install -q johnsnowlabs

In [None]:
from google.colab import files
print('Please Upload your John Snow Labs License using the button below')
license_keys = files.upload()

In [None]:
from johnsnowlabs import nlp

nlp.install()

In [None]:
from johnsnowlabs import nlp, medical
import pyspark.sql.functions as F
import pandas as pd

spark = nlp.start()

## **🖨️ Input/Output Annotation Types**

- Input: `DOCUMENT`

- Output: `DOCUMENT`

## **🔎 Parameters**


- `WindowSize` (int) : Sets size of the sliding window.

- `GlueString` (string) : Sets string to use to join the neighboring elements together.

### `setWindowSize()`

In [None]:
from johnsnowlabs import medical, nlp

documentAssembler =  nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentenceDetector =  nlp.SentenceDetector()\
    .setInputCols("document")\
    .setOutputCol("sentence")

windowedSentence1 =  medical.WindowedSentenceModel()\
    .setWindowSize(1)\
    .setInputCols("sentence")\
    .setOutputCol("window_1")

windowedSentence2 =  medical.WindowedSentenceModel()\
    .setWindowSize(2)\
    .setInputCols("sentence")\
    .setOutputCol("window_2")

pipeline = nlp.Pipeline(stages=[
    documentAssembler,
    sentenceDetector,
    windowedSentence1,
    windowedSentence2
    ])


sample_text = """The patient was admitted on Monday.
She has a right-sided pleural effusion for thoracentesis.
Her Coumadin was placed on hold.
A repeat echocardiogram was checked.
She was started on prophylaxis for DVT.
Her CT scan from March 2006 prior to her pericardectomy.
It already shows bilateral plural effusions."""

data = spark.createDataFrame([[sample_text]]).toDF("text")

result = pipeline.fit(data).transform(data)

In [None]:
result.select(F.explode('window_1')).select('col.result').show(truncate=False)

+---------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------+
|The patient was admitted on Monday. She has a right-sided pleural effusion for thoracentesis.                                                |
|The patient was admitted on Monday. She has a right-sided pleural effusion for thoracentesis. Her Coumadin was placed on hold.               |
|She has a right-sided pleural effusion for thoracentesis. Her Coumadin was placed on hold. A repeat echocardiogram was checked.              |
|Her Coumadin was placed on hold. A repeat echocardiogram was checked. She was started on prophylaxis for DVT.                          

In [None]:
result.select(F.explode('window_2')).select('col.result').show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                                                                                                          |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|The patient was admitted on Monday. She has a right-sided pleural effusion for thoracentesis. Her Coumadin was placed on hold.                                                                                                  |
|The patient was admitted on Monday. She has a right-sided pleural effusion for thoracentesi

### `setGlueString()`

In [None]:
documentAssembler =  nlp.DocumentAssembler()\
    .setInputCol("text")\
    .setOutputCol("document")

sentenceDetector =  nlp.SentenceDetector()\
    .setInputCols("document")\
    .setOutputCol("sentence")

windowedSentence1 =  medical.WindowedSentenceModel()\
    .setWindowSize(1)\
    .setInputCols("sentence")\
    .setOutputCol("window_1")\
    .setGlueString("_")

windowedSentence2 =  medical.WindowedSentenceModel()\
    .setWindowSize(2)\
    .setInputCols("sentence")\
    .setOutputCol("window_2")

pipeline = nlp.Pipeline(stages=[
    documentAssembler,
    sentenceDetector,
    windowedSentence1,
    windowedSentence2
    ])


sample_text = """The patient was admitted on Monday.
She has a right-sided pleural effusion for thoracentesis.
Her Coumadin was placed on hold.
A repeat echocardiogram was checked.
She was started on prophylaxis for DVT.
Her CT scan from March 2006 prior to her pericardectomy.
It already shows bilateral plural effusions."""

data = spark.createDataFrame([[sample_text]]).toDF("text")

result = pipeline.fit(data).transform(data)


In [None]:
result.select(F.explode('window_1')).select('col.result').show(truncate=False)

+---------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                       |
+---------------------------------------------------------------------------------------------------------------------------------------------+
|The patient was admitted on Monday._She has a right-sided pleural effusion for thoracentesis.                                                |
|The patient was admitted on Monday._She has a right-sided pleural effusion for thoracentesis._Her Coumadin was placed on hold.               |
|She has a right-sided pleural effusion for thoracentesis._Her Coumadin was placed on hold._A repeat echocardiogram was checked.              |
|Her Coumadin was placed on hold._A repeat echocardiogram was checked._She was started on prophylaxis for DVT.                          

In [None]:
result.select(F.explode('window_2')).select('col.result').show(truncate=False)

+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|result                                                                                                                                                                                                                          |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|The patient was admitted on Monday. She has a right-sided pleural effusion for thoracentesis. Her Coumadin was placed on hold.                                                                                                  |
|The patient was admitted on Monday. She has a right-sided pleural effusion for thoracentesi