![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/prediction/english/graph_extraction_helper_display.ipynb)

In [None]:
!wget http://setup.johnsnowlabs.com/colab.sh -O - | bash

--2023-01-02 20:04:15--  http://setup.johnsnowlabs.com/colab.sh
Resolving setup.johnsnowlabs.com (setup.johnsnowlabs.com)... 51.158.130.125
Connecting to setup.johnsnowlabs.com (setup.johnsnowlabs.com)|51.158.130.125|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://setup.johnsnowlabs.com/colab.sh [following]
--2023-01-02 20:04:15--  https://setup.johnsnowlabs.com/colab.sh
Connecting to setup.johnsnowlabs.com (setup.johnsnowlabs.com)|51.158.130.125|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp/master/scripts/colab_setup.sh [following]
--2023-01-02 20:04:15--  https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp/master/scripts/colab_setup.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:44

To better identify the kind of relationships we can extract from Graph Extraction annotator, we recommend using spark-nlp-display library to visualize the Dependency Parser tree and the tokens labeled by NER. This notebook shows how to use it.

In [None]:
!pip install spark-nlp-display

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting spark-nlp-display
  Downloading spark_nlp_display-4.2-py3-none-any.whl (95 kB)
[K     |████████████████████████████████| 95 kB 3.6 MB/s 
Collecting svgwrite==1.4
  Downloading svgwrite-1.4-py3-none-any.whl (66 kB)
[K     |████████████████████████████████| 66 kB 5.2 MB/s 
Collecting jedi>=0.10
  Downloading jedi-0.18.2-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 51.0 MB/s 
Installing collected packages: jedi, svgwrite, spark-nlp-display
Successfully installed jedi-0.18.2 spark-nlp-display-4.2 svgwrite-1.4


In [None]:
import sparknlp
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.sql import SparkSession

print("Spark NLP version", sparknlp.version())

Spark NLP version 4.2.6


In [None]:
text= 'Peter was born in Mexico and very successful man.'

In [None]:
document_assembler = DocumentAssembler().setInputCol("text").setOutputCol("document")
tokenizer = Tokenizer().setInputCols(["document"]).setOutputCol("token")
pos_tagger = PerceptronModel.pretrained().setInputCols("document", "token").setOutputCol("pos")
dep_parser = DependencyParserModel.pretrained().setInputCols(["document", "pos", "token"]).setOutputCol("dependency")
typed_dep_parser = TypedDependencyParserModel.pretrained().setInputCols(["token", "pos", "dependency"]).setOutputCol("dependency_type")

dep_parser_pipeline = Pipeline(stages = [document_assembler, tokenizer, pos_tagger, dep_parser, typed_dep_parser])

empty_df = spark.createDataFrame([['']]).toDF("text")
pipeline_model = dep_parser_pipeline.fit(empty_df)
light_model = LightPipeline(pipeline_model)

pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[OK!]
dependency_conllu download started this may take some time.
Approximate size to download 16.7 MB
[OK!]
dependency_typed_conllu download started this may take some time.
Approximate size to download 2.4 MB
[OK!]


In [None]:
from sparknlp_display import DependencyParserVisualizer

output = light_model.fullAnnotate(text)[0]
dependency_vis = DependencyParserVisualizer()
dependency_vis.display(output, 'pos', 'dependency', 'dependency_type')

In [None]:
from sparknlp.pretrained import PretrainedPipeline
from sparknlp_display import NerVisualizer


ner_pipeline = PretrainedPipeline('recognize_entities_dl', lang='en')
ner_output = ner_pipeline.fullAnnotate(text)[0]

visualiser = NerVisualizer()
visualiser.display(ner_output, label_col='entities', document_col='document')

recognize_entities_dl download started this may take some time.
Approx size to download 160.1 MB
[OK!]


The sentence below creates a deeper Dependency Tree

In [None]:
text= 'Peter was born in Mexico and very successful in Queens.'

In [None]:
output = light_model.fullAnnotate(text)[0]
dependency_vis = DependencyParserVisualizer()
dependency_vis.display(output, 'pos', 'dependency', 'dependency_type')

In [None]:
ner_pipeline = PretrainedPipeline('recognize_entities_dl', lang='en')
ner_output = ner_pipeline.fullAnnotate(text)[0]

visualiser = NerVisualizer()
visualiser.display(ner_output, label_col='entities', document_col='document')

recognize_entities_dl download started this may take some time.
Approx size to download 160.1 MB
[OK!]
