![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/open-source-nlp/04.1.NerDL_Graph.ipynb)

# Graph Generation for NerDL Model

---



In [None]:
!pip install -q pyspark==3.3.0  spark-nlp==5.0.0
!pip install -q tensorflow==2.11.0
!pip install -q tensorflow_addons

In [None]:
import sparknlp
from sparknlp.base import *
from sparknlp.annotator import *

spark = sparknlp.start()

print("Spark NLP version: ", sparknlp.version())
print("Apache Spark version: ", spark.version)

spark

Spark NLP version:  5.0.0
Apache Spark version:  3.3.0


# TF Graph Builder

`TFNerDLGraphBuilder` annotator can be used to create graph in the model training pipeline. This annotator inspects the data and creates the proper graph if a suitable version of TensorFlow (<= 2.7 ) is available. The graph is stored in the defined folder and loaded by the approach.

**NOTE:** This annotator is avaliable on `sparknlp` version `v4.1.0` and after.

**ATTENTION:** **Do not forget to play with the parameters of this annotator, it may affect the model performance that you want to train.**


In [None]:
!mkdir ner_logs
!mkdir ner_graphs

graph_folder = "/content/ner_graphs"

In [None]:
graph_builder = TFNerDLGraphBuilder()\
                      .setInputCols(["sentence", "token", "embeddings"]) \
                      .setLabelColumn("label")\
                      .setGraphFile("auto")\
                      .setGraphFolder(graph_folder)\
                      .setHiddenUnitsNumber(20)

*Train the model with `NerDLApproach` and let it use the graph generated by the builder.*

You can find an example in [NERDL Training Notebook](https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/Certification_Trainings/Public/4.NERDL_Training.ipynb).

```python
# You can use any word embeddings you want (Glove, Elmo, Bert, custom etc.)
glove_embeddings = WordEmbeddingsModel.pretrained('glove_100d')\
              .setInputCols(["document", "token"])\
              .setOutputCol("embeddings")

nerTagger = NerDLApproach()\
              .setInputCols(["sentence", "token", "embeddings"])\
              .setLabelColumn("label")\
              .setOutputCol("ner")\
              .setMaxEpochs(3)\
              .setLr(0.003)\
              .setBatchSize(32)\
              .setRandomSeed(0)\
              .setVerbose(1)\
              .setValidationSplit(0.2)\
              .setEvaluationLogExtended(True) \
              .setEnableOutputLogs(True)\
              .setIncludeConfidence(True)\
              .setGraphFolder(graph_folder)\
              .setOutputLogsPath('ner_logs') # if not set, logs will be written to ~/annotator_logs
          
ner_pipeline = Pipeline(stages=[glove_embeddings,
                                graph_builder,
                                nerTagger])
```


# Custom Graph

In [None]:
!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Public/utils/graph_utils/nerdl/nerdl-graph/create_graph.py
!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Public/utils/graph_utils/nerdl/nerdl-graph/dataset_encoder.py
!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Public/utils/graph_utils/nerdl/nerdl-graph/ner_model.py
!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Public/utils/graph_utils/nerdl/nerdl-graph/ner_model_saver.py
!wget https://raw.githubusercontent.com/JohnSnowLabs/spark-nlp-workshop/master/tutorials/Certification_Trainings/Public/utils/graph_utils/nerdl/nerdl-graph/sentence_grouper.py

In [None]:
import create_graph

ntags = 19 # number of labels
embeddings_dim = 100
nchars = 100

create_graph.create_graph(ntags, embeddings_dim, nchars)

# then put your graph file (pb) under a folder and set it with .setGraphFolder('folder') in NerDLApproach