# Visualization with displaCy

At this point, you already know about tools to analyse historical texts with regard to their structure and semantics. It would be helpful if we could also visualise these results to make more sense of them at one glance.  That's where displaCy comes in - a tool in spaCy that makes language analysis visually intuitive.  It helps us 'see' the roles each word plays in a sentence, making it easier to understand the building blocks of language.

In this tutorial, you'll learn how to use displaCy to visualize part-of-speech tagging.

### Before you get started ...

Please run the code below. It will import some necessary libaries. This might take some time ;)

In [None]:
# install the libraries listed in thre requirements.txt file
%pip install -r ../.devcontainer/python-3.12/requirements.txt --upgrade-strategy only-if-needed

# install spacy model en_core_web_sm
!python -m spacy download en_core_web_sm 

**Step 1: Importing spaCy and displaCy**

First, we need to import spaCy and displaCy.

Then we load a pre-trained spaCy language model for English. The string "en_core_web_sm" is the name of the model, in this case the small English model (sm stands for small).

The returned result is assigned to a variable named nlp which we can use to build our spaCy pipeline.

Run the code in the cell below:

In [4]:
# import spaCy and displaCy libraries
import spacy
from spacy import displacy

# Load English language model
nlp = spacy.load("en_core_web_sm")

**Step 2: Applying spaCy's processing pipeline to the input**

In order for displaCy to be able to process our input, we need to convert it into a format it can handle. Luckily, spaCy offers an easy way to convert your input into a format suitable for NLP tasks: When you call `nlp(text)` on a given input text, spaCy processes the input text and creates a `doc` object that contains the analyzed information. You can then access various attributes and methods of the `doc` object to obtain information about tokens, entities, part-of-speech tags, and more.

We will use the following sentence as input:

"I've been 2 times to New York in 2011, but did not have the constitution for it. It DIDN 'T appeal to me. I preferred Los Angeles."

Follow the instructions below to prepare the input text for displaCy, then run your code:

In [5]:
# TODO: Save the text into a variable called input
input = "I've been 2 times to New York in 2011, but did not have the constitution for it. It DIDN 'T appeal to me. I preferred Los Angeles."

# TODO: Pass the input variable as an input parameter to the nlp pipeline and save it in a variable called doc
doc = nlp(input)

**Step 3: Visualising dependencies**

Visualizing dependencies in a sentence provides a clear and intuitive way to analyze its building blocks, allowing you to see the relationships between words and understand how they work together grammatically.

You can use displacy's `render()` function to display your results as a dependency parse tree. The function offers several options to customise the look of your dependency parse tree, but all it really needs to work is the `doc` input and the `style` parameter. Try it out below:

In [6]:
# optional: set to true here because you run the code in a Jupyter notebook
jupyter_mode = True

# optional: set the distance between tokens in the visualization
render_options = {'distance': 140}

# TODO: Set the visuation style to "dep" to show the dependencies in the sentence
visualization_style = "dep"

# Render and display the visualization
displacy.render(doc, jupyter=jupyter_mode, options=render_options, style=visualization_style)

**Step 4: Visualising named entities**

Finding named entities in a text is crucial for quickly identifying and extracting key information such as names of people, locations, dates, and organizations.

DisplaCy's entity visualizer highlights named entities and their labels.

You use the entity visualiser in the exact same way as the dependency visualiser above. There is only one small difference - the `style` attribute of the `render()` function is set to the keyword `ent`. With this information, you can visualise the named entities present in our text input:

In [7]:
# TODO: apply the displacy render() function to the doc object with the jupyter parameter set to True and the style parameter set to 'ent'
displacy.render(doc, jupyter=True,  style="ent")

**Optional Challenge: Finetuning your visualisation results**

If you are only interested in certain entities, you can define a list of these entities and pass them to the `render()` function in the options attribute. You simply make a list of strings called options in which you list these entities of your choice (i.e. `'GPE'` and `'DATE'`). The only thing you still have to do below is to call displaCy's `render()` function with the `options` parameter and the appropriate `style` parameter for entity visualisation you already used above:


In [8]:
# Pass entities in list
options = {'ents': ['GPE', 'DATE']}

# TODO: call render() function for visualising entities with jupyter parameter set to True and options parameter set to options
displacy.render(doc, style='ent', jupyter=True, options=options)

Maybe you also want to choose the colors for your visualisation. You can tell displaCy which colors to use for which entity. Specify the color of your entities by assigning a hexadecimal color code to them (i.e. `'#FF5733'` and `'#33FF57'`). Below, you only have to call displaCy's `render()` function with the `options` parameter and the `style` parameter for entity visualisation you already used above. You can of course also experiment at bit more with the colors:

In [9]:
# Define a list with the entities you want to visualise as string values
entities = ["GPE", "DATE"]

# Define a dictionary with your entities as keys in the string type and their hexadecimal color values as values in the string type
custom_colors = {"GPE": "#FF5733", "DATE": "#33FF57"}

# Set rendering options with custom colors
options = {"ents": entities, "colors": custom_colors}

# Visualize named entities passing the doc object as well as the style, jupyter and options parameters
displacy.render(doc, style="ent", jupyter=True, options=options)