# Using Engagement Analyzer




First you will need to download the package. The following line of code will do this for you.

In [1]:
!pip install https://huggingface.co/egumasa/en_engagement_LSTM/resolve/main/en_engagement_LSTM-any-py3-none-any.whl

Collecting en-engagement-LSTM==any
  Downloading https://huggingface.co/egumasa/en_engagement_LSTM/resolve/main/en_engagement_LSTM-any-py3-none-any.whl (2784.5 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.8/2.8 GB[0m [31m380.3 kB/s[0m eta [36m0:00:00[0m:01[0m00:30[0m


# Import the package

You can import the package by running `spacy.load()` function.
It is customary to store the loaded model to `nlp` when you run a spaCy model, which you could choose desired variable name.

In [3]:
# Using spacy.load().
import spacy
nlp = spacy.load("en_engagement_LSTM")


  from .autonotebook import tqdm as notebook_tqdm


# First Engagement Analysis 

First things first, let's analyze the following example from the tutorial.

"My guess is that you would probably not believe in this approach yet."

We will create a variable named text, which will hold the example sentence to analyze.

In [34]:
# Put the text into a variable called text
text = "My guess is that you would probably not believe in this approach yet."

# Pass the text to the function nlp(), which will create a spacy Doc object with automated annotation.
doc = nlp(text)

## Let's examine the analysis results

In spaCy, span annotation will be contained in doc.spans[KEY].
In this case, the key is set to "sc" (default from spaCy package, but you could change when you train the model.)

In [35]:
print("Text\t\tLabel\t\tStart Char\tEnd Char")
for span in doc.spans['sc']: #iterate span objects
	print(span.text, # raw text span
	   	  span.label_, # predicted label
	      span.start, # start character ID
	      span.end,   # end character ID
	      sep = "\t") #print each span text and their labels

Text		Label		Start Char	End Char
My guess is	ENTERTAIN	0	3
would probably	ENTERTAIN	5	7
not believe	DENY	7	9


The output should look like the following (see also Table 3 of the paper).

```
Text		Label		Start Char	End Char
My guess is	ENTERTAIN	0	3
would probably	ENTERTAIN	5	7
not believe	DENY	7	9
```

* `span.text` : raw text will be returned

* `span.label_` : predicted label will be returned

* `span.start` : start character ID

* `span.end` : end character ID	

## Combining span prediction with existing spaCy pipeline

When you train spaCy models, you can copy the already existing spaCy pipeline without changing their models. This will allow the analysis of span along with other tasks offered in the spaCy pipeline.

Since no token-by-token annotation is available for span categorization, you will need to call information about the syntactic head of the span.

In [36]:
print("Span\t\tEngagement\tSpanHead\tPOS\tDependency")
for span in doc.spans['sc']: #iterate span objects
	print(span.text, # raw text span
	   	  span.label_, # predicted label
	   	  span.root,
		  span.root.pos_,
		  span.root.dep_,
	      sep = "\t") #print each span text and their labels

Span		Engagement	SpanHead	POS	Dependency
My guess is	ENTERTAIN	is	AUX	ROOT
would probably	ENTERTAIN	would	AUX	aux
not believe	DENY	believe	VERB	ccomp


The output should looke like this (format aligned):

Span			Engagement	SpanHead	POS		Dependency
My guess is		ENTERTAIN	is			AUX		ROOT
would probably	ENTERTAIN	would		AUX		aux
not believe		DENY		believe		VERB	ccomp

## Using Text C from Chang & Schllepegrell (2011)

In [38]:
example2 = """As with the teaching of L2 writers and the teaching of digital writing separately, a cross-disciplinary research project would probably be considered the province of the specialists. In other words, scholars in digital writing may say this type of cross-disciplinary research is outside of their expertise, an argument echoed by L2 writing specialists. Though we do not advocate that researchers develop projects about issues in which they have little grounding , we do believe that researchers should view this disciplinary division as an opportunity rather than an obstacle. As we have already begun to see , the writing classroom of the new millennium is characterized by digitally mediated communication and is populated by students from around the world. Both writing instructors and writing researchers face situations that specializations have not prepared them for. As multimodalities and multiliteracies become the reality of the writing classroom, claims of disciplinary ignorance are becoming increasingly irresponsible. Yet, to engage in these types of inquiry (such as answering the questions above), teacher–scholars have little theoretical and methodological precedent for studying issues of these disciplines."""

In [39]:
doc2 = nlp(example2) # uses doc2 as variable name intentionally 

In [40]:
print("Engagement\tText")
for span in doc2.spans['sc']: #iterate span objects
	print(f"{span.label_}:\t{span.text}") 

Engagement	Text
ENTERTAIN:	would probably
ENTERTAIN:	may
ATTRIBUTION:	an argument
ATTRIBUTION:	an argument echoed
SOURCES:	L2 writing specialists
COUNTER:	Though we do not advocate that researchers develop projects about issues in which they have little grounding
DENY:	do not advocate
PROCLAIM:	do believe
ENTERTAIN:	should
COUNTER:	rather than an obstacle
PROCLAIM:	As we have already begun to see
MONOGLOSS:	is characterized
MONOGLOSS:	is populated
MONOGLOSS:	face
DENY:	have not prepared
MONOGLOSS:	are becoming
COUNTER:	Yet
ENDOPHORIC:	above


# Visualizing the span annotation with displaCy



You can see the documentation here: https://spacy.io/usage/visualizers#jupyter

In [42]:
#import displacy explicitly from spacy
from spacy import displacy 

displacy.render(doc, style="span", jupyter=True)

In [43]:
displacy.render(doc2, style="span", jupyter=True)