# Quality metrics for NLP

There are several levels of chatbot quality measurements. In this section start with the most backend measures related strictly to the machine learning models. In the second section we show how to measure the quality based on chatbots' output. We check the grammar and spelling of the output. The last part of this notebook is dedicated to sentiment analysis that can be in many cases crucial.

## Grammar and spelling

There are several tools to check the spelling and grammar. We don't want our chatbot to reply with bad grammar or spelling errors. In Python we can use SpellChecker to check the spelling, pytypo to correct the typos and Language-check to check the grammar of a given sentence. We should check the grammar and spell so often as possible.

### Spell checking

Spell checking is one of the basic tool to check the output of our chatbot. It is not useful in many cases, only for a few generative-based chatbots.

In [1]:
from spellchecker import SpellChecker

spell = SpellChecker()

words = ['sample', 'words', 'heri', 'here']

for word in words:
    print(spell.correction(word))
    print(spell.candidates(word))

sample
{'sample'}
words
{'words'}
heri
{'heri'}
here
{'here'}


### Typos fixing

We can also easily fix some simple typos with pytypo.

In [2]:
import pytypo

pytypo.correct_sentence('this traiining is great!!!')

'this traiining is great!'

### Grammar check

A more complex tool that can measure the grammar is language tool that allows to check more than 25 languages. It's an app written in Java, but has ports in Python.

In [3]:
import language_check

tool = language_check.LanguageTool('en-US')

tool.check("the are trainings")

[Match({'fromy': 0, 'fromx': 0, 'toy': 0, 'tox': 3, 'ruleId': 'UPPERCASE_SENTENCE_START', 'msg': 'This sentence does not start with an uppercase letter', 'replacements': ['The'], 'context': 'the are trainings', 'contextoffset': 0, 'offset': 0, 'errorlength': 3, 'category': 'Capitalization', 'locqualityissuetype': 'typographical'}),
 Match({'fromy': 0, 'fromx': 8, 'toy': 0, 'tox': 17, 'ruleId': 'MORFOLOGIK_RULE_EN_US', 'msg': 'Possible spelling mistake found', 'replacements': ['training', 'training s'], 'context': 'the are trainings', 'contextoffset': 8, 'offset': 8, 'errorlength': 9, 'category': 'Possible Typo', 'locqualityissuetype': 'misspelling'})]

## Sentiment analysis

If we want to publish our chatbot on production, it's very important to measure the sentiment of the customers and our chatbot. We don't want to send to our customers a message with a negative sentiment. Two most popular libraries to check the sentiment analysis is CoreNLP and TextBlob. The libraries are trained on a dataset that usually does not give us the expected result. This is why many times we need to build our own library. Before we build a new one we check TextBlob to get the main idea of sentiment analysis.

In [2]:
example = "The weather is good outside."

We just get the sentiment for the example text:

In [3]:
from textblob import TextBlob

text = TextBlob(example)
text.sentiment

Sentiment(polarity=0.35, subjectivity=0.32500000000000007)

A negative polarity means a negative sentiment, a positiv polarity means a positive sentiment. The subjectivity means if the sentence is objective or subjective. The value is between 0 and 1.