# Counting Nouns -Plural and Singular Nouns

In this recipe, we will do two things:

• Determine whether a noun is plural or singular

• Turn plural nouns into singular nouns and vice versa

You might need these two things in a variety of tasks:

a. in making your chatbot speak in
grammatically correct sentences,

b. in coming up with text classifcation features, and so on.

In [13]:
# !pip install inflect

In [14]:
# !python -m spacy download en_core_web_md

In [15]:
# !pip install textacy

In [16]:
# !pip install spacy-experimental # modern alternative to neuralcoref

IMPORT LIBRARIES

In [17]:
import nltk
from nltk.stem import WordNetLemmatizer
import inflect

READ IN THE TEXT FILE

In [18]:
file = open(r"/content/001_Study_in_Scarlet.txt", "r", encoding = "utf-8")
Study_in_Scarlet = file.read()

REMOVE NEW LINES FOR BETTER READABILITY

In [19]:
Study_in_Scarlet = Study_in_Scarlet.replace("\n", " ")

DO PART OF SPEECH TAGGING

In [36]:
import spacy
# Load the model once
nlp = spacy.load("en_core_web_sm")


In [39]:
doc = nlp(Study_in_Scarlet)
words_with_pos = [(token.text, token.tag_) for token in doc]

DEFINE THE GET_NOUNS FUNCTION, WHICH WILL FILTER OUT THE NOUNS FROM THE WORDS

In [40]:
def get_nouns(word_with_pos):
  noun_set = ["NN", "NNS"]
  nouns = [word for word in words_with_pos if word[1] in noun_set]
  return nouns

RUN THE PRECEDING FUNCTION ON THE LIST OF POS-TAGGED WORDS AND PRINT IT:

In [41]:
nouns = get_nouns(words_with_pos)
print(nouns)



To determine whether a noun is singular or plural, we have two options. The
first option is to use the NLTK tags, where NN indicates a singular noun and NNS
indicates a plural noun. The following function uses the NLTK tags and returns
True if the input noun is plural:

In [42]:
def is_plural_nltk(noun_info):
  pos = noun_info[1]
  if (pos == "NNS"):
    return True
  else:
    return False

The other option is to use the WordNetLemmatizer class in the nltk.stem
package. The following function returns True if the noun is plural:

In [44]:
def is_plural_wn(noun):
  wnl = WordNetLemmatizer()
  lemma = wnl.lemmatize(noun, "n")

  plural = True if noun is not lemma else False
  return plural

The following function will change a singular noun into plural:

In [45]:
def get_plural(singular_noun):
  p = inflect.engine()
  return p.plural(singular_noun)

The following function will change a plural noun into singular:

In [46]:
def get_singular(plural_noun):
  p = inflect.engine()
  plural = p.singular_noun(plural_noun)
  if (plural):
    return plural
  else:
    return plural_noun

We can now use the two preceding functions to return a list of nouns changed into
plural or singular, depending on the original noun. The following code uses the
is_plural_wn function to determine if the noun is plural. You can also use the
is_plural_nltk function:

In [47]:
def plurals_wn(words_with_pos):
  other_nouns = []
  for noun_info in words_with_pos:
    word = noun_info[0]
    plural = is_plural_wn(word)
    plural = is_plural_wn(word)
    if (plural):
      singular = get_singular(word)
      other_nouns.append(singular)
    else:
      plural = get_plural(word)
      other_nouns.append(plural)
  return other_nouns

Use the preceding function to return a list of changed nouns:

In [49]:
import nltk
nltk.download("wordnet")
other_nouns_wn = plurals_wn(nouns)

[nltk_data] Downloading package wordnet to /root/nltk_data...


In [50]:
print(other_nouns_wn)



How it works…
Number detection works in one of two ways. One is by reading the part of speech tag
assigned by NLTK. If the tag is NN, then the noun is singular, and if it is NNS, then it's
plural. Te other way is to use the WordNet lemmatizer and to compare the lemma and
the original word. Te noun is singular if the lemma and the original input noun are the
same, and plural otherwise.
To fnd the singular form of a plural noun and the plural form of a singular noun, we
can use the inflect package. Its plural and singular_noun methods return the
correct forms.
In step 1, we import the necessary modules and functions. You can fnd the pos_tag_
nltk function in this book's GitHub repository, in the Chapter01 module, in the pos_
tagging.py fle It uses the code we wrote for Chapter 1, Learning NLP Basics. In step
2, we read in the fle's contents into a string. In step 3, we remove newlines from the text;
this is an optional step. In step 4, we use the pos_tag_nltk function defned in the code
from the previous chapter to tag parts of speech for the words.
In step 5, we create the get_nouns function, which flters out the words that are singular
or plural nouns. In this function, we use a list comprehension and keep only words that
have the NN or NNS tags.
In step 6, we run the preceding function on the word list and print the result. As you will
notice, NLTK tags several words incorrectly as nouns, such as cold and precise. Tese
errors will propagate into the next steps, and it is something to keep in mind when
working with NLP tasks.

In steps 7 and 8, we defne two functions to determine whether a noun is singular or
plural. In step 7, we defne the is_plural_nltk function, which uses NLTK POS
tagging information to determine if the noun is plural. In step 8, we defne the is_
plural_wn function, which compares the noun with its lemma, as determined by the
NLTK lemmatizer. If those two forms are the same, the noun is singular, and if they
are diﬀerent, the noun is plural. Both functions can return incorrect results that will
propagate downstream.
In step 9, we defne the get_plural function, which will return the plural form of the
noun by using the inflect package. In step 10, we defne the get_singular function,
which uses the same package to get the singular form of the noun. If there is no output
from inflect, the function returns the input.
In step 11, we defne the plurals_wn function, which takes in a list of words with the
parts of speech that we got in step 6 and changes plural nouns into singular and singular
nouns into plural.
In step 12, we run the plurals_wn function on the nouns list. Most of the words are
changed correctly; for example, women and emotion. We also see two kinds of error
propagation, where either the part of speech or number of the noun were determined
incorrectly. For example, the word akins appears here because akin was incorrectly labeled
as a noun. On the other hand, the word men was incorrectly determined to be singular
and resulted in the wrong output; that is, mens.