## Part 4. Adding nouns and adjectives to the lexicon

Below you can see a more extensive English lexicon that contains both verbs and nouns.

Notice the new sublexicon "root". Also note that we have separated out the part of speech tags (+V, +N) into their own sublexicons. In general, words are now built from longer chains (or layers) of sublexicons. In particular, nouns have separate sublexicons for number and case. Make sure that you understand how all this works:

In [None]:
# Configure for morphology
import sys
sys.path.append("../../../morf-synt-2025/src")
from morpholexicon import *

# Actual morphology starts here

def root(state):
    """ Root lexicon: this is where all words start """
    entry_a("", verbstems, state) # continue to verbs
    entry_a("", nounstems, state) # continue to nouns

# VERBS
    
def verbstems(state):
    """ Sublexicon for English verb stems """
    entry_a("climb", verblabel, state)
    entry_a("lift", verblabel, state)
    entry_a("open", verblabel, state)
    entry_a("talk", verblabel, state)
    entry_a("walk", verblabel, state)

    entry_t("bake", "bak", verblabel_e, state)
    entry_t("invite", "invit", verblabel_e, state)
    entry_t("like", "lik", verblabel_e, state)
    entry_t("mute", "mut", verblabel_e, state)
    entry_t("suppose", "suppos", verblabel_e, state)
        
def verblabel(state):
    """ Add +V to all verbs in their lexical form;
        this is for stems without any alternation
    """
    entry_t("+V", "", verbendings, state) # continue to endings
    
def verbendings(state):
    """ Verb endings; this is for stems without any alternation """
    entry_t("+Inf", "", None, state)      # +V is no longer part of these labels
    entry_t("+Pres3Sg", "s", None, state)
    entry_t("+Prog", "ing", None, state)
    entry_t("+Past", "ed", None, state)

def verblabel_e(state):
    """ Add +V to all verbs in their lexical form;
        this is for stems that end in -e
    """
    entry_t("+V", "", verbendings_e, state)

def verbendings_e(state):
    """ Verb endings; this is for stems that end in -e """
    entry_t("+Inf", "e", None, state)
    entry_t("+Pres3Sg", "es", None, state)
    entry_t("+Prog", "ing", None, state)
    entry_t("+Past", "ed", None, state)

# NOUNS

def nounstems(state):
    """ Sublexicon for English noun stems """
    entry_a("cat", nounlabel, state)
    entry_a("cow", nounlabel, state)
    entry_a("dog", nounlabel, state)
    entry_a("horse", nounlabel, state)
    entry_a("snake", nounlabel, state)

def nounlabel(state):
    """ Add +N to all nouns in their lexical form """
    entry_t("+N", "", nounnumber, state) # continue to number: Sg or Pl
    
def nounnumber(state):
    """ Number for nouns """
    entry_t("+Sg", "", nounposs_with_s, state)     # continue to possible 's 
    entry_t("+Pl", "s", nounposs_without_s, state) # continue to possible '

def nounposs_with_s(state):
    """ Case for nouns in singular """
    entry_t("+Nom", "", None, state)   # "nominative" form without possessive ending
    entry_t("+Gen", "'s", None, state) # "genitive" form with 's
    
def nounposs_without_s(state):
    """ Case for nouns in plural """
    entry_t("+Nom", "", None, state)   # "nominative" form without possessive ending
    entry_t("+Gen", "'", None, state)  # "genitive" form with only ' (because the -s is already there)
    
# The main program starts here

# First load the lexicon and tell Python that "root" is the starting point
load_lexicon(root, None)

# Generate all possible word forms of the noun "dog"
generate("dog+N+Sg+Nom")
generate("dog+N+Sg+Gen")
generate("dog+N+Pl+Nom")
generate("dog+N+Pl+Gen")

# Analyze a couple of surface forms
analyze("snakes")
analyze("baked")
analyze("cow's")
analyze("cats'")

# Generate all surface forms recognized by this lexicon
generate_all()

# End of program

You now need to develop this lexicon further, as follows:
* Incorporate some of the verb paradigms you added in Part 3.
* Add some new noun paradigms, such as nouns ending in -y (beauty, doggy, lobby) or nouns ending in a sibilant (crush, kiss, match, miss, fix).
* As you notice, many nouns are also verbs (crush, kiss, match, miss, fix). Add them to the lexicon as verbs as well. Then analyze some of the word forms (such as kiss, kisses, kissed). Do you get more than one analysis for some of the surface forms? You should.
* Add a new part of speech: adjectives. English adjectives have inflection for *comparison*: positive, comparative, and superlative (such as small, smaller, smallest). Can you create appropriate lexicons for these endings?
* You can also create _compound words_. This concerns the noun stems only. You would then get word forms like dogcat, snakehouse, or cowhorsedogs. Notice that you should not spell out the compound words explicitly, but let the lexicon produce all possible combinations of stems for you. (Compound words like these are not that typical for English, but if you were to work on Finnish or some other languages, this would be very useful.)

When you are done here, you can move on to working on you home assignment. Good luck!