# Grammatical Features
 In this chapter, we will investigate the role of features in building rule-based grammars. In contrast to feature extractors, which record features that have been automatically detected, we are now going to declare the features of words and phrases. We start off with a very simple example, using dictionaries to store features and their values.

In [21]:
kim = {'CAT': 'NP','ORTH':'Kim','REF':'k'}
chase = {'CAT': 'V','ORTH':'chased','REL':'chase'}

The objects kim and chase both have a couple of shared features, CAT (grammatical category) and ORTH (orthography, i.e., spelling). In addition, each has a more semantically-oriented feature: kim['REF'] is intended to give the referent of kim, while chase['REL'] gives the relation expressed by chase. In the context of rule-based grammars, such pairings of features and values are known as **feature structures**.

Feature structures contain various kinds of information about grammatical entities. The information need not be exhaustive, and we might want to add further properties. For example, in the case of a verb, it is often useful to know what "semantic role" is played by the arguments of the verb. In the case of chase, the subject plays the role of "agent", while the object has the role of "patient".

In [5]:
chase['AGT'] = 'sbj'
chase['PAT'] = 'obj'

In [22]:
sent = 'Kim chased Lee'
tokens = sent.split()
lee = {'CAT':'NP','ORTH':'Lee','REF':'l'}
def lex2fs(word):
    for fs in [kim,lee,chase]:
        if fs['ORTH'] == word:
            return fs
        
subj, verb, obj = lex2fs(tokens[0]),lex2fs(tokens[1]),lex2fs(tokens[2])
verb['AGT'] = subj['REF']
verb['PAT'] = obj['REF']
for k in ['ORTH','REL','AGT','PAT']:
    print('%-5s => %s' % (k,verb[k]))

ORTH  => chased
REL   => chase
AGT   => k
PAT   => l


The same approach could be adopted for a different verb, say surprise, though in this case, the subject would play the role of "source" (SRC) and the object, the role of "experiencer" (EXP)

In [23]:
surprise = {'CAT':'V','ORTH':'surprised','REL':'surprise',
           'SRC':'sbj','EXP':'obj'}

## Syntactic Agreement

That morphological properties of the verb co-vary with syntactic properties of the subject noun phrase. This co-variance is called **agreement**. If we look further at verb agreement in English, we will see that present tense verbs typically have two inflected forms: one for third person singular, and another for every other combination of person and number, as shown in Table below.

*Agreement Paradigm for English Regular Verbs*

| |singular|plural|
|-|------|------|
|1st per|I run|we run|
|2nd per|you run|you run|
|3rd per|he/she/it runs|they run|

```
S   ->   NP VP
NP  ->   Det N
VP  ->   V

Det  ->  'this'
N    ->  'dog'
V    ->  'runs'
```

To
```
S -> NP_SG VP_SG
S -> NP_PL VP_PL
NP_SG -> Det_SG N_SG
NP_PL -> Det_PL N_PL
VP_SG -> V_SG
VP_PL -> V_PL

Det_SG -> 'this'
Det_PL -> 'these'
N_SG -> 'dog'
N_PL -> 'dogs'
V_SG -> 'runs'
V_PL -> 'run'
```
The change results in the sentences generated by grammar can involve both singular subject NPS and VPS, and plural subject NPS and VPS

## Using Attributes and Constraits
```
S -> NP[NUM=?n] VP[NUM=?n]
NP[NUM=?n] -> Det[NUM=?n] N[NUM=?n]
VP[NUM=?n] -> V[NUM=?n]
```

We have introduced some new notation which says that the category N has a (grammatical) **feature** called NUM (short for 'number'), We are using ?n as a variable over values of NUM; it can be instantiated either to sg or pl, within a given production

In [27]:
nltk.data.show_cfg('grammars/book_grammars/feat0.fcfg')

% start S
# ###################
# Grammar Productions
# ###################
# S expansion productions
S -> NP[NUM=?n] VP[NUM=?n]
# NP expansion productions
NP[NUM=?n] -> N[NUM=?n] 
NP[NUM=?n] -> PropN[NUM=?n] 
NP[NUM=?n] -> Det[NUM=?n] N[NUM=?n]
NP[NUM=pl] -> N[NUM=pl] 
# VP expansion productions
VP[TENSE=?t, NUM=?n] -> IV[TENSE=?t, NUM=?n]
VP[TENSE=?t, NUM=?n] -> TV[TENSE=?t, NUM=?n] NP
# ###################
# Lexical Productions
# ###################
Det[NUM=sg] -> 'this' | 'every'
Det[NUM=pl] -> 'these' | 'all'
Det -> 'the' | 'some' | 'several'
PropN[NUM=sg]-> 'Kim' | 'Jody'
N[NUM=sg] -> 'dog' | 'girl' | 'car' | 'child'
N[NUM=pl] -> 'dogs' | 'girls' | 'cars' | 'children' 
IV[TENSE=pres,  NUM=sg] -> 'disappears' | 'walks'
TV[TENSE=pres, NUM=sg] -> 'sees' | 'likes'
IV[TENSE=pres,  NUM=pl] -> 'disappear' | 'walk'
TV[TENSE=pres, NUM=pl] -> 'see' | 'like'
IV[TENSE=past] -> 'disappeared' | 'walked'
TV[TENSE=past] -> 'saw' | 'liked'


Notice that a syntactic category can have more than one feature; for example, V[TENSE=pres, NUM=pl]. In general, we can add as many features as we like.

In [28]:
tokens = 'Kim likes children'.split()
from nltk import load_parser
cp = load_parser('grammars/book_grammars/feat0.fcfg',trace=2)
for tree in cp.parse(tokens):
    print(tree)

|.Kim .like.chil.|
Leaf Init Rule:
|[----]    .    .| [0:1] 'Kim'
|.    [----]    .| [1:2] 'likes'
|.    .    [----]| [2:3] 'children'
Feature Bottom Up Predict Combine Rule:
|[----]    .    .| [0:1] PropN[NUM='sg'] -> 'Kim' *
Feature Bottom Up Predict Combine Rule:
|[----]    .    .| [0:1] NP[NUM='sg'] -> PropN[NUM='sg'] *
Feature Bottom Up Predict Combine Rule:
|[---->    .    .| [0:1] S[] -> NP[NUM=?n] * VP[NUM=?n] {?n: 'sg'}
Feature Bottom Up Predict Combine Rule:
|.    [----]    .| [1:2] TV[NUM='sg', TENSE='pres'] -> 'likes' *
Feature Bottom Up Predict Combine Rule:
|.    [---->    .| [1:2] VP[NUM=?n, TENSE=?t] -> TV[NUM=?n, TENSE=?t] * NP[] {?n: 'sg', ?t: 'pres'}
Feature Bottom Up Predict Combine Rule:
|.    .    [----]| [2:3] N[NUM='pl'] -> 'children' *
Feature Bottom Up Predict Combine Rule:
|.    .    [----]| [2:3] NP[NUM='pl'] -> N[NUM='pl'] *
Feature Bottom Up Predict Combine Rule:
|.    .    [---->| [2:3] S[] -> NP[NUM=?n] * VP[NUM=?n] {?n: 'pl'}
Feature Single Edge Fundame

## Terminology
So far, we have only seen feature values like sg and pl. These simple values are usually called **atomic** — that is, they can't be decomposed into subparts. A special case of atomic values are **boolean** values, that is, values that just specify whether a property is true or false. For example, we might want to distinguish **auxiliary** verbs such as can, may, will and do with the boolean feature AUX. For example, the production V[TENSE=pres, AUX=+] -> 'can' means that can receives the value pres for TENSE and + or true for AUX. There is a widely adopted convention which abbreviates the representation of boolean features f; instead of AUX=+ or AUX=-, we use +AUX and -AUX respectively. These are just abbreviations, however, and the parser interprets them as though + and - are like any other atomic value. 
```
V[TENSE=pres, +AUX] -> 'can'
V[TENSE=pres, +AUX] -> 'may'

V[TENSE=pres, -AUX] -> 'walks'
V[TENSE=pres, -AUX] -> 'likes'
```

We have spoken of attaching "feature annotations" to syntactic categories. A more radical approach represents the whole category — that is, the non-terminal symbol plus the annotation — as a bundle of features. For example, N[NUM=sg] contains part of speech information which can be represented as POS=N. An alternative notation for this category therefore is [POS=N, NUM=sg].

In addition to atomic-valued features, features may take values that are themselves feature structures. For example, we can group together agreement features (e.g., person, number and gender) as a distinguished part of a category, grouped together as the value of *AGR*. In this case, we say that AGR has a **complex** value. Below depicts the structure, in a format known as an **attribute value matrix** (AVM).
```
[POS = N           ]
[                  ]
[AGR = [PER = 3   ]]
[      [NUM = pl  ]]
[      [GND = fem ]]
```

Once we have the possibility of using features like AGR, we can refactor a grammar like 1.1 so that agreement features are bundled together. A tiny grammar illustrating this idea is shown in 
```
S                    -> NP[AGR=?n] VP[AGR=?n]
NP[AGR=?n]           -> PropN[AGR=?n]
VP[TENSE=?t, AGR=?n] -> Cop[TENSE=?t, AGR=?n] Adj

Cop[TENSE=pres,  AGR=[NUM=sg, PER=3]] -> 'is'
PropN[AGR=[NUM=sg, PER=3]]            -> 'Kim'
Adj                                   -> 'happy'
```

# Processing Feature Structures
Feature structures in NLTK are declared with the FeatStruct() constructor. Atomic feature values can be strings or integers.

In [30]:
fs1 = nltk.FeatStruct(TENSE='past',NUM='sg')
print(fs1)

[ NUM   = 'sg'   ]
[ TENSE = 'past' ]


A feature structure is actually just a kind of dictionary, and so we access its values by indexing in the usual way. We can use our familiar syntax to assign values to features:

In [31]:
fs1 = nltk.FeatStruct(PER=3,NUM='pl',GND='fem')
print(fs1['GND'])

fem


In [32]:
fs1['CASE']='acc'
fs1

[CASE='acc', GND='fem', NUM='pl', PER=3]