<a href="https://colab.research.google.com/github/Neilus03/NLP-2023/blob/main/Python_for_NLP.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Python for NLP: An introduction to `nltk` library

In this notebook you'll be guided through a series of exercises, where you'll learn to make use of the nltk library and some of its functions for purposes like:


*   Parsing
*   Tokenization
*   Tree generation
*   ...

Hope you enjoy it!








## **Set up**

In [None]:
pip install nltk

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import nltk
nltk.download('popular')
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading collection 'popular'
[nltk_data]    | 
[nltk_data]    | Downloading package cmudict to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/cmudict.zip.
[nltk_data]    | Downloading package gazetteers to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/gazetteers.zip.
[nltk_data]    | Downloading package genesis to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/genesis.zip.
[nltk_data]    | Downloading package gutenberg to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/gutenberg.zip.
[nltk_data]    | Downloading package inaugural to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/inaugural.zip.
[nltk_data]    | Downloading package movie_reviews to
[nltk_data]    |     /root/nltk_data...
[nltk_data]    |   Unzipping corpora/movie_reviews.zip.
[nltk_data]    | Downloading package names to /root/nltk_data...
[nltk_data]    |   Unzipping corpora/names.zip.
[nltk_data]    | Downloading package shakespeare to /root/nltk_data...
[nlt

True

In [None]:
nltk.download('maxent_treebank_pos_tagger')
nltk.download('treebank')

from nltk.draw.tree import draw_trees


[nltk_data] Downloading package maxent_treebank_pos_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/maxent_treebank_pos_tagger.zip.
[nltk_data] Downloading package treebank to /root/nltk_data...
[nltk_data]   Package treebank is already up-to-date!


In [None]:
sentence = "At eight o'clock on Thursday morning Arthur didn't feel very good."
tokens = nltk.word_tokenize(sentence) # Tokenize
print(tokens)
tagged = nltk.pos_tag(tokens) # PoS tagging
print("PoS:",tagged)
from nltk.tokenize import RegexpTokenizer
s = "Good muffins cost $3.88\nin New York. Please buy me two of them.\n\nThanks."
tokenizer = RegexpTokenizer(r'\w+|\$[\d\.]+|\S+')
output = tokenizer.tokenize(s)
print(output)
from nltk.corpus import treebank
t = treebank.parsed_sents('wsj_0001.mrg')[0]
#t.draw()


['At', 'eight', "o'clock", 'on', 'Thursday', 'morning', 'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
PoS: [('At', 'IN'), ('eight', 'CD'), ("o'clock", 'NN'), ('on', 'IN'), ('Thursday', 'NNP'), ('morning', 'NN'), ('Arthur', 'NNP'), ('did', 'VBD'), ("n't", 'RB'), ('feel', 'VB'), ('very', 'RB'), ('good', 'JJ'), ('.', '.')]
['Good', 'muffins', 'cost', '$3.88', 'in', 'New', 'York', '.', 'Please', 'buy', 'me', 'two', 'of', 'them', '.', 'Thanks', '.']


##**1. Exercise**



#### Using the NLTK instructions, tokenize and compute the PoS of these sentences. Print the result:  

*   `The Jamaica Observer reported that Usain Bolt broke the 100m record`
*   `While hunting in Africa, I shot an elephant in my pajamas. How an elephant got into my pajamas I'll never know.`


In [None]:
# First sentence
sentence1 = "The Jamaica Observer reported that Usain Bolt broke the 100m record."
tokens1 = nltk.word_tokenize(sentence1)
PoS1 = nltk.pos_tag(tokens1)

print(PoS1)

# Second sentence
sentence2 = "While hunting in Africa, I shot an elephant in my pajamas. How an elephant got into my pajamas I'll never know."
tokens2 = nltk.word_tokenize(sentence2)
PoS2 = nltk.pos_tag(tokens2)

print(PoS2)


[('The', 'DT'), ('Jamaica', 'NNP'), ('Observer', 'NNP'), ('reported', 'VBD'), ('that', 'DT'), ('Usain', 'NNP'), ('Bolt', 'NNP'), ('broke', 'VBD'), ('the', 'DT'), ('100m', 'CD'), ('record', 'NN'), ('.', '.')]
[('While', 'IN'), ('hunting', 'VBG'), ('in', 'IN'), ('Africa', 'NNP'), (',', ','), ('I', 'PRP'), ('shot', 'VBP'), ('an', 'DT'), ('elephant', 'NN'), ('in', 'IN'), ('my', 'PRP$'), ('pajamas', 'NN'), ('.', '.'), ('How', 'WRB'), ('an', 'DT'), ('elephant', 'JJ'), ('got', 'VBD'), ('into', 'IN'), ('my', 'PRP$'), ('pajamas', 'NN'), ('I', 'PRP'), ("'ll", 'MD'), ('never', 'RB'), ('know', 'VB'), ('.', '.')]


## **2. Exercise:**

Given the following code:


```
import nltk
from nltk import CFG
# Defining a grammar
groucho_grammar = CFG.fromstring("""
S -> NP VP
PP -> P NP
NP -> Det N | Det N PP | 'I' | 'You'
VP -> V NP | VP PP
Det -> 'an' | 'my'
N -> 'elephant' | 'pajamas'
V -> 'shot'
P -> 'in'
""")
# Printing the grammar
print("START grammar:",groucho_grammar.start())
print("PRODUCTIONS grammar:",groucho_grammar.productions())
text = "I shot an elephant in my pajamas"
text_tokens = nltk.word_tokenize(text)
# Parsing the text
parser = nltk.parse.chart.ChartParser(groucho_grammar,trace=2)
trees = parser.parse(text_tokens)
for t in trees:
    print(t)
```
Modify the code to parse this list of texts (no need to parse Grouxo’s sentence anymore):

```
mytexts = ["John saw a man with my telescope", 
           "Alex kissed the dog", 
           "the man with the telescope ate a sandwich in the park"]
```
Follow these steps:
1. Create a CFG that can parse the sentences in mytexts.

2. Make a loop for each sentence in mytexts:  
    *   Parse the sentence.
    *   Print the parse trees.
    
3. If the output consists of more than one parse tree, explain (maximum two sentences)
why there is an ambiguity.



In [None]:
from nltk import CFG

# Defining a grammar
my_grammar = CFG.fromstring("""
S -> NP VP
PP -> P NP
NP -> Det N | Det N PP | Det N PP PP | 'John' | 'Alex'
VP -> V NP | VP PP | VP AdvP
Det -> 'a' | 'my' | 'the'
N -> 'man' | 'telescope' | 'dog' | 'sandwich' | 'park'
V -> 'saw' | 'kissed' | 'ate'
P -> 'with' | 'in'
AdvP -> Adv PP
Adv -> 'with'
""")

# List of texts to be parsed
mytexts = ["John saw a man with my telescope", 
           "Alex kissed the dog", 
           "the man with the telescope ate a sandwich in the park"]

# Printing the grammar
print("START grammar:",my_grammar.start())
print("PRODUCTIONS grammar:",my_grammar.productions())
print ("\n\n\n")

for sentence in mytexts:
    #printing the tree
    print("Tree of the sentence :", sentence)
    parser = nltk.parse.chart.ChartParser(my_grammar,trace=2)
    text_tokens = nltk.word_tokenize(sentence)
    trees = parser.parse(text_tokens)
    for t in trees:
        print(t)
    print("\n\n\n\n\n")


START grammar: S
PRODUCTIONS grammar: [S -> NP VP, PP -> P NP, NP -> Det N, NP -> Det N PP, NP -> Det N PP PP, NP -> 'John', NP -> 'Alex', VP -> V NP, VP -> VP PP, VP -> VP AdvP, Det -> 'a', Det -> 'my', Det -> 'the', N -> 'man', N -> 'telescope', N -> 'dog', N -> 'sandwich', N -> 'park', V -> 'saw', V -> 'kissed', V -> 'ate', P -> 'with', P -> 'in', AdvP -> Adv PP, Adv -> 'with']




Tree of the sentence : John saw a man with my telescope
|. John. saw .  a  . man . with.  my .teles.|
Leaf Init Rule:
|[-----]     .     .     .     .     .     .| [0:1] 'John'
|.     [-----]     .     .     .     .     .| [1:2] 'saw'
|.     .     [-----]     .     .     .     .| [2:3] 'a'
|.     .     .     [-----]     .     .     .| [3:4] 'man'
|.     .     .     .     [-----]     .     .| [4:5] 'with'
|.     .     .     .     .     [-----]     .| [5:6] 'my'
|.     .     .     .     .     .     [-----]| [6:7] 'telescope'
Bottom Up Predict Combine Rule:
|[-----]     .     .     .     .     .     .| [0:1]

###**Explanation of ambiguities:**



#### **John saw a man with my telescope**

####In the phrase **John saw a man with my telescope** there are two possible trees:


1. 


```
(S
(NP John)
(VP
    (VP (V saw) (NP (Det a) (N man)))
    (PP (P with) (NP (Det my) (N telescope)))))
```

We can conclude that the meaning is that John saw a man, and when John saw that man, John was using my telescope to see it.

2. 


```
(S
(NP John)
(VP
    (V saw)
    (NP (Det a) (N man) (PP (P with) (NP (Det my) (N telescope))))))

```

Here instead, the meaning would be that John saw a man, and that the man that john saw was with my telescope.

#### **the man with the telescope ate a sandwich in the park**

#### In the phrase **the man with the telescope ate a sandwich in the park** there are two possible trees:


1. 


```
(S
  (NP (Det the) (N man) (PP (P with) (NP (Det the) (N telescope))))
  (VP
    (VP (V ate) (NP (Det a) (N sandwich)))
    (PP (P in) (NP (Det the) (N park)))))
```
The first tree shows that the PP "with the telescope" modifies the NP "the man"

2. 


```
(S
  (NP (Det the) (N man) (PP (P with) (NP (Det the) (N telescope))))
  (VP
    (V ate)
    (NP (Det a) (N sandwich) (PP (P in) (NP (Det the) (N park))))))

```

The second tree shows that the PP "in the park" modifies the NP "a sandwich". 


The meaning remains the same in both.

##**3. Exercise**

#### Write an example of a sentence that, although it is syntactically correct in Englishand uses the lexicon of the previous grammar, it cannot be parsed by the previous defined grammar.

1. **"John with my telescope saw a man"**:

    The sentence is grammatically correct in English and includes words from the lexicon used to define the grammar. However, it cannot be parsed by the previously defined grammar because it violates the order of constituents specified in the grammar rules.
    
     The rule for a sentence (S) specifies that it must consist of a noun phrase (NP) followed by a verb phrase (VP). The rule for a verb phrase (VP) specifies that it can consist of a verb (V) followed by a noun phrase (NP) or a VP followed by a prepositional phrase (PP) or an adverb phrase (AdvP). \
     
     However, in the given sentence, the PP "with my telescope" appears before the VP "saw a man", while the grammar rules dictate that the prepositional phrase should follow the verb phrase. Then, the sentence cannot be parsed by the previously defined grammar.

Let's check if it would be possible to parse it or not:

In [None]:
from nltk import CFG

# Defining a grammar
my_grammar = CFG.fromstring("""
S -> NP VP
PP -> P NP
NP -> Det N | Det N PP | Det N PP PP | 'John' | 'Alex'
VP -> V NP | VP PP | VP AdvP
Det -> 'a' | 'my' | 'the'
N -> 'man' | 'telescope' | 'dog' | 'sandwich' | 'park'
V -> 'saw' | 'kissed' | 'ate'
P -> 'with' | 'in'
AdvP -> Adv PP
Adv -> 'with'
""")

# List of texts to be parsed
mytexts = ["John with my telescope saw a man"]

# Printing the grammar
print("START grammar:",my_grammar.start())
print("PRODUCTIONS grammar:",my_grammar.productions())
print ("\n\n\n")

for sentence in mytexts:
    #printing the tree
    print("Tree of the sentence :", sentence)
    parser = nltk.parse.chart.ChartParser(my_grammar,trace=2)
    text_tokens = nltk.word_tokenize(sentence)
    trees = parser.parse(text_tokens)
    for t in trees:
        print(t)
    print("\n\n\n\n\n")


START grammar: S
PRODUCTIONS grammar: [S -> NP VP, PP -> P NP, NP -> Det N, NP -> Det N PP, NP -> Det N PP PP, NP -> 'John', NP -> 'Alex', VP -> V NP, VP -> VP PP, VP -> VP AdvP, Det -> 'a', Det -> 'my', Det -> 'the', N -> 'man', N -> 'telescope', N -> 'dog', N -> 'sandwich', N -> 'park', V -> 'saw', V -> 'kissed', V -> 'ate', P -> 'with', P -> 'in', AdvP -> Adv PP, Adv -> 'with']




Tree of the sentence : John with my telescope saw a man
|. John. with.  my .teles. saw .  a  . man .|
Leaf Init Rule:
|[-----]     .     .     .     .     .     .| [0:1] 'John'
|.     [-----]     .     .     .     .     .| [1:2] 'with'
|.     .     [-----]     .     .     .     .| [2:3] 'my'
|.     .     .     [-----]     .     .     .| [3:4] 'telescope'
|.     .     .     .     [-----]     .     .| [4:5] 'saw'
|.     .     .     .     .     [-----]     .| [5:6] 'a'
|.     .     .     .     .     .     [-----]| [6:7] 'man'
Bottom Up Predict Combine Rule:
|[-----]     .     .     .     .     .     .| [0:1]

As you can see, there's no possible tree because the rules of grammar are not respected.

## **4. Exercise**

#### Propose how to augment the previous parser to deal with sentences that may be incorrect, for example, containing spelling errors or mistakes arising from automatic speech recognition or handwritten text recognition (maximum 3 sentences).

To augment the previous parser to deal with incorrect sentences, we can consider the following approaches:



*   **Error detection:** We can implement a module that detects errors in the input sentence before it is passed to the parser. This module can use spelling checkers or other machine learning models to identify possible errors and suggest corrections. If an error is detected, the user can be prompted to confirm or correct the error before proceeding. 

*   **Robust parsing:** We can improve the parser's robustness by using techniques such as probabilistic parsing or error-tolerant parsing. These techniques allow the parser to handle uncertain or incomplete input by assigning probabilities to different parse trees or by generating multiple possible parse trees. The most likely or the most appropriate parse tree can then be selected based on context or user feedback.

*   **Data augmentation:** We can augment the training data for the parser with examples of sentences containing errors or mistakes. This can help the parser learn to handle common errors and variations in the input. We can also use data augmentation techniques such as noise injection or random perturbations to simulate different types of errors and improve the parser's robustness.










##**5. Exercise**

Given the following code:


```
import nltk
from nltk import PCFG
pcfg1 = PCFG.fromstring("""
S -> NP VP [1.0]
NP -> Det N [0.5] | NP PP [0.25] | 'John' [0.1] | 'I' [0.15]
Det -> 'the' [0.8] | 'a' [0.2]
N -> 'man' [0.5] | 'telescope' [0.5]
VP -> VP PP [0.1] | V NP [0.7] | V [0.2]
V -> 'ate' [0.35] | 'saw' [0.65]
PP -> P NP [1.0]
P -> 'with' [0.61] | 'in' [0.39]
""")
print(pcfg1)
text = "I saw the man with a telescope"
text_tokens = nltk.word_tokenize(text)
viterbi_parser = nltk.ViterbiParser(pcfg1,trace=3)
trees = viterbi_parser.parse(text_tokens)
for tree in trees:
    print(tree)
    tree.draw()
```

Execute this code, modifying the parameter trace (values from 1 to 3). Explain the
differences. Print the parsing probability of the sentence "I saw the man with a
telescope". Print the tree.

In [None]:
#@title Just a simple function for better visualization, just run it
def print_spaces(trace_val):
    print("\n------------------------------------------\n")
    print("With trace = "+ str(trace_val)+":")
    print()

In [None]:
import nltk
from nltk import PCFG

pcfg1 = PCFG.fromstring("""
S -> NP VP [1.0]
NP -> Det N [0.5] | NP PP [0.25] | 'John' [0.1] | 'I' [0.15]
Det -> 'the' [0.8] | 'a' [0.2]
N -> 'man' [0.5] | 'telescope' [0.5]
VP -> VP PP [0.1] | V NP [0.7] | V [0.2]
V -> 'ate' [0.35] | 'saw' [0.65]
PP -> P NP [1.0]
P -> 'with' [0.61] | 'in' [0.39]
""")

print(pcfg1)

text = "I saw the man with a telescope"
text_tokens = nltk.word_tokenize(text)

#go changing the trace value and check the result

# trace = 1

print_spaces(trace_val = 1)

viterbi_parser = nltk.ViterbiParser(pcfg1, trace=1)
trees = viterbi_parser.parse(text_tokens)
for tree in trees:
    print("\nParsing probability:", tree.prob())
    #tree.draw()

print_spaces(trace_val = 2)

# trace = 2
viterbi_parser = nltk.ViterbiParser(pcfg1, trace=2)
trees = viterbi_parser.parse(text_tokens)
for tree in trees:
    print("\nParsing probability:", tree.prob())
    print(tree)
    #tree.draw()

print_spaces(trace_val = 3)

# trace = 3
viterbi_parser = nltk.ViterbiParser(pcfg1, trace=3)
trees = viterbi_parser.parse(text_tokens)
for tree in trees:
    print("\nParsing probability:", tree.prob())
    print(tree)
    #tree.draw()


Grammar with 17 productions (start state = S)
    S -> NP VP [1.0]
    NP -> Det N [0.5]
    NP -> NP PP [0.25]
    NP -> 'John' [0.1]
    NP -> 'I' [0.15]
    Det -> 'the' [0.8]
    Det -> 'a' [0.2]
    N -> 'man' [0.5]
    N -> 'telescope' [0.5]
    VP -> VP PP [0.1]
    VP -> V NP [0.7]
    VP -> V [0.2]
    V -> 'ate' [0.35]
    V -> 'saw' [0.65]
    PP -> P NP [1.0]
    P -> 'with' [0.61]
    P -> 'in' [0.39]

------------------------------------------

With trace = 1:

Inserting tokens into the most likely constituents table...
Finding the most likely constituents spanning 1 text elements...
Finding the most likely constituents spanning 2 text elements...
Finding the most likely constituents spanning 3 text elements...
Finding the most likely constituents spanning 4 text elements...
Finding the most likely constituents spanning 5 text elements...
Finding the most likely constituents spanning 6 text elements...
Finding the most likely constituents spanning 7 text elements...

Pars

As you can observe above, the trace parameter in `nltk.ViterbiParser` determines the amount of information displayed during the parsing process. The higher the value, the more information is displayed.



*   When `trace` is set to `1`, only the parse probabilities are displayed. 
*   When `trace` is set to `2`, the parse probabilities and the productions used to generate the parse are displayed.
*   When `trace` is set to `3`, the parse probabilities, the productions used to generate the parse, and the intermediate steps of the parsing algorithm are displayed.

##**6. Exercise**

#### Since the sentence has two possible parse trees, modify the grammar probabilities to force the other parsing tree to be more probable. Print the new probability and the tree.

To modify the grammar probabilities to force the other parsing tree to be more probable, I adjusted the probabilities of the productions that lead to that tree. In this case, we need to increase the probability of the production VP -> V NP [0.9], which is responsible for the tree where "with a telescope" is attached to "saw" rather than "man". We can decrease the probabilities of the other VP productions to compensate for this change.

I also changed some other probabilities in my grammar to enhance the likelihood of the other tree.

previous grammar probabilities:

```
S -> NP VP [1.0]
NP -> Det N [0.5] | NP PP [0.25] | 'John' [0.1] | 'I' [0.15]
Det -> 'the' [0.8] | 'a' [0.2]
N -> 'man' [0.5] | 'telescope' [0.5]
VP -> VP PP [0.1] | V NP [0.7] | V [0.2]
V -> 'ate' [0.35] | 'saw' [0.65]
PP -> P NP [1.0]
P -> 'with' [0.61] | 'in' [0.39]

```


New grammar probabilities to enhance the likelihood of the other tree:

```
S -> NP VP [1.0]
NP -> Det N [0.4] | NP PP [0.25] | 'John' [0.05] | 'I' [0.3]
Det -> 'the' [0.8] | 'a' [0.2]
N -> 'man' [0.5] | 'telescope' [0.5]
VP -> VP PP [0.05] | V NP [0.9] | V [0.05]
V -> 'ate' [0.2] | 'saw' [0.8]
PP -> P NP [1.0]
P -> 'with' [0.70] | 'in' [0.30]
```




In [None]:
import nltk
from nltk import PCFG

pcfg2 = PCFG.fromstring("""
S -> NP VP [1.0]
NP -> Det N [0.4] | NP PP [0.25] | 'John' [0.05] | 'I' [0.3]
Det -> 'the' [0.8] | 'a' [0.2]
N -> 'man' [0.5] | 'telescope' [0.5]
VP -> VP PP [0.05] | V NP [0.9] | V [0.05]
V -> 'ate' [0.2] | 'saw' [0.8]
PP -> P NP [1.0]
P -> 'with' [0.70] | 'in' [0.30]
""")

print(pcfg2)

text = "I saw the man with a telescope"
text_tokens = nltk.word_tokenize(text)

viterbi_parser = nltk.ViterbiParser(pcfg2,trace=0)

# Calculate the probability of the most probable parse tree
prob = None
for tree in viterbi_parser.parse(text_tokens):
    prob = tree.prob()
    break
print("Parsing probability:", prob)

# print parsing tree
trees = viterbi_parser.parse(text_tokens)
for tree in trees:
    print(tree)
    #tree.draw()

Grammar with 17 productions (start state = S)
    S -> NP VP [1.0]
    NP -> Det N [0.4]
    NP -> NP PP [0.25]
    NP -> 'John' [0.05]
    NP -> 'I' [0.3]
    Det -> 'the' [0.8]
    Det -> 'a' [0.2]
    N -> 'man' [0.5]
    N -> 'telescope' [0.5]
    VP -> VP PP [0.05]
    VP -> V NP [0.9]
    VP -> V [0.05]
    V -> 'ate' [0.2]
    V -> 'saw' [0.8]
    PP -> P NP [1.0]
    P -> 'with' [0.7]
    P -> 'in' [0.3]
Parsing probability: 0.00024192000000000007
(S
  (NP I)
  (VP
    (V saw)
    (NP
      (NP (Det the) (N man))
      (PP (P with) (NP (Det a) (N telescope)))))) (p=0.00024192)


##**7. Exercise**

#### Execute this code, modifying the parameter trace (values from 1 to 2). Print the output tree and the probability.

Given the following code for learning and using a grammar:
```
import nltk
from nltk.corpus import treebank

productions=[]
S=nltk.Nonterminal('S')

for f in treebank.fileids():
    for tree in treebank.parsed_sents(f):
        productions+=tree.productions()

grammar=nltk.induce_pcfg(S,productions)

for p in grammar.productions()[1:25]:
    print(p)

myparser = nltk.ViterbiParser(grammar,1)
text = "the boy jumps over the board"
mytokens = nltk.word_tokenize(text)
myparsing, = myparser.parse(mytokens)

print(myparsing)
```

In [44]:
import nltk
from nltk.corpus import treebank

productions=[]
S=nltk.Nonterminal('S')

for f in treebank.fileids():
    for tree in treebank.parsed_sents(f):
        productions+=tree.productions()

grammar=nltk.induce_pcfg(S,productions)

for p in grammar.productions()[1:25]:
    print(p)

myparser = nltk.ViterbiParser(grammar,2)
text = "the boy jumps over the board"
mytokens = nltk.word_tokenize(text)
myparsing, = myparser.parse(mytokens)

print(myparsing)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
(S
  (NP-SBJ (DT the) (NN boy))
  (NP-PRD
    (NP (NNS jumps))
    (PP (IN over) (NP (DT the) (NN board))))) (p=1.25001e-20)


The Viterbi parser produces a lot of discards, even more when the trace level is set to 2 or higher. This is because the parser considers many possible parse trees for the input sentence, and it needs to discard most of them in order to find the most likely one.

The discards in the parser's trace correspond to the parse trees that were considered and then rejected because they had lower probability than the best one found so far.