# Probabilistic Context-Free Grammar (PCFG)
- A PCFG is an extension of a standard Context-Free Grammar (CFG) where each production rule is assigned a probability. These probabilities are learned from annotated corpora (e.g., treebanks) and enable the parser to select the most likely parse for ambiguous sentences.

Key Components of a PCFG
1. Non-terminal Symbols (e.g., S, NP, VP).

2. Terminal Symbols (e.g., "cat", "run").

3. Production Rules with probabilities:


Each rule has the form:

 A → α [p]
    
    - A: Non-terminal.
    - α: Sequence of terminals/non-terminals.
    - p: Probability of the rule (0 ≤ p ≤ 1).
    - Constraint: For all rules with the same left-hand side (LHS) A, the probabilities must sum to 1
4. Start Symbol (typically S).

---

### Example PCFG

### How PCFGs Work
1. **Learning Rule Probabilities**

Probabilities are derived from annotated corpora (e.g., the Penn Treebank) using **Maximum Likelihood Estimation (MLE)**:
- **Count(A → α)**: Number of times the rule A→α appears in the corpus.
- **Count(A)**: Total number of times A appears as the LHS in all rules.

**Example**:
If the rule VP → V NP occurs 700 times and all VP rules occur 1000 times: 


P(VP→V NP) = 1000 / 700 =0.7

2. **Parsing with PCFGs**:

The goal is to find the most probable parse tree for a sentence. This is done using:
- **CKY Algorithm (Cocke-Kasami-Younger)**: A dynamic programming algorithm for PCFGs.
- **Viterbi Algorithm**: Tracks the highest-probability subtree for each span of the sentence.

**CKY Algorithm Steps**:
1. **Initialization**: Fill diagonal cells with terminal probabilities.
2. **Iteration**: Combine subtrees bottom-up, calculating probabilities.
3. **Backtracking**: Reconstruct the highest-probability tree from the table.


### Applications of PCFGs
1. **Syntactic Parsing**: Resolving ambiguity (e.g., prepositional phrase attachment).
2. **Grammar Induction**: Learning grammars from unannotated text.
3. **Machine Translation**: Aligning syntactic structures across languages.
4. **Information Extraction**: Identifying key phrases (e.g., named entities).

In [1]:
import nltk

In [2]:
#define the grammar for pcfg
pcfg_grammar = nltk.PCFG.fromstring("""
    S -> NP VP [1.0] 
    PP -> P NP [1.0]
    VP -> V NP [0.7] | VP PP [0.3] 
    NP -> NP PP [0.4] 
    P -> 'with' [1.0]
    V -> 'saw' [1.0]
    NP -> 'astronomers' [0.1] | 'ears' [0.18] | 'saw' [0.04] | 'stars' [0.18] | 'telescopes' [0.1]
    """)

In [3]:
str = "astronomers saw stars with ears"

In [4]:
from nltk.parse import pchart

parser = pchart.InsideChartParser(pcfg_grammar)

#print all possible trees, showing probability of each parse
for t in parser.parse(str.split()):
     print(t)

(S
  (NP astronomers)
  (VP (V saw) (NP (NP stars) (PP (P with) (NP ears))))) (p=0.0009072)
(S
  (NP astronomers)
  (VP (VP (V saw) (NP stars)) (PP (P with) (NP ears)))) (p=0.0006804)
