#### Parsing Prefixes with Earley

Here is Earley's algorithm from the course notes
```
s₀ := {(S → • π, 0)}; for i = 1 to n do sᵢ := {}
for i = 0 to n do
    v := {}
    while v ≠ sᵢ do
        e :∈ sᵢ - v; v := v ∪ {e}
        case e of
            (A → σ • a ω, j) and a = xᵢ₊₁:        -- match (M)
                sᵢ₊₁ := sᵢ₊₁ ∪ {(A → σ a • ω, j)} 
            (A → σ • B ω, j):                     -- predict (P)
                for B → μ ∈ P do
                    sᵢ := sᵢ ∪ {(B → • μ, i)} 
            (A → σ •, j):                         -- complete (C)
                for (B → μ • A ξ, k) ∈ sⱼ do
                    sᵢ := sᵢ ∪ {(B → μ A • ξ, k)}
accept := (S → π •, 0) ∈ sₙ
```

and its implementation in Python:

In [1]:
def parse(g: "grammar", x: "input"):
    global s
    n = len(x); x = '^' + x + '$'; S, π = g[0][0], g[0][2:]
    s = [{(S, '', π, 0)}] + [set() for _ in range(n)]#; print('   s[ 0 ]:', S, '→ •', π, ', 0')
    for i in range(n + 1):
        v = set() # visited items
        while v != s[i]:
            e = (s[i] - v).pop(); v.add(e) # pick an arbirary un-visited item
            A, σ, τ, j = e
            if len(τ) > 0 and τ[0] == x[i + 1]: # match, a == τ[0]
                f = (A, σ + τ[0], τ[1:], j)
                s[i + 1].add(f)#; print('M  s[', i + 1, ']:', f[0], '→', f[1], '•', f[2], ',', f[3])
            elif len(τ) > 0: # predict, B == ω[0]
                for f in ((r[0], '', r[2:], i) for r in g if r[0] == τ[0]):
                    s[i].add(f)#; print('P  s[', i, ']:', f[0], '→', f[1], '•', f[2], ',', f[3])
            else: # complete, len(τ) == 0
                for f in ((B, μ + ν[0], ν[1:], k) for (B, μ, ν, k) in s[j] if len(ν) > 0 and ν[0] == A):
                    s[i].add(f)#; print('C  s[', i, ']:', f[0], '→', f[1], '•', f[2], ',', f[3])
    return (S, π, '', 0) in s[n]

Parsing an input with a grammar returns a Boolean result, e.g.:

In [2]:
G0 = ("S→E", "E→a", "E→a+E")

In [3]:
assert parse(G0, "a+a")
assert parse(G0, "a+a+a")
assert not parse(G0, "a+b")
assert not parse(G0, "b+a")
assert not parse(G0, "a+a+")

For the last input, the prefixes `a` and `a+a` are derivable from the `S` in `G0`, but not the whole input. Now modify the Python implementation to return all positions up to which the input is derivable from the start symbol! Copy the body of the Python function `parse` to the cell below and modify it: 

In [60]:
def prefixparse(g: "grammar", x: "input"):
    
    def parse(x):
        global s
        n = len(x); x = '^' + x + '$'; S, π = g[0][0], g[0][2:]
        s = [{(S, '', π, 0)}] + [set() for _ in range(n)]#; print('   s[ 0 ]:', S, '→ •', π, ', 0')
        for i in range(n + 1):
            v = set() # visited items
            while v != s[i]:
                e = (s[i] - v).pop(); v.add(e) # pick an arbirary un-visited item
                A, σ, τ, j = e
                if len(τ) > 0 and τ[0] == x[i + 1]: # match, a == τ[0]
                    f = (A, σ + τ[0], τ[1:], j)
                    s[i + 1].add(f)#; print('M  s[', i + 1, ']:', f[0], '→', f[1], '•', f[2], ',', f[3])
                elif len(τ) > 0: # predict, B == ω[0]
                    for f in ((r[0], '', r[2:], i) for r in g if r[0] == τ[0]):
                        s[i].add(f)#; print('P  s[', i, ']:', f[0], '→', f[1], '•', f[2], ',', f[3])
                else: # complete, len(τ) == 0
                    for f in ((B, μ + ν[0], ν[1:], k) for (B, μ, ν, k) in s[j] if len(ν) > 0 and ν[0] == A):
                        s[i].add(f)#; print('C  s[', i, ']:', f[0], '→', f[1], '•', f[2], ',', f[3])
        return (S, π, '', 0) in s[n]
        
    return {i + 1 for i in range(len(x)) if parse(x[:i + 1])}

Here are some test cases.

In [61]:
assert prefixparse(G0, "a+a") == {1, 3}
assert prefixparse(G0, "a+a+a") == {1, 3, 5}
assert prefixparse(G0, "a+b") == {1}
assert prefixparse(G0, "b+a") == set()
assert prefixparse(G0, "a+a+") == {1, 3}

In [62]:
G1 = ("S→E", "E→T", "E→E+T", "E→E-T", "T→F", "T→T×F", "F→a", "F→-E", "F→(E)")

In [63]:
prefixparse(G1, "(-a)")

{4}

In [64]:
assert prefixparse(G1, "") == set()
assert prefixparse(G1, "a×a+a×a") == {1, 3, 5, 7}
assert prefixparse(G1, "--a") == {3}
assert prefixparse(G1, "-(-a)))") == {5}
assert prefixparse(G1, "((-(-a)") == set()
assert prefixparse(G1, "-a×a") == {2, 4}

*Hint:* Some Python constructs may be of help. The function `enumerate` allows to iterate over elements of a list and their index, e.g.:

In [65]:
assert [(i, e) for i, e in enumerate([3, 4, 5])] == [(0, 3), (1, 4), (2, 5)]

When iterating over tuples, the tuples can be decomposed, as in:

In [66]:
assert sum(a == b for a, b in [(1, 2), (3, 3), (1, 1)]) == 2