## Problem Set 5: PDA

_csc427, semester 212
<br>
university of miami
<br>
date: 10 march 2021
<br>
update: 10 march 2021_

---

### Student name: Temuulen Ganbold csc427 ps5

---


### Non-determinism and PDA's.

The class MachineModelPDA implements a PDA. It uses the Schrödinger approach of multible worlds computation. In this case, the state that is traced includes three elements: the location in the input string, the state, and the complete stack. I call this the _extended state_.

The simulation consists of iterating over a "bag" of such in-progress computations, replacing each as they are reviewed with the possible continuations.

It is possible that such computations do not halt, but do not strictly loop. For instance, the "loop" of pushing the same symbol onto the stack does generate new extended states, so is not looping. 

It is clear to our minds that such a computation path can be terminated. It is a limitation of the machine model that it, itself, cannot do this pruning, and is the nature of the _intrinsic non-determinism_ of Context Free Languages. 

For instance, Context Free Languages are __not__ closed under complementation. If they were the perhaps a companion calculation on the complement language will over-rule these loops, announcing definitively that the string is not in the original lanuage, and all hunts should end.

However, in your program, for full credit, your programs must halt!

It is possible to write a conceptually correct program that accepts but does not halt on reject. For a string not in the language, the language defintion, being non-deterministic, premits that the machine run forever loving for the accepting computation that it will never find.

Obviously, we do not write programs in this mindset. Spell checkers do not run forever looking for some dictionary in some dictionary to justify some obvious mispelling. Although I understand Scrabble games often end in such disputes.



## PDA Description Format


__PDA Description:__

    A PDA description is a dictionary with,
    
        'states': a list of states.
        'alphabet': a list of word letters
        'symbols': a list of stack symbols
        'transitions': a dictionary of
                tuple(state,alphabet_e,symbols_e):list(tuple((state,symbols_e)))
        'start': a state (the start state)
        'accept': a list of states (the accepting states)
  
    A state is a string
    A word letter is a string of length 1, but not the string ':'
    A stack symbol is a string but not the string ':'
    alphabet_e is alphabet adjoining ':'
    symbols_e is the symbols adjoining ':'
    The string ':' is used in transtitions to denote the empty string.


In [1]:
"""
The verbose switch:
    Set this true or false, to run code verbosely
"""

verbose = False

In [2]:
class MachineModelPDA:
    """
    A machine description is a dictionary with,
        'states': a list of states.
        'alphabet': a list of word letters
        'symbols': a list of stack symbols
        'transitions': a dictionary of
                tuple(state,alphabet_e,symbols_e):list(tuple((state,symbols_e)))
        'start': a state (the start state)
        'accept': a list of states (the accepting states)
  
    a state is a string
    a word letter is a string of length 1, but not the string ':'
    a stack symbol is a string but not the string ':'
    alphabet_e is alphabet adjoining ':'
    symbols_e is the symbols adjoining ':'
    
    and extended state (ext_state) is the tuple (state,location,stack contents)
    the stack contents is a tuple of symbols, with the top of the stack at index 0,
    
    the current state is a list of extended states, each extended state the result of 
    computing up to but not including the index in the word given by the integer "location" .
    
    """
    
    def __init__(self,machine_description):
        
        def check_transitions():
            ## check that it is a dict((state,alapha_e,symbols_e):list((state_symbols_e)))
            assert type(self.transitions)==type({})
            for k in self.transitions:
                if k[0] not in self.states:
                    print(f'state {k[0]} not a state')
                    assert False
                if not( k[1]==':' or k[1] in self.alphabet ):
                    print(f'letter {k[1]} not in alphabet')
                    assert False
                if not( k[2]==':' or k[2] in self.symbols ):
                    print(f'symbol {k[2]} not a symbol')
                    assert False
                vs = self.transitions[k]
                assert type(vs)==type([])
                for v in vs:
                    if v[0] not in self.states:
                        print(f'state {v[0]} not a state')
                        assert False
                    if not( v[1]==':' or v[1] in self.symbols):
                        print(f'symbol {v[1]} not a symbol')
                        assert False
        
        def check_alphabet():
            ## the alphabet symbols must be single characters,
            ## and should not be the : character which has the special
            ## meaning of a possible non-input consuming transition.
            for a in self.alphabet:
                assert len(a)==1 and a != ':'
                
        def check_start():
            assert self.start_state in self.states
            
        def check_accept_states():
            assert type(self.accept_states)==type([])
            for s in self.accept_states:
                assert s in self.states
        
        self.states = machine_description['states']
        self.alphabet = machine_description['alphabet']
        self.symbols = machine_description['symbols']
        self.transitions = machine_description['transitions']
        self.start_state = machine_description['start'] 
        self.accept_states = machine_description['accept']
        
        check_transitions()
        check_alphabet()
        check_start()
        check_accept_states()
        
        # for the non-deterministic trace we need to have the complete state, 
        # including stack contents and current input location
        self.current_states = [(self.start_state,0,())]
        
        # hint list. this works for verification
        self.merlin = []


    def do_transition(self,word,current_state):
        
        def trans_aux(letter,new_i,state,symbol,stack):
            new_states = []
            key = (state,letter,symbol)
            if key in self.transitions:
                for new_state,new_symbol in self.transitions[key]:
                    if new_symbol != ':':
                        nstack = (new_symbol,)+stack
                    else:
                        nstack = stack
                    new_states.append((new_state,new_i,nstack))
            return new_states
            
        state, i, stack = current_state
        new_states = []
        
        # transition of input letter
        if i<len(word):
            if len(stack)>0:
                new_states.extend(trans_aux(word[i],i+1,state,stack[0],stack[1:]))
            new_states.extend(trans_aux(word[i],i+1,state,':',stack))

        # epsilon transitions
        if len(stack)>0:
            new_states.extend(trans_aux(':',i,state,stack[0],stack[1:]))
        new_states.extend(trans_aux(':',i,state,':',stack))
        
        return set(new_states)  # glean duplicates of possible states
        
    def approximate_compute(self,word,limit):
        
        ## returns True if word is in language, False if it is not,
        ## and None if the computation was terminated and did not
        ## determine membership.
        
        def word_accepted():
            for ext_state in self.current_states:
                state, i, stack = ext_state
                if i==len(word) and state in self.accept_states:
                    return True
            return False
        
        self.current_states = set([(self.start_state,0,())])
        for steps in range(limit):
            
            if verbose: print(self.current_states)
            if word_accepted():
                return True
            if len(self.current_states)==0:
                return False

            new_states = set()
            for ext_state in self.current_states:
                new_states |= self.do_transition(word,ext_state)
            self.current_states = new_states
            if (len(self.current_states)>limit*limit):
                return None
            
        return None   # computation abandoned
        
        
    def describe(self,name=""):
        print("Machine Description:",name)
        print("\tstates:",len(self.states))
        print("\t\t",self.states)
        print("\tsymbols:",len(self.symbols))
        print("\t\t",self.symbols)
        print("\ttransitions:",len(self.transitions))
        for t,v in self.transitions.items():
            print(f"\t\t{t}  ->  {v}")
        print("\taccept states:",len(self.accept_states))
        print("\t\t",self.accept_states)
        print()


def test_machine(pda_description,test_cases,name="",limit=100):
    
    print('running test:',name)
    pda = MachineModelPDA(pda_description)
    if verbose: pda.describe(name)
    for (t,r) in (test_cases):
        if pda.approximate_compute(t,limit) != r:
            print(r,'\t|'+t+'|','\tWRONG, ABORT')
            return False
        print(r,'\t|'+t+'|','\tOK')
    print("** passes test")
    return True
  

## Exercise A

From Sipser Exercises 2.5. Give a description of a PDA accepting a string over $\{\,0,1\,\}$ when,

1. it has at least 3 ones;
1. it begins and ends with the same symbol;
1. it is of odd length;
1. it is of odd length and the middle symbol is a 0;
1. it is a palindrome.
1. never. The language is the  empty set.


In [3]:
pda_2_5_1 = {
    'states':['S','A','B','F'],
    'alphabet':['0','1'],
    'symbols':[],
    'transitions':{
        ('S','0',':'):[('S',':')],('S','1',':'):[('A',':')],
        ('A','0',':'):[('A',':')],('A','1',':'):[('B',':')],
        ('B','0',':'):[('B',':')],('B','1',':'):[('F',':')],
        ('F','0',':'):[('F',':')],('F','1',':'):[('F',':')],
    },
    'start':'S',
    'accept':['F']
}

test_2_5_1 = [
    ('11',False),
    ('1000000',False),
    ('10001',False),
    ('111',True),
    ('10000111',True),
    ('1111',True)
]

pda_2_5_2 = {
    'states':['S','F1','F2','N1','N2'],
    'alphabet':['0','1'],
    'symbols':[],
    'transitions':{
        ('S','0',':'):[('F1',':')],('S','1',':'):[('F2',':')],
        ('F1','0',':'):[('F1',':')],('F1','1',':'):[('N1',':')],
        ('N1','0',':'):[('F1',':')],('N1','1',':'):[('N1',':')],
        ('F2','0',':'):[('N2',':')],('F2','1',':'):[('F2',':')],
        ('N2','0',':'):[('N2',':')],('N2','1',':'):[('F2',':')],
        
    },
    'start':'S',
    'accept':['F1','F2']  
}

test_2_5_2 = [
    ('',False),
    ('1',True),
    ('0',True),
    ('01',False),
    ('1011111',True),
]

pda_2_5_3 = {
    'states':['S','E','F'],
    'alphabet':['0','1'],
    'symbols':[],
    'transitions':{
        ('S',':',':'):[('E',':')],
        ('S','0',':'):[('F',':')],('S','1',':'):[('F',':')],
        ('F','0',':'):[('E',':')],('F','1',':'):[('E',':')],
        ('E','0',':'):[('F',':')],('E','1',':'):[('F',':')],
    },
    'start':'S',
    'accept':['F']    
}

test_2_5_3 = [
    ('',False),
    ('10',False),
    ('1001111',True),
]

pda_2_5_4 = {
    'states':['S','A','B','F','S1','F1','A1','B1'],
    'alphabet':['0','1'],
    'symbols':['$','E'],
    'transitions':{
        ('S1',':',':'):[('S','$')],
        ('S','0',':'):[('A',':')],('S','1',':'):[('A',':')],
        ('A','0',':'):[('A1','E'),('B1','E')],('A','1',':'):[('A1','E')],
        ('A1','0','E'):[('A',':'),('B',':')],('A1','1','E'):[('A',':')],
        ('B1','0','E'):[('B',':'),('F',':')],('B1','1','E'):[('B',':'),('F',':')],
        
        ('F',':','$'):[('F1',':')],
        
    },
    'start':'S1',
    'accept':['F1']    
}



test_2_5_4 = [
    ('',False),
    ('101',True),
    ('111',False),
    ('001',True),
    ('0010101',True),
    ('0000',False),
    ('0010101',True),
    ('000011',False),
    
]

# tricky because even or odd
pda_2_5_5 = {
    'states':['S','S1','F','F1','A','A1','B','B1'],
    'alphabet':['0','1'],
    'symbols':['$','1','0'],
    'transitions':{
        ('S',':',':'):[('S1','$')],
        ('S1','0',':'):[('A','0'),('F',':')],('S1','1',':'):[('A','1'),('F',':')],
        ('A','0',':'):[('A','0'),('A1',':')],('A','0','0'):[('A1',':')],
        ('A','1',':'):[('A','1'),('A1',':')],('A','1','1'):[('A1',':')],
        ('A1','0','0'):[('A1',':')],('A1','1','1'):[('A1',':')],
        
      
        ('A1',':','$'):[('F1',':')],
        ('F',':','$'):[('F1',':')],
    },
    'start':'S',
    'accept':['F1','S']    
}

test_2_5_5 = [
    ('',True),
    ('1',True),
    ('0',True),
    ('11',True),
    ('00',True),
    ('10',False),
    ('01',False),
    ('101',True),
    ('010',True),
    ('111',True),
    ('0000',True),
    ('10101',True),
    ('101101',True),
    ('1110000',False),
    ('0010001',False),
    ('1000100',False),
    ('100000000',False),
    ('10001000001',False),
]

# not to be confused with the language {':'}
pda_2_5_6 = {
    'states':['S','E','F','F1'],
    'alphabet':['0','1'],
    'symbols':['$'],
    'transitions':{
        ('S',':',':'):[('E','$')],
        ('E','0',':'):[('F',':')],('E','1',':'):[('F',':')],
        ('F','0',':'):[('F',':')],('F','1',':'):[('F',':')],
    },
    'start':'S',
    'accept':['F1']    
}

test_2_5_6 = [
    ('',False),
    ('1',False),
    ('00',False),
    ('010',False),
]

### Basic Tests

In [4]:

def basic_test(pdas,tests):
    i = 0
    correct = 0
    for mac, test in zip(pdas,tests):
        if test_machine(mac,test,name='exercise '+str(i+1),limit=100):
            correct += 1
        i += 1
    print(f'\n{correct} out of {i} correct')


In [5]:
exer_A = [
    pda_2_5_1,
    pda_2_5_2,
    pda_2_5_3,
    pda_2_5_4,
    pda_2_5_5,
    pda_2_5_6
]

test_A = [
    test_2_5_1,
    test_2_5_2,
    test_2_5_3,
    test_2_5_4,
    test_2_5_5,
    test_2_5_6,
]

basic_test(exer_A,test_A)

running test: exercise 1
False 	|11| 	OK
False 	|1000000| 	OK
False 	|10001| 	OK
True 	|111| 	OK
True 	|10000111| 	OK
True 	|1111| 	OK
** passes test
running test: exercise 2
False 	|| 	OK
True 	|1| 	OK
True 	|0| 	OK
False 	|01| 	OK
True 	|1011111| 	OK
** passes test
running test: exercise 3
False 	|| 	OK
False 	|10| 	OK
True 	|1001111| 	OK
** passes test
running test: exercise 4
False 	|| 	OK
True 	|101| 	OK
False 	|111| 	OK
True 	|001| 	OK
True 	|0010101| 	OK
False 	|0000| 	OK
True 	|0010101| 	OK
False 	|000011| 	OK
** passes test
running test: exercise 5
True 	|| 	OK
True 	|1| 	OK
True 	|0| 	OK
True 	|11| 	OK
True 	|00| 	OK
False 	|10| 	OK
False 	|01| 	OK
True 	|101| 	OK
True 	|010| 	OK
True 	|111| 	OK
True 	|0000| 	OK
True 	|10101| 	OK
True 	|101101| 	OK
False 	|1110000| 	OK
False 	|0010001| 	OK
False 	|1000100| 	OK
False 	|100000000| 	OK
False 	|10001000001| 	OK
** passes test
running test: exercise 6
False 	|| 	OK
False 	|1| 	OK
False 	|00| 	OK
False 	|010| 	OK
** passes test

6 

## Exercise B

From Sipser Exercises 2.7. Give a description of a PDA accepting strings,

1. The language,

$$
\{\, s\in\{\,a,b\,\}\,|\, \mbox{$s$ has more $a$'s than $b$'s }\,\}
$$

2. The complement of the language,

$$
\{\,a^nb^n\,|\, n\ge 0\,\}
$$

3. The language,

$$
\{\,w\#x\,|\, w,x \in \{\,a,b\,\}, w^R \mbox{ is a substring of } x\,\}
$$
they are of the form w#x, with the string reverse(w) a substring of x, and '#' only where shown,
4. they are of the form 

$$
\{\,x_1\#x_2\#\dots\#x_k \,|\, k\ge 1, x_i \in\{\,a,b \,\}^*, \exists \,i, j \mbox{ s.t. } x_i = (x_j)^R\,\}
$$

   
   
   

In [6]:
pda_2_7_1 = {
    'states':['S','S1','A','F','B','F1'],
    'alphabet':['a','b'],
    'symbols':['$','a','b'],
    'transitions':{
        ('S1',':',':'):[('S','$')],
        ('S','a',':'):[('A','a')],('S','b',':'):[('B','b')],
        ('A','a',':'):[('A','a')],('A','b','a'):[('B',':')],
        ('A','a','b'):[('A',':')],('B','b','a'):[('B',':')],
        ('B','a','b'):[('A',':')],('B','b',':'):[('B','b')],
        ('B','a',':'):[('A','a')],('A','b',':'):[('B','b')],
        
        
        ('B',':','a'):[('F',':')],('A',':','a'):[('F',':')],
        ('F',':','a'):[('F',':')],('F',':','$'):[('F1',':')],
        
        
    },
    'start':'S1',
    'accept':['F1']    
}

pda_2_7_2 = {
    'states':['S','S1','A','B','F','F1','C','C1','A1','B1','SS','AA','BB','FF','FF1'],
    'alphabet':['a','b'],
    'symbols':['$','q','w'],
    'transitions':{
        #a^n b^m n!=m
        ('S1',':',':'):[('S','$'),('SS','$')],
        ('S','a',':'):[('A','q')],('S','b',':'):[('B','w')],
        ('A','a',':'):[('A','q')],('A','b','q'):[('B',':')],
        ('B','b','q'):[('B',':')],('B','b',':'):[('B','w')],
        
        #(aUb)*ba(aUb)*
        
        ('SS','a',':'):[('AA',':')],('SS','b',':'):[('BB',':'),('C',':')],
        ('AA','a',':'):[('AA',':')],('AA','b',':'):[('C',':')],
        ('BB','b',':'):[('BB',':'),('C',':')],('BB','b',':'):[('BB',':'),('C',':')],
        ('C','a',':'):[('C1',':'),('FF',':')],
        ('C1','a',':'):[('A1',':')],('C1','b',':'):[('B1',':')],
        ('A1','a',':'):[('A1',':')],('B1','b',':'):[('B1',':')],
        ('A1',':',':'):[('FF',':')],('B1',':',':'):[('FF',':')],
        ('FF',':','$'):[('FF1',':')],('FF',':','$'):[('FF1',':')],
        
        #a^n b^m n!=m
        ('A',':','q'):[('F',':')],
        ('B',':','q'):[('F',':')],('B',':','w'):[('F',':')],
        ('F',':','q'):[('F',':')],('F',':','w'):[('F',':')],
        ('F',':','$'):[('F1',':')],
    },
    'start':'S1',
    'accept':['F1','FF1']
}

pda_2_7_3 = {
    'states':['S1','S','A','B','C','F'],
    'alphabet':['a','b','#'],
    'symbols':['$','q','w'],
    'transitions':{
        ('S1',':',':'):[('S','$')],
        ('S','a',':'):[('S','q')],('S','b',':'):[('S','w')],
        ('S','#',':'):[('A',':')],
        ('A','a','q'):[('A',':')],('A','b','w'):[('A',':')],
        ('A','a',':'):[('A',':')],('A','b',':'):[('A',':')],
        ('A',':','$'):[('F',':')],
    },
    'start':'S1',
    'accept':['F']
}

pda_2_7_4 = {
    'states':['0','1','2','3','4','5','6','7'],
    'alphabet':['a','b','#'],
    'symbols':['$','a','b'],
    'transitions':{
        ('0',':',':'):[('1','$')],
        ('1',':',':'):[('2',':'),('3',':')],('2','#',':'):[('1',':')],
        ('2','a',':'):[('2',':')],('2','b',':'):[('2',':')],
        ('3','a',':'):[('3','a')],('3','b',':'):[('3','b')],
        ('3','#',':'):[('4',':')],('3',':',':'):[('6',':')],
        ('4',':',':'):[('3',':'),('6',':')],('4','#',':'):[('5',':')],
        ('6','a','a'):[('6',':')],('6','b','b'):[('6',':')],
        ('6','#',':'):[('4',':')],('6',':','$'):[('7',':')],
        ('5',':',':'):[('4',':')],
        ('5','a',':'):[('5',':')],('5','b',':'):[('5',':')],
        ('5',':','$'):[('7',':')],
    },
    'start':'0',
    'accept':['7']
}

test_2_7_1 =[
    ('',False),
    ('bbaa',False),
    ('abaabbbb',False),
    ('abababab',False),
    ('abababb',False),
    ('aaaaa',True),
    ('abbbaaaa',True),
    ('abababa',True),
    ('bbbabaaaaba',True),
    ('bbbabaaaa',True),
    ('bababaa',True),
    ('baba',False),
    ('bbbba',False),
    ('a',True),
    ('b',False),
    ('ab',False),
    ('aba',True)
]

test_2_7_2 =[
    #a^n b^m n!=m
    ('ab',False),
    ('aabbb',True),
    ('abbb',True),
    ('aaaab',True),
    ('aa',True),
    ('bb',True),
    #(aUb)*ba(aUb)*
    ('ba',True),
    ('abab',True),
    ('aaababbbb',True),
    ('babbb',True),
    ('aaaaba',True),
    ('bbbbabaaa',False),
]

test_2_7_3 =[
    ('#',True),
    ('a#a',True),
    ('ab#ab',False),
    ('ab#aba',True),
    ('a#a#a',False)
]

test_2_7_4 =[
    
    ('abb#bba',True),
    ('ba#baa#aab#ba',True),
    ('ba#ba',False)
]


In [7]:
exer_B = [
    pda_2_7_1,
    pda_2_7_2,
    pda_2_7_3,
    pda_2_7_4,
]

test_B = [
    test_2_7_1,
    test_2_7_2,
    test_2_7_3,
    test_2_7_4,
]

basic_test(exer_B,test_B)

running test: exercise 1
False 	|| 	OK
False 	|bbaa| 	OK
False 	|abaabbbb| 	OK
False 	|abababab| 	OK
False 	|abababb| 	OK
True 	|aaaaa| 	OK
True 	|abbbaaaa| 	OK
True 	|abababa| 	OK
True 	|bbbabaaaaba| 	OK
True 	|bbbabaaaa| 	OK
True 	|bababaa| 	OK
False 	|baba| 	OK
False 	|bbbba| 	OK
True 	|a| 	OK
False 	|b| 	OK
False 	|ab| 	OK
True 	|aba| 	OK
** passes test
running test: exercise 2
False 	|ab| 	OK
True 	|aabbb| 	OK
True 	|abbb| 	OK
True 	|aaaab| 	OK
True 	|aa| 	OK
True 	|bb| 	OK
True 	|ba| 	OK
True 	|abab| 	OK
True 	|aaababbbb| 	OK
True 	|babbb| 	OK
True 	|aaaaba| 	OK
False 	|bbbbabaaa| 	OK
** passes test
running test: exercise 3
True 	|#| 	OK
True 	|a#a| 	OK
False 	|ab#ab| 	OK
True 	|ab#aba| 	OK
False 	|a#a#a| 	OK
** passes test
running test: exercise 4
True 	|ab#aa#ab| 	WRONG, ABORT

3 out of 4 correct
