***

## Universal Turing Machines

>csc427: Theory of Automata and Complexity. 
<br>
university of miami
<br>
spring 2020.
<br>
Burton Rosenberg.
<br>
<br>
Created: 1 April 2020
<br>last update: 7 April 2020

***


### Overview

A two tape Turing Machine has two tapes, with two reading heads. The transitions now depend on the triple of the current state, the symbol under the head on the first tape, and the symbol under the head of the second tape.

The effect of a transition is to write a possibly new symbol under the head of the first tape, and possibly move that head one cell left or right, and also write a possibly new symbol under the head of the second tape, and possibly move that head one cell left or right.

Beware, there is nothing universal about the two tape turing machine so defined. It becomes universal when given a program that uses the first tape as a compute tape and the second tape as a program tape. This is called a Harvard Architecture machine, where program and data are stored separately. It is named after the Harvard Mark I, a computer built at Harvard in 1944. It contrasts with a Princeton Architecture machine which has a single storage for both program and data.

So there will be two types of compilers. There will be a description for a two-tape Turing Machine. A parser then creates a device the calculates according to that description. Consider this to be the microcode inside a CPU. Then there is a compiler that takes a description of a one tape Turing Machine, and creates a long string that is given as one of the two inputs to the universal Turing Machine.


### Two Tape Turing Machine Description Syntax

The TM is described by a multiline string, with the format:

- **Comments:**
 - If the first character of the line after whitespace is #, the entire line is a comment.
- **Stanzas:**
 - Stanzas begin with a tag-name in column 1, a colon, and an argument.
 - Stanzas continue with a non-empty line beginning with whitespace.
- **Start:** 
 - Begins with tag "start". 
 - Argument is the start state.
 - There must be no continuation lines. 
- **Accept:**
 - Begins with tag "accept".
 - Argument is an accept state.
 - Each contination line, if any, is an accept state.
- **Reject:**
 - Begins with tag "reject".
 - Argument is an reject state.
 - Each contination line, if any, is an reject state.
- **State:**
 - Begin with the tag "state"
 - Argument is the state name that applies for the stanza
 - Each continuation line is a transition
 - Syntax of a continuation line:
   - *read_symbol_1 read_symbol_2 write_symbol_1 action_1 write_symbol_2 action_2 state*.
 - Wildcard matching using the '%' symbol. 
   - if there is no direct match with the two tape symbols, wildcards will be tried.
   - first single wildcards will be tried. a match on tape symbols "a b" will look for
     rules "% b" and "a %".
     - if a rule "% b" is found, then the wildcard is set to a, and the rule is applied.
     - if a rule "a %" is found, then the wildcard is set to b, and the rule is applied.
     - it both are found, exactly one of them is applied. 
   - if no single wildcard is found, and the tape symbols are equal, a double wildcard will be tried.
     - if the rule "% %" is found, then the wild card is set to the common value of the two tape symbols
   - if a match occurs and a write symbol is '%', the write symbol is set to the wildcard
 - Action must be one of l, r, or n
 - Unwritten tape is blank.
 - Use _ for the blank symbol in cases where the syntax requires non-white space.
 - The symbols ":" and ";" are intended as special marks. One can be used to make the left end of tape. 
 - Left on the left end of the tape leaves the head position unchanged.
- **To do**
 - The verbose variant actions L, R, N. 
 - What the verbose actions print out will depend on whether it is a simulated or action action.

### Simulation and Program Tape

The two tape machine uses one tape read-only where a "compiled binary" is written. The simulation tape refers to that compiled program while working read-write on the other tape.

That tape is divided into two areas. (Note to 2020 students - other approaches used a three area work tape. I have combined the first two area. I hope this makes things less complicated.)

The first area contains a number in binary then a sequence of characters. That string alternatingly represents the input and output to the delta function (state transitions). When that string is the input, a match is seached for on the read-only tape and when found, what follows the match on the read-only tape is copied over the input.

The second area of the tape contains a replica of what would be on the work tape of the simulated TM. Not exactly - each cell on the original gives two cells on the simulator. The first of the two cells is either blank or contains a 1. There is exactly one 1 on in this region of the work tape, and it marks where the head rests on the simulated tape.



In [1]:
import string
import sys
import os
import argparse
import re
import math

#
# tm-universal.py
#
# author: bjr
# date: 1 apr 2020
# last update: 1 apr 2020
#
# copyright: Creative Commons. See http://www.cs.miami.edu/home/burt
#


#
# BETA VERSION 
#

class TuringMachineTwoTape:

    def __init__(self,verbose="none"):
        self.start_state = ""
        self.accept_states = set()
        self.reject_states = set()
        self.transitions = {}
        self.current_state = ""
        self.step_counter = 0
        self.all_actions = ["r","l","n"]
        self.verbose_levels = {"none":0, "verbose":1, "debug":2}
        self.tape_1 = [' ']
        self.tape_2 = [' ']
        self.position_1 = 0
        self.position_2 = 0
        self.verbose = self.verbose_levels[verbose]

    def set_start_state(self,state):
        self.start_state = state

    def set_tapes(self,tape_string_1,tape_string_2):
        # change '_' to ' '
        self.tape_1 = [' ' if symbol=='_' else symbol  for symbol in tape_string_1]
        self.tape_2 = [' ' if symbol=='_' else symbol  for symbol in tape_string_2]
        
    def str_tapes(self):
        if self.tape_1[self.position_1]==' ': self.tape_1[self.position_1] = '_'
        else:
            self.tape_1[self.position_1] = self.tape_1[self.position_1].upper()
        if self.tape_2[self.position_2] == ' ': self.tape_2[self.position_2]  = '_'
        else:
            self.tape_2[self.position_2] = self.tape_2[self.position_2].upper()
        return ( ''.join(self.tape_1), ''.join(self.tape_2))
        
    def set_verbose(self,verbose):
        self.verbose = 0
        if verbose in self.verbose_levels:
            self.verbose = self.verbose_levels[verbose]

    def add_accept_state(self,state):
        self.accept_states.add(state)

    def add_reject_state(self,state):
        self.reject_states.add(state)
    
    def get_current_state(self):
        return self.curent_state

    def add_transition(self,state_from,read_symbol_1,read_symbol_2,
                       write_symbol_1,action_1,write_symbol_2,action_2,state_to):
        
        if self.verbose >= self.verbose_levels['debug']:
            print("adding transition:", 
                  state_from, read_symbol_1, read_symbol_2, write_symbol_1, action_1,
                  write_symbol_2, action_2, state_to )

        (read_symbol_1,read_symbol_2,write_symbol_1,write_symbol_2) = (' ' if sym=='_' 
            else sym for sym in (read_symbol_1,read_symbol_2,write_symbol_1,write_symbol_2))

        if action_1 not in self.all_actions or action_2 not in self.all_actions:
            # return something instead, nobody likes a chatty program
            return "WARNING: unrecognized action, skipping."
        x = (state_from, read_symbol_1, read_symbol_2)
        if x in self.transitions:
            return "WARNING: multiple outgoing states not allowed for DFA's, skipping."
        self.transitions[x] = (state_to,write_symbol_1,action_1,write_symbol_2,action_2)
        return None

    def restart(self,tape_string_1,tape_string_2):
        self.current_state = self.start_state
        self.step_counter = 1
        self.position_1 = 0
        self.position_2 = 0
        if len(tape_string_1)==0 : tape_string_1 = ' '
        if len(tape_string_2)==0 : tape_string_2 = ' '
        self.set_tapes(tape_string_1,tape_string_2)

    def transition_wildcard(self,current_state,read_1,read_2):
                            
        def replace_wildcards(write_1,write_2,wild_card):
            return ( wild_card if write_1=='%' else write_1,wild_card if write_2=='%' else write_2)

        # direct match 
        x = (current_state,read_1,read_2)
        if x in self.transitions:
            (new_state, write_1, action_1, write_2, action_2) = self.transitions[x]
            return (new_state, write_1, action_1, write_2, action_2)
            
        # single wildcard
        wild_card = read_1
        x = (current_state,'%',read_2)
        if x in self.transitions:
            (new_state, write_1, action_1, write_2, action_2) = self.transitions[x]
            ( write_1, write_2 ) = replace_wildcards( write_1, write_2, wild_card)
            return (new_state, write_1, action_1, write_2, action_2)

        # single wildcard
        wild_card = self.tape_2[self.position_2]
        x = (current_state,self.tape_1[self.position_1],'%')
        if x in self.transitions:
            (new_state, write_1, action_1, write_2, action_2) = self.transitions[x]
            ( write_1, write_2 ) = replace_wildcards( write_1, write_2, wild_card)
            return (new_state, write_1, action_1, write_2, action_2)

        # double wildcard
        if self.tape_1[self.position_1]==self.tape_2[self.position_2]:
            wild_card = self.tape_2[self.position_2]
            x = (current_state,'%','%')
            if x in self.transitions:
                (new_state, write_1, action_1, write_2, action_2) = self.transitions[x]
                ( write_1, write_2 ) = replace_wildcards( write_1, write_2, wild_card)
                return (new_state, write_1, action_1, write_2, action_2)

        # no match  
        if self.verbose>=self.verbose_levels['debug']:
            print('current state:', current_state, 
                  'current symbol 1: |', self.tape_1[self.position_1],'| current position 1: ', self.position_1,
                  'current symbol 2: |', self.tape_2[self.position_2],'| current position 1: ', self.position_2
                 )
        return (current_state,self.tape_1[self.position_1],self.tape_2[self.position_2])

    def step_transition(self):

        res = self.transition_wildcard(self.current_state, self.tape_1[self.position_1], self.tape_2[self.position_2])
        if len(res)== 3:
            return res
        
        (new_state, write_1, action_1, write_2, action_2) = res
        self.current_state = new_state
        self.tape_1[self.position_1] = write_1
        self.tape_2[self.position_2] = write_2
        
        if action_1 == 'l' and self.position_1>0: self.position_1 -= 1
        if action_1 == 'r':
            self.position_1 += 1
            if self.position_1==len(self.tape_1): self.tape_1[self.position_1:] = ' '
        if action_1 == 'n': pass
        
        if action_2 == 'l' and self.position_2>0: self.position_2 -= 1
        if action_2 == 'r':
            self.position_2 += 1
            if self.position_2==len(self.tape_2): self.tape_2[self.position_2:] = ' '
        if action_2 == 'n': pass

        if self.verbose >= self.verbose_levels['debug']:
            print("\t", self.step_counter, "\t", new_state, symbol_1, action_1, symbol_2, action_2)
        self.step_counter += 1
        return None

    def compute_tm(self,tape_string_1,tape_string_2,step_limit=0):
        self.restart(tape_string_1,tape_string_2)
        step = 0
            
        stop_states = self.accept_states.union(self.reject_states)
        while self.current_state not in stop_states:
            res = self.step_transition()
            if res: return ("no transition",res)
            step += 1
            if step > step_limit:
                (t1,t2) = self.str_tapes()
                return ("step limit",step,t1,t2)
            if self.verbose >= self.verbose_levels['debug']:
                print(step, self.current_state, self.position_1, self.tape_1, self.position_2, self.tape_2 )

        cause = "reject"
        if self.current_state in self.accept_states: cause = "accept"
        (t1,t2) = self.str_tapes()
        return (cause,t1,t2)


    def print_tape(self):
        t, p = self.tape_1, self.position_1
        s1 = ''.join(t[:p] + ['_'] + [t[p]] + ['_'] + t[p+1:])
        t, p = self.tape_2, self.position_2
        s2 = ''.join(t[:p] + ['_'] + [t[p]] + ['_'] + t[p+1:])
        print("step:",self.step_counter, "state:", self.current_state,"\t",s1,s2)
    
    def print_tm(self):
        print("\nstart state:\n\t",self.start_state)
        print("accept states:\n\t",self.accept_states)
        print("reject states:\n\t",self.reject_states)
        print("transitions:")
        for t in self.transitions:
            print("\t",t,"->",self.transitions[t])
        # print("tape:\n\t",self.tape)
        
### end class TuringMachineTwoTape


class MachineParserTwoTape:

    @staticmethod
    def turing(tm_obj, fa_string):
        """
        Code to parse a Turing Machine description into the Turing Machine object.
        """
        
        fa_array = fa_string.splitlines()
        line_no = 0 
        current_state = ""
        in_state_read = False
        in_accept_read = False
        in_reject_read = False

        for line in fa_array:
            while True:

                # comment lines are fully ignored
                if re.search('^\s*#',line):
                    break

                if re.search('^\s+',line):

                    if in_state_read:
                        m = re.search('\s+(\w|:|;|%)\s+(\w|:|;|%)\s+(\w|:|;|%)\s+(\w)\s+(\w|:|;|%)\s+(\w)\s+(\w+)',line)
                        if m:
                            res = tm_obj.add_transition(current_state,
                                    m.group(1),m.group(2),m.group(3),m.group(4),m.group(5),m.group(6),m.group(7))
                            if res: 
                                print(res)
                            break

                    if in_accept_read:
                        m = re.search('\s+(\w+)',line)
                        if m:
                            tm_obj.add_accept_state(m.group(1))
                            break

                    if in_reject_read:
                        m = re.search('\s+(\w+)',line)
                        if m:
                            tm_obj.add_reject_state(m.group(1))
                            break

                in_state_read = False
                in_accept_read = False
                in_reject_read = False

                # blank lines do end multiline input
                if re.search('^\s*$',line):
                    break ;

                m = re.search('^start:\s*(\w+)',line)
                if m:
                    tm_obj.set_start_state(m.group(1))
                    break

                m = re.search('^accept:\s*(\w+)',line)
                if m:
                    tm_obj.add_accept_state(m.group(1))
                    in_accept_read = True
                    break

                m = re.search('^reject:\s*(\w+)',line)
                if m:
                    tm_obj.add_reject_state(m.group(1))
                    in_reject_read = True
                    break

                m = re.search('^state:\s*(\w+)',line)
                if m:
                    in_state_read = True
                    current_state = m.group(1)
                    break

                print(line_no,"warning: unparsable line, dropping: ", line)
                break

            line_no += 1
        return

### end class MachineParser


class TuringMachineCompiler:

    @staticmethod
    def compiler(fa_string,n_states=512):
        """
        Code to parse a Turing Machine description into the Turing Machine object.
        """
        
        fa_array = fa_string.splitlines()
        line_no = 0 
        current_state = ""
        in_state_read = False
        in_accept_read = False
        in_reject_read = False

        state_numbering = {}
        next_avail_number = 3
        obj_code = ';;'
        bit_field = int(math.log2(n_states))
        bin_format = '{'+':0{}b'.format(bit_field)+'}'
        delta_format = bin_format +' {} {}'+bin_format+'{}{}{}{};'

        for line in fa_array:
            while True:

                # comment lines are fully ignored
                if re.search('^\s*#',line):
                    break

                if re.search('^\s+',line):

                    if in_state_read:
                        m = re.search('\s+(\w|:|;|%)\s+(\w|:|;|%)\s+(\w|:|;|%)\s+(\w)\s+(\w|:|;|%)\s+(\w)\s+(\w+)',line)
                        if m:
                            (rs1,rs2,ws1,a1,ws2,a2,st) = (m.group(i) for i in range(1,8))
                            if st not in state_numbering:
                                state_numbering[st] = next_avail_number
                                next_avail_number += 1
                            (rs1,rs2,ws1,ws2) =(' ' if sym=='_' else sym for sym in (rs1,rs2,ws1,ws2))
                            obj_code += delta_format.format(state_numbering[current_state],
                                                rs1,rs1,state_numbering[st],ws1,a1,ws2,a2)
                            break

                    if in_accept_read:
                        m = re.search('\s+(\w+)',line)
                        if m:
                            state_numbering[m.group(1)] = 1 # an accept state
                            break

                    if in_reject_read:
                        m = re.search('\s+(\w+)',line)
                        if m:
                            state_numbering[m.group(1)] = 2 # a reject state
                            break

                in_state_read = False
                in_accept_read = False
                in_reject_read = False

                # blank lines do end multiline input
                if re.search('^\s*$',line):
                    break ;

                m = re.search('^start:\s*(\w+)',line)
                if m:
                    state_numbering[m.group(1)] = 0 # a start state
                    break

                m = re.search('^accept:\s*(\w+)',line)
                if m:
                    state_numbering[m.group(1)] = 1 # an accept state
                    in_accept_read = True
                    break

                m = re.search('^reject:\s*(\w+)',line)
                if m:
                    state_numbering[m.group(1)] = 2 # a reject state
                    in_reject_read = True
                    break

                m = re.search('^state:\s*(\w+)',line)
                if m:
                    in_state_read = True
                    current_state = m.group(1)
                    if current_state not in state_numbering:
                        state_numbering[current_state] = next_avail_number
                        next_avail_number += 1
                    break

                print(line_no,"warning: unparsable line, dropping: ", line)
                break

            line_no += 1
        return obj_code

### end class TuringMachineCompiler


In [2]:
tm_1="""# copy from tape 1 to tape 2 until the first blank
start: s
accept: a
reject: r

state: s
    : ; : r ; r loop. # skip over end marker
    
state: loop
    a _ a r a r loop
    b _ b r b r loop
    _ _ _ r _ r a

"""

tm_2="""# check if tape 1 and 2 are equal (up to the first common blank)
start: s
accept: a
reject: r

state: s
    : ; : r ; r loop # skip over end marker
    
state: loop
    a a a r a r loop
    b b b r b r loop
    _ _ _ r _ r a
    a b a r b r r
    b a b r a r r
"""


def make_work_tape(tape,n_states=512):
    bits = int(math.log2(n_states))
    work_tape = '::'+'0'*bits+' '*2+':'
    head = '1'
    for sym in tape:
        if sym=='_':
            sym = ' '
        work_tape += head + sym
        head = ' '
    return work_tape

def build_and_run(tm_description, tape1, tape2, verbose='none'):
    tm = TuringMachineTwoTape("verbose")
    MachineParserTwoTape.turing(tm,tm_description)
    if verbose!='none':
        tm.print_tm()
    return tm.compute_tm(tape1,tape2,step_limit=10*(len(tape1)+len(tape2)+5)**2)


r = build_and_run(tm_1,":abbaaab",";", verbose="verbose")
print(r)
r = build_and_run(tm_2,":abba",";abba", verbose="verbose")
print(r)
r = build_and_run(tm_2,":abaa",";abba", verbose="verbose")
print(r)

print (make_work_tape('abcabc'))        




start state:
	 s
accept states:
	 {'a'}
reject states:
	 {'r'}
transitions:
	 ('s', ':', ';') -> ('loop', ':', 'r', ';', 'r')
	 ('loop', 'a', ' ') -> ('loop', 'a', 'r', 'a', 'r')
	 ('loop', 'b', ' ') -> ('loop', 'b', 'r', 'b', 'r')
	 ('loop', ' ', ' ') -> ('a', ' ', 'r', ' ', 'r')
('accept', ':abbaaab _', ';abbaaab _')

start state:
	 s
accept states:
	 {'a'}
reject states:
	 {'r'}
transitions:
	 ('s', ':', ';') -> ('loop', ':', 'r', ';', 'r')
	 ('loop', 'a', 'a') -> ('loop', 'a', 'r', 'a', 'r')
	 ('loop', 'b', 'b') -> ('loop', 'b', 'r', 'b', 'r')
	 ('loop', ' ', ' ') -> ('a', ' ', 'r', ' ', 'r')
	 ('loop', 'a', 'b') -> ('r', 'a', 'r', 'b', 'r')
	 ('loop', 'b', 'a') -> ('r', 'b', 'r', 'a', 'r')
('accept', ':abba _', ';abba _')

start state:
	 s
accept states:
	 {'a'}
reject states:
	 {'r'}
transitions:
	 ('s', ':', ';') -> ('loop', ':', 'r', ';', 'r')
	 ('loop', 'a', 'a') -> ('loop', 'a', 'r', 'a', 'r')
	 ('loop', 'b', 'b') -> ('loop', 'b', 'r', 'b', 'r')
	 ('loop', ' ', ' ') -> ('a',

In [3]:
tm_wild_1="""# test wildcards
start: s
accept: a
reject: r

# look for a c on the top and bottom tapes, starting from the
# left end

state: s
    : ; : r ; r find_1
    
state: find_1
    c ; c n ; n find_2
    % ; % r ; n find_1
    
state: find_2
    c c c n c n a
    c % c n % r find_2

"""

tape_1 = "::aabbaabcaa"
tape_2 = ";;aabbcabc"
build_and_run(tm_wild_1,tape_1,tape_2)


('accept', '::aabbaabCaa', ';;aabbCabc')

In [4]:
TuringMachineCompiler.compiler(tm_1,256)

';;00000000 : :00000011:r;r;00000011 a a00000011arar;00000011 b b00000011brbr;00000011    00000001 r r;'

In [5]:
TuringMachineCompiler.compiler(tm_2,64)

';;000000 : :000011:r;r;000011 a a000011arar;000011 b b000011brbr;000011    000001 r r;000011 a a000010arbr;000011 b b000010brar;'

In [6]:
test_tm="""# no transtion is go
start: s
reject: r
accept: a

state: s
    _ _ _ n _ n s

"""

tape_1 = "aaa"
tape_2 = "bbb"
build_and_run(test_tm,tape_1,tape_2)

('no transition', ('s', 'a', 'b'))

In [8]:
test_gather="""# gather program for the universial TM
# heads are over left edge of tape
 
# program tape is of the form ;;.* (we only care about the first two symbols)

# work tape is over the form ::(0|1)+__:((_|1)s)* where s is a symbol set
#   what does the work tape mean? 
#   - two :: marks left end of tape, and the single : marks the end of the staging area
#     there are no other :'s on the tape
#   - the binary string is the number of a state. it will be formated fixed width, 
#     right justified and 0 padded.
#   - the two blanks space before the : will be used for the state transition simulation
#   - the (_|1) is the head marked (maybe i could reuse ; for this ...). there should be
#     exactly one 1 on the tape

start: s
reject: r
accept: a

state: s
    : ; : r ; r findone

state: findone
    % ; % r ; n findone
    1 ; 1 r ; n getsymbol
    
state: getsymbol
    % ; % l % l findstaging  # store the symbol
    
state: findstaging
    % ; % l ; n findstaging
    : ; : l ; r writesymbol
    
state: writesymbol
    _ % % l ; l clean_1
    
state: clean_1
    % ; _ l ; n clean_2
    
state: clean_2
    % ; % l ; n clean_2
    : ; : l ; n a
    

"""

tape = make_work_tape("abcde")
build_and_run(test_gather,tape,';;')

('accept', '::000000000 a:1a b c d e', ';;')