# RE operations

You may watch a portion of the following Youtube video (fast-forwarding as needed) to obtain
an overview of RE-related operations.

We will discuss:

* REs for some simple languages

* RE to NFA conversion

* NFA to RE conversion


In [None]:
# This Youtube video walks through many RE operations
from IPython.display import YouTubeVideo
YouTubeVideo('eXjIYsalFEQ')


In [None]:
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
import sys

# -- Detect if in Own Install or in Colab
try:
    import google.colab
    OWN_INSTALL = False
except:
    OWN_INSTALL = True
    
if OWN_INSTALL:
    
  #---- Leave these definitions ON if running on laptop
  #---- Else turn OFF by putting them between ''' ... '''

  sys.path[0:0] = ['../../../../..',  '../../../../../3rdparty',  
                   '../../../..',  '../../../../3rdparty',  
                   '../../..',     '../../../3rdparty', 
                   '../..',        '../../3rdparty',
                   '..',           '../3rdparty' ]

else: # In colab
  ! if [ ! -d Jove ]; then git clone https://github.com/anon-Jove/Jove Jove; fi
  sys.path.append('./Jove')
  sys.path.append('./Jove/jove')

# -- common imports --
from jove.DotBashers import *
from jove.Def_DFA import *
from jove.Def_NFA import *
from jove.Def_RE2NFA import *
from jove.Def_NFA2RE import *
from jove.Def_md2mc import *
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 # We will develop REs for some example languages

In [None]:
# RE for the language of odd 1's (over the alphabet {0,1})

RE_Odd1s  = "(0* 1 0* (1 0* 1 0*)*)"
NFA_Odd1s = re2nfa(RE_Odd1s)
DO_Odd1s  = dotObj_dfa(min_dfa(nfa2dfa(NFA_Odd1s)))
DO_Odd1s

In [None]:
# RE for the language of exactly three 0's 

RE_Ex3z = "1* 0 1* 0 1* 0 1*"
NFA_Ex3z = re2nfa(RE_Ex3z)
DO_Ex3z  = dotObj_dfa(min_dfa(nfa2dfa(NFA_Ex3z)))
DO_Ex3z

In [None]:
# RE for odd 1's or exactly three 0's

RE_O13z  = "0* 1 0* (1 0* 1 0*)* + 1* 0 1* 0 1* 0 1* "
NFA_O13z = re2nfa(RE_O13z)
MD_O13z  = min_dfa(nfa2dfa(NFA_O13z))
DO_O13z  = dotObj_dfa(MD_O13z)
DO_O13z

# Examples of RE to NFA conversion

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(re2nfa("''"))))

In [None]:
# DFA for the language of "aa" generated via re2nfa, nfa2dfa and min_dfa

D1 = min_dfa(nfa2dfa(re2nfa("aa")))
dotObj_dfa(D1)

In [None]:
# DFA for the language of "bb" generated via re2nfa, nfa2dfa and min_dfa

D2 = min_dfa(nfa2dfa(re2nfa("bb")))
dotObj_dfa(D2)

In [None]:
D1

In [None]:
D2

In [None]:
# The union of the DFA of the aforesaid languages, i.e. {aa,bb}

D1or2 = min_dfa(union_dfa(D1,D2))
D1or2p = pruneUnreach(D1or2)
dotObj_dfa(D1or2)

In [None]:
# The intersection of {aa} and {bb}

D1and2 = min_dfa(intersect_dfa(D1,D2))
D1and2p = pruneUnreach(D1and2)
dotObj_dfa(D1and2)

In [None]:
dotObj_dfa_w_bh(D1and2p, FuseEdges=True)

In [None]:
# Another example of the power of the conversions and checking facilities for DFA equivalence


d1=nfa2dfa(re2nfa("abcde"))
d2=nfa2dfa(re2nfa("abced"))
langeq_dfa(d1,d2,True)

## Counterexamples

When two DFA are not equivalent, one can obtain a counterexample heping us debug. The counterexample
is a sequence of pairs of DFA states.

```
Here the path is

("{'St1'}", "{'St1'}"),                  -- this pair of states exists in the DFAs below
("{'St2', 'St3'}", "{'St2', 'St3'}"),    -- this pair also exists
("{'St4', 'St5'}", "{'St4', 'St5'}"),    -- so does this
("{'St6', 'St7'}", "{'St6', 'St7'}"),    -- so does this  
('BH', "{'St8', 'St9'}"),                -- oops, one machine goes into the BH, the other not
('BH', "{'St10'}")                       -- this too; one machine goes to St10, the other is BH

```

In [None]:
dotObj_dfa(d1)

In [None]:
dotObj_dfa(d2)

In [None]:
# Two other RE-based languages that are not equivalent
# We can read the counterexample the same way

d1a=nfa2dfa(re2nfa("aa*+bc"))
d2a=nfa2dfa(re2nfa("a(a*+bc)"))
langeq_dfa(d1a,d2a,True)

In [None]:
dotObj_dfa(d1a)

In [None]:
dotObj_dfa(d2a)

In [None]:
d1b=nfa2dfa(re2nfa("aaa*+aa*bc+bcaa*+bcbc"))
d2b=nfa2dfa(re2nfa("(aa*+bc)(aa*+bc)"))
langeq_dfa(d1b,d2b,True)

In [None]:
dotObj_dfa(d1b)

In [None]:
dotObj_dfa(d2b)

In [None]:
iso_dfa(d1b,d2b)

In [None]:
d1c=min_dfa(d1b)

In [None]:
d2c=min_dfa(d2b)

In [None]:
iso_dfa(d1c,d2c)

In [None]:
dotObj_dfa(d1c)

In [None]:
dotObj_dfa(d2c)

In [None]:
d1d=nfa2dfa(re2nfa("aaa*+aa*bc+bcaaa*+bcbc"))
d2d=nfa2dfa(re2nfa("(aa*+bc)(aa*+bc)"))
langeq_dfa(d1d,d2d,True)

In [None]:
d1d=nfa2dfa(re2nfa("a a a*+a a* b c+ b c a a a*+b c b c"))
d2d=nfa2dfa(re2nfa("(a a*+b c)(a a*+b c)"))
langeq_dfa(d1d,d2d,True)

In [None]:
dotObj_dfa(d1d)

In [None]:
dotObj_dfa(d2d)

In [None]:
d1d=nfa2dfa(re2nfa("james*+bond*"))
dotObj_dfa(d1d)

In [None]:
d1d=nfa2dfa(re2nfa("ja mes*+bo nd*"))
dotObj_dfa(d1d)

In [None]:
d1d=nfa2dfa(re2nfa("''"))
dotObj_dfa(d1d)

In [None]:
test = md2mc(src="File", fname="endsin0101.nfa")
dotObj_nfa(test)

In [None]:
# NFA for 0101 within hamming dist of 2
nfamd1 = md2mc(src="File", fname="nfa0101h2.nfa")
dotObj_nfa(nfamd1)

In [None]:
dfamd1=nfa2dfa(nfamd1)
dotObj_dfa(dfamd1)

In [None]:
m1=min_dfa(dfamd1)

In [None]:
m2=min_dfa_brz(dfamd1)

In [None]:
dotObj_dfa(m1)

In [None]:
dotObj_dfa(m2)

In [None]:
iso_dfa(m1,m2)

# Designing DFA that accept within a Hamming Distance

Given a regular language, say (0+1)* 0101 (0+1)* (i.e., all bit-strings with an occurrence of 0101 in it), let us come up with 

1. An RE that represents strings within a Hamming distance of 2 from strings in this language

2. An NFA that represents strings within a Hamming distance of 2 from strings in this language


In [None]:
h2_0101_re = ("(0+1)* ( (0+1)(0+1)01 +" + 
                      " (0+1)1(0+1)1 +" + 
                      " (0+1)10(0+1) +" + 
                      " 0(0+1)(0+1)1 +" +
                      " 0(0+1)0(0+1) +" +
                      " 01(0+1)(0+1) )" +
              "(0+1)*")

In [None]:
h2_0101_re

In [None]:
minD_h2_0101_re = min_dfa(nfa2dfa(re2nfa(h2_0101_re)))

In [None]:
DO_minD_h2_0101_re = dotObj_dfa(minD_h2_0101_re)

In [None]:
DO_minD_h2_0101_re

In [None]:
DO_minD_h2_0101_re.source

In [None]:
h2_0101_nfa_md = '''
NFA
!!--------------------------------------------
!! We are supposed to process (0+1)*0101(0+1)*
!! with up to two "dings" allowed
!!
!! Approach: Silently error-correct, but remember
!! each "ding" in a new state name.
!! After two dings, do not error-correct anymore
!!--------------------------------------------

!!-- pattern for (0+1)* is the usual
!!-- no error-correction needed here :-)
I : 0 | 1 -> I

!!-- Now comes the opportunity to exit I via 0101
!!-- The state names are A,B,C,D with ding-count
!!-- Thus A0 is A with 0 dings
!!-- C2 is C with 2 dings; etc

!!-- Ding-less traversal -- how lucky!
I  : 0 -> A0
A0 : 1 -> B0
B0 : 0 -> C0
C0 : 1 -> F
!!-- Phew, finally at F
F  : 0 | 1 -> F

!!-- First ding in any of these cases
I  : 1 -> A1
A0 : 0 -> B1
B0 : 1 -> C1
C0 : 0 -> F  !!-- ding-recording un-nec.; just goto F

!!-- Second ding in any of these cases
A1 : 0 -> B2
B1 : 1 -> C2
C1 : 0 -> F  !!-- ding-recording un-nec.; just goto F

!!-- No more dings allowed!
B2 : 0 -> C2
C2 : 1 -> F

!!-- Allow one-dingers to finish fine
A1 : 1 -> B1
B1 : 0 -> C1
C1 : 1 -> F

'''

In [None]:
h2_0101_nfa = md2mc(h2_0101_nfa_md)

In [None]:
DO_h2_0101_nfa = dotObj_nfa(h2_0101_nfa)
DO_h2_0101_nfa

In [None]:
minD_h2_0101_nfa = min_dfa(nfa2dfa(h2_0101_nfa))
DO_minD_h2_0101_nfa = dotObj_dfa(minD_h2_0101_nfa)
DO_minD_h2_0101_nfa

In [None]:
iso_dfa(minD_h2_0101_re, minD_h2_0101_nfa)

# We will now illustrate NFA to RE conversion

The workhorse function is del_gnfa_states


In [None]:
help(del_gnfa_states)

In [None]:
gnfamd1=mk_gnfa(nfamd1)
dotObj_gnfa(gnfamd1)

In [None]:
(Gfinal, dotObj_List, final_re_str) = del_gnfa_states(gnfamd1)

In [None]:
final_re_str

In [None]:
dotObj_List[0]

In [None]:
dotObj_List[1]

In [None]:
dotObj_List[2]

In [None]:
dotObj_List[3]

In [None]:
dotObj_List[4]

In [None]:
len(dotObj_List)

In [None]:
dotObj_List[11]

In [None]:
final_re_str

# Let us go full circle!

The obtained final RE string is fed back into the conversion pipeline


In [None]:
fullcircle=min_dfa(nfa2dfa(re2nfa(final_re_str)))

In [None]:
dotObj_dfa(fullcircle)

In [None]:
# This is the two-bit hamming distance machine wrt the intended pattern 0101

h2_from_re = min_dfa(nfa2dfa(re2nfa("(0+1)(0+1)01 + (0+1)1(0+1)1 + (0+1)10(0+1) + 0(0+1)(0+1)1 + 0(0+1)0(0+1) + 01(0+1)(0+1)")))

In [None]:
dotObj_dfa(h2_from_re)

In [None]:
iso_dfa(fullcircle,h2_from_re)

In [None]:
aplusb_aplusb = dotObj_nfa(re2nfa("(a+b)(a+b)"), True)

In [None]:
aplusb_aplusb

In [None]:
DOodd1s_or_30s = dotObj_nfa(re2nfa("0* 1 0* (1 0* 1 0*)* + 1* 0 1* 0 1* 0 1* "), True)

In [None]:
DOodd1s_or_30s

In [None]:
DOodd1s_or_30s = dotObj_nfa(re2nfa("0* 1 0* (1 0* 1 0*)* + 1* 0 1* 0 1* 0 1* "), False)

In [None]:
DOodd1s_or_30s

In [None]:
DOodd1s_or_30s_mind = dotObj_dfa(min_dfa(nfa2dfa(re2nfa("0* 1 0* (1 0* 1 0*)* + 1* 0 1* 0 1* 0 1* "))))
DOodd1s_or_30s_mind

In [None]:
nfaEx = md2mc('''NFA
I : '' -> B
I : a  -> A
!!A : b  -> I
A : q  -> F
A : r  -> B
B : s  -> B
B : p  -> F
F : t  -> A
''')
DO_nfaEx = dotObj_nfa(nfaEx)
DO_nfaEx

In [None]:
GNFA_nfaEx = mk_gnfa(nfaEx)

In [None]:
help(del_gnfa_states)

In [None]:
(Gfinal, do_list, final_re) = del_gnfa_states(GNFA_nfaEx)

In [None]:
final_re

In [None]:
final_re

In [None]:
do_list[0]

In [None]:
do_list[1]

In [None]:
do_list[1]

In [None]:
do_list[2]

In [None]:
do_list[2].source

In [None]:
do_list[3]

In [None]:
do_list[3].source

In [None]:
do_list[4]

In [None]:
do_list[4].source

In [None]:
re_mindfa = min_dfa(nfa2dfa(re2nfa(final_re)))

In [None]:
dir_mindfa = min_dfa(nfa2dfa(nfaEx))

In [None]:
iso_dfa(re_mindfa,dir_mindfa)

In [None]:
dotObj_dfa(dir_mindfa)

In [None]:
dotObj_dfa(re_mindfa)

In [None]:
dot_san_str('""')

In [None]:
nfaExp = md2mc('''NFA
I : a -> A1
I : n -> B1
A1 : b -> AB1
B1 : o -> AB1
AB1 : c -> A2
AB1 : p -> B2
A2 : d -> AB2
B2 : q -> AB2
AB2 : e -> A3
AB2 : r -> B3
A3 : f -> AB3
B3 : s -> AB3
AB3 : g -> A4
AB3 : t -> B4
A4 : h -> FAB4
B4 : u -> FAB4
''')
DO_nfaExp = dotObj_nfa(nfaExp)
DO_nfaExp

In [None]:
gnfaExp = mk_gnfa(nfaExp)
DO_gnfaExp = dotObj_gnfa(gnfaExp)
DO_gnfaExp

In [None]:
(Gfinal, dotObj_List, final_re_str) = del_gnfa_states(gnfaExp)

In [None]:
final_re_str

In [None]:
nfaExer = md2mc('''NFA
I1 : a -> X
I2 : b -> X
I3 : c -> X
X  : p | q -> X
X  : m -> F1
X  : n -> F2
''')
DO_nfaExer = dotObj_nfa(nfaExer)
DO_nfaExer
gnfaExer = mk_gnfa(nfaExer)
DO_gnfaExer = dotObj_gnfa(gnfaExer)
DO_gnfaExer
(G, DO, RE) = del_gnfa_states(gnfaExer)

In [None]:
RE

In [None]:
DO_nfaExer

In [None]:
DO[0]

In [None]:
DO[1]

In [None]:
DO[2]

In [None]:
DO[3]

In [None]:
DO[4]

In [None]:
DO[5]

In [None]:
DO[6]

In [None]:
nfaExer = md2mc('''NFA
I1 : a -> X
I2 : b -> X
I3 : c -> X
X  : p | q -> X
X  : m -> F1
X  : n -> F2
''')
DO_nfaExer = dotObj_nfa(nfaExer)
DO_nfaExer
gnfaExer = mk_gnfa(nfaExer)
DO_gnfaExer = dotObj_gnfa(gnfaExer)
DO_gnfaExer
(G, DO, RE) = del_gnfa_states(gnfaExer, DelList=["X", "I1", "I2","I3","F1","F2"])

In [None]:
DO_gnfaExer

In [None]:
len(DO)

In [None]:
RE

In [None]:
DO[0]

In [None]:
DO[1]

In [None]:
DO[2]

In [None]:
DO[3]

In [None]:
DO[4]

In [None]:
DO[5]

In [None]:
DO[6]

In [None]:
RE

In [None]:
sylv_11_13 = min_dfa(nfa2dfa(re2nfa("(11111111111+1111111111111)*")))

In [None]:
dotObj_dfa(sylv_11_13)

In [None]:
sylv_11_13

In [None]:
sylv_3_5 = min_dfa(nfa2dfa(re2nfa("(111+11111)*")))

In [None]:
len(sylv_3_5["Q"]) - 2

In [None]:
3*5-3-5

In [None]:
dotObj_dfa(sylv_3_5)

In [None]:
dotObj_nfa(re2nfa("(111+11111)*"))

In [None]:
nfa_3_5 = re2nfa("(111+11111)*")

In [None]:
nfa_3_5

In [None]:
dotObj_nfa(nfa_3_5)

In [None]:
Gnfa_3_5 = mk_gnfa(nfa_3_5)

In [None]:
Gnfa_3_5

In [None]:
dotObj_gnfa(Gnfa_3_5)

In [None]:
(Gfinal, dotObj_List, final_re_str) = del_gnfa_states(Gnfa_3_5)

In [None]:
len(dotObj_List)

In [None]:
dotObj_List[0]

In [None]:
dotObj_List[1]

In [None]:
dotObj_List[2]

In [None]:
dotObj_List[3]

In [None]:
dotObj_List[4]

In [None]:
dotObj_List[5]

In [None]:
dotObj_List[6]

In [None]:
dotObj_List[7]

In [None]:
dotObj_List[8]

In [None]:
dotObj_List[9]

In [None]:
dotObj_List[10]

In [None]:
dotObj_List[11]

In [None]:
dotObj_List[12]

In [None]:
dotObj_List[13]

In [None]:
dotObj_List[14]

In [None]:
dotObj_List[15]

In [None]:
dotObj_List[16]

In [None]:
dotObj_List[17]

In [None]:
len(dotObj_List)

In [None]:
final_re_str

In [None]:
dotObj_gnfa(mk_gnfa(re2nfa("(111+11111)*")))

In [None]:
minD_renfare = min_dfa(nfa2dfa(re2nfa(final_re_str)))

In [None]:
DOminD_renfare = dotObj_dfa(minD_renfare)

In [None]:
DOminD_renfare

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(nfa_3_5)))

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(re2nfa("(111+11111)*"))))

In [None]:
sylv_3_5 = min_dfa(nfa2dfa(re2nfa("(111+11111)*")))

In [None]:
len(sylv_3_5["Q"]) - 1 - 1

In [None]:
DO_sylv_3_5 = dotObj_dfa(sylv_3_5)

In [None]:
non_sylv_3_6 = min_dfa(nfa2dfa(re2nfa("(111+111111)*")))

In [None]:
DO_non_sylv_3_6 = dotObj_dfa(non_sylv_3_6)

In [None]:
DO_non_sylv_3_6

In [None]:
non_sylv_prefix_and_3_6 = min_dfa(nfa2dfa(re2nfa("111(111+111111)*")))

In [None]:
DO_non = dotObj_dfa(non_sylv_prefix_and_3_6)

In [None]:
DO_non

In [None]:
stamp_3_5_7 = min_dfa(nfa2dfa(re2nfa("(111+11111+1111111)*")))

In [None]:
DOstamp_3_5_7 = dotObj_dfa(stamp_3_5_7)

In [None]:
DOstamp_3_5_7

In [None]:
len(min_dfa(nfa2dfa(re2nfa("(111+1111111111111)*")))["Q"]) - 2

In [None]:
dfaBESame = md2mc('''
DFA !! Begins and ends with same; epsilon allowed
IF  : 0 -> F0
IF  : 1 -> F1
!!
F0  : 0 -> F0
F0  : 1 -> S01
S01 : 1 -> S01
S01 : 0 -> F0
!!
F1  : 1 -> F1
F1  : 0 -> S10
S10 : 0 -> S10
S10 : 1 -> F1
''')
DOdfaBESame = dotObj_dfa(dfaBESame)
DOdfaBESame

In [None]:
nfaBESame = apply_h_dfa(dfaBESame, lambda x: '0')

In [None]:
nfaBESame

In [None]:
DONFABESame = dotObj_nfa(nfaBESame)

In [None]:
DONFABESame

In [None]:
dotObj_dfa(nfa2dfa(nfaBESame))

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(nfaBESame)))

In [None]:
blimp = md2mc('''
DFA 
I1 : a -> F2
I1 : b -> F3
F2 : a -> S8
F2 : b -> S5
F3 : a -> S7
F3 : b -> S4
S4 : a | b -> F6
S5 : a | b -> F6
F6 : a | b -> F6
S7 : a | b -> F6
S8 : a -> F6
S8 : b -> F9
F9 : a -> F9
F9 : b -> F6
''')

In [None]:
blimpnfa = apply_h_dfa(blimp, lambda x: 'a')

In [None]:
dotObj_nfa(blimpnfa)

In [None]:
dotObj_dfa(nfa2dfa(blimpnfa))

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(blimpnfa)))

In [None]:
testdfa = md2mc('''DFA
I : 0 | 1 -> I
I : 2 -> F
''')

In [None]:
dotObj_dfa_w_bh(testdfa)

In [None]:
help(dotObj_dfa_w_bh)

In [None]:
dotObj_dfa(testdfa, FuseEdges=True)

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(re2nfa("(111+11111)*"))))

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(re2nfa("(111+1111111111111)*"))))

In [None]:
ResetStNum()

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(re2nfa("(111+1111111111111)*"))))

In [None]:
ResetStNum()

In [None]:
dotObj_dfa(min_dfa(nfa2dfa(re2nfa("(111+1111111111111)*"))))