In [None]:
import { display } from "tslab";
import { readFileSync } from "fs";

const css = readFileSync("../style.css", "utf8");
display.html(`<style>${css}</style>`);

# Converting a Non-Deterministic <span style="font-variant:small-caps;">Fsm</span> into a Deterministic <span style="font-variant:small-caps;">Fsm</span>

In this notebook we show how a non-deterministic <span style="font-variant:small-caps;">Fsm</span>
$$ F = \langle Q, \Sigma, \delta, q_0, A \rangle $$
can be transformed into a deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ such that both <span style="font-variant:small-caps;">Fsm</span>s accept the
same language, that is we have
$$ L(F) = L\bigl(\texttt{det}(F)\bigr). $$
The idea behind this transformation is that the <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ has to 
compute the set of all states that the <span style="font-variant:small-caps;">Fsm</span> $F$ could be in. 
Hence the states of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ are 
**sets** of states of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$.  A set of these states contains all those states that the non-deterministic <span style="font-variant:small-caps;">Fsm</span> 
$F$ could have reached.  Furthermore, a set $M$ of states of the <span style="font-variant:small-caps;">Fsm</span> $F$ is an accepting state of the <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ if the set $M$ contains an accepting state of the <span style="font-variant:small-caps;">Fsm</span> $F$.

## Declaring the Necessary Types

To implement the mathematical concept of "sets of sets" efficiently in TypeScript, we utilize the `RecursiveSet` library. This library allows sets to contain other sets as elements and supports using sets as keys in maps based on their structural content, which is essential for constructing the states of our deterministic machine.

In [None]:
import { RecursiveSet } from "recursive-set";

`State` is the abstract type of the states of an <span style="font-variant:small-caps;">Fsm</span>.

In [None]:
type State = string | number;

In [None]:
type Char = string;

We represent the non-deterministic transition relation $\delta$ by the following type:

In [None]:
type TransRel = Map<string, RecursiveSet<State>>;

The type of the deterministic transition relation leverages `RecursiveSet`. The states of the DFA are themselves sets of states from the NFA (elements of the power set $2^Q$).

In [None]:
type TransRelDet = Map<string, RecursiveSet<State>>;

A non-deterministic finite state machine (NFA) is defined by the following type:

In [None]:
type NFA = {
  Q: RecursiveSet<State>;
  Sigma: RecursiveSet<Char>;
  delta: TransRel;
  q0: State;
  A: RecursiveSet<State>;
};

The deterministic finite state machine (DFA) produced in this notebook has the following type.

Note that `Q` is a set of sets (`RecursiveSet<DFAState>;`), reflecting that each DFA state corresponds to a subset of NFA states.

In [None]:
type DFAState = RecursiveSet<State>;

In [None]:
type DFA = {
  Q: RecursiveSet<DFAState>;
  Sigma: RecursiveSet<Char>;
  delta: TransRelDet;
  q0: DFAState;
  A: RecursiveSet<DFAState>;
};

<hr style="height:5px;background-color:blue">

In order to present the construction of $\texttt{det}(F)$ we first have to define three auxiliary functions.

First, we define `key`, a helper function to generate unique identifiers for transition lookups given a state (or a set of states) and a character.

In [None]:
function key(q: State | RecursiveSet<State>, c: Char): string {
  return `${q.toString()},${c}`;
}

Next is the function `bigUnion`. Given a set `M` that contains sets of elements, the expression `bigUnion(M)` returns the union of all these inner sets:
$$ \texttt{bigUnion}(M) = \bigcup M = \bigl\{ x \bigm| \exists A \in M: x \in A \bigr\}. $$

In [None]:
function bigUnion(sets: RecursiveSet<DFAState>): DFAState {
  const result = new RecursiveSet<State>();
  
  for (const subset of sets) {
    // subset is typed as RecursiveSet<State> automatically
    for (const x of subset) {
      result.add(x); 
    }
  }
  return result;
}

In [None]:
const input = new RecursiveSet<RecursiveSet<number>>(
  new RecursiveSet(1, 2, 3),
  new RecursiveSet(2, 3, 4),
  new RecursiveSet(3, 4, 5)
);

const result = bigUnion(input);

console.log("Result:", result.toString());
// Expected: {1, 2, 3, 4, 5}

The function `epsClosure` takes two arguments:
- `s` is a state, 
- `Œ¥` is the transition function of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$.

The function computes the set of all those states that can be reached from the state
`s` via $\varepsilon$-transitions.
Formally, the set $\texttt{epsClosure}(q)$ is defined inductively:
- $s \in \texttt{epsClosure}(s)$.
- $p \in \texttt{epsClosure}(s) \wedge r \in \delta(p, \varepsilon) \;\rightarrow\; r \in \texttt{epsClosure}(s)$.
 
  If the state $p$ is an element of the $\varepsilon$-closure of the state $s$ 
  and there is an $\varepsilon$-transition from $p$ to some state $r$, then $r$ 
  is also an element of the $\varepsilon$-transition of $s$.
  
The implementation of `epsClosure` uses a *fixed-point algorithm*:
We start with the set `{s}` and iteratively add all states reachable via $\varepsilon$-transitions until no new states can be added (i.e., the set of reachable states stabilizes).

In [None]:
function epsClosure(s: State, delta: TransRel): RecursiveSet<State> {
  let result = new RecursiveSet<State>(s);

  while (true) {
    const setsToUnite = new RecursiveSet<RecursiveSet<State>>();
    
    for (const q of result) {
        const targets = delta.get(key(q, 'Œµ'));
        if (targets) {
            setsToUnite.add(targets);
        }
    }
    
    const newStates = bigUnion(setsToUnite);
      
    if (newStates.isSubset(result)) {
        return result;
    }
      
    result = result.union(newStates);
  }
}

In order to transform a non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$ into a deterministic 
<span style="font-variant:small-caps;">Fsm</span>
$\texttt{det}(F)$ we have to extend the function $\delta:Q \times \Sigma \rightarrow 2^Q$ into the function
$$\widehat{\delta}: Q \times \Sigma \rightarrow 2^Q. $$
The idea is that given a state $q$ and a character $c$,  the value of $\widehat{\delta}(q,c)$ is the set of all states that the
<span style="font-variant:small-caps;">Fsm</span> $F$ could reach when it reads the character $c$ in state $q$ and then performs an arbitrary number of $\varepsilon$-transitions.  Formally, the definition of $\widehat{\delta}$ is as follows:
$$ \widehat{\delta}(q_1, c) := \bigcup \bigl\{ \texttt{epsClosure}(q_2) \bigm| q_2 \in \delta(q_1, c) \bigr \}. $$
This formula is to be read as follows:
- For every state $q_2 \in Q$ that can be reached from the state $q_1$ by reading the character $c$ we
  compute $\texttt{epsClosure}(q_2)$.
- Then we take the union of all these sets $\texttt{epsClosure}(q_2)$.

The function $\widehat{\delta}$ is implemented as the function `deltaHat`, which takes three arguments:
- `s` is a state,
- `c` is a character,
- `ùõø` is the transition function of a non-deterministic 
  <span style="font-variant:small-caps;">Fsm</span>.

This function computes the set of all those states that can be reached 
from `s` when we first have a transition from state `s` to some state `p` 
on reading the character `c` followed by any number of $\varepsilon$-transitions
starting in `p`.

In [None]:
function deltaHat(s: State, c: Char, delta: TransRel): RecursiveSet<State> {
  const directTargets = delta.get(key(s, c));
  
  if (!directTargets) {
      return new RecursiveSet<State>();
  }

  const closures = new RecursiveSet<RecursiveSet<State>>();
  
  for (const q of directTargets) {
      closures.add(epsClosure(q, delta));
  }

  return bigUnion(closures);
}

The function  $\widehat{\delta}$ maps a state into a set of states.  Since the <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ uses sets of states of the <span style="font-variant:small-caps;">Fsm</span> $F$ as its states we need a function that maps sets of states of the <span style="font-variant:small-caps;">Fsm</span> $F$ into sets of states.  Hence we generalize 
the function $\widehat{\delta}$ to the function
$$ \Delta: 2^Q \times \Sigma \rightarrow 2^Q $$
such that for a set $M$ of states and a character $c$ the expression $\Delta(M, c)$
computes the set of all those states that the <span style="font-variant:small-caps;">Fsm</span> $F$ could be in if it is in a state from the set $M$, then
reads the character $c$, and finally makes some $\varepsilon$-transitions.
The formal definition is as follows: 
$$ \Delta(M,c) := \bigcup \bigl\{ \widehat{\delta}(q,c) \bigm| q \in M \bigr\}. $$
This formula is easy to understand:  For every state  $q \in M$ we compute the set of states that the
<span style="font-variant:small-caps;">Fsm</span> could be in after reading the character $c$ and doing some 
$\varepsilon$-transitions.  Then we take the union of these sets.

The function `capitalDelta` ($\Delta$) generalizes this to sets of states.
$$ \Delta(M,c) := \bigcup \bigl\{ \widehat{\delta}(q,c) \bigm| q \in M \bigr\}. $$

In [None]:
function capitalDelta(
  M: RecursiveSet<State>,
  c: Char,
  delta: TransRel
): RecursiveSet<State> {
  const partials = new RecursiveSet<RecursiveSet<State>>();
  
  for (const q of M) {
    partials.add(deltaHat(q, c, delta));
  }
  return bigUnion(partials);
}

The function `allStates` takes three arguments:
- $Q$ is $\texttt{epsClosure}(q_0)$, where $q_0$ is the start state of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$,
- $\delta$ is the transition function of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$, and
- $\Sigma$ is the alphabet of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$.

The function `allStates` computes the set of all states of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$
that can be reached from the start state.

In [None]:
function allStates(
  Q0: RecursiveSet<State>, 
  delta: TransRel,
  Sigma: RecursiveSet<Char>
): RecursiveSet<DFAState> {
  
  const states = new RecursiveSet<DFAState>();
  const queue: DFAState[] = [Q0];
  
  states.add(Q0);

  while (queue.length > 0) {
    const M = queue.shift()!;

    for (const c of Sigma) {
      const N = capitalDelta(M, c, delta); 

      if (!states.has(N)) {
        states.add(N);
        queue.push(N);
      }
    }
  }
  return states;
}

In [None]:
function allStatesFixedPoint(
  Q0: RecursiveSet<State>, 
  delta: TransRel,
  Sigma: RecursiveSet<Char>
): RecursiveSet<DFAState> {
  let result = new RecursiveSet<DFAState>();
  result.add(Q0);

  while (true) {
    const newStates = new RecursiveSet<DFAState>();
    
    for (const M of result) {
       for (const c of Sigma) {
           newStates.add(capitalDelta(M, c, delta));
       }
    }

    if (newStates.isSubset(result)) {
        return result;
    }
    
    result = result.union(newStates);
  }
}

Now we are ready to formally define how the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$
is computed from the non-deterministic <span style="font-variant:small-caps;">Fsm</span>
$F = \bigl\langle Q, \Sigma, \delta, q_0, A \bigr\rangle$.
We define: 
$$ \texttt{det}(F) := \bigl\langle \texttt{allStates}(\texttt{epsClosure}(q_0)), \Sigma, \Delta, \texttt{epsClosure}(q_0), \widehat{A} \bigr\rangle $$
where the components of this tuple are given as follows:
- The set of states of $\texttt{det}(F)$ is the set of all states that can be reached from the set $\texttt{epsClosure}(q_0)$.
- The input alphabet $\Sigma$ does not change when going from $F$ to $\texttt{det}(F)$.
  After all, the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ has to recognize the same language as the non-deterministic
  <span style="font-variant:small-caps;">Fsm</span> $F$.
- The function $\Delta$, that has been defined previously, specified how the set of states change when a
  character is read.
- The start state $\texttt{epsClosure}(q_0)$ of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ is the set of all states
  that can be reached from the start state $q_0$ of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$
  via $\varepsilon$-transitions.
- The set of accepting states $\widehat{A}$ is the set of those subsets of $Q$ that contain an accepting
  state of the <span style="font-variant:small-caps;">Fsm</span> $F$:
  $$\widehat{A} := \bigl\{ M \in 2^Q \mid M \cap A \not= \{\} \bigl\}. $$

In [None]:
function nfa2dfa(nfa: NFA): DFA {
  const { Sigma, delta, q0, A } = nfa;

  const newStart: DFAState = epsClosure(q0, delta);  
  
  const newStates: RecursiveSet<DFAState> = allStates(newStart, delta, Sigma);
  
  const newDelta: TransRelDet = new Map();
  
  for (const M of newStates) {
    for (const c of Sigma) {
        const N = capitalDelta(M, c, delta);
        newDelta.set(key(M, c), N);
    }
  }

  const newFinal = new RecursiveSet<DFAState>();
  
  for (const M of newStates) {
    const intersection = M.intersection(A);
    
    if (!intersection.isEmpty()) {
      newFinal.add(M);
    }
  }

  return {
    Q: newStates,
    Sigma,
    delta: newDelta,
    q0: newStart,
    A: newFinal
  };
}

To test this function, use the notebook `02-Test-NFA-2-DFA.ipynb`.