In [1]:
import { display } from "tslab";
import { readFileSync } from "fs";

const css = readFileSync("../style.css", "utf8");
display.html(`<style>${css}</style>`);

# Converting a Non-Deterministic <span style="font-variant:small-caps;">Fsm</span> into a Deterministic <span style="font-variant:small-caps;">Fsm</span>

In this notebook we show how a non-deterministic <span style="font-variant:small-caps;">Fsm</span>
$$ F = \langle Q, \Sigma, \delta, q_0, A \rangle $$
can be transformed into a deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ such that both <span style="font-variant:small-caps;">Fsm</span>s accept the
same language, that is we have
$$ L(F) = L\bigl(\texttt{det}(F)\bigr). $$
The idea behind this transformation is that the <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ has to 
compute the set of all states that the <span style="font-variant:small-caps;">Fsm</span> $F$ could be in. 
Hence the states of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ are 
**sets** of states of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$.  A set of these states contains all those states that the non-deterministic <span style="font-variant:small-caps;">Fsm</span> 
$F$ could have reached.  Furthermore, a set $M$ of states of the <span style="font-variant:small-caps;">Fsm</span> $F$ is an accepting state of the <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ if the set $M$ contains an accepting state of the <span style="font-variant:small-caps;">Fsm</span> $F$.

## Declaring the Necessary Types

`State` is the abstract type of the states of an <span style="font-variant:small-caps;">Fsm</span>.

In [2]:
type State = string | number;

In [3]:
type Char = string;

We represent the non-deterministic transition relation $\delta$ by the following type:

In [4]:
type TransRel = Map<string, Set<State>>;

The type of the deterministic transition relation is more complicated, since the states of the deterministic <span style="font-variant:small-caps;">Fsm</span> are set of the states of the non-deterministic <span style="font-variant:small-caps;">Fsm</span>.

In [5]:
type TransRelDet = Map<string, Set<State>>;

A non-deterministic finite state machine has the following type:

In [6]:
type NFA = {
  Q: Set<State>;
  Sigma: Set<Char>;
  delta: TransRel;
  q0: State;
  A: Set<State>;
};

The deterministic finite state machines produced in this notebook have the following type.

In [7]:
type DFA = {
  Q: Set<Set<State>>; // Menge der DFA-Zustände (jeder ist eine Menge von NFA-Zuständen)
  Sigma: Set<Char>;
  delta: TransRelDet; // "setKey(M),c" -> N
  q0: Set<State>;
  A: Set<Set<State>>;
};

<hr style="height:5px;background-color:blue">

In order to present the construction of $\texttt{det}(F)$ we first have to define three auxiliary functions.
We start with the function `bigUnion`.  Given a set `M` that contains frozensets, the epression `bigUnion(M)`
returns the union of all sets in `M`, i.e. we have
$$ \texttt{bigUnion}(M) = \bigcup M = \bigl\{ x \bigm| \exists A \in M: x \in A \bigr\}. $$
The resulting set is returned as a `frozenset`.

In [8]:
function bigUnion<T>(sets: Set<Set<T>>): Set<T> {
  const result = new Set<T>();
  for (const subset of sets) {
    for (const x of subset) result.add(x);
  }
  return result;
}

In [9]:
const input = new Set([
  new Set([1, 2, 3]),
  new Set([2, 3, 4]),
  new Set([3, 4, 5]),
]);

const result = bigUnion(input);

console.log("Result:", result); 
// Erwartet: Set {1, 2, 3, 4, 5}

// Wenn du es schöner sehen willst:
console.log("Result as array:", Array.from(result).sort());
// Erwartet: [1, 2, 3, 4, 5]

Result: Set(5) { [33m1[39m, [33m2[39m, [33m3[39m, [33m4[39m, [33m5[39m }
Result as array: [ [33m1[39m, [33m2[39m, [33m3[39m, [33m4[39m, [33m5[39m ]


`Helpfunctions`:

In [11]:
function key(q: State, c: Char): string {
  return `${q},${c}`;
}

function toSortedArray(s: Set<State>): (string | number)[] {
  // States können number oder string sein -> stabil als string vergleichen
  return Array.from(s).sort((a, b) => String(a).localeCompare(String(b)));
}

function setKey(s: Set<State>): string {
  return JSON.stringify(toSortedArray(s));
}


The function `epsClosure` takes two arguments:
- `s` is a state, 
- `δ` is the transition function of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$.

The function computes the set of all those states that can be reached from the state
`s` via $\varepsilon$-transitions.
Formally, the set $\texttt{epsClosure}(q)$ is defined inductively:
- $s \in \texttt{epsClosure}(s)$.
- $p \in \texttt{epsClosure}(s) \wedge r \in \delta(p, \varepsilon) \;\rightarrow\; r \in \texttt{epsClosure}(s)$.
 
  If the state $p$ is an element of the $\varepsilon$-closure of the state $s$ 
  and there is an $\varepsilon$-transition from $p$ to some state $r$, then $r$ 
  is also an element of the $\varepsilon$-transition of $s$.
  
The implementation of `epsClosure` uses a *fixed-point algorithm*.

In [12]:
function epsClosure(s: State, delta: TransRel): Set<State> {
  let result = new Set<State>([s]);
  while (true) {
    const newStates = new Set<State>();
    for (const q of result) {
      const targets = delta.get(key(q, 'ε'));
      if (targets) for (const t of targets) newStates.add(t);
    }
    const combined = new Set([...result, ...newStates]);
    if (combined.size === result.size) return result;
    result = combined;
  }
}

In order to transform a non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$ into a deterministic 
<span style="font-variant:small-caps;">Fsm</span>
$\texttt{det}(F)$ we have to extend the function $\delta:Q \times \Sigma \rightarrow 2^Q$ into the function
$$\widehat{\delta}: Q \times \Sigma \rightarrow 2^Q. $$
The idea is that given a state $q$ and a character $c$,  the value of $\widehat{\delta}(q,c)$ is the set of all states that the
<span style="font-variant:small-caps;">Fsm</span> $F$ could reach when it reads the character $c$ in state $q$ and then performs an arbitrary number of $\varepsilon$-transitions.  Formally, the definition of $\widehat{\delta}$ is as follows:
$$ \widehat{\delta}(q_1, c) := \bigcup \bigl\{ \texttt{epsClosure}(q_2) \bigm| q_2 \in \delta(q_1, c) \bigr \}. $$
This formula is to be read as follows:
- For every state $q_2 \in Q$ that can be reached from the state $q_1$ by reading the character $c$ we
  compute $\texttt{epsClosure}(q_2)$.
- Then we take the union of all these sets $\texttt{epsClosure}(q_2)$.

The function $\widehat{\delta}$ is implemented as the function `deltaHat`, which takes three arguments:
- `s` is a state,
- `c` is a character,
- `𝛿` is the transition function of a non-deterministic 
  <span style="font-variant:small-caps;">Fsm</span>.

This function computes the set of all those states that can be reached 
from `s` when we first have a transition from state `s` to some state `p` 
on reading the character `c` followed by any number of $\varepsilon$-transitions
starting in `p`.

In [13]:
function deltaHat(s: State, c: Char, delta: TransRel): Set<State> {
  const reachable = new Set<State>();
  const targets = delta.get(key(s, c));
  if (targets) {
    for (const q of targets) {
      const clos = epsClosure(q, delta);
      for (const r of clos) reachable.add(r);
    }
  }
  return reachable;
}

The function  $\widehat{\delta}$ maps a state into a set of states.  Since the <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ uses sets of states of the <span style="font-variant:small-caps;">Fsm</span> $F$ as its states we need a function that maps sets of states of the <span style="font-variant:small-caps;">Fsm</span> $F$ into sets of states.  Hence we generalize 
the function $\widehat{\delta}$ to the function
$$ \Delta: 2^Q \times \Sigma \rightarrow 2^Q $$
such that for a set $M$ of states and a character $c$ the expression $\Delta(M, c)$
computes the set of all those states that the <span style="font-variant:small-caps;">Fsm</span> $F$ could be in if it is in a state from the set $M$, then
reads the character $c$, and finally makes some $\varepsilon$-transitions.
The formal definition is as follows: 
$$ \Delta(M,c) := \bigcup \bigl\{ \widehat{\delta}(q,c) \bigm| q \in M \bigr\}. $$
This formula is easy to understand:  For every state  $q \in M$ we compute the set of states that the
<span style="font-variant:small-caps;">Fsm</span> could be in after reading the character $c$ and doing some 
$\varepsilon$-transitions.  Then we take the union of these sets.

In [14]:
function capitalDelta(
  M: Set<State>,
  c: Char,
  delta: TransRel
): Set<State> {
  const partials = new Set<Set<State>>();
  for (const q of M) partials.add(deltaHat(q, c, delta));
  return bigUnion(partials);
}

The function `allStates` takes three arguments:
- $Q$ is $\texttt{epsClosure}(q_0)$, where $q_0$ is the start state of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$,
- $\delta$ is the transition function of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$, and
- $\Sigma$ is the alphabet of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$.

The function `allStates` computes the set of all states of the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$
that can be reached from the start state.

In [15]:
function allStates(
  Q0: Set<State>,
  delta: TransRel,
  Sigma: Set<Char>
): Set<Set<State>> {
  const seen = new Map<string, Set<State>>();
  const queue: Set<State>[] = [];

  const startKey = setKey(Q0);
  seen.set(startKey, Q0);
  queue.push(Q0);

  while (queue.length > 0) {
    const M = queue.shift()!;
    for (const c of Sigma) {
      const N = capitalDelta(M, c, delta); // kann leer sein – ist ok
      const k = setKey(N);
      if (!seen.has(k)) {
        seen.set(k, N);
        queue.push(N);
      }
    }
  }
  // Rückgabe als Set<Set<State>>
  return new Set(seen.values());
}

Now we are ready to formally define how the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$
is computed from the non-deterministic <span style="font-variant:small-caps;">Fsm</span>
$F = \bigl\langle Q, \Sigma, \delta, q_0, A \bigr\rangle$.
We define: 
$$ \texttt{det}(F) := \bigl\langle \texttt{allStates}(\texttt{epsClosure}(q_0)), \Sigma, \Delta, \texttt{epsClosure}(q_0), \widehat{A} \bigr\rangle $$
where the components of this tuple are given as follows:
- The set of states of $\texttt{det}(F)$ is the set of all states that can be reached from the set $\texttt{epsClosure}(q_0)$.
- The input alphabet $\Sigma$ does not change when going from $F$ to $\texttt{det}(F)$.
  After all, the deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ has to recognize the same language as the non-deterministic
  <span style="font-variant:small-caps;">Fsm</span> $F$.
- The function $\Delta$, that has been defined previously, specified how the set of states change when a
  character is read.
- The start state $\texttt{epsClosure}(q_0)$ of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $\texttt{det}(F)$ is the set of all states
  that can be reached from the start state $q_0$ of the non-deterministic <span style="font-variant:small-caps;">Fsm</span> $F$
  via $\varepsilon$-transitions.
- The set of accepting states $\widehat{A}$ is the set of those subsets of $Q$ that contain an accepting
  state of the <span style="font-variant:small-caps;">Fsm</span> $F$:
  $$\widehat{A} := \bigl\{ M \in 2^Q \mid M \cap A \not= \{\} \bigl\}. $$

In [16]:
function nfa2dfa(nfa: NFA): DFA {
  const { Sigma, delta, q0, A } = nfa;

  const newStart = epsClosure(q0, delta);
  const newStates = allStates(newStart, delta, Sigma);

  const newDelta: TransRelDet = new Map();
  for (const M of newStates) {
    const Mk = setKey(M);
    for (const c of Sigma) {
      const N = capitalDelta(M, c, delta);
      newDelta.set(`${Mk},${c}`, N);
    }
  }

  const newFinal = new Set<Set<State>>();
  for (const M of newStates) {
    for (const a of A) {
      if (M.has(a)) {
        newFinal.add(M);
        break;
      }
    }
  }

  return { Q: newStates, Sigma, delta: newDelta, q0: newStart, A: newFinal };
}

To test this function, use the notebook `02-Test-NFA-2-DFA.ipynb`.